climateprediction.net home page
Posts by Skip Da Shu

Posts by Skip Da Shu

1) Questions and Answers : Preferences : Update preferences doesn't stick. (Message 41585)
Posted 1 Feb 2011 by Profile Skip Da Shu
Post:
I changed my resource allocation, hit update, it waited awhile and returns with the allocation unchanged. 3 times.
2) Questions and Answers : Unix/Linux : platform 'i586-mandriva-linux-gnu' not found (Message 38661)
Posted 9 Jan 2010 by Profile Skip Da Shu
Post:
Grrrr. Stupid Mandriva packagers.

They\'re compiling it wrong. So the server is saying \"Hey, I don\'t have a app for that platform. Now if you were asking for i686-pc-linux-gnu, we\'d be all set.\"

Here\'s what I\'d do if I were you. I\'d run down all the work on your machine, if there is any. Uninstall BOINC using the package manager. Grab BOINC from here. Follow the instructions here to install.

And if you really want to help, file a bug against the package. Point the packager to this Wiki page and tell them to make note of the configure command.


I believe there are other ways around this (using the anonymous platform method) but that seems like a pain to me.


I CAN NOT believe this problem STILL exists... somebody handed me a Mandriva machine (2010) and I was installing BOINC for them... after getting around the gui_rpc_auth.cfg problem where the manager will not attach to the client I\'d hoped I was done... attach any project... and I get this crap... 7 months later. They need to take a look at the Debian package. It just works.
3) Message boards : Number crunching : Upload problem (Message 37342)
Posted 25 Jun 2009 by Profile Skip Da Shu
Post:

Apparently this didn't happen after all, and that server is still physically in Switzerland.


It is indeed.
I've just heard that it's up again and I'm reconfiguring it.
Interestingly, the link to the 2005 problems refers to both power supplies on uploader1.atm failing. This is the same machine that failed recently, only more severely. I think it's about time to retire that hardware and any future machine appearing on that url will be new.


Reading that historical account of the 'Bern Floods' I thought I picked up that ya'll are using Dell stuff. Living about 10 miles from Dell you'd be surprised how many Dell server parts are laying around within a 50m radius of here. Next time you need a PSU post up the model... we might be able to locate one in a matter of hours. I'd call Steve first... about 3 miles up the hwy from me... unless he's 'cleaned up' he has an entire room in his house full of Dell stuff... mostly server stuff... I don't ask. LOL.
4) Message boards : Number crunching : problem uploading (Message 37338)
Posted 24 Jun 2009 by Profile Skip Da Shu
Post:

Number Crunching is also the appropriate section for posts about problems; there are currently 2 threads there discussing it.


Les, I agree this s/b in number crunching... but...

The link in this (below) on the front page directs folks here.

Recent updates

* Server space shortage
2009-06-10 This image denotes that this item is a news item rather than an addition of a new page.


Think maybe that should be redirected to news thread in number crunching.

5) Message boards : Number crunching : Just a thought... Donated / Loaned Hardware (Message 37202)
Posted 13 Jun 2009 by Profile Skip Da Shu
Post:
Mo/Milo, I know this is a long-shot but...

The folks that crunch for CPDN are a wide and varied group. I'm sure the jobs folks have also range all over the place from mere Rocket Scientist to highly esteemed Sanitation Engineer. However, I strongly suspect that there's a higher than average % of folks that are somehow connected to IT (and/or are just hobby hardware geeks). As I sit here and ponder the disk space and power supply problems going on my eyes keep drifting to my 'parts shelves'. Specifically to the 7 extra HDDs, 4 extra power supplies and 17 power cords over there. Now my stuff is officially certified 'Junk' and all older than dust. But I can't help to wonder if someone else has access to or happens to have a spare PSU or a "raid array" of the same make/model that ya'll need. Not likely but posting up the specific make/model of the parts needed wouldn't take more than a couple minutes and we might just get lucky. Maybe somebody with more means than I would be willing to ship ya'll a PSU as a 'loaner'. Who knows? Maybe worth a shot in the future.

Hmmmm... Now that reminds me, where is that old dead ProLiant server box I kept around for the PSU and SCSI card.
6) Message boards : Number crunching : HADAM3P NOT Compatible with Pentium3s & AthlonXPs (Message 36894)
Posted 9 May 2009 by Profile Skip Da Shu
Post:
I am using an Athlon XP 2400 chip and was getting this sort of problem with HADAM3P.


Those really old systems are susceptible to VCore voltage drops when models are running. I have an old Athlon MP 1200 system that does this. It can run most applications fine, but climate models and other BOINC apps don't always finish. By monitoring the voltages, I found the problem. Turns out the only fix is a new motherboard. :(
Unless the BIOS allows you to set the vcore.
7) Message boards : Number crunching : to how many computers is a WU sent? (Message 35942)
Posted 16 Jan 2009 by Profile Skip Da Shu
Post:
Please see my post here in reply to a similar question on our other board.
.......................
Les,
Unencumbered by any actual knowledge...

I was thinking that it\'s not a matter of modifying the edit in code but rather some parameterization to set \"\'X\' # of units will be sent out\" so that the edit is passed / message not triggered.
Oh well, no response needed... just rambling thoughts.
8) Message boards : Number crunching : upload server (Message 35941)
Posted 16 Jan 2009 by Profile Skip Da Shu
Post:
The server has its new RAID array and has had an OS upgrade, but there are problems with the array that mean it can\'t come back on line yet. This is supposed to have been worked on this afternoon and I\'ll hopefully find out more tomorrow.
As it\'s in a different department where I don\'t have machine room access I have to depend on the sysadmin there to deal with it.
...........
Offer \'em a 12-pack, It\'ll be fixed within the hour.
In days long gone this would always get my mainframe batch jobs through much faster at month-end.
9) Questions and Answers : Unix/Linux : Boinc on ubuntu server (Message 35940)
Posted 16 Jan 2009 by Profile Skip Da Shu
Post:
For Xubuntu/Ubuntu you may also need to install the libstdc++5 package. It\'s been too long for my memory to be accurate on this but... I don\'t think this is required for CPDN but it is for some BOINC project apps. There were some posts on this back around Nov/Dec 2007... I\'ll take a quick look because I thought I\'d documented what projects require what back then.
10) Message boards : Number crunching : Workunit error - check skipped (Message 35075)
Posted 21 Sep 2008 by Profile Skip Da Shu
Post:
I am noticing that most if not all of my WUs returned that are marked as \"success\" have \'valid state\' as \"Workunit error - check skipped\". Because some of these are OC\'d dedicated crunchers I was about to undertake an across the board reduction in clock speeds until I checked a couple \'not overclocked\' machines http://climateapps2.oucs.ox.ac.uk/cpdnboinc/show_host_detail.php?hostid=8270 and see it getting the same result. Am I really contributing anything at all these days? Is this \'normal\'?
11) Message boards : Number crunching : Where do all the errors come from? (Message 31571)
Posted 3 Dec 2007 by Profile Skip Da Shu
Post:
If the answer to Mike\'s question is positive, have you recently run stability checks on the machine? Unless I looked at the wrong entries, the machine had 10 Models in WinXP SP1, none of which were successful. (Yes, three show \'Success\' but they, too, failed; boinc entries must be taken with a grain of salt.)

24 hours of dual Prime-95 wouldn\'t hurt. Just to be sure.


The problem appears to be the IA32 libs. However, it appears I tested stable under WinXP SP1 with FSB set at 216 and default vcore. Then at some point unknown bumped the vcore a notch and the FSB to 218 and didn\'t run a full test (OCCT or Prime95). I haven\'t got Prime95 installed yet but backed the FSB down to 215 and left vcore up 1 notch just to be safe until I can find/install Prime95 or another stress tester. Thanx
12) Message boards : Number crunching : Where do all the errors come from? (Message 31570)
Posted 3 Dec 2007 by Profile Skip Da Shu
Post:

Did you install IA32 support in Ubuntu? (I gather that Ubuntu 64-bit does NOT support 32-bit apps by default).

I used a package manager and found an IA32 library noted as \'shared 32bit libs for AMD64 system\'. Installed that and it has resolved the code 22 on QMC and E@H WUs. Assuming it\'ll do the same for CPDN but have a couple more hours before I can get another one to verify.

Thank you VERY much.

UPDATE: A CPDN HadSM3 Slab WU has now running for about 5 minutes. :-)
13) Message boards : Number crunching : Where do all the errors come from? (Message 31563)
Posted 2 Dec 2007 by Profile Skip Da Shu
Post:
Well ya\'ll didn\'t make me feel all warm and fuzzy about solving my errors... let\'s take a run at it.

Today (after a few days of winding down WUs on the machine) I formated the HDD and installed Xubuntu v7.10 (64bit) on this machine. It\'s been running WinXP for some time with multiple projects on it. It\'s an AMD X2 4200+ with 2 x 256MB of PC4000 RAM. It\'s a dedicated number cruncher as is normally \"headless\".

I installed the v5 stdc++ libs (Gutsy comes with v6) required by several project apps (QMC, E&H, Lieden, WCG and perhaps CPDN). Use the package install to get the AMD64 version of BOINC v5.10.8 up and running as a daemon. I encountered these errors:

Sat 01 Dec 2007 06:24:50 PM CST|QMC@HOME|Reason: Unrecoverable error for result three_ad_anthracene.3996_0 (process exited with code 22 (0x16, -234))

Sat 01 Dec 2007 06:25:47 PM CST|climateprediction.net|Reason: Unrecoverable error for result hadsm3fub_0107_005913005_1 (process exited with code 22 (0x16, -234))

Sat 01 Dec 2007 06:25:51 PM CST|Einstein@Home|Reason: Unrecoverable error for result h1_0666.20_S5R2__265_S5R3a_1 (process exited with code 22 (0x16, -234))

Sat 01 Dec 2007 06:25:53 PM CST|World Community Grid|Reason: Unrecoverable error for result dddt0201k0629_ZINC06913243-0000_00_0 (process exited with code 22 (0x16, -234))


One thing that makes me think it\'s app dependent is that one of WCGs other apps runs fine.

Any thoughts?

PS: I see \"execv: No such file or directory\" in this returned result http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=7013096
14) Questions and Answers : Wish list : Merging old dead computers (Message 27807)
Posted 9 Apr 2007 by Profile Skip Da Shu
Post:
Deleting won\'t happen, because the workunits do not get purged from the database. The units crunched on your machine are still being used for science, so they can not remove it. This project is different from others that have a separate database for the units, CPDN does not. It keeps it on the online database.

They auto-hide, so just ignore them.


Look at 50038
15) Questions and Answers : Wish list : Merging old dead computers (Message 27805)
Posted 9 Apr 2007 by Profile Skip Da Shu
Post:
I have 17 computers listed that I no longer have and / or they no longer crunch. 11 (about to be 12) of these have not contacted the server in over 13 months. We should have the ability to delete them after this much time. If that\'s not possible will you set up an email address where we can send you machine numbers and you delete them from the database? -- Skip
16) Questions and Answers : Preferences : Unable to merge computers (Message 20068)
Posted 9 Feb 2006 by Profile Skip Da Shu
Post:
... I don\'t think that facility was created...

Is not host merging and deleting a stock part of BOINC? Does CPDN have/run no jobs that clean up / remove / purge old returned and/or old unreturned work from the ques/dBs? Is this what you are refering to as not being created?


But I\'m sure they know about the \'problem\'.


Do I detect a \'wee bit\' of sarcasm here? Not much of a \'problem\' when you only have 3 computers listed and only 1 that does anything... I\'d guess that it\'s really 1 computer (why will they not merge? They appear identical)


At the moment, they are busy with the science, which is what their funders want.


Ahhh the old \'science card\'. Well let me give you the stock reply also then... there is NO science without the crunchers.
17) Questions and Answers : Preferences : CPU speed + errors + continuous trickling (Message 20064)
Posted 9 Feb 2006 by Profile Skip Da Shu
Post:
If set to blank screen, there should be no calculations done. You could verify that by watching Task Manager when Blank screen kicks in.


Wait a minute here... how do you see the task manager after the screen goes blank??!

General comment on Memtest86+ & Prime95: I find memtest to be a good final memory test. If I can get it to run w/o erros I don\'t have memory errors with anything I run. In fact Test5 on AMD XPs (especially dual channel) will find things nothing else does. So, great tool and a prerequesit for any sort of stable OC\'ing set up.

However, I have some reservations about the absoluteness of Prime95. I personally believe CPDN to be a better FPU / CPU tester than Prime95. If P95 runs for an hour then the machine is ready to be tested with CPDN. Suspend all the other projects and if it runs overnight it\'s \'good to go\'.

I\'d be curious to see the General Prefs massic80 is using. Also has either machine been tweaked at all? I\'d better read on.
18) Questions and Answers : Preferences : Unable to merge computers (Message 20062)
Posted 9 Feb 2006 by Profile Skip Da Shu
Post:
Sorry, no way around it of which I\'m aware.

In time, the machine count can become a meaningless indicator. Unfortunate, because it renders comparative stats questionable to meaningless.


So with a big stick in hand, who\'s arm do we need to twist to get some clean-up done?

I\'ve got old machines whose last activity was 2004. Machines with no id, no units, no ip, no nothing... a trash entry. Machines who finished all work in 2005 or have only past due w/u\'s from 2005... all forms of trash. I\'ve taken to hidding them as it\'s such a mess.

Who do we need to write to get some action in this area?

19) Message boards : Number crunching : To Completion Time (Message 16466)
Posted 6 Oct 2005 by Profile Skip Da Shu
Post:
... text deleted...
If you suspended the slab unit now and set no new work for CP. I would estimate that around 1400 of 3600 hours to 28 Feb would be taken completing the sulphur model. That\'s about 39% of your time rather than the set 25%. It will then catch up on work on other projects before giving CP more time to do the slab before its deadline.

If you continue doing the slab, CP will end up taking much more of the time available than 39%. So if you want to avoid this, and keep resource usage more in line with what you have set, it could be sensible to suspend the slab model for 5 months. Since you are only 9 hours into the slab, this is what I would recommend.
... text deleted

Good clear thinking! I\'m now upset with myself that I didn\'t think of suspending the slab and running the other one 1st. Duh. Thanx much Crandles.
20) Message boards : Number crunching : To Completion Time (Message 16445)
Posted 5 Oct 2005 by Profile Skip Da Shu
Post:
Are the estimated time to completion predictions for AMD based Windows machines close to reality for the sulfer cycle WUs?

One of my machines is working a slab WU (1.75% complete) with 9 hours crunched and an estimated remaining of about 334 hours. Since CPDN gets 25% of the machines time, that\'s 6 hours per day or about 55 days to complete. Well before the 9/2006 deadline. However, 55 days from now will put us into early December. Still no problem... but... I have a sulfer cycle 4.19 WU sitting in que behind it with a deadline of 2/28/2006 that has not started and has an estimated remaining of about 1543! If I did the math right, 1543 at 6 per day is around 257 days. And this thing will not start till December 1st! Unless someone tells me that the estimated completion time will run down way faster than the actual time crunched... I need to abort this WU before it starts.

For some reason when I upgraded from 4.45 to 4.72 I downloaded both of these WUs.


Next 20

©2024 climateprediction.net