climateprediction.net home page
Posts by old_user733

Posts by old_user733

11) Message boards : Number crunching : trickle uploads stopping at 7.3 MB. Is it just me? (Message 32462)
Posted 5 Feb 2008 by Profile old_user733
Post:
It was the Riverbed. With that bypassed, the transfers went smoothly.
12) Message boards : Number crunching : trickle uploads stopping at 7.3 MB. Is it just me? (Message 32456)
Posted 5 Feb 2008 by Profile old_user733
Post:
Sorry I had to go, my ride was there. We have a thing at our location which is called a \"Riverbed\"...

http://www.riverbed.com/products/appliances/

which gets in the middle of all transfers, like a cache. I am becoming suspicious that it may be at fault here. Will explore this more tomorrow.
13) Message boards : Number crunching : trickle uploads stopping at 7.3 MB. Is it just me? (Message 32455)
Posted 4 Feb 2008 by Profile old_user733
Post:
I ran Wireshark on one of the machines with the problem, and at the point where the stoppage occurs, I see a \"TCP ZeroWindow\" message, looks like it\'s coming back from climateapps3.oucs.ox.ac.uk (163.1.13.134). This is on the Ack, and I can see a steadily decreasing window size on the preceding acks.

got go go, more on this later...
14) Message boards : Number crunching : trickle uploads stopping at 7.3 MB. Is it just me? (Message 32451)
Posted 3 Feb 2008 by Profile old_user733
Post:
Hmmm, well you\'ve answered at least part of the question. It does seem to be just me. To answer the two questions implied in your response, no, we don\'t use a proxy server, and, I\'ve been running CPDN for years from this site, we never had this issue before.

It\'s a good bet that somebody in the IT department has been fiddling with something, then. I\'ll have to complain about it tomorrow.

Thanks for the help.
15) Message boards : Number crunching : trickle uploads stopping at 7.3 MB. Is it just me? (Message 32448)
Posted 3 Feb 2008 by Profile old_user733
Post:
OK, so it must be something on my end. Just want to make sure before I start complaining to our IT department. Here\'s an example from the message log of what\'s happening...

2/3/2008 9:34:31 AM|climateprediction.net|Started upload of hadcm3iozn_cq15_2000_80_75899506_6_5.zip
2/3/2008 9:38:51 AM||Project communication failed: attempting access to reference site
2/3/2008 9:38:51 AM|climateprediction.net|Temporarily failed upload of hadcm3iozn_cq15_2000_80_75899506_6_5.zip: http error
2/3/2008 9:38:51 AM|climateprediction.net|Backing off 2 hr 8 min 39 sec on upload of hadcm3iozn_cq15_2000_80_75899506_6_5.zip
2/3/2008 9:38:52 AM||Access to reference site succeeded - project servers may be temporarily down.

The zip file in question is 15.83MB, and the transfer started and went very quickly up to 7.30MB, then abruptly halted. After a while, timeout ensued. The same thing is also happening on another machine which has 3 of these files trying to upload.
16) Message boards : Number crunching : trickle uploads stopping at 7.3 MB. Is it just me? (Message 32440)
Posted 3 Feb 2008 by Profile old_user733
Post:
I just noticed that all my machines are stuck trying to upload trickles and/or results. They start out uploading just fine, then the upload freezes at around 7.3MB, or just under halfway. The upload times out, then the process repeats. Is there a server problem there, or is it on my end?
17) Message boards : Number crunching : Server State Over, but wu is in progress! (Message 20290)
Posted 15 Feb 2006 by Profile old_user733
Post:
Don\'t abort! The CC is stupid in that CPDN does not care that the deadline has been passed, but with other projects, it is a big deal. The original bunch of work units were sent out with too short of a deadline. Later sulphurs had a deadline of about a year. Congratulations on getting so far with your P3 1 GHz and good luck on finishing!


OK, OK!! I\'ll take my finger off the button! ;-) Thanks for the quick reply.

Seriously, though, should I detach this machine from CPDN in the future? It looks like the new models will, if anything, require even more powerful CPU\'s. This machine is at the low end of my collection, so I will still be running CPDN with other, faster machines in any case. Is there a minimum recommended machine?

Thanks.
-Gene
18) Message boards : Number crunching : Server State Over, but wu is in progress! (Message 20287)
Posted 15 Feb 2006 by Profile old_user733
Post:
On this note, I have a machine which is running a sulphur cycle 4.19 model, this is the WU:

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/workunit.php?wuid=711985

It is now overdue by 6 days, the BOINC CC is telling me I should abort it, but it is on phase 5. It seems that the deadline was a bit short on this WU, since I got it on Sept.12, 05, I thought these had about a year to finish.

Anyway, the WU is listed as \"over\" with \"too many total results\" as the error. Should I abort? I\'m really reluctant to do so since it is on phase 5, unless the result is worthless. It is continuing to run normally otherwise.

The machine is a 1GHz P3, it has no other project running at the moment, BOINC has been in EDF mode essentially since I got this WU. Now running BOINC CC 5.2.15.
19) Message boards : Number crunching : Announcement: Database residual problem - misallocated WUs (Message 12902)
Posted 26 May 2005 by Profile old_user733
Post:
I just got some WU's on a couple of machines, but I don't see them listed in my "results" page (yet). Is there a delay before they show up?
-----
Actually, two results showed up for one of my machines, but they are different than the ones I got. The other machine's WU's did not show up. I'm thinking I will have to abort them all and try again.
-----
These are the problem WU's:
Host: 6415 - Result ID: 880949 - Name: 3y7o_100206157 (not on machine)
Host: 6415 - Result ID: 880941 - Name: 3y7g_100206149_0 (not on machine)
Host: 6415 - Name: 3zvo_100208339_0 (not in results page) - aborting
Host: 6415 - Name: 3zx7_100208394_0 (not in results page) - aborting
Host: 1113 - Name: 3zx1_100208388_0 (not in results page) - aborting
Host: 1113 - Name: 3xv6_100205703_0 (not in results page) - aborting
-----
Interesting note: Results 880941 and 880949 are listed as being sent to me at an earlier time than I got the other four units.
-----
Update: After aborting the 4 unlisted WU's, I got 1 WU per machine that DID show up on my results page. Whew!
20) Message boards : Number crunching : Boinc 4.36 dev version. dwnlded 3 cpdn, what to do (Message 12519)
Posted 11 May 2005 by Profile old_user733
Post:
I've got one machine which dl'ed an extra WU which it won't start for maybe a month at the going rate. I know it'll finish in time, but doesn't CPDN assume you trashed it if it doesn't start trickling?

21) Questions and Answers : Windows : Boinc v4.09 conflict w/ Climate Prediction (Message 5106)
Posted 6 Oct 2004 by Profile old_user733
Post:
By any chance do you have a dual processor or HT machine? There is an existing problem with BOINC GUI 4.09 and multi-processor machines when there is not enough work for both processors.

I myself am locked out today after having lost 220 hours of work on one CPDN WU, and erroring out of about 3 more new WU's in an attempt to fix it. Apparently the fix (until a new BOINC GUI comes out) is to set your GENERAL prefs to use only 1 processor. This may or may not work, since upon rebooting or restarting BOINC, it hangs right away, before you can attempt to update.

If you kill the BOINC GUI process, then the two HADSM processes (in that order), no harm is done to the WU's (however, the problem remains, and when you start up BOINC again it will hang). If you kill the HADSM processes first, the BOINC GUI will dump them claiming a processing error. If you have a lot of time invested in a WU, you will lose it. However, this seems to be the only way to get the client to attempt an update.

22) Questions and Answers : Windows : Problems with BOINC 4.09 on Windows XP (Message 4894)
Posted 1 Oct 2004 by Profile old_user733
Post:
>
> I would suggest updating your project preferences (
> http://climateapps2.oucs.ox.ac.uk/cpdnboinc/prefs.php?subset=global and
> http://climateapps2.oucs.ox.ac.uk/cpdnboinc/prefs.php?subset=project ),
> reinstalling BOINC 4.09, running the benchmarks, and doing a project update
> for all the attached projects on the machine.
>
> My guess is the problem has nothing to do with you hardware or OS.
>
> Are there any suspect messages listed in the messages tab of the boinc_gui
> application?
>
>
Further info: This machine was attached to CPDN, Seti@Home, and Predictor. Predictor is obviously not giving any work lately, and Seti had run out. There was only one CPDN WU on this HT machine. So, since things were working again under 4.05, I could see that Seti had stopped giving me work because of the client version. I again tried upgrading to 4.09 (completely shutting BOINC down before upgrading), but, again, it got into this state (and it was NOT running the benchmarks).

I then tried this -- I backed off again to 4.05, had it running again, then attached to LHC@home and got some work. I upgraded again to 4.09 and everything is now working. Could it have something to do with only having work for one of two processors?

How come I can't get another WU from CPDN? So far, out of four machines, two being duals, one being HT, and one being a single, only one of the duals has two WU's from CPDN. Prefs are the same. (I know, a different problem.)
23) Questions and Answers : Windows : Problems with BOINC 4.09 on Windows XP (Message 4759)
Posted 28 Sep 2004 by Profile old_user733
Post:
OK, more details. Setting size to zero didn't help. I also noticed there was NO Seti process running as it had run out of work, and S@H isn't responding. Going to try reinstalling 4.05.

Also, I'm running 4.09 on three other machines with no problem. One is a P4 (non-HT) and two are PIII duals.

-------
Edit-- Back up and running with 4.05...
24) Questions and Answers : Windows : Problems with BOINC 4.09 on Windows XP (Message 4758)
Posted 28 Sep 2004 by Profile old_user733
Post:
This is not an answer, but more details...

I just switched to BOINC 4.09, am running WinXP Pro, SP2. CPU is a 3.2GHz P4 HT. I didn't get to see the screensaver part -- when I turned the monitor on this morning, the screen was black, the machine unresponsive to mouse wiggles. I was able to do a Ctl-Alt-Del to get control, and rebooted. On reboot, BOINC started up but when I clicked the icon in the systray, nothing happened. I checked running processes, saw hadsm3um_4.03_w (50%), hadsm3_4.03_win (0%), boinc_gui (12%). My virus scanner was taking up the remaining time scanning client_state.xml as it was being rewritten constantly.

I looked at client_state.xml and there is an odd line in it which may be a symptom. Here are the first few lines in the section up to the odd one:

{project}
{master_url}http://climateprediction.net/{/master_url}
{project_name}climateprediction.net{/project_name}
{share_size}0.000000{/share_size}
{size}-7307860.000000{/size}

I noticed that in properly working installations, the "size" is 0.000000. I'm going to try just changing that one line to see what happens.





Previous 20

©2024 climateprediction.net