climateprediction.net home page
file_xfer_error

file_xfer_error

Questions and Answers : Windows : file_xfer_error
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
old_user15825

Send message
Joined: 9 Sep 04
Posts: 5
Credit: 101,373
RAC: 0
Message 25398 - Posted: 1 Dec 2006, 6:16:41 UTC

30.11.2006 14:10:28|climateprediction.net|Restarting task hadcm3ohc_1k23_05621010_1 using hadcm3 version 515
1.12.2006 2:20:55||Rescheduling CPU: application exited
1.12.2006 2:20:55|climateprediction.net|Computation for task hadcm3ohc_1k23_05621010_1 finished
1.12.2006 2:20:56|climateprediction.net|Unrecoverable error for result hadcm3ohc_1k23_05621010_1 (<file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_2.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_3.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_4.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_5.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_6.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_7.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_8.zip</file_name> <error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohc_1k23_05621010_1_9.zip</file_name> <error
1.12.2006 2:20:56|climateprediction.net|Deferring scheduler requests for 1 minutes and 0 seconds
1.12.2006 2:21:57|climateprediction.net|Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
1.12.2006 2:21:57|climateprediction.net|Reason: To fetch work
1.12.2006 2:21:57|climateprediction.net|Requesting 8640 seconds of new work, and reporting 1 completed tasks
1.12.2006 2:22:03|climateprediction.net|Scheduler request succeeded

Is there any known reason for this?
ID: 25398 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 25399 - Posted: 1 Dec 2006, 6:56:54 UTC

Yes, It\'s well known.
When something happens to the model such that BOINC loses contact with the science program, BOINC thinks that the model has stopped because it\'s finished, and tries to upload the final zip files. As the model HASN\'T finished, the files don\'t exist, which gives a 161 (not found) error.

The real reason for the crash lies with any other error messages that you got, or that have been sent to the server to be appended to the model\'s page in your account.

There are some hints and tips about this in this thread.
Sometimes an update to the drivers for the graphics card from the maker\'s website helps.
As does Suspending BOINC before running gaphics intense programs:
games, video editing, audio/video streaming, Skype, etc.

ID: 25399 · Report as offensive     Reply Quote
old_user15825

Send message
Joined: 9 Sep 04
Posts: 5
Credit: 101,373
RAC: 0
Message 25417 - Posted: 2 Dec 2006, 10:59:42 UTC
Last modified: 2 Dec 2006, 11:01:40 UTC

This is just so frustrating because there\'s so many possibilities that may cause the problem. And I don\'t have time to solve it... I would be very pleased if someone could help me.
I\'m having Win xp with SP2. Symantec\'s firewall and virus protection. I read that symantec may cause troubles but I didn\'t find any good answer how to solve it. Somehow I have to exclude these files but how? I\'m using omega driver 3.8.291 for ATI 9600 pro. Is this combination working well?

Les Bayliss wrote:
\"As does Suspending BOINC before running gaphics intense programs:
games, video editing, audio/video streaming, Skype, etc.\"

I do suspend boinc if I play something but if I watch some short streaming videos I\'m not doing and not going to suspend. If that\'s the problem I just can\'t BOINC anymore. Skype? You mean video calls?

It can\'t be any hardware problems, because this is rather old machine and never unstable. Cooling is working perfectly. AMD 2800+ (64).

Here\'s some parts from the stderr out log:
<core_client_version>5.4.11</core_client_version>
<stderr_txt>
Not a JPEG file: starts with 0x01 0xda
(null): cannot open input file dataout/atmos_restart.day
(null): cannot open input file dataout/ocean_restart.day
Not a JPEG file: starts with 0x01 0xda
Not a JPEG file: starts with 0x01 0xda
CPDN Monitor - Quit request from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error - Return code = 16
Error in converting file dataout/1k23fo.pjc2c10 to netcdf format.

MainError:	07:47:22 PM	No files match the supplied pattern.
Not a JPEG file: starts with 0x01 0xda

CPDN Monitor - No \'heartbeat\' from BOINC...
No heartbeat from core client for 32 sec - exiting

Model crashed: umshell1.f:  P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.                                                                                                                                                                                                                     GA
Fatal crash! :-(



Something helpfull for someone?
ID: 25417 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 25421 - Posted: 2 Dec 2006, 12:54:12 UTC


Model crashed: umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.

This is the key line in the log. If the crash happened in the later part of the model (2000 onwards) it probably indicates that the climate model went as far as it could, and found out that it wasn\'t a realistic model. This is one of the main things the project is trying to discover, which models are realistic and which aren\'t, so it\'s actually quite a \'good\' outcome from the scientists viewpoint.

The alternative, which is more common (particularly in the earlier years of the model), is that a floating point calculation was incorrect and this caused the model to go off the rails. To test this theory, try running Prime95 for a day or so to check that all is well with the hardware.

We have no way of telling which of these alternatives may be the correct one without being able to look at the model and host\'s link, since you have them hidden.

In addition to that the graphs don\'t seem to be working on this site so that makes it harder anyway (if they were working, we\'d be looking at the shape of the model\'s temperature graph to see if it was a gradual temperature change which would indicate the climate was not viable or an abrupt one which would indicate your CPU is introducing errors).
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 25421 · Report as offensive     Reply Quote
old_user15825

Send message
Joined: 9 Sep 04
Posts: 5
Credit: 101,373
RAC: 0
Message 25435 - Posted: 2 Dec 2006, 20:06:23 UTC - in response to Message 25421.  


Model crashed: umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.

This is the key line in the log. If the crash happened in the later part of the model (2000 onwards) it probably indicates that the climate model went as far as it could, and found out that it wasn\'t a realistic model. This is one of the main things the project is trying to discover, which models are realistic and which aren\'t, so it\'s actually quite a \'good\' outcome from the scientists viewpoint.


Of course. I totally forgot that possibility.


...since you have them hidden.


Should not be anymore.


When I built this machine I run long memorytests(memtest86+) and cputests(superPI) and everyhing worked fine. It\'s even possible to overclock pretty much before getting unstable. So only chance is that my machine is getting broken if the reason is in the hardware.
I you still think it\'s in the hardware I will run Prime95. ;)



ID: 25435 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 25438 - Posted: 3 Dec 2006, 1:05:27 UTC

Hi,

Thanks for that, it looks like the model got to around 1940 (i.e., 20 years into the model). This is a bit early for a typical \'model out of bounds\' type situation, but since the graphs aren\'t working at the moment it\'s hard to say. On the other hand, the s/ts times are very stable, whereas on an unstable machine they tend to increase (due to having to re-run days, months and years whenever a calculation error is received).

So I can\'t say for sure, but on balance I\'d tend towards it being a model going out of bounds. If it happens again, however, I\'d recommend running Prime95\'s torture test for 24h or so (I think it\'s similar in concept to SuperPI).
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 25438 · Report as offensive     Reply Quote
old_user15825

Send message
Joined: 9 Sep 04
Posts: 5
Credit: 101,373
RAC: 0
Message 25458 - Posted: 5 Dec 2006, 3:38:39 UTC

I tried Prime95.
Self-test 1792K passed!
About 30h counting for that project, ~9h torture test with large FFTs and over 7h Blend torture test.
Everything should be okay in my pc because no errors were reported.
ID: 25458 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 25460 - Posted: 5 Dec 2006, 5:40:37 UTC
Last modified: 5 Dec 2006, 5:41:33 UTC

Symantec is Norton, and Norton\'s virus checker is a known trouble maker.
(It locks files before checking, and Fortran doesn\'t like being kept waiting for
one of it\'s files to become available.)

There\'s a How To for the firewall here.

ID: 25460 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25523 - Posted: 8 Dec 2006, 17:40:28 UTC

I joined the club. Model crashed about 49%; now re-running from 3-day-old backup.

This follows a l-o-n-g list of not .jpg file messages:
Model crashed: umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Model crashed: umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Model crashed: umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Model crashed: umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Fatal crash! :-(</stderr_txt>

Our old favorite for the next decadal file to have been uploaded:
<file_name>hadcm3ohc_1f8l_05614764_0_8.zip</file_name>
<error_code>-161</error_code>


The other model on the box was okay. So, now I wait three days to see whether the problem goes away, or whether the three days were \"wasted\"...

The entire folder is saved in case anyone would like to see more diagnostic/prescriptive data.

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25523 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25556 - Posted: 11 Dec 2006, 16:45:29 UTC
Last modified: 11 Dec 2006, 19:26:01 UTC

Follow-up to previous post:
Attempted to demonstrate whether this negative pressure crash could be attributed to a machine fluke or the the Model,
for whatever value that might be to Carl and/or the researchers. Pity the thing couldn\'t have held on until after 50% complete.

After three days of rerun, same error same place:

... Not a JPEG file: starts with 0x01 0xda Not a JPEG file: starts with 0x01
0xda Not a JPEG file: starts with 0x01 0xda Suspended CPDN Monitor - Quit
request from BOINC... Not a JPEG file: starts with 0x01 0xda Model crashed:
umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Model crashed:
umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Model crashed:
umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Model crashed:
umshell1.f: P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED. GA Fatal crash! :-(</stderr_txt>

12/11/2006 5:09:52 AM|climateprediction.net|Unrecoverable error for result hadcm3ohc_1f8l_05614764_0 (<file_xfer_error>
<file_name>hadcm3ohc_1f8l_05614764_0_8.zip</file_name> <error_code>-161...

<file_name>hadcm3ohc_1f8l_05614764_0_15.zip</file_name>

Edited to format the error output. (Lines were too long.)
Edit2: Other than quite cold, and leveling-out still cold, it doesn\'t look all that different from a few others running on my machines.
Hmm, it used one of my Spinups... (Nice to have graphics on \"ohc\" Runs; thanks Carl & Tolu.)
http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6035197
"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25556 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25607 - Posted: 14 Dec 2006, 19:26:11 UTC
Last modified: 14 Dec 2006, 19:43:35 UTC

AArrrrgggggh! (And other manifestations of the primal scream.)

Second negative pressure generated in a week. Different series of Model. Different machine. Different TYPE of machine.
(Previous error was on an A64 X2 4400+ running WinXP-64, this one is on Pentium-D 940, with 32-bit XP.) Both on 5.4.9.

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=5557168

Have a backup a bit less than two days old but, given the experience trying to push the earlier Run beyond the error point,
I didn\'t bother to try.

By the way, this one died ~97%. I\'m not pleased --> and on top of that, a storm arrives in a few hours
packing gusts to 100 mph and seas to 42 feet. Bad day on the coast.


Edit: That would have been #14 but leaves the count at 13 and an \"almost\". (As the old expression holds,
\"almost\" counts only in horseshoes, hand grenades, and nukes.)

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25607 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 25609 - Posted: 14 Dec 2006, 20:18:09 UTC

Commiserations on the model.
Hope the storm damage isn\'t too great.

ID: 25609 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25614 - Posted: 14 Dec 2006, 23:11:17 UTC
Last modified: 14 Dec 2006, 23:11:39 UTC

Thanks, Les, on both counts.

Storm is already in progress and, per the forecast, only eleven more hours to go until the heavy winds begin to subside...
The UPS units beep all too often and it\'s probably only a matter of time until line power goes down. (And max wind isn\'t due for about five hours.)

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25614 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 25658 - Posted: 19 Dec 2006, 9:03:49 UTC

Have you been able to recover the one that died ~97% ?

I also just had a -161 (at just above 6%), and I am currently re-running from a 2-day old backup.
ID: 25658 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25670 - Posted: 19 Dec 2006, 18:06:36 UTC
Last modified: 19 Dec 2006, 18:11:04 UTC

No, didn\'t try, even though I had a backup less than two days old.

Had one fail days before, also with negative pressure; reran it from three-day-old backup only to have it fail the same way at the same place
(~49% complete). Guessed it was a consequence of the parameter mix. Made the same assumption for the 97% Run, though it was a different
series Model on a different machine. (Figured it would be throwing good electrons after bad to retry.) So, my machines now have
sixteen completed Runs -- and an \"almost\".

Best of luck with your rerun.

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25670 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25849 - Posted: 6 Jan 2007, 1:23:34 UTC
Last modified: 6 Jan 2007, 1:25:04 UTC

Another Negative Pressure failure, third one in less than a month.
Three different machines, none overclocked.
This Model was on my newest machine, an Intel E6600 Core 2 Duo --> a machine that
already finished a pair of CPDN Runs with no other failures.

Reran from a two-day-old backup. Failed in exactly the same place.
That puts the onus on the Model. Again.

As far as I\'m concerned, we have a dodgy batch of parameter mixes --
and I won\'t be bothered to try to recover another Run which fails with Negative Pressure.

This malignant animal was ~29%, earliest of my sickly trio of failures.

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25849 · Report as offensive     Reply Quote
Profile old_user17289

Send message
Joined: 13 Sep 04
Posts: 228
Credit: 354,979
RAC: 0
Message 25902 - Posted: 9 Jan 2007, 4:16:48 UTC

I tried restore 4 times, and it failed each time near the same area (although not always the same error; sometimes it was 0xFFFFFFFF).

The latest download also just ended with 0xFFFFFFFF; this time I\'ll retry only once...
ID: 25902 · Report as offensive     Reply Quote
Profile Jack Shaftoe

Send message
Joined: 8 Oct 06
Posts: 43
Credit: 3,310,951
RAC: 0
Message 25933 - Posted: 10 Jan 2007, 13:18:42 UTC
Last modified: 10 Jan 2007, 13:22:03 UTC

Rats, just had the same problem. Fortunately it happened on a relatively newer model. Only 60 hours into it.
Spotted this thread on the forum, and realize now the importance of backing up my models (I have a few approaching 80% complete).

Here is the result

1/10/2007 3:59:22 AM|climateprediction.net|Unrecoverable error for result
hadcm3ohe_27in_05751393_1 (<file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_1.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error> <file_name>hadcm3ohe_27in_05751393_1_2.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_3.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_4.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_5.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_6.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_7.zip</file_name>
<error_code>-161</error_code></file_xfer_error><file_xfer_error>
<file_name>hadcm3ohe_27in_05751393_1_8.zip</file_name> <error>
ID: 25933 · Report as offensive     Reply Quote
Profile astroWX
Volunteer moderator

Send message
Joined: 5 Aug 04
Posts: 1496
Credit: 95,522,203
RAC: 0
Message 25944 - Posted: 10 Jan 2007, 18:09:32 UTC
Last modified: 10 Jan 2007, 18:10:46 UTC

This is the error from stderr_out:
Model crashed: umshell1.f: ATM_DYN : NEGATIVE THETA DETECTED.


Had three negative pressures in the last month, as far along as 97%. Reruns failed same way, same place, so I don\'t recommend rerunning
them. Though it\'s possible a machine error could cause a bad calculation, chances are it resulted from an unstable parameter mix. (A guess
on my part, but there have been too many recently not to suspect parameters. No biggie, though, the scientists also have to see what *doesn\'t*
work. [I know, small consolation.])

"We have met the enemy and he is us." -- Pogo
Greetings from coastal Washington state, the scenic US Pacific Northwest.
ID: 25944 · Report as offensive     Reply Quote
Profile Jack Shaftoe

Send message
Joined: 8 Oct 06
Posts: 43
Credit: 3,310,951
RAC: 0
Message 25969 - Posted: 12 Jan 2007, 2:29:13 UTC - in response to Message 25944.  
Last modified: 12 Jan 2007, 2:29:46 UTC

chances are it resulted from an unstable parameter mix. (A guess
on my part, but there have been too many recently not to suspect parameters. No biggie, though, the scientists also have to see what *doesn\'t*
work. [I know, small consolation.])


And another this morning. 2 in 2 days. :( Yuck!
Team Starfire World BOINC
ID: 25969 · Report as offensive     Reply Quote
1 · 2 · Next

Questions and Answers : Windows : file_xfer_error

©2024 climateprediction.net