climateprediction.net home page
Unrecoverable error for result sulphur_hska_000830170_0 ...

Unrecoverable error for result sulphur_hska_000830170_0 ...

Questions and Answers : Windows : Unrecoverable error for result sulphur_hska_000830170_0 ...
Message board moderation

To post messages, you must log in.

AuthorMessage
old_user143821

Send message
Joined: 26 Dec 05
Posts: 2
Credit: 251,588
RAC: 0
Message 20052 - Posted: 8 Feb 2006, 19:30:30 UTC
Last modified: 8 Feb 2006, 19:32:32 UTC

Hi,

I\'ve got the next error after ~51% WU was done and climateprediction.net job has been lost from the Work tab of boinc manager. What can I do for solve this problem?

cut from stderrdae.txt
======================================================
2006-02-06 18:06:30 [climateprediction.net] Restarting result sulphur_hska_000830170_0 using sulphur_cycle version 422
2006-02-06 18:06:35 [climateprediction.net] Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2006-02-06 18:07:08 [climateprediction.net] Sending scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi
2006-02-06 18:07:08 [climateprediction.net] Reason: To send trickle-up message
2006-02-06 18:07:08 [climateprediction.net] Note: not requesting new work or reporting results
2006-02-06 18:07:36 [climateprediction.net] Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded
2006-02-07 11:34:31 [---] request_reschedule_cpus: process exited
2006-02-07 11:34:31 [climateprediction.net] Computation for result sulphur_hska_000830170_0 finished
2006-02-07 11:34:32 [climateprediction.net] Unrecoverable error for result sulphur_hska_000830170_0 (<file_xfer_error>
<file_name>sulphur_hska_000830170_0_3.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_hska_000830170_0_4.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_hska_000830170_0_5.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
)
ID: 20052 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 20053 - Posted: 8 Feb 2006, 19:47:22 UTC

Unless there is a recent backup, there may not be anything you can do to recover this WU. Just as a diagnostic, could you post the last 30 lines of the yabsd.out file which may be found in the

/projects/climateprediction.net/\"experimentname\" or
/projects/climateprediction.net/\"experimentname\"/dataout folder.

It may be zipped, but once unzipped, can be opened in WordPad.

One other thing, using Windows 2003 on a PC with 256 MB of memory may be straining the system when BOINC climateprediction.net is running. I\'m not saying that was the cause of the failure, but may be contributing to it.
ID: 20053 · Report as offensive     Reply Quote
LMEE

Send message
Joined: 4 Sep 04
Posts: 7
Credit: 41,953,885
RAC: 296
Message 20061 - Posted: 9 Feb 2006, 17:23:42 UTC

Same problem here.

Recently a model calculation failed with an unrecoverable error. When I enabled network activity an upload of an 8 Mb file began. I use a dial-up connection and aborted the upload with the expectation of doing it later. The results seem to have been lost. Anyway I can recover the work for the project?

Here are a series of message I received.

2/8/2006 8:41:21 PM|climateprediction.net|Note: not requesting new work or reporting results

2/8/2006 8:41:21 PM|climateprediction.net|Started upload of sulphur_in0v_100869647_0_1.zip

2/8/2006 8:41:28 PM|climateprediction.net|Scheduler request to http://climateapps2.oucs.ox.ac.uk/cpdnboinc_cgi/cgi succeeded

2/8/2006 8:41:41 PM|climateprediction.net|Unrecoverable error for result sulphur_in0v_100869647_0 (<file_xfer_error> file_name>sulphur_in0v_100869647_0_1.zip</file_name> <error_code>-115</error_code> <error_message>user requested transfer abort</error_message></file_xfer_error>)



Here are the last lines from the yabsd.out file. Are the results recoverable?

SLAB TIMESTEP 177
3395537 words long
MODEL DUMP SUCCESSFULLY WRITTEN - 3434914 WORDS TO UNIT 22

Number of Words Written to Disk was 3436498
im,sm,ngroup,new_im,new_sm 1 1 48 T F
FINAL TOTAL ENERGY = 0.45466E+27 J/
INITIAL TOTAL ENERGY = 0.45455E+27 J/
CHG IN TOTAL ENERGY OVER DAY = 0.11824E+24 J/
FLUXES INTO ATM OVER DAY = 0.16368E+24 J/
ERROR IN ENERGY BUDGET = 0.45436E+23 J/
TEMP CORRECTION OVER DAY = 0.25144E-01 K
TEMPERATURE CORRECTION RATE = 0.29102E-06 K/S
FLUX CORRECTION (ATM) = 0.29441E+01 W/M2
FINAL ATM MASS = 0.17980E+22 KG
INITIAL ATM MASS = 0.17980E+22 KG
CORRECTION FACTOR FOR PSTAR = 0.10000E+01
im,sm,ngroup,new_im,new_sm 3 1 1 T F
NOCNINDX Namelist is
$NOCNINDX
J_1 = 1
J_2 = 2
J_3 = 3
J_JMT = 73
J_JMTM1 = 72
J_JMTM2 = 71
J_JMTP1 = 74
JST = 1
JFIN = 73
J_FROM_LOC = 0
J_TO_LOC = 0
JMT_GLOBAL = 73
JMTM1_GLOBAL = 72
JMTM2_GLOBAL = 71
JMTP1_GLOBAL = 74
J_OFFSET = 0
O_MYPE = 0
O_EW_HALO = 0
O_NS_HALO = 0
J_PE_JSTM1 = -1
J_PE_JSTM2 = -1
J_PE_JFINP1 = -1
J_PE_JFINP2 = -1
O_NPROC = 1
IMOUT = 4*0
JMOUT = 4*0
J_PE_IND_MED = 4*0
NMEDLEV = 0
$END
SLAB TIMESTEP 178
im,sm,ngroup,new_im,new_sm 1 1 48 T F


Thanks,

LMEE


ID: 20061 · Report as offensive     Reply Quote
DebT

Send message
Joined: 1 Dec 05
Posts: 1
Credit: 1,778,788
RAC: 564
Message 20077 - Posted: 10 Feb 2006, 0:38:03 UTC

Same problem here. Also, since the error, no new work has downloaded.

Error msg:
2006-02-05 08:49:36 [climateprediction.net] Unrecoverable error for result sulphur_e04r_000653355_0 (<file_xfer_error>
<file_name>sulphur_e04r_000653355_0_2.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_e04r_000653355_0_3.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_e04r_000653355_0_4.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
<file_xfer_error>
<file_name>sulphur_e04r_000653355_0_5.zip</file_name>
<error_code>-161</error_code>
<error_message></error_message>
</file_xfer_error>
)


Last few lines of yabsd.out are:

REPLANCA: UPDATE REQUIRED FOR FIELD 76
REPLANCA - time interpolation for field 76
time,time1,time2 7620.000 7200.000 7920.000
hours,int,period 7620 720 8640
Information used in checking ancillary data set:
position of lookup table in dataset: 818
Position of first lookup table referring to data type 58
Interval between lookup tables referring to data type 76
Number of steps 10
STASH code in dataset 125 STASH code requested 125
\'Start\' position of lookup tables for dataset in overall lookup array
368
REPLANCA: UPDATE REQUIRED FOR FIELD 77
REPLANCA - time interpolation for field 77
time,time1,time2 7620.000 7200.000 7920.000
hours,int,period 7620 720 8640
Information used in checking ancillary data set:
position of lookup table in dataset: 44
Position of first lookup table referring to data type 4
Interval between lookup tables referring to data type 4
Number of steps 10
STASH code in dataset 126 STASH code requested 126
\'Start\' position of lookup tables for dataset in overall lookup array
301
PPCTL: Opening new file e04rba.pa29c10 on unit 60
PPCTL: Initialising new file on unit 60
PPCTL: Opening new file e04rba.pb29c10 on unit 61
PPCTL: Initialising new file on unit 61
PPCTL: Opening new file e04rba.pd29c10 on unit 63
PPCTL: Initialising new file on unit 63
PPCTL: Opening new file e04rba.pe29c10 on unit 64
PPCTL: Initialising new file on unit 64
PPCTL: Opening new file e04rba.pf29c10 on unit 65
PPCTL: Initialising new file on unit 65
PPCTL: Opening new file e04rba.pg28dec on unit 66
PPCTL: Initialising new file on unit 66
PPCTL: Opening new file e04rba.ph28dec on unit 67
PPCTL: Initialising new file on unit 67
PPCTL: Opening new file e04rba.pi28dec on unit 68
PPCTL: Initialising new file on unit 68
ID: 20077 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 20082 - Posted: 10 Feb 2006, 5:31:44 UTC

It looks like there is a new type of error starting to occur.
If the two programmers weren\'t so tied up with last minute problems in getting the coupled model, (experiment 2), ready for launch, they\'d be right onto it.
There\'s not much I can say, except hang in there. You could continue with sulphur, in case it works better next time.

Deb
At least you got phase one uploaded. In sulphur, this contains extra data not included in phase one of slab models, and this will help the researchers a lot.

LMEE
It looks as though you uploaded phase one from one computer as well. I\'m not sure if this was the one about which you posted.

ID: 20082 · Report as offensive     Reply Quote
tron

Send message
Joined: 1 Dec 05
Posts: 1
Credit: 2,905,140
RAC: 27,842
Message 20084 - Posted: 10 Feb 2006, 6:56:32 UTC

Same problem here.

Fourth model which failed with an unrecoverable error. :-(

Here are a series of message I received.

10.02.2006 7:44:43 |climateprediction.net|Unrecoverable error for result sulphur_hc5e_100808898_0 (-exit code -1073741819(0xc0000005))

Last few lines of yabsd.out are:

Number of Words Written to Disk was 3436498
im,sm,ngroup,new_im,new_sm 1 1 48 T F
FINAL TOTAL ENERGY = 0.45364E+27 J/
INITIAL TOTAL ENERGY = 0.45363E+27 J/
CHG IN TOTAL ENERGY OVER DAY = 0.11511E+23 J/
FLUXES INTO ATM OVER DAY = 0.50202E+23 J/
ERROR IN ENERGY BUDGET = 0.38691E+23 J/
TEMP CORRECTION OVER DAY = 0.21412E-01 K
TEMPERATURE CORRECTION RATE = 0.24782E-06 K/S
FLUX CORRECTION (ATM) = 0.25071E+01 W/M2
FINAL ATM MASS = 0.17980E+22 KG
INITIAL ATM MASS = 0.17980E+22 KG
CORRECTION FACTOR FOR PSTAR = 0.99999E+00
im,sm,ngroup,new_im,new_sm 3 1 1 T F
NOCNINDX Namelist is
$NOCNINDX
J_1 = 1
J_2 = 2
J_3 = 3
J_JMT = 73
J_JMTM1 = 72
J_JMTM2 = 71
J_JMTP1 = 74
JST = 1
JFIN = 73
J_FROM_LOC = 0
J_TO_LOC = 0
JMT_GLOBAL = 73
JMTM1_GLOBAL = 72
JMTM2_GLOBAL = 71
JMTP1_GLOBAL = 74
J_OFFSET = 0
O_MYPE = 0
O_EW_HALO = 0
O_NS_HALO = 0
J_PE_JSTM1 = -1
J_PE_JSTM2 = -1
J_PE_JFINP1 = -1
J_PE_JFINP2 = -1
O_NPROC = 1
IMOUT = 4*0
JMOUT = 4*0
J_PE_IND_MED = 4*0
NMEDLEV = 0
$END
SLAB TIMESTEP 502
im,sm,ngroup,new_im,new_sm 1 1 48 T F
FINAL TOTAL ENERGY = 0.45372E+27 J/
INITIAL TOTAL ENERGY = 0.45364E+27 J/
CHG IN TOTAL ENERGY OVER DAY = 0.79432E+23 J/
FLUXES INTO ATM OVER DAY = 0.12826E+24 J/
ERROR IN ENERGY BUDGET = 0.48826E+23 J/
TEMP CORRECTION OVER DAY = 0.27020E-01 K
TEMPERATURE CORRECTION RATE = 0.31273E-06 K/S
FLUX CORRECTION (ATM) = 0.31638E+01 W/M2
FINAL ATM MASS = 0.17980E+22 KG
INITIAL ATM MASS = 0.17980E+22 KG
CORRECTION FACTOR FOR PSTAR = 0.10000E+01
im,sm,ngroup,new_im,new_sm 3 1 1 T F
NOCNINDX Namelist is
$NOCNINDX
J_1 = 1
J_2 = 2
J_3 = 3
J_JMT = 73
J_JMTM1 = 72
J_JMTM2 = 71
J_JMTP1 = 74
JST = 1
JFIN = 73
J_FROM_LOC = 0
J_TO_LOC = 0
JMT_GLOBAL = 73
JMTM1_GLOBAL = 72
JMTM2_GLOBAL = 71
JMTP1_GLOBAL = 74
J_OFFSET = 0
O_MYPE = 0
O_EW_HALO = 0
O_NS_HALO = 0
J_PE_JSTM1 = -1
J_PE_JSTM2 = -1
J_PE_JFINP1 = -1
J_PE_JFINP2 = -1
O_NPROC = 1
IMOUT = 4*0
JMOUT = 4*0
J_PE_IND_MED = 4*0
NMEDLEV = 0
$END
SLAB TIMESTEP 503
im,sm,ngroup,new_im,new_sm 1 1 48 T F
FINAL TOTAL ENERGY = 0.45381E+27 J/
INITIAL TOTAL ENERGY = 0.45372E+27 J/
CHG IN TOTAL ENERGY OVER DAY = 0.83084E+23 J/
FLUXES INTO ATM OVER DAY = 0.12463E+24 J/
ERROR IN ENERGY BUDGET = 0.41550E+23 J/
TEMP CORRECTION OVER DAY = 0.22994E-01 K
TEMPERATURE CORRECTION RATE = 0.26613E-06 K/S
FLUX CORRECTION (ATM) = 0.26923E+01 W/M2
FINAL ATM MASS = 0.17980E+22 KG
INITIAL ATM MASS = 0.17980E+22 KG
CORRECTION FACTOR FOR PSTAR = 0.99999E+00
im,sm,ngroup,new_im,new_sm 3 1 1 T F
NOCNINDX Namelist is
$NOCNINDX
J_1 = 1
J_2 = 2
J_3 = 3
J_JMT = 73
J_JMTM1 = 72
J_JMTM2 = 71
J_JMTP1 = 74
JST = 1
JFIN = 73
J_FROM_LOC = 0
J_TO_LOC = 0
JMT_GLOBAL = 73
JMTM1_GLOBAL = 72
JMTM2_GLOBAL = 71
JMTP1_GLOBAL = 74
J_OFFSET = 0
O_MYPE = 0
O_EW_HALO = 0
O_NS_HALO = 0
J_PE_JSTM1 = -1
J_PE_JSTM2 = -1
J_PE_JFINP1 = -1
J_PE_JFINP2 = -1
O_NPROC = 1
IMOUT = 4*0
JMOUT = 4*0
J_PE_IND_MED = 4*0
NMEDLEV = 0
$END
SLAB TIMESTEP 504
3395537 words long
MODEL DUMP SUCCESSFULLY WRITTEN - 3434914 WORDS TO UNIT 22

Number of Words Written to Disk was 3436498
im,sm,ngroup,new_im,new_sm 1 1 48 T F


Any idea what\'s going wrong???

Thx
Himmelsjaeger
ID: 20084 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 20087 - Posted: 10 Feb 2006, 8:28:52 UTC

Ah. Now this is a different problem, tron.
Error code -1073741819 appears to be a MicroSoft error, and is most likely to be a problem with your graphics card drivers.
Try updating it/them, and see if that helps.

If not, post back here again, and we\'ll have another look.

ID: 20087 · Report as offensive     Reply Quote
LMEE

Send message
Joined: 4 Sep 04
Posts: 7
Credit: 41,953,885
RAC: 296
Message 20118 - Posted: 10 Feb 2006, 23:02:45 UTC - in response to Message 20082.  

It looks like there is a new type of error starting to occur.
If the two programmers weren\'t so tied up with last minute problems in getting the coupled model, (experiment 2), ready for launch, they\'d be right onto it.
There\'s not much I can say, except hang in there. You could continue with sulphur, in case it works better next time.

Deb
At least you got phase one uploaded. In sulphur, this contains extra data not included in phase one of slab models, and this will help the researchers a lot.

LMEE
It looks as though you uploaded phase one from one computer as well. I\'m not sure if this was the one about which you posted.



Les,

The WU in question \"IN0V\" was running on computer ID 20344 when I lost the 8Mb upload file. Does this answer your question/ Is the large result file lost for good?

Thanks,

LMEE
ID: 20118 · Report as offensive     Reply Quote
Les Bayliss
Volunteer moderator

Send message
Joined: 5 Sep 04
Posts: 7629
Credit: 24,240,330
RAC: 0
Message 20122 - Posted: 10 Feb 2006, 23:59:32 UTC

It\'s there OK. <a href=\"http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=1730316\"> This</a> is a direct link to your model page. Click on P1 (phase 1) at the bottom to see the graphs. Keep in mind that this is \'data to give the user something for his effort\', not the data which the researchers use, which is more extensive.

ID: 20122 · Report as offensive     Reply Quote
old_user143821

Send message
Joined: 26 Dec 05
Posts: 2
Credit: 251,588
RAC: 0
Message 20127 - Posted: 11 Feb 2006, 7:40:20 UTC - in response to Message 20053.  

Unless there is a recent backup, there may not be anything you can do to recover this WU. Just as a diagnostic, could you post the last 30 lines of the yabsd.out file which may be found in the

/projects/climateprediction.net/\"experimentname\" or
/projects/climateprediction.net/\"experimentname\"/dataout folder.

It may be zipped, but once unzipped, can be opened in WordPad.

One other thing, using Windows 2003 on a PC with 256 MB of memory may be straining the system when BOINC climateprediction.net is running. I\'m not saying that was the cause of the failure, but may be contributing to it.


Unfortunately I don\'t have a backup and I haven\'t still think of it. It\'s a great pity a whole month of work was lost.

hmm, it\'s actually AMD Athlon 64 3000+/256M under wXP 64 Pro, I\'ll add an extra memory module.

REPLANCA: UPDATE REQUIRED FOR FIELD 76
REPLANCA - time interpolation for field 76
time,time1,time2 6900.000 6480.000 7200.000
hours,int,period 6900 720 8640
Information used in checking ancillary data set:
position of lookup table in dataset: 742
Position of first lookup table referring to data type 58
Interval between lookup tables referring to data type 76
Number of steps 9
STASH code in dataset 125 STASH code requested 125
\'Start\' position of lookup tables for dataset in overall lookup array
332
REPLANCA: UPDATE REQUIRED FOR FIELD 77
REPLANCA - time interpolation for field 77
time,time1,time2 6900.000 6480.000 7200.000
hours,int,period 6900 720 8640
Information used in checking ancillary data set:
position of lookup table in dataset: 40
Position of first lookup table referring to data type 4
Interval between lookup tables referring to data type 4
Number of steps 9
STASH code in dataset 126 STASH code requested 126
\'Start\' position of lookup tables for dataset in overall lookup array
265
im,sm,ngroup,new_im,new_sm 1 1 48 T F
PPCTL: Opening new file hskaca.pg48nov on unit 66
PPCTL: Initialising new file on unit 66
PPCTL: Opening new file hskaca.ph48nov on unit 67
PPCTL: Initialising new file on unit 67
PPCTL: Opening new file hskaca.pi48nov on unit 68
PPCTL: Initialising new file on unit 68
NEGATIVE PRESSURE AT POINT 193
NEGATIVE PRESSURE AT POINT 194
... skip from 195 to 478 ...
NEGATIVE PRESSURE AT POINT 479
NEGATIVE PRESSURE AT POINT 480
*********************************************************************************
Model aborted with error code - 1 Routine and message:-
P_TH_ADJ : NEGATIVE PRESSURE VALUE CREATED.
*********************************************************************************

ID: 20127 · Report as offensive     Reply Quote
old_user48532

Send message
Joined: 31 Jan 05
Posts: 1
Credit: 88,947
RAC: 0
Message 20362 - Posted: 16 Feb 2006, 21:21:21 UTC

I also have this kind of problem, the same error, unrecoverable. Before i began using the ver. 5.2.13 it was better.
In ver 4.45 the models didn\'t crash, but i couldn\'t run the grafic and had to disable the screensaver. I even had to change the virus scanner to use the ver 4.45, Antivir was causing the model to crash. Seemed to be a problem common with many Athlon systems.

Earlier Version didn\'t make any problem at all, al other applications under Boinc ver 5.2.13 run smooth. I\'m sure there it\'s all down to a grafic driver problem, maybe ATI or XP...

My system:

Athlon XP 2600+
Ram 1024Mb
Asus A7N8X-E
Radeon 9600 with ATI-Drivers
Windows XP Sp1
ID: 20362 · Report as offensive     Reply Quote

Questions and Answers : Windows : Unrecoverable error for result sulphur_hska_000830170_0 ...

©2024 cpdn.org