climateprediction.net home page
Task 14443513

Task 14443513

Name hadam3p_pnw_bebq_1965_1_007904869_0
Workunit 8059981
Created 17 Apr 2012, 17:47:28 UTC
Sent 12 May 2012, 14:08:15 UTC
Report deadline 24 Apr 2013, 19:28:15 UTC
Received 21 Jul 2012, 11:33:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1178365
Run time 3 days 14 hours 40 min 21 sec
CPU time 3 days 4 hours 22 min 4 sec
Validate state Invalid
Credit 1,754.30
Device peak FLOPS 2.73 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>7.0.25</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5400, selfPID=5688, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=156, selfPID=3964, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3788, selfPID=3812, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3688, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3396, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1956, selfPID=3360, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3288, selfPID=3848, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2924, selfPID=3796, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3056, selfPID=3796, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3940, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2924, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2860, selfPID=2356, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3268, selfPID=3660, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2884, selfPID=3784, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3160, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3292, selfPID=3800, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN prController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3052, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=348, selfPID=4084, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4004, selfPID=3940, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=600, selfPID=3700, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3732, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3944, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4732, selfPID=3700, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5032, selfPID=4908, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1804, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4916, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4924, selfPID=4044, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5384, iMonCtr=2
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2384, selfPID=3492, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2204, selfPID=3528, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3256, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 7
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_bebq_1965_1_007904869/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_bebq_1965_1_007904869/dataout/region_restart.day after 11 attempts

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakg.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_bebq_1965_1_007904869_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_bebq_1965_1_007904869_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_bebq_1965_1_007904869_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_bebq_1965_1_007904869_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_bebq_1965_1_007904869_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Jul 2012 10:36:38 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 80,736 274,090 3.3949
16 Jun 2012 19:52:23 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 69,227 235,273 3.3986
16 Jun 2012 18:50:51 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 69,218 234,765 3.3917
16 Jun 2012 12:29:10 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 69,216 234,254 3.3844
15 Jun 2012 17:00:24 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 57,696 194,625 3.3733
09 Jun 2012 17:07:19 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 46,176 155,398 3.3653
06 Jun 2012 18:39:50 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 34,656 116,566 3.3635
02 Jun 2012 07:05:23 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 23,136 77,921 3.3680
31 May 2012 18:38:14 1178365 14443513 hadam3p_pnw_bebq_1965_1_007904869_0 11,616 39,500 3.4005


©2024 climateprediction.net