climateprediction.net home page
Task 13497334

Task 13497334

Name hadam3p_eu_6swm_2003_1_007504731_0
Workunit 7702206
Created 14 Oct 2011, 20:54:18 UTC
Sent 15 Oct 2011, 2:27:48 UTC
Report deadline 26 Sep 2012, 7:47:48 UTC
Received 30 Nov 2011, 1:31:25 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1175098
Run time 4 days 0 hours 41 min 12 sec
CPU time 3 days 5 hours 35 min 12 sec
Validate state Invalid
Credit 1,194.02
Device peak FLOPS 1.73 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5644, selfPID=4560, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1756, selfPID=4736, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3956, selfPID=4708, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6192, selfPID=4528, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7412, selfPID=4380, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4316, selfPID=4544, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6268, selfPID=5432, iMonCtr=1
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=0, iMonCtr=0
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=984, selfPID=4636, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6920, selfPID=4476, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2116, selfPID=1748, iMonCtr=1
Model crash detected, will try to restart...
GGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4348, selfPID=4324, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5900, selfPID=4464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6916, selfPID=5292, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6496, selfPID=4792, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6784, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4804, selfPID=4180, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4172, selfPID=3916, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3496, selfPID=3128, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4836, selfPID=3876, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4512, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4052, selfPID=3708, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6612, selfPID=3312, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6904, selfPID=3716, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2476, selfPID=3640, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4308, selfPID=5340, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6064, selfPID=4272, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5992, selfPID=5092, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
21:45:39 (4156): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3596, selfPID=3520, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6304, selfPID=4512, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1324, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5504, selfPID=2000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1728, selfPID=3756, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5032, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5040, selfPID=4600, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4920, selfPID=3976, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5960, selfPID=4452, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3700, selfPID=4004, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4856, selfPID=3892, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_6swm_2003_1_007504731\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_eu_um_6.0  012CA39A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  01272CD0  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  01271E9A  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  01252819  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  01152287  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  011EE7B2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  011EF2DA  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  00F69BD2  Unknown               Unknown  Unknown
hadam3p_eu_um_6.0  012AE638  Unknown               Unknown  Unknown
kernel32.dll       766E339A  Unknown               Unknown  Unknown
ntdll.dll          77969ED2  Unknown               Unknown  Unknown
ntdll.dll          77969EA5  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_6swm_2003_1_007504731\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_eu_um_6.0  0155C52A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01504460  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  0150362A  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  014E2469  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  013E66EB  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01482AE2  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  014835AF  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01229860  Unknown               Unknown  Unknown
hadrm3p_eu_um_6.0  01540893  Unknown               Unknown  Unknown
kernel32.dll       766E339A  Unknown               Unknown  Unknown
ntdll.dll          77969ED2  Unknown               Unknown  Unknown
ntdll.dll          77969EA5  Unknown               Unknown  Unknown

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_eu_6swm_2003_1_007504731_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6swm_2003_1_007504731_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6swm_2003_1_007504731_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6swm_2003_1_007504731_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6swm_2003_1_007504731_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_6swm_2003_1_007504731_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
22 Nov 2011 05:38:00 1175098 13497334 hadam3p_eu_6swm_2003_1_007504731_0 69,216 250,168 3.6143
18 Nov 2011 12:25:10 1175098 13497334 hadam3p_eu_6swm_2003_1_007504731_0 57,696 208,573 3.6150
16 Nov 2011 01:31:41 1175098 13497334 hadam3p_eu_6swm_2003_1_007504731_0 46,176 167,242 3.6218
08 Nov 2011 11:27:05 1175098 13497334 hadam3p_eu_6swm_2003_1_007504731_0 34,656 126,315 3.6448
03 Nov 2011 02:42:50 1175098 13497334 hadam3p_eu_6swm_2003_1_007504731_0 23,136 84,594 3.6564
31 Oct 2011 17:26:02 1175098 13497334 hadam3p_eu_6swm_2003_1_007504731_0 11,616 43,517 3.7463


©2024 cpdn.org