climateprediction.net home page
Task 15111071

Task 15111071

Name hadam3p_pnw_2v7i_1960_1_008139767_0
Workunit 8294881
Created 13 Aug 2012, 17:34:56 UTC
Sent 13 Aug 2012, 17:52:08 UTC
Report deadline 26 Jul 2013, 23:12:08 UTC
Received 6 Sep 2012, 17:03:39 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1120776
Run time 2 days 20 hours 33 min
CPU time 2 days 16 hours 34 min 23 sec
Validate state Invalid
Credit 1,503.98
Device peak FLOPS 3.17 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v6.09
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3836, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2620, selfPID=4280, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 0
Suspended CPDN Monitor - Suspend request from BOINC...
15:32:33 (4336): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:32:34 (4336): No heartbeat from core client for 30 sec - exiting
15:32:35 (4336): No heartbeat from core client for 30 sec - exiting
15:32:36 (4336): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8048, selfPID=4588, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5564, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2896, selfPID=4288, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 1
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
16:19:33 (4536): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:19:35 (4536): No heartbeat from core client for 30 sec - exiting
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4516, selfPID=3964, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4476, selfPID=5352, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4344, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 3
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3252, selfPID=1208, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 3
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3184, selfPID=4372, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 3
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5472, iMonCtr=
2
del crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 3
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4512, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5228, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3900, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3656, selfPID=4108, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 4
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4312, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4448, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5616, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3796, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 5
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4796, selfPID=4292, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6016, selfPID=5608, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3876, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2224, iMonCtr=2
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_pnw_2v7i_1960_1_008139767\tmp\xaakg.namelists

Image              PC        Routine            Line        Source             
hadrm3p_pnw_um_6.  004AC52A  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  00454460  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  0045362A  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  00432469  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  003366EB  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  003D2AE2  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  003D35AF  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  00179860  Unknown               Unknown  Unknown
hadrm3p_pnw_um_6.  00490893  Unknown               Unknown  Unknown
kernel32.dll       7678ED6C  Unknown               Unknown  Unknown
ntdll.dll          771B377B  Unknown               Unknown  Unknown
ntdll.dll          771B374E  Unknown               Unknown  Unknown
forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_pnw_2v7i_1960_1_008139767\tmp\xaakm.namelists

Image              PC        Routine            Line        Source             
hadam3p_pnw_um_6.  00CDA39A  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00C82CD0  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00C81E9A  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00C62819  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00B62287  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00BFE7B2  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00BFF2DA  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00979BD2  Unknown               Unknown  Unknown
hadam3p_pnw_um_6.  00CBE638  Unknown               Unknown  Unknown
kernel32.dll       7678ED6C  Unknown               Unknown  Unknown
ntdll.dll          771B377B  Unknown               Unknown  Unknown
ntdll.dll          771B374E  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2372, selfPID=728, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Regional yearly means requires 12 input files got 6
Called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_2v7i_1960_1_008139767_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_2v7i_1960_1_008139767_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_2v7i_1960_1_008139767_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_2v7i_1960_1_008139767_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_2v7i_1960_1_008139767_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_pnw_2v7i_1960_1_008139767_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
31 Aug 2012 21:57:57 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 69,216 202,822 2.9303
29 Aug 2012 19:00:25 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 57,696 169,367 2.9355
27 Aug 2012 19:05:17 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 46,176 135,796 2.9408
22 Aug 2012 16:49:16 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 34,656 101,891 2.9401
20 Aug 2012 19:04:07 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 23,149 68,523 2.9601
20 Aug 2012 18:10:47 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 23,136 68,063 2.9419
15 Aug 2012 19:05:47 1120776 15111071 hadam3p_pnw_2v7i_1960_1_008139767_0 11,616 34,501 2.9701


©2024 cpdn.org