climateprediction.net home page
Task 12326076

Task 12326076

Name hadam3p_saf_259h_1964_1_007038909_0
Workunit 7242225
Created 25 Nov 2010, 11:08:05 UTC
Sent 5 Dec 2010, 18:50:11 UTC
Report deadline 18 Nov 2011, 0:10:11 UTC
Received 16 Feb 2011, 22:22:26 UTC
Server state Over
Outcome No reply
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1026864
Run time 2 days 5 hours 53 min 10 sec
CPU time 2 days 2 hours 34 min 57 sec
Validate state Invalid
Credit 1,122.82
Device peak FLOPS 2.56 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Southern Africa v6.08
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5008, selfPID=4224, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1156, selfPID=1156, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2468, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1532, selfPID=1532, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4612, selfPID=3304, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3312, selfPID=3312, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5476, selfPID=5476, iMonCtr=2
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4616, selfPID=3100, iMonCtr=1
Model crash detected, will try to restart...
20:32:24 (3196): No heartbeat from core client for 30 sec - exiting
20:32:25 (3196): No heartbeat from core client for 30 sec - exiting
20:32:27 (3196): No heartbeat from core client for 30 sec - exiting
20:32:28 (3196): No heartbeat from core client for 30 sec - exiting
20:32:29 (3196): No heartbeat from core client for 30 sec - exiting
20:32:30 (3196): No heartbeat from core client for 30 sec - exiting
20:32:31 (3196): No heartbeat from core client for 30 sec - exiting
20:32:32 (3196): No heartbeat from core client for 30 sec - exiting
20:32:33 (3196): No heartbeat from core client for 30 sec - exiting
20:32:34 (3196): No heartbeat from core client for 30 sec - exiting
20:32:35 (3196): No heartbeat from core client for 30 sec - exiting
20:32:36 (3196): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:29:56 (3076): No heartbeat from core client for 30 sec - exiting
18:29:57 (3076): No heartbeat from core client for 30 sec - exiting
18:29:58 (3076): No heartbeat from core client for 30 sec - exiting
18:29:59 (3076): No heartbeat from core client for 30 sec - exiting
18:30:00 (3076): No heartbeat from core client for 30 sec - exiting
18:30:01 (3076): No heartbeat from core client for 30 sec - exiting
18:30:02 (3076): No heartbeat from core client for 30 sec - exiting
18:30:03 (3076): No heartbeat from core client for 30 sec - exiting
18:30:05 (3076): No heartbeat from core client for 30 sec - exiting
18:30:06 (3076): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:30:07 (3076): No heartbeat from core client for 30 sec - exiting

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
20:12:05 (3092): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO                                                                                                                                                                                           tmp/xaakm.pipe_dummy                                                            2048    
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6112, selfPID=5896, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
20:12:44 (5896): called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_saf_259h_1964_1_007038909_0_7.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_259h_1964_1_007038909_0_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_259h_1964_1_007038909_0_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_259h_1964_1_007038909_0_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_259h_1964_1_007038909_0_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_saf_259h_1964_1_007038909_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
16 Feb 2011 21:56:05 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 69,216 181,490 2.6221
15 Feb 2011 00:09:37 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 57,696 150,711 2.6122
05 Feb 2011 23:25:21 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 46,176 120,730 2.6146
22 Jan 2011 20:49:36 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 34,656 90,736 2.6182
06 Jan 2011 23:27:15 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 23,142 60,496 2.6141
06 Jan 2011 22:19:27 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 23,136 60,128 2.5989
07 Dec 2010 20:45:12 1026864 12326076 hadam3p_saf_259h_1964_1_007038909_0 11,616 30,928 2.6625


©2024 climateprediction.net