climateprediction.net home page
Task 12708594

Task 12708594

Name hadam3p_eu_wdl7_2002_1_006822563_1
Workunit 7025879
Created 24 Mar 2011, 6:40:14 UTC
Sent 24 Mar 2011, 6:45:17 UTC
Report deadline 5 Mar 2012, 12:05:17 UTC
Received 15 Jun 2011, 12:32:00 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1140727
Run time 7 days 20 hours 33 min 14 sec
CPU time 6 days 8 hours 22 min 26 sec
Validate state Invalid
Credit 1,392.75
Device peak FLOPS 2.07 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Europe v6.09
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1072, selfPID=432, iMonCtr=1
Model crash detected, will try to restart...
19:04:21 (3100): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1160, selfPID=2300, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
12:00:20 (4292): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4616, selfPID=4616, iMonCtr=2
12:58:57 (2804): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:57:54 (4284): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:57:55 (4284): No heartbeat from core client for 30 sec - exiting
13:57:56 (4284): No heartbeat from core client for 30 sec - exiting
13:57:57 (4284): No heartbeat from core client for 30 sec - exiting
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3992, selfPID=3992, iMonCtr=2
14:56:40 (532): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:55:36 (5840): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:55:37 (5840): No heartbeat from core client for 30 sec - exiting
15:55:38 (5840): No heartbeat from core client for 30 sec - exiting
15:55:39 (5840): No heartbeat from core client for 30 sec - exiting
15:55:40 (5840): No heartbeat from core client for 30 sec - exiting
15:55:41 (5840): No heartbeat from core client for 30 sec - exiting
16:54:21 (4324): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:53:06 (4376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4996, selfPID=4996, iMonCtr=2
18:51:50 (1044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:50:51 (5876): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
GSuspended CPDN Monitor - Suspend request from BOINC...
11:39:47 (3628): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3592, selfPID=4052, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=780, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5336, selfPID=4968, iMonCtr=1
Model crash detected, will try to restart...
20:40:31 (2288): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:59:56 (2460): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2632, selfPID=2632, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6008, selfPID=3760, iMonCtr=1
Model crash detected, will try to restart...
19:55:11 (2408): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:58:21 (3548): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3728, selfPID=3728, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
CSuspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...

Model crashed: ATM_DYN : NEGATIVE THETA DETECTED.                                                                                                                                                                                                                              tmp/xaakm.pipe_dummy                                                            2048    
Leaving CPDN_Main::Monitor...
Called boinc_finish

</stderr_txt>
<message>
<file_xfer_error>
  <file_name>hadam3p_eu_wdl7_2002_1_006822563_1_8.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wdl7_2002_1_006822563_1_9.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wdl7_2002_1_006822563_1_10.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wdl7_2002_1_006822563_1_11.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>hadam3p_eu_wdl7_2002_1_006822563_1_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
14 Jun 2011 11:27:22 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 80,736 501,292 6.2090
09 Jun 2011 17:36:10 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 69,216 429,394 6.2037
31 May 2011 18:43:11 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 57,696 355,408 6.1600
28 May 2011 10:15:03 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 46,176 284,305 6.1570
06 May 2011 10:41:36 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 34,656 214,582 6.1918
02 May 2011 15:09:29 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 23,136 143,702 6.2112
22 Apr 2011 00:43:02 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 11,617 71,809 6.1814
22 Apr 2011 00:43:02 1140727 12708594 hadam3p_eu_wdl7_2002_1_006822563_1 11,616 70,775 6.0929


©2024 climateprediction.net