climateprediction.net home page
Task 12972236

Task 12972236

Name hadcm3n_o4oy_1940_40_007266050_2
Workunit 7464290
Created 11 Jun 2011, 13:49:35 UTC
Sent 11 Jun 2011, 13:49:42 UTC
Report deadline 10 Sep 2011, 21:16:53 UTC
Received 18 Sep 2011, 10:28:55 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1012620
Run time 19 days 1 hours 31 min 16 sec
CPU time 13 days 9 hours 1 min 19 sec
Validate state Invalid
Credit 6,220.80
Device peak FLOPS 2.26 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2576, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4300, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
13:16:31 (5916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:23:53 (4512): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:58:15 (4352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CBUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/pipe_dummy                                                                  2048    
Suspended CPDN Monitor - Suspend request from BOINC...
Ocean Restart file copy failed on o4oyko.dae73f0
Suspended CPDN Monitor - Suspend request from BOINC...
20:16:18 (4568): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
12:03:56 (1900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:03:57 (1900): No heartbeat from core client for 30 sec - exiting
12:03:58 (1900): No heartbeat from core client for 30 sec - exiting
12:03:59 (1900): No heartbeat from core client for 30 sec - exiting
12:04:00 (1900): No heartbeat from core client for 30 sec - exiting
12:04:01 (1900): No heartbeat from core client for 30 sec - exiting
12:04:02 (1900): No heartbeat from core client for 30 sec - exiting
12:04:03 (1900): No heartbeat from core client for 30 sec - exiting
12:04:04 (1900): No heartbeat from core client for 30 sec - exiting
12:04:05 (1900): No heartbeat from core client for 30 sec - exiting
12:04:07 (1900): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4516, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6040, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5880, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2716, iMonCtr=1
Model crash detected, will try to restart...
11:24:52 (4652): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:05:02 (5168): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:05:03 (5168): No heartbeat from core client for 30 sec - exiting
17:05:04 (5168): No heartbeat from core client for 30 sec - exiting
17:05:05 (5168): No heartbeat from core client for 30 sec - exiting
17:05:06 (5168): No heartbeat from core client for 30 sec - exiting
17:05:07 (5168): No heartbeat from core client for 30 sec - exiting
17:05:08 (5168): No heartbeat from core client for 30 sec - exiting
17:05:09 (5168): No heartbeat from core client for 30 sec - exiting
17:05:10 (5168): No heartbeat from core client for 30 sec - exiting
17:05:11 (5168): No heartbeat from core client for 30 sec - exiting
17:05:12 (5168): No heartbeat from core client for 30 sec - exiting
17:05:13 (5168): No heartbeat from core client for 30 sec - exiting
Ocean Restart file copy failed on o4oyko.daf19g0
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2344, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4924, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFOUT: C I/O Error - Return code = 32

Model crashed: WRITDUMP: BAD BUFFOUT OF DATA                                                                                                                                                                                                                                   tmp/pipe_dummy                                                                  2048    
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1208, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4104, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4448, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5572, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5412, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5544, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4352, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5164, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2824, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2824, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Sep 2011 10:28:35 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 518,400 1,155,656 2.2293
16 Sep 2011 07:14:29 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 492,480 1,093,383 2.2202
10 Sep 2011 08:00:13 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 466,560 1,028,324 2.2041
05 Sep 2011 12:01:44 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 440,640 965,067 2.1901
02 Sep 2011 15:15:03 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 414,720 898,846 2.1674
29 Aug 2011 13:51:18 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 388,800 834,760 2.1470
25 Aug 2011 08:57:10 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 362,880 770,704 2.1239
24 Aug 2011 16:33:12 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 336,960 742,768 2.2043
20 Aug 2011 14:06:15 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 311,040 678,726 2.1821
13 Aug 2011 11:06:21 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 285,120 614,939 2.1568
06 Aug 2011 10:31:09 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 259,200 552,502 2.1316
04 Aug 2011 05:58:58 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 233,280 489,179 2.0970
02 Aug 2011 14:53:56 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 207,360 428,052 2.0643
02 Aug 2011 14:53:56 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 181,440 364,373 2.0082
25 Jul 2011 15:27:19 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 155,520 300,880 1.9347
09 Jul 2011 07:45:02 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 129,600 243,181 1.8764
05 Jul 2011 07:33:24 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 103,680 191,627 1.8483
05 Jul 2011 07:33:24 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 77,760 180,202 2.3174
17 Jun 2011 15:13:03 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 51,840 117,836 2.2731
15 Jun 2011 10:12:10 1012620 12972236 hadcm3n_o4oy_1940_40_007266050_2 25,920 60,161 2.3210


©2024 cpdn.org