climateprediction.net home page
Task 15604986

Task 15604986

Name hadcm3n_4gnq_1940_40_008311132_1
Workunit 8462267
Created 11 Feb 2013, 15:54:16 UTC
Sent 11 Feb 2013, 15:54:29 UTC
Report deadline 13 May 2013, 23:21:40 UTC
Received 11 Mar 2013, 17:35:01 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1077037
Run time 21 days 5 hours 20 min 28 sec
CPU time 19 days 2 hours 19 min 11 sec
Validate state Invalid
Credit 5,598.72
Device peak FLOPS 2.26 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
09:45:22 (9796): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6780, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
11:26:18 (5140): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:14:32 (6708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=37468, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
02:40:59 (3364): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
08:34:18 (7776): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:36:18 (18756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:54:46 (55004): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:58:19 (38664): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Atmos Hold Restart file rename failed on atmos_restart.hold
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=31072, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
17:53:10 (5548): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
04:12:15 (22040): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
00:02:36 (6788): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2012, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
11 Mar 2013 03:30:50 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 466,560 1,612,032 3.4551
09 Mar 2013 21:12:59 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 440,640 1,511,942 3.4312
08 Mar 2013 16:00:48 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 414,720 1,412,959 3.4070
07 Mar 2013 09:46:11 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 388,800 1,317,936 3.3898
06 Mar 2013 08:10:41 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 362,880 1,227,170 3.3818
05 Mar 2013 05:24:54 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 336,960 1,134,123 3.3657
04 Mar 2013 01:56:33 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 311,040 1,040,249 3.3444
02 Mar 2013 20:15:18 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 285,120 949,960 3.3318
01 Mar 2013 15:56:00 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 259,200 860,403 3.3195
28 Feb 2013 16:00:35 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 233,280 764,138 3.2756
26 Feb 2013 22:19:03 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 207,360 663,315 3.1989
25 Feb 2013 17:57:45 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 181,440 565,322 3.1158
24 Feb 2013 01:32:28 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 155,520 466,666 3.0007
22 Feb 2013 13:35:15 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 129,600 382,664 2.9527
20 Feb 2013 02:11:47 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 103,680 291,467 2.8112
15 Feb 2013 05:56:34 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 77,760 204,699 2.6324
13 Feb 2013 02:35:03 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 51,840 114,204 2.2030
12 Feb 2013 09:14:42 1077037 15604986 hadcm3n_4gnq_1940_40_008311132_1 25,920 56,391 2.1756


©2024 climateprediction.net