climateprediction.net home page
Task 16161395

Task 16161395

Name hadcm3n_n0jw_1880_40_008409521_3
Workunit 8560377
Created 26 Dec 2013, 9:55:51 UTC
Sent 26 Dec 2013, 9:56:50 UTC
Report deadline 27 Mar 2014, 17:24:01 UTC
Received 30 Jan 2014, 23:00:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1376550
Run time 34 days 7 hours 37 min 49 sec
CPU time 31 days 1 hours 25 min 38 sec
Validate state Invalid
Credit 10,575.36
Device peak FLOPS 1.76 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.2.33</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
21:50:16 (2176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13184, iMonCtr=1
Model crash detected, will try to restart...
18:06:03 (1900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:19:52 (5748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:19:57 (5748): No heartbeat from core client for 30 sec - exiting
03:19:58 (5748): No heartbeat from core client for 30 sec - exiting
03:19:59 (5748): No heartbeat from core client for 30 sec - exiting
03:20:00 (5748): No heartbeat from core client for 30 sec - exiting
03:20:01 (5748): No heartbeat from core client for 30 sec - exiting
03:20:02 (5748): No heartbeat from core client for 30 sec - exiting
03:21:58 (6664): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:32:05 (11372): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Atmos Hold Restart file rename failed on atmos_restart.hold
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6132, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
03:15:32 (6060): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
03:15:52 (6060): No heartbeat from core client for 30 sec - exiting
03:15:53 (6060): No heartbeat from core client for 30 sec - exiting
03:15:54 (6060): No heartbeat from core client for 30 sec - exiting
03:15:55 (6060): No heartbeat from core client for 30 sec - exiting
03:15:56 (6060): No heartbeat from core client for 30 sec - exiting
03:15:57 (6060): No heartbeat from core client for 30 sec - exiting
03:15:58 (6060): No heartbeat from core client for 30 sec - exiting
03:15:59 (6060): No heartbeat from core client for 30 sec - exiting
03:16:00 (6060): No heartbeat from core client for 30 sec - exiting
03:16:01 (6060): No heartbeat from core client for 30 sec - exiting
19:59:14 (5900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:59:43 (5900): No heartbeat from core client for 30 sec - exiting
19:59:44 (5900): No heartbeat from core client for 30 sec - exiting
19:59:45 (5900): No heartbeat from core client for 30 sec - exiting
19:59:46 (5900): No heartbeat from core client for 30 sec - exiting
19:59:47 (5900): No heartbeat from core client for 30 sec - exiting
19:59:48 (5900): No heartbeat from core client for 30 sec - exiting
19:59:49 (5900): No heartbeat from core client for 30 sec - exiting
19:59:50 (5900): No heartbeat from core client for 30 sec - exiting
19:59:52 (5900): No heartbeat from core client for 30 sec - exiting
19:59:53 (5900): No heartbeat from core client for 30 sec - exiting
Atmos Hold Restart file rename failed on atmos_restart.hold
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
30 Jan 2014 23:00:34 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 881,280 2,651,830 3.0091
27 Jan 2014 23:34:25 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 855,360 2,576,186 3.0118
27 Jan 2014 02:16:30 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 829,440 2,500,634 3.0148
26 Jan 2014 05:11:27 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 803,520 2,425,482 3.0186
25 Jan 2014 07:06:51 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 777,600 2,349,836 3.0219
24 Jan 2014 11:21:29 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 751,680 2,273,706 3.0248
23 Jan 2014 07:45:06 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 725,760 2,195,060 3.0245
22 Jan 2014 05:57:53 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 699,840 2,114,151 3.0209
21 Jan 2014 05:39:19 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 673,920 2,033,520 3.0175
20 Jan 2014 04:49:29 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 648,000 1,950,217 3.0096
19 Jan 2014 04:19:12 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 622,080 1,867,668 3.0023
18 Jan 2014 04:26:05 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 596,160 1,789,917 3.0024
17 Jan 2014 04:21:06 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 570,240 1,712,885 3.0038
16 Jan 2014 02:22:00 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 544,320 1,633,444 3.0009
14 Jan 2014 23:11:31 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 518,400 1,547,222 2.9846
14 Jan 2014 00:56:42 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 492,480 1,465,273 2.9753
12 Jan 2014 20:55:03 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 466,560 1,380,192 2.9582
11 Jan 2014 21:20:44 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 440,640 1,296,850 2.9431
10 Jan 2014 21:31:22 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 414,720 1,213,517 2.9261
10 Jan 2014 00:04:47 1274848 16161395 hadcm3n_n0jw_1880_40_008409521_3 388,800 1,137,689 2.9262


©2024 climateprediction.net