climateprediction.net home page
Task 15896579

Task 15896579

Name hadcm3n_3i2y_1980_40_008400578_1
Workunit 8551434
Created 18 Jul 2013, 20:06:23 UTC
Sent 18 Jul 2013, 20:15:59 UTC
Report deadline 18 Oct 2013, 3:43:10 UTC
Received 23 Aug 2013, 13:46:43 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1251442
Run time 16 days 16 hours 42 min 43 sec
CPU time 12 days 1 hours 1 min 38 sec
Validate state Invalid
Credit 4,665.60
Device peak FLOPS 2.28 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7032, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1
Model crash detected, will try to restart...
10:59:21 (5180): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6952, iMonCtr=1
Model crash detected, will try to restart...
12:44:31 (5200): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7556, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4596, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6836, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6308, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6092, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6092, iMonCtr=1
Model crash detected, will try to restart...
12:02:03 (2948): No heartbeat from core client for 30 sec - exiting
12:02:05 (2948): No heartbeat from core client for 30 sec - exiting
12:02:06 (2948): No heartbeat from core client for 30 sec - exiting
12:02:07 (2948): No heartbeat from core client for 30 sec - exiting
12:02:08 (2948): No heartbeat from core client for 30 sec - exiting
12:02:09 (2948): No heartbeat from core client for 30 sec - exiting
12:02:10 (2948): No heartbeat from core client for 30 sec - exiting
12:02:11 (2948): No heartbeat from core client for 30 sec - exiting
12:02:12 (2948): No heartbeat from core client for 30 sec - exiting
12:02:13 (2948): No heartbeat from core client for 30 sec - exiting
12:02:14 (2948): No heartbeat from core client for 30 sec - exiting
12:02:16 (2948): No heartbeat from core client for 30 sec - exiting
12:02:17 (2948): No heartbeat from core client for 30 sec - exiting
12:02:18 (2948): No heartbeat from core client for 30 sec - exiting
12:02:19 (2948): No heartbeat from core client for 30 sec - exiting
12:02:20 (2948): No heartbeat from core client for 30 sec - exiting
12:02:21 (2948): No heartbeat from core client for 30 sec - exiting
12:02:22 (2948): No heartbeat from core client for 30 sec - exiting
12:02:23 (2948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5768, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
11:12:25 (1668): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1452, iMonCtr=1
Model crash detected, will try to restart...
10:24:50 (4620): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:24:51 (4620): No heartbeat from core client for 30 sec - exiting
10:24:52 (4620): No heartbeat from core client for 30 sec - exiting
10:24:53 (4620): No heartbeat from core client for 30 sec - exiting
10:24:54 (4620): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6004, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6004, iMonCtr=1
Model crash detected, will try to restart...
13:32:20 (6064): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2192, iMonCtr=1
Model crash detected, will try to restart...
15:38:11 (5476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7784, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4604, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5564, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5564, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8068, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4692, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6536, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7124, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2936, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6156, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6156, iMonCtr=1
Model crash detected, will try to restart...
11:54:36 (6176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5984, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Aug 2013 16:38:30 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 388,800 1,007,367 2.5910
19 Aug 2013 17:11:33 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 362,880 939,609 2.5893
16 Aug 2013 20:42:05 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 336,960 874,784 2.5961
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 311,040 805,430 2.5895
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 285,120 737,148 2.5854
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 259,200 665,222 2.5664
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 233,280 594,523 2.5485
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 207,360 526,614 2.5396
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 181,440 459,405 2.5320
14 Aug 2013 17:49:28 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 155,520 393,303 2.5290
29 Jul 2013 14:34:52 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 129,600 330,867 2.5530
29 Jul 2013 14:34:52 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 103,680 266,808 2.5734
25 Jul 2013 02:06:57 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 77,760 201,289 2.5886
23 Jul 2013 22:13:56 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 51,840 135,558 2.6149
23 Jul 2013 20:38:38 1251442 15896579 hadcm3n_3i2y_1980_40_008400578_1 25,920 67,677 2.6110


©2024 cpdn.org