climateprediction.net home page
Task 15791174

Task 15791174

Name hadcm3n_zmsq_1960_40_008370019_0
Workunit 8520878
Created 20 May 2013, 16:23:30 UTC
Sent 20 May 2013, 16:23:51 UTC
Report deadline 19 Aug 2013, 23:51:02 UTC
Received 16 Jun 2013, 0:24:56 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1100141
Run time 9 days 0 hours 21 min 22 sec
CPU time 5 days 21 hours 26 min 55 sec
Validate state Invalid
Credit 4,043.52
Device peak FLOPS 2.76 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8000, iMonCtr=1
Model crash detected, will try to restart...
20:15:50 (8068): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:38:16 (8076): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2552, iMonCtr=1
Model crash detected, will try to restart...
15:16:12 (4856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11704, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7404, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12360, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12360, iMonCtr=1
Model crash detected, will try to restart...
18:57:02 (1720): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:57:03 (1720): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13496, iMonCtr=1
Model crash detected, will try to restart...
21:49:36 (11820): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:49:38 (11820): No heartbeat from core client for 30 sec - exiting
21:49:39 (11820): No heartbeat from core client for 30 sec - exiting
21:49:40 (11820): No heartbeat from core client for 30 sec - exiting
21:49:41 (11820): No heartbeat from core client for 30 sec - exiting
21:49:42 (11820): No heartbeat from core client for 30 sec - exiting
21:49:44 (11820): No heartbeat from core client for 30 sec - exiting
21:49:45 (11820): No heartbeat from core client for 30 sec - exiting
21:49:46 (11820): No heartbeat from core client for 30 sec - exiting
21:49:47 (11820): No heartbeat from core client for 30 sec - exiting
21:49:48 (11820): No heartbeat from core client for 30 sec - exiting
22:36:26 (3948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:01:47 (6324): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:01:48 (6324): No heartbeat from core client for 30 sec - exiting
23:01:49 (6324): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5508, iMonCtr=1
Model crash detected, will try to restart...
06:16:38 (1744): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
23:16:11 (1304): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:40:10 (11848): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:44:09 (11532): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
10:51:43 (7540): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
05:44:59 (5784): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
05:45:00 (5784): No heartbeat from core client for 30 sec - exiting
05:45:01 (5784): No heartbeat from core client for 30 sec - exiting
05:45:02 (5784): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
20:32:43 (14092): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:32:45 (14092): No heartbeat from core client for 30 sec - exiting
20:33:23 (1944): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9224, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1636, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10828, selfPID=10828, iMonCtr=1
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
21:30:01 (10884): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:30:03 (10884): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7192, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7192, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7192, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7192, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7192, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7264, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
14 Jun 2013 01:59:17 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 336,960 480,845 1.4270
12 Jun 2013 13:45:55 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 311,040 444,164 1.4280
11 Jun 2013 04:26:25 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 285,120 407,237 1.4283
09 Jun 2013 16:07:48 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 259,200 369,336 1.4249
07 Jun 2013 22:18:07 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 233,280 332,170 1.4239
05 Jun 2013 02:02:27 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 207,360 296,350 1.4292
01 Jun 2013 11:40:27 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 181,440 260,420 1.4353
29 May 2013 21:15:28 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 155,520 223,402 1.4365
26 May 2013 17:33:00 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 129,600 186,946 1.4425
25 May 2013 16:04:42 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 103,680 149,365 1.4406
24 May 2013 09:28:07 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 77,760 112,003 1.4404
22 May 2013 20:57:48 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 51,840 75,549 1.4573
21 May 2013 12:49:30 1100141 15791174 hadcm3n_zmsq_1960_40_008370019_0 25,920 37,267 1.4378


©2024 cpdn.org