climateprediction.net (CPDN) home page
Task 15942291

Task 15942291

Name hadcm3n_n363_1920_40_008402595_1
Workunit 8553451
Created 28 Aug 2013, 18:10:56 UTC
Sent 28 Aug 2013, 18:11:03 UTC
Report deadline 28 Nov 2013, 1:38:14 UTC
Received 10 Nov 2013, 19:52:52 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1258506
Run time 5 days 14 hours 27 min 26 sec
CPU time 5 days 9 hours 49 min 22 sec
Validate state Invalid
Credit 7,464.96
Device peak FLOPS 4.51 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4904, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5100, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5100, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4832, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4576, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1332, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5396, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1748, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4992, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4472, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4472, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4152, iMonCtr=1
Model crash detected, will try to restart...
11:56:39 (1892): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:02:48 (6176): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:41:20 (4900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:50:53 (5672): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4268, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4268, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4728, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4676, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2356, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4976, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4676, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4676, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4576, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4468, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1468, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5784, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4944, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5092, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4408, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3888, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4352, iMonCtr=1
Model crash detected, will try to restart...
21:38:11 (4568): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1
Model crash detected, will try to restart...
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 110 - Return code = 16

Model crashed: REPLANCA :I/O ERROR                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
07 Nov 2013 20:18:42 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 622,080 455,952 0.7329
06 Nov 2013 19:36:47 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 596,160 441,809 0.7411
05 Nov 2013 17:47:07 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 570,240 427,748 0.7501
04 Nov 2013 19:48:12 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 544,320 413,628 0.7599
01 Nov 2013 16:25:17 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 518,400 398,475 0.7687
30 Oct 2013 21:44:30 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 492,480 383,053 0.7778
30 Oct 2013 14:00:46 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 466,560 367,884 0.7885
27 Oct 2013 17:00:19 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 440,640 352,281 0.7995
26 Oct 2013 14:37:04 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 414,720 336,678 0.8118
24 Oct 2013 19:41:17 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 388,800 320,688 0.8248
20 Oct 2013 18:07:56 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 362,880 305,226 0.8411
20 Oct 2013 12:56:40 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 336,960 289,646 0.8596
19 Oct 2013 15:40:31 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 311,040 274,299 0.8819
18 Oct 2013 17:59:39 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 285,120 258,974 0.9083
17 Oct 2013 20:10:00 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 259,200 244,731 0.9442
12 Oct 2013 07:19:14 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 233,280 228,692 0.9803
07 Oct 2013 14:49:55 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 207,360 213,519 1.0297
03 Oct 2013 20:45:06 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 181,440 198,288 1.0929
29 Sep 2013 17:43:02 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 155,520 182,297 1.1722
07 Sep 2013 21:43:02 1258506 15942291 hadcm3n_n363_1920_40_008402595_1 129,600 76,976 0.5940


©2024 cpdn.org