climateprediction.net home page
Task 15476485

Task 15476485

Name hadcm3n_zd38_1920_40_008255705_1
Workunit 8410829
Created 14 Dec 2012, 3:19:24 UTC
Sent 14 Dec 2012, 3:20:52 UTC
Report deadline 15 Mar 2013, 10:48:03 UTC
Received 10 Feb 2013, 5:50:46 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1251637
Run time 17 days 8 hours 13 min 29 sec
CPU time 15 days 13 hours 1 min 57 sec
Validate state Invalid
Credit 7,153.92
Device peak FLOPS 2.51 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5476, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CSuspended CPDN Monitor - Suspend request from BOINC...
14:21:54 (2260): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:21:56 (2260): No heartbeat from core client for 30 sec - exiting
14:21:57 (2260): No heartbeat from core client for 30 sec - exiting
14:21:58 (2260): No heartbeat from core client for 30 sec - exiting
14:21:59 (2260): No heartbeat from core client for 30 sec - exiting
14:22:00 (2260): No heartbeat from core client for 30 sec - exiting
14:22:01 (2260): No heartbeat from core client for 30 sec - exiting
14:22:02 (2260): No heartbeat from core client for 30 sec - exiting
14:22:03 (2260): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1180, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5692, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5436, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4840, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5624, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3668, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5620, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
cpdnmonitor: error reading file dataout/ocean_restart.day
cpdnmonitor: error reading file dataout/atmos_restart.hold

Model crashed: TEMPHIST: Write ERROR on history file for namelistNLIHISTO                                                                                                                                                                                                      tmp/pipe_dummy                                                                  2048    
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
08 Feb 2013 04:35:29 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 596,160 1,342,689 2.2522
07 Feb 2013 10:48:37 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 570,240 1,285,305 2.2540
06 Feb 2013 01:26:31 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 544,320 1,228,568 2.2571
03 Feb 2013 22:35:08 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 518,400 1,170,979 2.2588
03 Feb 2013 22:35:08 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 492,480 1,113,453 2.2609
29 Jan 2013 09:34:38 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 466,560 1,055,738 2.2628
23 Jan 2013 01:06:48 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 440,640 998,144 2.2652
20 Jan 2013 09:09:42 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 414,720 940,415 2.2676
19 Jan 2013 09:05:15 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 388,800 883,198 2.2716
18 Jan 2013 15:36:37 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 362,880 825,411 2.2746
15 Jan 2013 16:00:59 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 336,960 767,965 2.2791
09 Jan 2013 21:10:56 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 311,040 710,273 2.2835
04 Jan 2013 18:14:54 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 285,120 630,611 2.2117
03 Jan 2013 23:33:00 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 259,200 572,476 2.2086
03 Jan 2013 06:10:21 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 233,280 515,375 2.2093
30 Dec 2012 19:19:14 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 207,360 456,231 2.2002
30 Dec 2012 00:52:23 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 181,440 396,997 2.1880
29 Dec 2012 06:24:47 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 155,520 338,452 2.1763
23 Dec 2012 07:12:42 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 129,600 282,222 2.1776
23 Dec 2012 01:56:37 1251637 15476485 hadcm3n_zd38_1920_40_008255705_1 103,680 226,300 2.1827


©2024 climateprediction.net