Task 15890617

Name	hadcm3n_3f5k_1980_40_008335019_3
Workunit	8485880
Created	10 Jul 2013, 13:01:06 UTC
Sent	10 Jul 2013, 13:44:55 UTC
Report deadline	9 Oct 2013, 21:12:06 UTC
Received	30 Jul 2013, 10:03:24 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1233671
Run time	12 days 18 hours 9 min 56 sec
CPU time	12 days 7 hours 26 min 54 sec
Validate state	Invalid
Credit	10,264.32
Device peak FLOPS	2.81 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:36:33 (6748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1816, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1816, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:00:12 (6008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:00:01 (9700): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2864, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
29 Jul 2013 13:28:34	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	855,360	1,038,412	1.2140
29 Jul 2013 13:28:34	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	829,440	1,003,037	1.2093
29 Jul 2013 13:28:34	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	803,520	967,723	1.2044
29 Jul 2013 13:28:34	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	777,600	932,550	1.1993
29 Jul 2013 13:28:34	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	751,680	897,322	1.1938
29 Jul 2013 13:28:34	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	725,760	862,043	1.1878
26 Jul 2013 18:25:10	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	699,840	826,834	1.1815
26 Jul 2013 11:33:14	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	673,920	802,390	1.1906
26 Jul 2013 04:56:02	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	648,000	778,438	1.2013
25 Jul 2013 22:17:58	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	622,080	754,648	1.2131
25 Jul 2013 15:43:12	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	596,160	730,597	1.2255
25 Jul 2013 09:00:03	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	570,240	706,708	1.2393
25 Jul 2013 02:22:03	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	544,320	682,828	1.2545
24 Jul 2013 19:41:45	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	518,400	658,934	1.2711
24 Jul 2013 13:06:55	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	492,480	635,159	1.2897
24 Jul 2013 06:27:20	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	466,560	611,396	1.3104
23 Jul 2013 23:49:56	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	440,640	587,548	1.3334
23 Jul 2013 22:14:09	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	414,720	563,549	1.3589
23 Jul 2013 22:04:13	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	388,800	529,352	1.3615
23 Jul 2013 21:54:31	1233671	15890617	hadcm3n_3f5k_1980_40_008335019_3	362,880	494,208	1.3619