Task 15828524

Name	hadcm3n_3gc2_2020_40_008390267_1
Workunit	8541126
Created	4 Jun 2013, 10:32:35 UTC
Sent	4 Jun 2013, 11:25:04 UTC
Report deadline	3 Sep 2013, 18:52:15 UTC
Received	1 Jul 2013, 22:28:08 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1050120
Run time	21 days 3 hours 1 min 33 sec
CPU time	21 days 3 hours 1 min 33 sec
Validate state	Invalid
Credit	5,598.72
Device peak FLOPS	2.54 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 12:31:20 (2000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:39:19 (4956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:56:14 (5792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2684, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3380, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
02 Jul 2013 10:56:55	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	466,560	1,783,064	3.8217
02 Jul 2013 10:02:04	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	440,640	1,658,456	3.7637
27 Jun 2013 21:57:14	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	414,720	1,535,779	3.7032
26 Jun 2013 22:38:57	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	388,800	1,462,422	3.7614
26 Jun 2013 02:51:59	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	362,880	1,391,825	3.8355
25 Jun 2013 07:55:49	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	336,960	1,320,907	3.9201
24 Jun 2013 11:16:04	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	311,040	1,250,212	4.0195
23 Jun 2013 14:49:20	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	285,120	1,177,805	4.1309
22 Jun 2013 07:05:41	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	259,200	1,086,560	4.1920
22 Jun 2013 05:19:50	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	233,280	985,192	4.2232
22 Jun 2013 05:19:50	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	207,360	883,807	4.2622
22 Jun 2013 05:19:50	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	181,440	782,074	4.3104
22 Jun 2013 05:19:50	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	155,520	681,199	4.3801
22 Jun 2013 05:19:50	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	129,600	582,360	4.4935
22 Jun 2013 05:19:50	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	103,680	483,094	4.6595
13 Jun 2013 18:56:56	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	77,760	381,926	4.9116
12 Jun 2013 07:10:20	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	51,840	258,538	4.9872
10 Jun 2013 11:23:52	1050120	15828524	hadcm3n_3gc2_2020_40_008390267_1	25,920	108,959	4.2037