Task 13586268

Name	hadcm3n_y9uq_1900_40_007521311_2
Workunit	7718786
Created	2 Nov 2011, 19:05:57 UTC
Sent	2 Nov 2011, 19:21:15 UTC
Report deadline	2 Feb 2012, 2:48:26 UTC
Received	19 Nov 2011, 1:58:34 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	893745
Run time	8 days 11 hours 47 min 11 sec
CPU time	7 days 11 hours 46 min 17 sec
Validate state	Invalid
Credit	5,287.68
Device peak FLOPS	2.76 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5020, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5708, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5432, iMonCtr=1 Model crash detected, will try to restart... 19:02:56 (5176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4980, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4308, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4308, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4308, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
18 Nov 2011 10:54:44	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	440,640	641,481	1.4558
17 Nov 2011 21:55:58	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	414,720	603,885	1.4561
15 Nov 2011 17:40:15	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	388,800	566,341	1.4566
15 Nov 2011 17:40:15	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	362,880	529,002	1.4578
15 Nov 2011 17:40:15	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	336,960	491,332	1.4581
15 Nov 2011 17:40:15	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	311,040	453,130	1.4568
15 Nov 2011 17:40:15	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	285,120	415,373	1.4568
15 Nov 2011 17:40:14	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	259,200	377,396	1.4560
15 Nov 2011 17:40:14	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	233,280	339,610	1.4558
15 Nov 2011 17:40:14	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	207,360	301,842	1.4556
09 Nov 2011 19:32:03	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	181,440	263,893	1.4544
06 Nov 2011 16:12:39	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	155,520	225,912	1.4526
06 Nov 2011 04:06:48	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	129,600	187,833	1.4493
05 Nov 2011 17:04:24	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	103,680	149,832	1.4451
05 Nov 2011 05:34:08	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	77,760	112,122	1.4419
03 Nov 2011 20:46:20	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	51,840	74,717	1.4413
03 Nov 2011 16:01:37	893745	13586268	hadcm3n_y9uq_1900_40_007521311_2	25,920	37,072	1.4302