Task 13108504

Name	hadcm3n_yf0j_1900_40_007352317_1
Workunit	7549747
Created	6 Jul 2011, 14:19:43 UTC
Sent	15 Jul 2011, 23:18:33 UTC
Report deadline	15 Oct 2011, 6:45:44 UTC
Received	24 Aug 2011, 0:12:48 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1001784
Run time	13 days 8 hours 9 min 37 sec
CPU time	12 days 7 hours 41 min 36 sec
Validate state	Invalid
Credit	4,043.52
Device peak FLOPS	1.86 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3532, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3400, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3036, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3224, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3164, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3260, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3492, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1668, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4112, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4112, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4616, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4616, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3080, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
12 Aug 2011 14:09:43	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	336,960	1,041,072	3.0896
11 Aug 2011 00:55:08	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	311,040	960,615	3.0884
09 Aug 2011 17:55:58	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	285,120	880,967	3.0898
02 Aug 2011 16:00:12	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	259,200	799,612	3.0849
31 Jul 2011 14:38:30	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	233,280	719,566	3.0846
30 Jul 2011 15:04:37	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	207,360	639,610	3.0845
29 Jul 2011 00:38:29	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	181,440	559,914	3.0859
27 Jul 2011 18:51:56	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	155,520	480,865	3.0920
25 Jul 2011 22:59:38	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	129,600	400,269	3.0885
25 Jul 2011 21:12:46	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	103,680	320,572	3.0919
25 Jul 2011 19:38:47	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	77,760	240,850	3.0974
25 Jul 2011 19:22:39	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	51,840	160,441	3.0949
25 Jul 2011 18:16:49	1001784	13108504	hadcm3n_yf0j_1900_40_007352317_1	25,920	80,194	3.0939