Task 13887666

Name	hadcm3n_t0ru_1940_40_007442764_4
Workunit	7640267
Created	8 Jan 2012, 3:12:26 UTC
Sent	8 Jan 2012, 3:12:56 UTC
Report deadline	8 Apr 2012, 10:40:07 UTC
Received	10 Feb 2012, 18:17:42 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	851990
Run time	6 days 18 hours 16 min 13 sec
CPU time	6 days 18 hours 16 min 13 sec
Validate state	Invalid
Credit	4,043.52
Device peak FLOPS	2.53 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>5.10.45</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold Ocean Restart file copy failed on t0ruko.dae1ai0 Ocean Restart file copy failed on t0ruko.dae22d0 Ocean Restart file copy failed on t0ruko.dae22j0 Atmos Hold Restart file rename failed on atmos_restart.hold 20:36:27 (9820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:36:29 (9820): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... 03:44:56 (11684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Ocean Restart file copy failed on t0ruko.dae76b0 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
06 Feb 2012 06:11:37	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	336,960	548,985	1.6292
04 Feb 2012 22:03:20	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	311,040	507,135	1.6304
03 Feb 2012 14:09:44	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	285,120	465,171	1.6315
02 Feb 2012 08:57:04	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	259,200	423,370	1.6334
31 Jan 2012 22:37:47	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	233,280	381,371	1.6348
29 Jan 2012 17:32:21	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	207,360	338,421	1.6320
27 Jan 2012 18:10:22	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	181,440	296,207	1.6325
27 Jan 2012 00:59:55	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	155,520	255,857	1.6452
25 Jan 2012 09:52:01	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	129,600	213,532	1.6476
23 Jan 2012 04:44:20	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	103,680	173,753	1.6759
22 Jan 2012 03:55:14	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	77,760	129,850	1.6699
21 Jan 2012 04:44:40	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	51,840	87,182	1.6818
19 Jan 2012 15:38:23	851990	13887666	hadcm3n_t0ru_1940_40_007442764_4	25,920	41,974	1.6194