Task 15469380

Name	hadcm3n_zmcl_1880_40_008247831_4
Workunit	8402955
Created	3 Dec 2012, 2:18:02 UTC
Sent	3 Dec 2012, 2:18:13 UTC
Report deadline	4 Mar 2013, 9:45:24 UTC
Received	14 Feb 2013, 14:31:38 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	858731
Run time	15 days 6 hours 46 min 5 sec
CPU time	15 days 6 hours 46 min 5 sec
Validate state	Invalid
Credit	10,264.32
Device peak FLOPS	2.60 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>5.10.45</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:27:03 (3100): No heartbeat from core client for 30 sec - exiting 11:27:04 (3100): No heartbeat from core client for 30 sec - exiting 11:27:05 (3100): No heartbeat from core client for 30 sec - exiting 11:27:06 (3100): No heartbeat from core client for 30 sec - exiting 11:27:07 (3100): No heartbeat from core client for 30 sec - exiting 11:27:08 (3100): No heartbeat from core client for 30 sec - exiting 11:27:09 (3100): No heartbeat from core client for 30 sec - exiting 11:27:10 (3100): No heartbeat from core client for 30 sec - exiting 11:27:11 (3100): No heartbeat from core client for 30 sec - exiting 11:27:12 (3100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:24:47 (2616): No heartbeat from core client for 30 sec - exiting 13:24:49 (2616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:45:12 (1780): No heartbeat from core client for 30 sec - exiting 15:45:13 (1780): No heartbeat from core client for 30 sec - exiting 15:45:14 (1780): No heartbeat from core client for 30 sec - exiting 15:45:15 (1780): No heartbeat from core client for 30 sec - exiting 15:45:16 (1780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 18:08:00 (3408): No heartbeat from core client for 30 sec - exiting 18:08:01 (3408): No heartbeat from core client for 30 sec - exiting 18:08:02 (3408): No heartbeat from core client for 30 sec - exiting 18:08:03 (3408): No heartbeat from core client for 30 sec - exiting 18:08:04 (3408): No heartbeat from core client for 30 sec - exiting 18:08:06 (3408): No heartbeat from core client for 30 sec - exiting 18:08:07 (3408): No heartbeat from core client for 30 sec - exiting 18:08:08 (3408): No heartbeat from core client for 30 sec - exiting 18:08:09 (3408): No heartbeat from core client for 30 sec - exiting 18:08:10 (3408): No heartbeat from core client for 30 sec - exiting 18:08:11 (3408): No heartbeat from core client for 30 sec - exiting 18:08:12 (3408): No heartbeat from core client for 30 sec - exiting 18:08:13 (3408): No heartbeat from core client for 30 sec - exiting 18:08:14 (3408): No heartbeat from core client for 30 sec - exiting 18:08:15 (3408): No heartbeat from core client for 30 sec - exiting 18:08:17 (3408): No heartbeat from core client for 30 sec - exiting 18:08:18 (3408): No heartbeat from core client for 30 sec - exiting 18:08:19 (3408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:08:20 (3408): No heartbeat from core client for 30 sec - exiting Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1 Model crash detected, will try to restart... 18:09:13 (3548): No heartbeat from core client for 30 sec - exiting 18:09:14 (3548): No heartbeat from core client for 30 sec - exiting 18:09:15 (3548): No heartbeat from core client for 30 sec - exiting 18:09:16 (3548): No heartbeat from core client for 30 sec - exiting 18:09:17 (3548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=552, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=552, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=552, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=552, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16432, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16432, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts CPDN Monitor - Quit request from BOINC... cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\Program Files\BOINC/projects/climateprediction.net/hadcm3n_zmcl_1880_40_008247831/dataout/ocean_restart.day after 11 attempts Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
10 Feb 2013 12:17:59	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	855,360	1,294,087	1.5129
10 Feb 2013 00:14:54	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	829,440	1,254,958	1.5130
07 Feb 2013 01:00:23	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	803,520	1,216,592	1.5141
05 Feb 2013 15:45:48	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	777,600	1,177,970	1.5149
05 Feb 2013 04:46:04	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	751,680	1,139,631	1.5161
04 Feb 2013 02:21:19	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	725,760	1,101,196	1.5173
31 Jan 2013 00:24:12	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	699,840	1,062,585	1.5183
29 Jan 2013 18:56:10	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	673,920	1,023,859	1.5193
26 Jan 2013 02:23:53	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	648,000	985,035	1.5201
25 Jan 2013 02:24:14	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	622,080	945,500	1.5199
23 Jan 2013 02:57:10	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	596,160	906,584	1.5207
21 Jan 2013 22:33:28	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	570,240	868,416	1.5229
20 Jan 2013 21:53:33	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	544,320	829,880	1.5246
20 Jan 2013 10:44:58	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	518,400	790,879	1.5256
19 Jan 2013 23:42:49	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	492,480	752,354	1.5277
19 Jan 2013 12:50:10	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	466,560	714,238	1.5309
19 Jan 2013 01:48:39	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	440,640	675,747	1.5336
18 Jan 2013 14:46:26	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	414,720	637,496	1.5372
14 Jan 2013 23:51:33	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	388,800	598,306	1.5389
14 Jan 2013 12:15:05	858731	15469380	hadcm3n_zmcl_1880_40_008247831_4	362,880	557,996	1.5377