Task 15479782

Name	hadcm3n_zk9d_1880_40_008246589_4
Workunit	8401713
Created	15 Dec 2012, 14:20:51 UTC
Sent	15 Dec 2012, 14:20:58 UTC
Report deadline	16 Mar 2013, 21:48:09 UTC
Received	25 Dec 2012, 19:03:48 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1177742
Run time	8 days 16 hours 11 min 1 sec
CPU time	7 days 0 hours 39 min 3 sec
Validate state	Invalid
Credit	1,244.16
Device peak FLOPS	0.96 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu
Stderr	<core_client_version>6.10.17</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6389, iMonCtr=1 Model crash detected, will try to restart... 07:42:27 (6389): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:42:52 (6389): No heartbeat from core client for 30 sec - exiting 07:43:07 (6389): No heartbeat from core client for 30 sec - exiting 08:06:40 (8112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8130, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8130, iMonCtr=1 Model crash detected, will try to restart... 08:25:43 (8130): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8358, iMonCtr=1 Model crash detected, will try to restart... 07:59:10 (8358): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:36:34 (9907): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9951, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9951, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10011, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10011, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10011, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10011, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10011, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10011, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
22 Dec 2012 18:56:56	1177742	15479782	hadcm3n_zk9d_1880_40_008246589_4	103,680	575,940	5.5550
20 Dec 2012 20:13:53	1177742	15479782	hadcm3n_zk9d_1880_40_008246589_4	77,760	433,973	5.5809
19 Dec 2012 00:55:31	1177742	15479782	hadcm3n_zk9d_1880_40_008246589_4	51,840	291,372	5.6206
17 Dec 2012 07:19:24	1177742	15479782	hadcm3n_zk9d_1880_40_008246589_4	25,920	144,680	5.5818