Task 15633970

Name	hadcm3n_z8v2_1960_40_008319568_0
Workunit	8470703
Created	24 Feb 2013, 10:09:53 UTC
Sent	24 Feb 2013, 10:10:00 UTC
Report deadline	26 May 2013, 17:37:11 UTC
Received	15 May 2013, 0:58:48 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1125445
Run time	25 days 14 hours 58 min 15 sec
CPU time	23 days 2 hours 46 min 31 sec
Validate state	Invalid
Credit	12,441.60
Device peak FLOPS	2.64 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6964, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 08:41:19 (5188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2768, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4140, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3616, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5940, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4736, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4740, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4232, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4996, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4544, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4828, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... 17:26:15 (4684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:26:16 (4684): No heartbeat from core client for 30 sec - exiting 17:26:17 (4684): No heartbeat from core client for 30 sec - exiting 17:26:18 (4684): No heartbeat from core client for 30 sec - exiting 17:26:19 (4684): No heartbeat from core client for 30 sec - exiting 17:26:21 (4684): No heartbeat from core client for 30 sec - exiting 17:26:22 (4684): No heartbeat from core client for 30 sec - exiting 17:26:23 (4684): No heartbeat from core client for 30 sec - exiting 17:26:24 (4684): No heartbeat from core client for 30 sec - exiting 17:26:25 (4684): No heartbeat from core client for 30 sec - exiting 17:26:26 (4684): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5004, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4792, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
15 May 2013 00:02:26	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	1,036,800	1,997,187	1.9263
14 May 2013 03:26:41	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	1,010,880	1,954,125	1.9331
12 May 2013 09:16:15	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	984,960	1,910,107	1.9393
11 May 2013 08:02:51	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	959,040	1,866,483	1.9462
09 May 2013 05:50:12	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	933,120	1,820,058	1.9505
05 May 2013 12:10:46	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	907,200	1,775,213	1.9568
04 May 2013 12:26:57	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	881,280	1,733,203	1.9667
03 May 2013 02:46:51	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	855,360	1,689,996	1.9758
01 May 2013 06:23:10	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	829,440	1,645,318	1.9836
29 Apr 2013 03:21:05	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	803,520	1,601,652	1.9933
28 Apr 2013 02:44:16	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	777,600	1,557,392	2.0028
27 Apr 2013 13:40:07	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	751,680	1,512,569	2.0123
25 Apr 2013 04:35:43	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	725,760	1,467,005	2.0213
22 Apr 2013 01:21:16	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	699,840	1,422,610	2.0328
21 Apr 2013 10:42:39	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	673,920	1,375,492	2.0410
20 Apr 2013 20:32:37	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	648,000	1,328,046	2.0495
18 Apr 2013 11:47:36	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	622,080	1,277,199	2.0531
15 Apr 2013 05:49:26	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	596,160	1,224,187	2.0535
14 Apr 2013 13:03:02	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	570,240	1,171,000	2.0535
13 Apr 2013 14:20:35	1125445	15633970	hadcm3n_z8v2_1960_40_008319568_0	544,320	1,117,431	2.0529