Task 15800624

Name	hadcm3n_4lqf_1940_40_008306309_3
Workunit	8457444
Created	29 May 2013, 12:54:41 UTC
Sent	29 May 2013, 12:55:09 UTC
Report deadline	28 Aug 2013, 20:22:20 UTC
Received	11 Jul 2013, 11:47:47 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	193 (0x000000C1) EXIT_SIGNAL
Computer ID	1127622
Run time	8 days 13 hours 21 min 2 sec
CPU time	8 days 11 hours 14 min 48 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	2.28 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6108, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5264, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5264, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 13:27:23 (3252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 08:20:27 (5392): No heartbeat from core client for 30 sec - exiting 08:20:28 (5392): No heartbeat from core client for 30 sec - exiting 08:20:29 (5392): No heartbeat from core client for 30 sec - exiting 08:20:30 (5392): No heartbeat from core client for 30 sec - exiting 08:20:31 (5392): No heartbeat from core client for 30 sec - exiting 08:20:32 (5392): No heartbeat from core client for 30 sec - exiting 08:20:33 (5392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:24:09 (8320): Can't acquire lockfile (32) - waiting 35s 11:24:31 (3472): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:55:27 (12764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12292, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:00:15 (3124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2432, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=1 Model crash detected, will try to restart... 17:16:27 (2636): Can't acquire lockfile (32) - waiting 35s 17:16:28 (5068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:02:52 (6724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13912, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13912, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:38:58 (6376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:11:33 (7468): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7052, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:30:49 (6452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:03:59 (5732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
11 Jul 2013 10:36:39	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	518,400	731,681	1.4114
09 Jul 2013 08:36:58	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	492,480	696,344	1.4140
04 Jul 2013 14:27:43	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	466,560	660,293	1.4152
02 Jul 2013 12:07:09	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	440,640	622,928	1.4137
26 Jun 2013 12:11:10	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	414,720	586,162	1.4134
25 Jun 2013 07:15:35	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	388,800	547,726	1.4088
17 Jun 2013 11:52:12	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	362,880	507,646	1.3989
14 Jun 2013 12:44:59	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	336,960	467,410	1.3871
12 Jun 2013 11:09:05	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	311,040	429,664	1.3814
11 Jun 2013 11:49:34	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	285,120	393,148	1.3789
10 Jun 2013 15:56:05	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	259,200	357,193	1.3781
08 Jun 2013 10:59:33	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	233,280	322,129	1.3809
07 Jun 2013 12:17:00	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	207,360	286,816	1.3832
06 Jun 2013 10:52:09	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	181,440	251,186	1.3844
05 Jun 2013 12:32:04	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	155,520	215,523	1.3858
04 Jun 2013 15:17:47	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	129,600	180,848	1.3954
03 Jun 2013 20:36:48	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	103,680	145,340	1.4018
03 Jun 2013 10:19:56	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	77,760	108,783	1.3990
31 May 2013 10:31:57	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	51,840	72,929	1.4068
30 May 2013 08:03:46	1127622	15800624	hadcm3n_4lqf_1940_40_008306309_3	25,920	36,423	1.4052