Task 12744782

Name	hadcm3n_o4zf_1900_40_007201790_0
Workunit	7400070
Created	28 Mar 2011, 14:12:08 UTC
Sent	30 Mar 2011, 2:42:32 UTC
Report deadline	29 Jun 2011, 10:09:43 UTC
Received	21 Jun 2011, 4:07:47 UTC
Server state	Over
Outcome	Success
Client state	Done
Exit status	0 (0x00000000)
Computer ID	1072494
Run time	17 days 20 hours 22 min 13 sec
CPU time	14 days 12 hours 50 min 2 sec
Validate state	Valid
Credit	12,441.60
Device peak FLOPS	3.30 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.60</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... 12:32:07 (5848): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:32:08 (5848): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 18:55:10 (6260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 12:18:22 (5276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6116, iMonCtr=1 Model crash detected, will try to restart... 18:43:04 (6096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:55:58 (2156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4444, iMonCtr=1 Model crash detected, will try to restart... 20:38:19 (3296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5764, iMonCtr=1 Model crash detected, will try to restart... 21:48:15 (5276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 18:55:30 (6936): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:35:08 (6168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:53:35 (700): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6844, iMonCtr=1 Model crash detected, will try to restart... 16:43:33 (644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:43:34 (644): No heartbeat from core client for 30 sec - exiting 16:43:35 (644): No heartbeat from core client for 30 sec - exiting 16:43:36 (644): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3300, iMonCtr=1 Model crash detected, will try to restart... Atmos Hold Restart file rename failed on atmos_restart.hold CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:48:10 (5620): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=740, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=740, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5456, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5112, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3460, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6304, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6304, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:54:01 (4448): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2612, iMonCtr=1 Model crash detected, will try to restart... 17:57:57 (6064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5048, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6460, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3280, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3280, iMonCtr=1 Model crash detected, will try to restart... 12:25:44 (6844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3676, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
20 Jun 2011 03:43:19	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	1,036,800	1,255,364	1.2108
19 Jun 2011 22:01:09	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	1,010,880	1,225,272	1.2121
16 Jun 2011 10:41:15	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	984,960	1,194,791	1.2130
14 Jun 2011 13:03:24	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	959,040	1,164,843	1.2146
10 Jun 2011 05:24:14	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	933,120	1,135,325	1.2167
08 Jun 2011 04:27:02	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	907,200	1,105,196	1.2182
07 Jun 2011 05:02:25	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	881,280	1,075,719	1.2206
05 Jun 2011 03:14:11	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	855,360	1,046,197	1.2231
02 Jun 2011 14:11:31	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	829,440	1,015,861	1.2248
30 May 2011 22:20:16	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	803,520	986,076	1.2272
29 May 2011 03:16:11	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	777,600	955,800	1.2292
28 May 2011 10:35:09	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	751,680	926,069	1.2320
26 May 2011 09:38:30	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	725,760	895,462	1.2338
25 May 2011 12:55:42	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	699,840	865,167	1.2362
23 May 2011 04:30:33	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	673,920	834,450	1.2382
21 May 2011 04:02:34	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	648,000	803,132	1.2394
20 May 2011 02:58:53	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	622,080	772,553	1.2419
17 May 2011 10:15:36	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	596,160	742,883	1.2461
15 May 2011 12:29:31	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	570,240	712,416	1.2493
14 May 2011 10:00:26	1072494	12744782	hadcm3n_o4zf_1900_40_007201790_0	544,320	681,153	1.2514