Task 16420338

Name	hadam3p_anz_nbvf_2012_1_008603003_0
Workunit	8749515
Created	26 Mar 2014, 19:37:43 UTC
Sent	26 Mar 2014, 20:58:53 UTC
Report deadline	9 Mar 2015, 2:18:53 UTC
Received	23 Jun 2014, 16:55:33 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	0 (0x00000000)
Computer ID	1115566
Run time	16 days 20 hours 28 min 19 sec
CPU time	13 days 21 hours 43 min 57 sec
Validate state	Invalid
Credit	5,477.92
Device peak FLOPS	1.40 GFLOPS
Application version	UK Met Office HadAM3P-HadRM3P Australia New Zealand v6.10 windows_intelx86
Stderr	<core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5224, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6360, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2684, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... GCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1348, selfPID=1348, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2900, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4704, selfPID=4704, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=440, selfPID=440, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1596, selfPID=1596, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3208, selfPID=4284, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4740, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4288, iMonCtr=2 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5888, selfPID=4152, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4220, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5812, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4168, selfPID=3676, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5020, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6556, selfPID=5000, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4440, selfPID=4196, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5620, selfPID=1156, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4936, selfPID=4640, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4468, selfPID=1136, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5136, selfPID=4604, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5888, iMonCtr=2 Model crash detected, will try to restart... 18:11:11 (5636): No heartbeat from core client for 30 sec - exiting 18:11:12 (5636): No heartbeat from core client for 30 sec - exiting 18:11:13 (5636): No heartbeat from core client for 30 sec - exiting 18:11:14 (5636): No heartbeat from core client for 30 sec - exiting 18:11:16 (5636): No heartbeat from core client for 30 sec - exiting 18:11:17 (5636): No heartbeat from core client for 30 sec - exiting 18:11:18 (5636): No heartbeat from core client for 30 sec - exiting 18:11:19 (5636): No heartbeat from core client for 30 sec - exiting 18:11:20 (5636): No heartbeat from core client for 30 sec - exiting 18:11:21 (5636): No heartbeat from core client for 30 sec - exiting 18:11:22 (5636): No heartbeat from core client for 30 sec - exiting 18:11:23 (5636): No heartbeat from core client for 30 sec - exiting 18:11:24 (5636): No heartbeat from core client for 30 sec - exiting 18:11:25 (5636): No heartbeat from core client for 30 sec - exiting 18:11:26 (5636): No heartbeat from core client for 30 sec - exiting 18:11:28 (5636): No heartbeat from core client for 30 sec - exiting 18:11:29 (5636): No heartbeat from core client for 30 sec - exiting 18:11:30 (5636): No heartbeat from core client for 30 sec - exiting 18:11:31 (5636): No heartbeat from core client for 30 sec - exiting 18:11:32 (5636): No heartbeat from core client for 30 sec - exiting 18:11:33 (5636): No heartbeat from core client for 30 sec - exiting 18:11:34 (5636): No heartbeat from core client for 30 sec - exiting 18:11:35 (5636): No heartbeat from core client for 30 sec - exiting 18:11:36 (5636): No heartbeat from core client for 30 sec - exiting 18:11:37 (5636): No heartbeat from core client for 30 sec - exiting 18:11:38 (5636): No heartbeat from core client for 30 sec - exiting 18:11:40 (5636): No heartbeat from core client for 30 sec - exiting 18:11:41 (5636): No heartbeat from core client for 30 sec - exiting 18:11:42 (5636): No heartbeat from core client for 30 sec - exiting 18:11:43 (5636): No heartbeat from core client for 30 sec - exiting 18:11:44 (5636): No heartbeat from core client for 30 sec - exiting 18:11:45 (5636): No heartbeat from core client for 30 sec - exiting 18:11:46 (5636): No heartbeat from core client for 30 sec - exiting 18:11:47 (5636): No heartbeat from core client for 30 sec - exiting 18:11:48 (5636): No heartbeat from core client for 30 sec - exiting 18:11:49 (5636): No heartbeat from core client for 30 sec - exiting 18:11:51 (5636): No heartbeat from core client for 30 sec - exiting 18:11:52 (5636): No heartbeat from core client for 30 sec - exiting 18:11:53 (5636): No heartbeat from core client for 30 sec - exiting 18:11:54 (5636): No heartbeat from core client for 30 sec - exiting 18:11:55 (5636): No heartbeat from core client for 30 sec - exiting 18:11:56 (5636): No heartbeat from core client for 30 sec - exiting 18:11:57 (5636): No heartbeat from core client for 30 sec - exiting 18:11:58 (5636): No heartbeat from core client for 30 sec - exiting 18:11:59 (5636): No heartbeat from core client for 30 sec - exiting 18:12:01 (5636): No heartbeat from core client for 30 sec - exiting 18:12:02 (5636): No heartbeat from core client for 30 sec - exiting 18:12:03 (5636): No heartbeat from core client for 30 sec - exiting 18:12:04 (5636): No heartbeat from core client for 30 sec - exiting 18:12:05 (5636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5332, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3780, selfPID=4216, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5764, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1216, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10044, selfPID=10044, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4688, selfPID=4688, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=400, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4924, selfPID=4924, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3984, selfPID=3984, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1692, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=1744, iMonCtr=1 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3984, selfPID=1464, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_anz_nbvf_2012_1_008603003_0_12.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
17 Jun 2014 17:42:14	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	127,019	1,138,238	8.9612
05 Jun 2014 19:54:49	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	115,499	1,038,838	8.9943
25 May 2014 14:07:59	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	103,979	937,524	9.0165
15 May 2014 18:16:15	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	92,459	833,224	9.0118
12 May 2014 07:37:33	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	80,939	728,135	8.9961
05 May 2014 16:00:37	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	69,419	623,412	8.9804
27 Apr 2014 20:36:25	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	57,899	518,348	8.9526
20 Apr 2014 15:54:40	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	46,379	414,900	8.9459
17 Apr 2014 11:24:44	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	34,859	310,767	8.9150
06 Apr 2014 20:55:46	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	23,339	207,821	8.9045
01 Apr 2014 01:48:11	1115566	16420338	hadam3p_anz_nbvf_2012_1_008603003_0	11,819	103,989	8.7985