Task 12278643

Name	hadam3p_eu_xoly_1981_1_006993902_0
Workunit	7197218
Created	24 Nov 2010, 10:18:17 UTC
Sent	23 Jan 2011, 11:21:42 UTC
Report deadline	5 Jan 2012, 16:41:42 UTC
Received	14 Feb 2011, 19:46:11 UTC
Server state	Over
Outcome	No reply
Client state	Compute error
Exit status	0 (0x00000000)
Computer ID	972664
Run time	6 days 19 hours 33 min 18 sec
CPU time	4 days 3 hours 53 min 12 sec
Validate state	Invalid
Credit	1,988.94
Device peak FLOPS	2.11 GFLOPS
Application version	UK Met Office HadAM3P-HadRM3P Europe v6.08 windows_intelx86
Stderr	<core_client_version>6.6.20</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6624, selfPID=5304, iMonCtr=1 Model crash detected, will try to restart... 09:24:21 (5540): No heartbeat from core client for 30 sec - exiting 09:24:22 (5540): No heartbeat from core client for 30 sec - exiting 09:24:23 (5540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6740, selfPID=6740, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4280, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4148, selfPID=5756, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6072, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 15:37:11 (5604): No heartbeat from core client for 30 sec - exiting 15:37:12 (5604): No heartbeat from core client for 30 sec - exiting 15:37:13 (5604): No heartbeat from core client for 30 sec - exiting 15:37:14 (5604): No heartbeat from core client for 30 sec - exiting 15:37:15 (5604): No heartbeat from core client for 30 sec - exiting 15:37:16 (5604): No heartbeat from core client for 30 sec - exiting 15:37:17 (5604): No heartbeat from core client for 30 sec - exiting 15:37:18 (5604): No heartbeat from core client for 30 sec - exiting 15:37:19 (5604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6692, selfPID=7052, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7984, selfPID=5672, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4824, selfPID=5952, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5796, selfPID=5868, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5904, selfPID=5280, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=11768, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5304, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6016, selfPID=5624, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6540, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5644, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 15:45:29 (5480): No heartbeat from core client for 30 sec - exiting 15:45:30 (5480): No heartbeat from core client for 30 sec - exiting 15:45:31 (5480): No heartbeat from core client for 30 sec - exiting 15:45:32 (5480): No heartbeat from core client for 30 sec - exiting 15:45:33 (5480): No heartbeat from core client for 30 sec - exiting 15:45:34 (5480): No heartbeat from core client for 30 sec - exiting 15:45:35 (5480): No heartbeat from core client for 30 sec - exiting 15:45:36 (5480): No heartbeat from core client for 30 sec - exiting 15:45:37 (5480): No heartbeat from core client for 30 sec - exiting 15:45:38 (5480): No heartbeat from core client for 30 sec - exiting 15:45:39 (5480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2892, selfPID=6200, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5324, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7132, selfPID=5616, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1096, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_xoly_1981_1_006993902/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_xoly_1981_1_006993902/dataout/region_restart.day after 11 attempts Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO tmp/xaakm.pipe_dummy 2048 Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO tmp/xaakg.pipe_dummy 2048 Leaving CPDN_Main::Monitor... 20:36:27 (5632): called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_xoly_1981_1_006993902_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_xoly_1981_1_006993902_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
14 Feb 2011 00:11:00	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	115,296	344,075	2.9843
12 Feb 2011 16:21:05	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	103,776	311,795	3.0045
10 Feb 2011 19:54:34	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	92,256	278,696	3.0209
08 Feb 2011 15:32:22	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	80,736	244,787	3.0319
08 Feb 2011 04:04:20	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	69,216	212,036	3.0634
07 Feb 2011 16:10:57	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	57,696	178,368	3.0915
04 Feb 2011 18:57:49	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	46,176	144,737	3.1345
30 Jan 2011 21:03:25	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	34,656	111,111	3.2061
28 Jan 2011 19:42:58	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	23,136	78,338	3.3860
24 Jan 2011 13:17:35	972664	12278643	hadam3p_eu_xoly_1981_1_006993902_0	11,616	40,604	3.4955