Name | hadam3p_eu_wr43_1962_1_006878491_1 |
Workunit | 7081807 |
Created | 24 Feb 2012, 16:46:13 UTC |
Sent | 24 Feb 2012, 16:52:52 UTC |
Report deadline | 5 Feb 2013, 22:12:52 UTC |
Received | 21 Mar 2012, 10:32:08 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1170519 |
Run time | 4 days 11 hours 11 min 24 sec |
CPU time | 3 days 22 hours 11 min 44 sec |
Validate state | Invalid |
Credit | 1,988.94 |
Device peak FLOPS | 2.49 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1516, selfPID=3684, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2800, selfPID=4968, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5352, iMonCtr=2 Model crash detected, will try to restart... G14:52:55 (5000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:53:34 (5000): No heartbeat from core client for 30 sec - exiting 14:53:35 (5000): No heartbeat from core client for 30 sec - exiting 14:53:36 (5000): No heartbeat from core client for 30 sec - exiting 14:53:37 (5000): No heartbeat from core client for 30 sec - exiting 14:53:38 (5000): No heartbeat from core client for 30 sec - exiting 14:53:39 (5000): No heartbeat from core client for 30 sec - exiting 14:53:40 (5000): No heartbeat from core client for 30 sec - exiting 14:53:41 (5000): No heartbeat from core client for 30 sec - exiting 14:53:42 (5000): No heartbeat from core client for 30 sec - exiting 14:53:43 (5000): No heartbeat from core client for 30 sec - exiting 14:53:44 (5000): No heartbeat from core client for 30 sec - exiting 14:53:45 (5000): No heartbeat from core client for 30 sec - exiting 14:53:46 (5000): No heartbeat from core client for 30 sec - exiting 14:53:47 (5000): No heartbeat from core client for 30 sec - exiting 14:53:48 (5000): No heartbeat from core client for 30 sec - exiting 14:53:49 (5000): No heartbeat from core client for 30 sec - exiting 14:53:50 (5000): No heartbeat from core client for 30 sec - exiting 14:53:51 (5000): No heartbeat from core client for 30 sec - exiting 14:53:52 (5000): No heartbeat from core client for 30 sec - exiting 14:53:53 (5000): No heartbeat from core client for 30 sec - exiting 14:53:54 (5000): No heartbeat from core client for 30 sec - exiting 14:53:55 (5000): No heartbeat from core client for 30 sec - exiting 14:53:56 (5000): No heartbeat from core client for 30 sec - exiting 14:53:57 (5000): No heartbeat from core client for 30 sec - exiting 14:53:58 (5000): No heartbeat from core client for 30 sec - exiting 14:53:59 (5000): No heartbeat from core client for 30 sec - exiting 14:54:00 (5000): No heartbeat from core client for 30 sec - exiting 14:54:01 (5000): No heartbeat from core client for 30 sec - exiting 14:54:02 (5000): No heartbeat from core client for 30 sec - exiting 14:54:03 (5000): No heartbeat from core client for 30 sec - exiting 14:54:04 (5000): No heartbeat from core client for 30 sec - exiting 14:54:05 (5000): No heartbeat from core client for 30 sec - exiting 14:54:06 (5000): No heartbeat from core client for 30 sec - exiting 14:54:07 (5000): No heartbeat from core client for 30 sec - exiting 14:54:08 (5000): No heartbeat from core client for 30 sec - exiting 14:54:09 (5000): No heartbeat from core client for 30 sec - exiting 14:54:10 (5000): No heartbeat from core client for 30 sec - exiting 14:54:11 (5000): No heartbeat from core client for 30 sec - exiting 14:54:12 (5000): No heartbeat from core client for 30 sec - exiting 14:54:13 (5000): No heartbeat from core client for 30 sec - exiting 14:54:14 (5000): No heartbeat from core client for 30 sec - exiting 14:54:15 (5000): No heartbeat from core client for 30 sec - exiting 14:54:16 (5000): No heartbeat from core client for 30 sec - exiting 14:54:17 (5000): No heartbeat from core client for 30 sec - exiting 14:54:18 (5000): No heartbeat from core client for 30 sec - exiting 14:54:19 (5000): No heartbeat from core client for 30 sec - exiting 14:54:20 (5000): No heartbeat from core client for 30 sec - exiting 14:54:21 (5000): No heartbeat from core client for 30 sec - exiting 14:54:22 (5000): No heartbeat from core client for 30 sec - exiting 14:54:23 (5000): No heartbeat from core client for 30 sec - exiting 14:54:24 (5000): No heartbeat from core client for 30 sec - exiting 14:54:25 (5000): No heartbeat from core client for 30 sec - exiting 14:54:26 (5000): No heartbeat from core client for 30 sec - exiting 14:54:27 (5000): No heartbeat from core client for 30 sec - exiting 14:54:28 (5000): No heartbeat from core client for 30 sec - exiting 14:54:29 (5000): No heartbeat from core client for 30 sec - exiting 14:54:30 (5000): No heartbeat from core client for 30 sec - exiting 14:54:31 (5000): No heartbeat from core client for 30 sec - exiting 14:54:32 (5000): No heartbeat from core client for 30 sec - exiting 14:54:33 (5000): No heartbeat from core client for 30 sec - exiting 14:54:34 (5000): No heartbeat from core client for 30 sec - exiting 14:54:35 (5000): No heartbeat from core client for 30 sec - exiting 14:54:36 (5000): No heartbeat from core client for 30 sec - exiting 14:54:37 (5000): No heartbeat from core client for 30 sec - exiting 14:54:38 (5000): No heartbeat from core client for 30 sec - exiting 14:54:39 (5000): No heartbeat from core client for 30 sec - exiting 14:54:40 (5000): No heartbeat from core client for 30 sec - exiting 14:54:41 (5000): No heartbeat from core client for 30 sec - exiting 14:54:42 (5000): No heartbeat from core client for 30 sec - exiting 14:54:43 (5000): No heartbeat from core client for 30 sec - exiting 14:54:44 (5000): No heartbeat from core client for 30 sec - exiting 14:54:45 (5000): No heartbeat from core client for 30 sec - exiting 14:54:46 (5000): No heartbeat from core client for 30 sec - exiting 14:54:47 (5000): No heartbeat from core client for 30 sec - exiting 14:54:48 (5000): No heartbeat from core client for 30 sec - exiting 15:07:24 (1448): No heartbeat from core client for 30 sec - exiting 15:07:25 (1448): No heartbeat from core client for 30 sec - exiting 15:07:26 (1448): No heartbeat from core client for 30 sec - exiting 15:07:27 (1448): No heartbeat from core client for 30 sec - exiting 15:07:28 (1448): No heartbeat from core client for 30 sec - exiting 15:07:29 (1448): No heartbeat from core client for 30 sec - exiting 15:07:30 (1448): No heartbeat from core client for 30 sec - exiting 15:07:31 (1448): No heartbeat from core client for 30 sec - exiting 15:07:32 (1448): No heartbeat from core client for 30 sec - exiting 15:07:33 (1448): No heartbeat from core client for 30 sec - exiting 15:07:34 (1448): No heartbeat from core client for 30 sec - exiting 15:07:35 (1448): No heartbeat from core client for 30 sec - exiting 15:07:36 (1448): No heartbeat from core client for 30 sec - exiting 15:07:37 (1448): No heartbeat from core client for 30 sec - exiting 15:07:38 (1448): No heartbeat from core client for 30 sec - exiting 15:07:39 (1448): No heartbeat from core client for 30 sec - exiting 15:07:40 (1448): No heartbeat from core client for 30 sec - exiting 15:07:41 (1448): No heartbeat from core client for 30 sec - exiting 15:07:42 (1448): No heartbeat from core client for 30 sec - exiting 15:07:43 (1448): No heartbeat from core client for 30 sec - exiting 15:07:44 (1448): No heartbeat from core client for 30 sec - exiting 15:07:45 (1448): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:07:46 (1448): No heartbeat from core client for 30 sec - exiting 15:07:47 (1448): No heartbeat from core client for 30 sec - exiting 15:07:48 (1448): No heartbeat from core client for 30 sec - exiting 15:07:49 (1448): No heartbeat from core client for 30 sec - exiting 15:21:04 (1920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:21:08 (1920): No heartbeat from core client for 30 sec - exiting 15:21:12 (1920): No heartbeat from core client for 30 sec - exiting 15:21:13 (1920): No heartbeat from core client for 30 sec - exiting 15:21:14 (1920): No heartbeat from core client for 30 sec - exiting 15:21:15 (1920): No heartbeat from core client for 30 sec - exiting 15:21:16 (1920): No heartbeat from core client for 30 sec - exiting 15:21:17 (1920): No heartbeat from core client for 30 sec - exiting 15:21:18 (1920): No heartbeat from core client for 30 sec - exiting 15:21:19 (1920): No heartbeat from core client for 30 sec - exiting 15:21:20 (1920): No heartbeat from core client for 30 sec - exiting 15:21:21 (1920): No heartbeat from core client for 30 sec - exiting 15:21:22 (1920): No heartbeat from core client for 30 sec - exiting 15:21:23 (1920): No heartbeat from core client for 30 sec - exiting 15:21:24 (1920): No heartbeat from core client for 30 sec - exiting 15:21:25 (1920): No heartbeat from core client for 30 sec - exiting 15:21:26 (1920): No heartbeat from core client for 30 sec - exiting 15:21:27 (1920): No heartbeat from core client for 30 sec - exiting 15:21:28 (1920): No heartbeat from core client for 30 sec - exiting 15:21:29 (1920): No heartbeat from core client for 30 sec - exiting 15:21:30 (1920): No heartbeat from core client for 30 sec - exiting 15:21:31 (1920): No heartbeat from core client for 30 sec - exiting 15:21:32 (1920): No heartbeat from core client for 30 sec - exiting 15:21:33 (1920): No heartbeat from core client for 30 sec - exiting 15:21:34 (1920): No heartbeat from core client for 30 sec - exiting 15:21:35 (1920): No heartbeat from core client for 30 sec - exiting 15:21:36 (1920): No heartbeat from core client for 30 sec - exiting 15:21:37 (1920): No heartbeat from core client for 30 sec - exiting 15:21:38 (1920): No heartbeat from core client for 30 sec - exiting 15:21:39 (1920): No heartbeat from core client for 30 sec - exiting 15:21:40 (1920): No heartbeat from core client for 30 sec - exiting 15:21:41 (1920): No heartbeat from core client for 30 sec - exiting 15:21:42 (1920): No heartbeat from core client for 30 sec - exiting 15:21:43 (1920): No heartbeat from core client for 30 sec - exiting 15:27:24 (1444): No heartbeat from core client for 30 sec - exiting 15:27:25 (1444): No heartbeat from core client for 30 sec - exiting 15:27:26 (1444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2252, selfPID=4012, iMonCtr=1 Model crash detected, will try to restart... 20:16:20 (4688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Colobal Worker::: :PDNCPDN process is running, exitixiting, tVRetVal1, checcheckPID=0,lfPIfPID=42144,onCtr=2 r=2 del crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2748, selfPID=3744, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3756, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3764, selfPID=3516, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1664, selfPID=1664, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4212, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3780, selfPID=3960, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4760, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1844, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3264, iMonCtr=2 Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2932, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2940, selfPID=2620, iMonCtr=1 Model crash detected, will try to restart... 06:59:36 (2972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1196, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2884, selfPID=2728, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4708, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4564, selfPID=2732, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_wr43_1962_1_006878491/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_wr43_1962_1_006878491/dataout/region_restart.day after 11 attempts forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_wr43_1962_1_006878491\tmp\xaakm.namelists Image PC Routine Line Source hadam3p_eu_um_6.0 0143A39A Unknown Unknown Unknown hadam3p_eu_um_6.0 013E2CD0 Unknown Unknown Unknown hadam3p_eu_um_6.0 013E1E9A Unknown Unknown Unknown hadam3p_eu_um_6.0 013C2819 Unknown Unknown Unknown hadam3p_eu_um_6.0 012C2287 Unknown Unknown Unknown hadam3p_eu_um_6.0 0135E7B2 Unknown Unknown Unknown hadam3p_eu_um_6.0 0135F2DA Unknown Unknown Unknown hadam3p_eu_um_6.0 010D9BD2 Unknown Unknown Unknown hadam3p_eu_um_6.0 0141E638 Unknown Unknown Unknown kernel32.dll 7545339A Unknown Unknown Unknown ntdll.dll 77C59EF2 Unknown Unknown Unknown ntdll.dll 77C59EC5 Unknown Unknown Unknown forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_wr43_1962_1_006878491\tmp\xaakg.namelists Image PC Routine Line Source hadrm3p_eu_um_6.0 00C6C52A Unknown Unknown Unknown hadrm3p_eu_um_6.0 00C14460 Unknown Unknown Unknown hadrm3p_eu_um_6.0 00C1362A Unknown Unknown Unknown hadrm3p_eu_um_6.0 00BF2469 Unknown Unknown Unknown hadrm3p_eu_um_6.0 00AF66EB Unknown Unknown Unknown hadrm3p_eu_um_6.0 00B92AE2 Unknown Unknown Unknown hadrm3p_eu_um_6.0 00B935AF Unknown Unknown Unknown hadrm3p_eu_um_6.0 00939860 Unknown Unknown Unknown hadrm3p_eu_um_6.0 00C50893 Unknown Unknown Unknown kernel32.dll 7545339A Unknown Unknown Unknown ntdll.dll 77C59EF2 Unknown Unknown Unknown ntdll.dll 77C59EC5 Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3348, selfPID=3596, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_wr43_1962_1_006878491_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_wr43_1962_1_006878491_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
20 Mar 2012 16:22:08 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 115,296 | 326,604 | 2.8327 |
19 Mar 2012 14:53:50 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 103,776 | 293,647 | 2.8296 |
17 Mar 2012 01:41:44 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 92,256 | 261,309 | 2.8324 |
14 Mar 2012 19:31:33 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 80,736 | 229,977 | 2.8485 |
13 Mar 2012 17:09:56 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 69,219 | 198,026 | 2.8609 |
13 Mar 2012 15:21:36 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 69,216 | 197,522 | 2.8537 |
11 Mar 2012 15:37:49 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 57,696 | 164,244 | 2.8467 |
10 Mar 2012 14:13:55 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 46,179 | 131,362 | 2.8446 |
10 Mar 2012 12:10:47 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 46,176 | 130,866 | 2.8341 |
04 Mar 2012 18:52:59 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 34,656 | 98,146 | 2.8320 |
03 Mar 2012 16:53:44 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 23,136 | 65,648 | 2.8375 |
28 Feb 2012 15:31:28 | 1170519 | 14188402 | hadam3p_eu_wr43_1962_1_006878491_1 | 11,616 | 33,072 | 2.8471 |
©2024 cpdn.org