Name | hadam3p_eu_k2j7_2013_1_008554525_1 |
Workunit | 8702037 |
Created | 5 Mar 2014, 19:24:41 UTC |
Sent | 5 Mar 2014, 19:27:16 UTC |
Report deadline | 16 Feb 2015, 0:47:16 UTC |
Received | 5 Apr 2014, 15:50:11 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1312373 |
Run time | 3 days 5 hours 8 min 52 sec |
CPU time | 23 hours 15 min 50 sec |
Validate state | Invalid |
Credit | 1,194.02 |
Device peak FLOPS | 2.72 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4228, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3084, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4452, selfPID=4452, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4780, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2752, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1656, selfPID=4936, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 23:22:22 (3496): No heartbeat from core client for 30 sec - exiting 23:22:23 (3496): No heartbeat from core client for 30 sec - exiting 23:22:24 (3496): No heartbeat from core client for 30 sec - exiting 23:22:25 (3496): No heartbeat from core client for 30 sec - exiting 23:22:26 (3496): No heartbeat from core client for 30 sec - exiting 23:22:28 (3496): No heartbeat from core client for 30 sec - exiting 23:22:29 (3496): No heartbeat from core client for 30 sec - exiting 23:22:30 (3496): No heartbeat from core client for 30 sec - exiting 23:22:31 (3496): No heartbeat from core client for 30 sec - exiting 23:22:32 (3496): No heartbeat from core client for 30 sec - exiting 23:22:33 (3496): No heartbeat from core client for 30 sec - exiting 23:22:34 (3496): No heartbeat from core client for 30 sec - exiting 23:22:35 (3496): No heartbeat from core client for 30 sec - exiting 23:22:36 (3496): No heartbeat from core client for 30 sec - exiting 23:22:37 (3496): No heartbeat from core client for 30 sec - exiting 23:22:39 (3496): No heartbeat from core client for 30 sec - exiting 23:22:40 (3496): No heartbeat from core client for 30 sec - exiting 23:22:41 (3496): No heartbeat from core client for 30 sec - exiting 23:22:42 (3496): No heartbeat from core client for 30 sec - exiting 23:22:43 (3496): No heartbeat from core client for 30 sec - exiting 23:22:44 (3496): No heartbeat from core client for 30 sec - exiting 23:22:45 (3496): No heartbeat from core client for 30 sec - exiting 23:22:46 (3496): No heartbeat from core client for 30 sec - exiting 23:22:47 (3496): No heartbeat from core client for 30 sec - exiting 23:22:48 (3496): No heartbeat from core client for 30 sec - exiting 23:22:49 (3496): No heartbeat from core client for 30 sec - exiting 23:22:51 (3496): No heartbeat from core client for 30 sec - exiting 23:22:52 (3496): No heartbeat from core client for 30 sec - exiting 23:22:53 (3496): No heartbeat from core client for 30 sec - exiting 23:22:54 (3496): No heartbeat from core client for 30 sec - exiting 23:22:55 (3496): No heartbeat from core client for 30 sec - exiting 23:22:56 (3496): No heartbeat from core client for 30 sec - exiting 23:22:57 (3496): No heartbeat from core client for 30 sec - exiting 23:22:58 (3496): No heartbeat from core client for 30 sec - exiting 23:22:59 (3496): No heartbeat from core client for 30 sec - exiting 23:23:00 (3496): No heartbeat from core client for 30 sec - exiting 23:23:01 (3496): No heartbeat from core client for 30 sec - exiting 23:23:03 (3496): No heartbeat from core client for 30 sec - exiting 23:23:04 (3496): No heartbeat from core client for 30 sec - exiting 23:23:05 (3496): No heartbeat from core client for 30 sec - exiting 23:23:06 (3496): No heartbeat from core client for 30 sec - exiting 23:23:07 (3496): No heartbeat from core client for 30 sec - exiting 23:23:08 (3496): No heartbeat from core client for 30 sec - exiting 23:23:09 (3496): No heartbeat from core client for 30 sec - exiting 23:23:10 (3496): No heartbeat from core client for 30 sec - exiting 23:23:11 (3496): No heartbeat from core client for 30 sec - exiting 23:23:12 (3496): No heartbeat from core client for 30 sec - exiting 23:23:13 (3496): No heartbeat from core client for 30 sec - exiting 23:23:15 (3496): No heartbeat from core client for 30 sec - exiting 23:23:16 (3496): No heartbeat from core client for 30 sec - exiting 23:23:17 (3496): No heartbeat from core client for 30 sec - exiting 23:23:18 (3496): No heartbeat from core client for 30 sec - exiting 23:23:19 (3496): No heartbeat from core client for 30 sec - exiting 23:23:20 (3496): No heartbeat from core client for 30 sec - exiting 23:23:21 (3496): No heartbeat from core client for 30 sec - exiting 23:23:22 (3496): No heartbeat from core client for 30 sec - exiting 23:23:23 (3496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:23:24 (3496): No heartbeat from core client for 30 sec - exiting GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3068, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 11:46:26 (4064): No heartbeat from core client for 30 sec - exiting 11:46:27 (4064): No heartbeat from core client for 30 sec - exiting 11:46:28 (4064): No heartbeat from core client for 30 sec - exiting 11:46:29 (4064): No heartbeat from core client for 30 sec - exiting 11:46:30 (4064): No heartbeat from core client for 30 sec - exiting 11:46:31 (4064): No heartbeat from core client for 30 sec - exiting 11:46:32 (4064): No heartbeat from core client for 30 sec - exiting 11:46:33 (4064): No heartbeat from core client for 30 sec - exiting 11:46:35 (4064): No heartbeat from core client for 30 sec - exiting 11:46:36 (4064): No heartbeat from core client for 30 sec - exiting 11:46:37 (4064): No heartbeat from core client for 30 sec - exiting 11:46:38 (4064): No heartbeat from core client for 30 sec - exiting 11:46:39 (4064): No heartbeat from core client for 30 sec - exiting 11:46:40 (4064): No heartbeat from core client for 30 sec - exiting 11:46:41 (4064): No heartbeat from core client for 30 sec - exiting 11:46:42 (4064): No heartbeat from core client for 30 sec - exiting 11:46:43 (4064): No heartbeat from core client for 30 sec - exiting 11:46:44 (4064): No heartbeat from core client for 30 sec - exiting 11:46:45 (4064): No heartbeat from core client for 30 sec - exiting 11:46:47 (4064): No heartbeat from core client for 30 sec - exiting 11:46:48 (4064): No heartbeat from core client for 30 sec - exiting 11:46:49 (4064): No heartbeat from core client for 30 sec - exiting 11:46:50 (4064): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1412, selfPID=5540, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1724, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=396, selfPID=2076, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=608, selfPID=608, iMonCtr=2 CPDN Monitor - Quit request from BOINC... 21:18:52 (4344): No heartbeat from core client for 30 sec - exiting 21:18:53 (4344): No heartbeat from core client for 30 sec - exiting 21:18:54 (4344): No heartbeat from core client for 30 sec - exiting 21:18:55 (4344): No heartbeat from core client for 30 sec - exiting 21:18:56 (4344): No heartbeat from core client for 30 sec - exiting 21:18:57 (4344): No heartbeat from core client for 30 sec - exiting 21:18:58 (4344): No heartbeat from core client for 30 sec - exiting 21:18:59 (4344): No heartbeat from core client for 30 sec - exiting 21:19:00 (4344): No heartbeat from core client for 30 sec - exiting 21:19:02 (4344): No heartbeat from core client for 30 sec - exiting 21:19:03 (4344): No heartbeat from core client for 30 sec - exiting 21:19:04 (4344): No heartbeat from core client for 30 sec - exiting 21:19:05 (4344): No heartbeat from core client for 30 sec - exiting 21:19:06 (4344): No heartbeat from core client for 30 sec - exiting 21:19:07 (4344): No heartbeat from core client for 30 sec - exiting 21:19:08 (4344): No heartbeat from core client for 30 sec - exiting 21:19:09 (4344): No heartbeat from core client for 30 sec - exiting 21:19:10 (4344): No heartbeat from core client for 30 sec - exiting 21:19:11 (4344): No heartbeat from core client for 30 sec - exiting 21:19:12 (4344): No heartbeat from core client for 30 sec - exiting 21:19:13 (4344): No heartbeat from core client for 30 sec - exiting 21:19:14 (4344): No heartbeat from core client for 30 sec - exiting 21:19:15 (4344): No heartbeat from core client for 30 sec - exiting 21:19:16 (4344): No heartbeat from core client for 30 sec - exiting 21:19:18 (4344): No heartbeat from core client for 30 sec - exiting 21:19:19 (4344): No heartbeat from core client for 30 sec - exiting 21:19:20 (4344): No heartbeat from core client for 30 sec - exiting 21:19:21 (4344): No heartbeat from core client for 30 sec - exiting 21:19:22 (4344): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5560, selfPID=3860, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 14:52:18 (3808): No heartbeat from core client for 30 sec - exiting cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_k2j7_2013_1_008554525/dataout/atmos_restart.day after 11 attempts 14:52:19 (3808): No heartbeat from core client for 30 sec - exiting 14:52:21 (3808): No heartbeat from core client for 30 sec - exiting 14:52:22 (3808): No heartbeat from core client for 30 sec - exiting 14:52:23 (3808): No heartbeat from core client for 30 sec - exiting 14:52:24 (3808): No heartbeat from core client for 30 sec - exiting 14:52:25 (3808): No heartbeat from core client for 30 sec - exiting 14:52:26 (3808): No heartbeat from core client for 30 sec - exiting 14:52:27 (3808): No heartbeat from core client for 30 sec - exiting 14:52:28 (3808): No heartbeat from core client for 30 sec - exiting 14:52:29 (3808): No heartbeat from core client for 30 sec - exiting 14:52:30 (3808): No heartbeat from core client for 30 sec - exiting 14:52:31 (3808): No heartbeat from core client for 30 sec - exiting 14:52:33 (3808): No heartbeat from core client for 30 sec - exiting 14:52:34 (3808): No heartbeat from core client for 30 sec - exiting 14:52:35 (3808): No heartbeat from core client for 30 sec - exiting 14:52:36 (3808): No heartbeat from core client for 30 sec - exiting 14:52:37 (3808): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_k2j7_2013_1_008554525/dataout/atmos_restart.day after 11 attempts forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_k2j7_2013_1_008554525\tmp\xaakm.namelists Image PC Routine Line Source hadam3p_eu_um_6.0 0150A39A Unknown Unknown Unknown hadam3p_eu_um_6.0 014B2CD0 Unknown Unknown Unknown hadam3p_eu_um_6.0 014B1E9A Unknown Unknown Unknown hadam3p_eu_um_6.0 01492819 Unknown Unknown Unknown hadam3p_eu_um_6.0 01392287 Unknown Unknown Unknown hadam3p_eu_um_6.0 0142E7B2 Unknown Unknown Unknown hadam3p_eu_um_6.0 0142F2DA Unknown Unknown Unknown hadam3p_eu_um_6.0 011A9BD2 Unknown Unknown Unknown hadam3p_eu_um_6.0 014EE638 Unknown Unknown Unknown kernel32.dll 769A336A Unknown Unknown Unknown ntdll.dll 776E9F72 Unknown Unknown Unknown ntdll.dll 776E9F45 Unknown Unknown Unknown forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_k2j7_2013_1_008554525\tmp\xaakg.namelists Image PC Routine Line Source hadrm3p_eu_um_6.0 0133C52A Unknown Unknown Unknown hadrm3p_eu_um_6.0 012E4460 Unknown Unknown Unknown hadrm3p_eu_um_6.0 012E362A Unknown Unknown Unknown hadrm3p_eu_um_6.0 012C2469 Unknown Unknown Unknown hadrm3p_eu_um_6.0 011C66EB Unknown Unknown Unknown hadrm3p_eu_um_6.0 01262AE2 Unknown Unknown Unknown hadrm3p_eu_um_6.0 012635AF Unknown Unknown Unknown hadrm3p_eu_um_6.0 01009860 Unknown Unknown Unknown hadrm3p_eu_um_6.0 01320893 Unknown Unknown Unknown kernel32.dll 769A336A Unknown Unknown Unknown ntdll.dll 776E9F72 Unknown Unknown Unknown ntdll.dll 776E9F45 Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2452, selfPID=3152, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_7.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_k2j7_2013_1_008554525_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
23 Mar 2014 12:39:27 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 69,216 | 134,655 | 1.9454 |
22 Mar 2014 18:16:36 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 57,696 | 112,404 | 1.9482 |
19 Mar 2014 21:58:47 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 46,176 | 89,626 | 1.9410 |
16 Mar 2014 12:02:30 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 34,656 | 67,414 | 1.9452 |
15 Mar 2014 17:23:04 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 23,136 | 44,841 | 1.9381 |
14 Mar 2014 21:33:26 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 11,631 | 23,023 | 1.9795 |
09 Mar 2014 13:42:57 | 1312373 | 16344601 | hadam3p_eu_k2j7_2013_1_008554525_1 | 11,616 | 22,712 | 1.9552 |
©2024 cpdn.org