Name | hadam3p_eu_h3nv_2013_1_008858572_1 |
Workunit | 9004501 |
Created | 16 Oct 2014, 10:19:47 UTC |
Sent | 16 Oct 2014, 10:25:20 UTC |
Report deadline | 28 Sep 2015, 15:45:20 UTC |
Received | 4 Nov 2014, 13:41:28 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1322479 |
Run time | 2 days 18 hours 51 min 33 sec |
CPU time | 1 days 12 hours 40 min 8 sec |
Validate state | Invalid |
Credit | 1,988.94 |
Device peak FLOPS | 3.50 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>7.2.42</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3000, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5412, iMonCtr=2 Model crash detected, will try to restart... 13:03:04 (2628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7528, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7536, selfPID=6876, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 17:40:45 (6816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4992, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=2 Model crash detected, will try to restart... 15:54:03 (5296): No heartbeat from core client for 30 sec - exiting 15:54:04 (5296): No heartbeat from core client for 30 sec - exiting 15:54:05 (5296): No heartbeat from core client for 30 sec - exiting 15:54:06 (5296): No heartbeat from core client for 30 sec - exiting 15:54:07 (5296): No heartbeat from core client for 30 sec - exiting 15:54:08 (5296): No heartbeat from core client for 30 sec - exiting 15:54:09 (5296): No heartbeat from core client for 30 sec - exiting 15:54:10 (5296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5988, iMonCtr=2 19:56:41 (4316): No heartbeat from core client for 30 sec - exiting 19:56:42 (4316): No heartbeat from core client for 30 sec - exiting 19:56:43 (4316): No heartbeat from core client for 30 sec - exiting 19:56:44 (4316): No heartbeat from core client for 30 sec - exiting 19:56:45 (4316): No heartbeat from core client for 30 sec - exiting 19:56:46 (4316): No heartbeat from core client for 30 sec - exiting 19:56:47 (4316): No heartbeat from core client for 30 sec - exiting 19:56:48 (4316): No heartbeat from core client for 30 sec - exiting 19:56:50 (4316): No heartbeat from core client for 30 sec - exiting 19:56:51 (4316): No heartbeat from core client for 30 sec - exiting 19:56:52 (4316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1524, selfPID=6040, iMonCtr=1 Model crash detected, will try to restart... 20:23:03 (4488): No heartbeat from core client for 30 sec - exiting 20:23:04 (4488): No heartbeat from core client for 30 sec - exiting 20:23:05 (4488): No heartbeat from core client for 30 sec - exiting 20:23:06 (4488): No heartbeat from core client for 30 sec - exiting 20:23:07 (4488): No heartbeat from core client for 30 sec - exiting 20:23:08 (4488): No heartbeat from core client for 30 sec - exiting 20:23:09 (4488): No heartbeat from core client for 30 sec - exiting 20:23:10 (4488): No heartbeat from core client for 30 sec - exiting 20:23:11 (4488): No heartbeat from core client for 30 sec - exiting 20:23:12 (4488): No heartbeat from core client for 30 sec - exiting 20:23:14 (4488): No heartbeat from core client for 30 sec - exiting 20:23:15 (4488): No heartbeat from core client for 30 sec - exiting 20:23:16 (4488): No heartbeat from core client for 30 sec - exiting 20:23:17 (4488): No heartbeat from core client for 30 sec - exiting 20:23:18 (4488): No heartbeat from core client for 30 sec - exiting 20:23:19 (4488): No heartbeat from core client for 30 sec - exiting 20:23:20 (4488): No heartbeat from core client for 30 sec - exiting 20:23:21 (4488): No heartbeat from core client for 30 sec - exiting 20:23:22 (4488): No heartbeat from core client for 30 sec - exiting 20:23:23 (4488): No heartbeat from core client for 30 sec - exiting 20:23:25 (4488): No heartbeat from core client for 30 sec - exiting 20:23:26 (4488): No heartbeat from core client for 30 sec - exiting 20:23:27 (4488): No heartbeat from core client for 30 sec - exiting 20:23:28 (4488): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... zip error: Could not create output file (was replacing the original zip file) Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4748, iMonCtr=2 Model crash detected, will try to restart... 23:04:50 (5040): No heartbeat from core client for 30 sec - exiting 23:04:51 (5040): No heartbeat from core client for 30 sec - exiting 23:04:52 (5040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:41:13 (2296): No heartbeat from core client for 30 sec - exiting 20:41:14 (2296): No heartbeat from core client for 30 sec - exiting 20:41:16 (2296): No heartbeat from core client for 30 sec - exiting 20:41:17 (2296): No heartbeat from core client for 30 sec - exiting 20:41:18 (2296): No heartbeat from core client for 30 sec - exiting 20:41:19 (2296): No heartbeat from core client for 30 sec - exiting 20:41:20 (2296): No heartbeat from core client for 30 sec - exiting 20:41:21 (2296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:14:10 (7656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:25:45 (4680): No heartbeat from core client for 30 sec - exiting 12:25:46 (4680): No heartbeat from core client for 30 sec - exiting 12:25:47 (4680): No heartbeat from core client for 30 sec - exiting 12:25:48 (4680): No heartbeat from core client for 30 sec - exiting 12:25:49 (4680): No heartbeat from core client for 30 sec - exiting 12:25:50 (4680): No heartbeat from core client for 30 sec - exiting 12:25:51 (4680): No heartbeat from core client for 30 sec - exiting 12:25:52 (4680): No heartbeat from core client for 30 sec - exiting 12:25:53 (4680): No heartbeat from core client for 30 sec - exiting 12:25:54 (4680): No heartbeat from core client for 30 sec - exiting 12:25:55 (4680): No heartbeat from core client for 30 sec - exiting 12:25:56 (4680): No heartbeat from core client for 30 sec - exiting 12:25:57 (4680): No heartbeat from core client for 30 sec - exiting 12:25:58 (4680): No heartbeat from core client for 30 sec - exiting 12:25:59 (4680): No heartbeat from core client for 30 sec - exiting 12:26:00 (4680): No heartbeat from core client for 30 sec - exiting 12:26:01 (4680): No heartbeat from core client for 30 sec - exiting 12:26:02 (4680): No heartbeat from core client for 30 sec - exiting 12:26:03 (4680): No heartbeat from core client for 30 sec - exiting 12:26:04 (4680): No heartbeat from core client for 30 sec - exiting 12:26:05 (4680): No heartbeat from core client for 30 sec - exiting 12:26:06 (4680): No heartbeat from core client for 30 sec - exiting 12:26:07 (4680): No heartbeat from core client for 30 sec - exiting 12:26:08 (4680): No heartbeat from core client for 30 sec - exiting 12:26:09 (4680): No heartbeat from core client for 30 sec - exiting 12:26:10 (4680): No heartbeat from core client for 30 sec - exiting 12:26:11 (4680): No heartbeat from core client for 30 sec - exiting 12:26:12 (4680): No heartbeat from core client for 30 sec - exiting 12:26:13 (4680): No heartbeat from core client for 30 sec - exiting 12:26:14 (4680): No heartbeat from core client for 30 sec - exiting 12:26:15 (4680): No heartbeat from core client for 30 sec - exiting 12:26:16 (4680): No heartbeat from core client for 30 sec - exiting 12:26:17 (4680): No heartbeat from core client for 30 sec - exiting 12:26:18 (4680): No heartbeat from core client for 30 sec - exiting 12:26:19 (4680): No heartbeat from core client for 30 sec - exiting 12:26:20 (4680): No heartbeat from core client for 30 sec - exiting 12:26:21 (4680): No heartbeat from core client for 30 sec - exiting 12:26:22 (4680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4164, iMonCtr=2 Model crash detected, will try to restart... 16:09:38 (3816): No heartbeat from core client for 30 sec - exiting 16:09:39 (3816): No heartbeat from core client for 30 sec - exiting 16:09:40 (3816): No heartbeat from core client for 30 sec - exiting 16:09:42 (3816): No heartbeat from core client for 30 sec - exiting 16:09:43 (3816): No heartbeat from core client for 30 sec - exiting 16:09:44 (3816): No heartbeat from core client for 30 sec - exiting 16:09:45 (3816): No heartbeat from core client for 30 sec - exiting 16:09:46 (3816): No heartbeat from core client for 30 sec - exiting 16:09:47 (3816): No heartbeat from core client for 30 sec - exiting 16:09:48 (3816): No heartbeat from core client for 30 sec - exiting 16:09:49 (3816): No heartbeat from core client for 30 sec - exiting 16:09:50 (3816): No heartbeat from core client for 30 sec - exiting 16:09:51 (3816): No heartbeat from core client for 30 sec - exiting 16:09:52 (3816): No heartbeat from core client for 30 sec - exiting 16:09:54 (3816): No heartbeat from core client for 30 sec - exiting 16:09:55 (3816): No heartbeat from core client for 30 sec - exiting 16:09:56 (3816): No heartbeat from core client for 30 sec - exiting 16:09:57 (3816): No heartbeat from core client for 30 sec - exiting 16:09:58 (3816): No heartbeat from core client for 30 sec - exiting 16:09:59 (3816): No heartbeat from core client for 30 sec - exiting 16:10:00 (3816): No heartbeat from core client for 30 sec - exiting 16:10:01 (3816): No heartbeat from core client for 30 sec - exiting 16:10:02 (3816): No heartbeat from core client for 30 sec - exiting 16:10:03 (3816): No heartbeat from core client for 30 sec - exiting 16:10:04 (3816): No heartbeat from core client for 30 sec - exiting 16:10:06 (3816): No heartbeat from core client for 30 sec - exiting 16:10:07 (3816): No heartbeat from core client for 30 sec - exiting 16:10:08 (3816): No heartbeat from core client for 30 sec - exiting 16:10:09 (3816): No heartbeat from core client for 30 sec - exiting 16:10:10 (3816): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3900, selfPID=5976, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1784, selfPID=3632, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 22:43:15 (4024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2068, selfPID=4500, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:40:01 (5232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4588, selfPID=4624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5316, selfPID=3008, iMonCtr=1 Model crash detected, will try to restart... 21:27:12 (4500): No heartbeat from core client for 30 sec - exiting 21:27:14 (4500): No heartbeat from core client for 30 sec - exiting 21:27:15 (4500): No heartbeat from core client for 30 sec - exiting 21:27:16 (4500): No heartbeat from core client for 30 sec - exiting 21:27:17 (4500): No heartbeat from core client for 30 sec - exiting 21:27:18 (4500): No heartbeat from core client for 30 sec - exiting 21:27:19 (4500): No heartbeat from core client for 30 sec - exiting 21:27:20 (4500): No heartbeat from core client for 30 sec - exiting 21:27:21 (4500): No heartbeat from core client for 30 sec - exiting 21:27:22 (4500): No heartbeat from core client for 30 sec - exiting 21:27:23 (4500): No heartbeat from core client for 30 sec - exiting 21:27:25 (4500): No heartbeat from core client for 30 sec - exiting 21:27:26 (4500): No heartbeat from core client for 30 sec - exiting 21:27:27 (4500): No heartbeat from core client for 30 sec - exiting 21:27:28 (4500): No heartbeat from core client for 30 sec - exiting 21:27:29 (4500): No heartbeat from core client for 30 sec - exiting 21:27:30 (4500): No heartbeat from core client for 30 sec - exiting 21:27:31 (4500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5212, selfPID=4008, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4264, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1676, selfPID=4752, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2240, selfPID=3576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4824, selfPID=2136, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9816, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2716, selfPID=3292, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_eu_h3nv_2013_1_008858572/dataout/atmos_restart.day after 11 attempts forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_h3nv_2013_1_008858572\tmp\xaakg.namelists Image PC Routine Line Source hadrm3p_eu_um_6.0 006EC52A Unknown Unknown Unknown hadrm3p_eu_um_6.0 00694460 Unknown Unknown Unknown hadrm3p_eu_um_6.0 0069362A Unknown Unknown Unknown hadrm3p_eu_um_6.0 00672469 Unknown Unknown Unknown hadrm3p_eu_um_6.0 005766EB Unknown Unknown Unknown hadrm3p_eu_um_6.0 00612AE2 Unknown Unknown Unknown hadrm3p_eu_um_6.0 006135AF Unknown Unknown Unknown hadrm3p_eu_um_6.0 003B9860 Unknown Unknown Unknown hadrm3p_eu_um_6.0 006D0893 Unknown Unknown Unknown kernel32.dll 7591EE1C Unknown Unknown Unknown ntdll.dll 773537EB Unknown Unknown Unknown ntdll.dll 773537BE Unknown Unknown Unknown forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_h3nv_2013_1_008858572\tmp\xaakm.namelists Image PC Routine Line Source hadam3p_eu_um_6.0 0125A39A Unknown Unknown Unknown hadam3p_eu_um_6.0 01202CD0 Unknown Unknown Unknown hadam3p_eu_um_6.0 01201E9A Unknown Unknown Unknown hadam3p_eu_um_6.0 011E2819 Unknown Unknown Unknown hadam3p_eu_um_6.0 010E2287 Unknown Unknown Unknown hadam3p_eu_um_6.0 0117E7B2 Unknown Unknown Unknown hadam3p_eu_um_6.0 0117F2DA Unknown Unknown Unknown hadam3p_eu_um_6.0 00EF9BD2 Unknown Unknown Unknown hadam3p_eu_um_6.0 0123E638 Unknown Unknown Unknown kernel32.dll 7591EE1C Unknown Unknown Unknown ntdll.dll 773537EB Unknown Unknown Unknown ntdll.dll 773537BE Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4924, selfPID=2516, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish 22:36:34 (2516): No heartbeat from core client for 30 sec - exiting 22:36:35 (2516): No heartbeat from core client for 30 sec - exiting </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>hadam3p_eu_h3nv_2013_1_008858572_1_11.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_h3nv_2013_1_008858572_1_12.zip</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
03 Nov 2014 15:44:49 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 115,296 | 123,173 | 1.0683 |
03 Nov 2014 07:00:54 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 103,776 | 110,933 | 1.0690 |
02 Nov 2014 11:04:34 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 92,256 | 98,634 | 1.0691 |
02 Nov 2014 07:06:26 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 80,736 | 86,518 | 1.0716 |
31 Oct 2014 12:32:24 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 69,240 | 74,476 | 1.0756 |
30 Oct 2014 15:05:02 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 69,216 | 74,267 | 1.0730 |
26 Oct 2014 16:49:08 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 57,696 | 62,168 | 1.0775 |
25 Oct 2014 02:50:10 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 46,176 | 49,780 | 1.0780 |
23 Oct 2014 11:33:36 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 34,670 | 37,391 | 1.0785 |
21 Oct 2014 15:30:10 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 34,656 | 37,208 | 1.0736 |
21 Oct 2014 07:40:18 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 23,136 | 24,730 | 1.0689 |
19 Oct 2014 05:23:43 | 1322479 | 17213245 | hadam3p_eu_h3nv_2013_1_008858572_1 | 11,616 | 12,529 | 1.0786 |
©2024 climateprediction.net