Name | hadam3p_eu_2kgx_1991_1_007380896_0 |
Workunit | 7578326 |
Created | 31 Jul 2011, 15:48:01 UTC |
Sent | 3 Aug 2011, 21:09:42 UTC |
Report deadline | 16 Jul 2012, 2:29:42 UTC |
Received | 17 Sep 2011, 7:48:38 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 725427 |
Run time | 6 days 19 hours 8 min 18 sec |
CPU time | 4 days 11 hours 2 min 46 sec |
Validate state | Invalid |
Credit | 1,392.75 |
Device peak FLOPS | 2.06 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6332, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7540, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 10:12:07 (5396): No heartbeat from core client for 30 sec - exiting 10:12:08 (5396): No heartbeat from core client for 30 sec - exiting 10:12:09 (5396): No heartbeat from core client for 30 sec - exiting 10:12:10 (5396): No heartbeat from core client for 30 sec - exiting 10:12:11 (5396): No heartbeat from core client for 30 sec - exiting 10:12:12 (5396): No heartbeat from core client for 30 sec - exiting 10:12:13 (5396): No heartbeat from core client for 30 sec - exiting 10:12:14 (5396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1368, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9108, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5020, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7052, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7432, selfPID=8232, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4648, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=2 Model crash detected, will try to restart... CGntroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7004, iMonCtr=2 Model crash detected, will try to restart... lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10152, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5472, selfPID=5940, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9600, iMonCtr=2 Model crash detected, will try to restart... lobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7816, iMonCtr=2 Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3224, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2448, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5424, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8836, selfPID=10128, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7232, selfPID=8424, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7660, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7572, iMonCtr=2 Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8876, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6232, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6692, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9268, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=156, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 08:58:46 (4688): No heartbeat from core client for 30 sec - exiting 08:58:47 (4688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6696, iMonCtr=2 Model crash detected, will try to restart... 08:01:15 (1468): No heartbeat from core client for 30 sec - exiting 08:01:16 (1468): No heartbeat from core client for 30 sec - exiting 08:01:17 (1468): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3652, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3996, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=816, selfPID=3988, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7080, selfPID=7704, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7004, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10088, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=9688, selfPID=9628, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6020, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6528, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2804, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5760, iMonCtr=2 Model crash detected, will try to restart... 23:21:11 (4612): No heartbeat from core client for 30 sec - exiting 23:21:12 (4612): No heartbeat from core client for 30 sec - exiting 23:21:13 (4612): No heartbeat from core client for 30 sec - exiting 23:21:14 (4612): No heartbeat from core client for 30 sec - exiting 23:21:15 (4612): No heartbeat from core client for 30 sec - exiting 23:21:16 (4612): No heartbeat from core client for 30 sec - exiting 23:21:17 (4612): No heartbeat from core client for 30 sec - exiting 23:21:18 (4612): No heartbeat from core client for 30 sec - exiting 23:21:19 (4612): No heartbeat from core client for 30 sec - exiting 23:21:20 (4612): No heartbeat from core client for 30 sec - exiting 23:21:21 (4612): No heartbeat from core client for 30 sec - exiting 23:21:22 (4612): No heartbeat from core client for 30 sec - exiting 23:21:23 (4612): No heartbeat from core client for 30 sec - exiting 23:21:24 (4612): No heartbeat from core client for 30 sec - exiting 23:21:25 (4612): No heartbeat from core client for 30 sec - exiting 23:21:27 (4612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7756, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6636, iMonCtr=2 08:22:35 (5996): No heartbeat from core client for 30 sec - exiting 08:22:38 (5996): No heartbeat from core client for 30 sec - exiting 08:22:40 (5996): No heartbeat from core client for 30 sec - exiting 08:22:41 (5996): No heartbeat from core client for 30 sec - exiting 08:22:42 (5996): No heartbeat from core client for 30 sec - exiting 08:22:43 (5996): No heartbeat from core client for 30 sec - exiting 08:22:44 (5996): No heartbeat from core client for 30 sec - exiting 08:22:45 (5996): No heartbeat from core client for 30 sec - exiting 08:22:46 (5996): No heartbeat from core client for 30 sec - exiting 08:22:48 (5996): No heartbeat from core client for 30 sec - exiting 08:22:50 (5996): No heartbeat from core client for 30 sec - exiting 08:22:51 (5996): No heartbeat from core client for 30 sec - exiting 08:22:54 (5996): No heartbeat from core client for 30 sec - exiting 08:22:55 (5996): No heartbeat from core client for 30 sec - exiting 08:22:56 (5996): No heartbeat from core client for 30 sec - exiting 08:22:57 (5996): No heartbeat from core client for 30 sec - exiting 08:22:58 (5996): No heartbeat from core client for 30 sec - exiting 08:22:59 (5996): No heartbeat from core client for 30 sec - exiting 08:23:00 (5996): No heartbeat from core client for 30 sec - exiting 08:23:01 (5996): No heartbeat from core client for 30 sec - exiting 08:23:02 (5996): No heartbeat from core client for 30 sec - exiting 08:23:03 (5996): No heartbeat from core client for 30 sec - exiting 08:23:04 (5996): No heartbeat from core client for 30 sec - exiting 08:23:05 (5996): No heartbeat from core client for 30 sec - exiting 08:23:07 (5996): No heartbeat from core client for 30 sec - exiting 08:23:08 (5996): No heartbeat from core client for 30 sec - exiting 08:23:09 (5996): No heartbeat from core client for 30 sec - exiting 08:23:10 (5996): No heartbeat from core client for 30 sec - exiting 08:23:11 (5996): No heartbeat from core client for 30 sec - exiting 08:23:12 (5996): No heartbeat from core client for 30 sec - exiting 08:23:13 (5996): No heartbeat from core client for 30 sec - exiting 08:23:14 (5996): No heartbeat from core client for 30 sec - exiting 08:23:15 (5996): No heartbeat from core client for 30 sec - exiting 08:23:16 (5996): No heartbeat from core client for 30 sec - exiting 08:23:17 (5996): No heartbeat from core client for 30 sec - exiting 08:23:18 (5996): No heartbeat from core client for 30 sec - exiting 08:23:19 (5996): No heartbeat from core client for 30 sec - exiting 08:23:20 (5996): No heartbeat from core client for 30 sec - exiting 08:23:21 (5996): No heartbeat from core client for 30 sec - exiting 08:23:22 (5996): No heartbeat from core client for 30 sec - exiting 08:23:23 (5996): No heartbeat from core client for 30 sec - exiting 08:23:24 (5996): No heartbeat from core client for 30 sec - exiting 08:23:25 (5996): No heartbeat from core client for 30 sec - exiting 08:23:26 (5996): No heartbeat from core client for 30 sec - exiting 08:23:27 (5996): No heartbeat from core client for 30 sec - exiting 08:23:28 (5996): No heartbeat from core client for 30 sec - exiting 08:23:29 (5996): No heartbeat from core client for 30 sec - exiting 08:23:30 (5996): No heartbeat from core client for 30 sec - exiting 08:23:31 (5996): No heartbeat from core client for 30 sec - exiting 08:23:32 (5996): No heartbeat from core client for 30 sec - exiting 08:23:33 (5996): No heartbeat from core client for 30 sec - exiting 08:23:34 (5996): No heartbeat from core client for 30 sec - exiting 08:23:35 (5996): No heartbeat from core client for 30 sec - exiting 08:23:36 (5996): No heartbeat from core client for 30 sec - exiting 08:23:37 (5996): No heartbeat from core client for 30 sec - exiting 08:23:38 (5996): No heartbeat from core client for 30 sec - exiting 08:23:39 (5996): No heartbeat from core client for 30 sec - exiting 08:23:41 (5996): No heartbeat from core client for 30 sec - exiting 08:23:42 (5996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4940, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7512, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2960, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9012, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7272, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6780, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6288, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5112, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7128, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5712, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_2kgx_1991_1_007380896\tmp\xaakm.namelists Image PC Routine Line Source hadam3p_eu_um_6.0 00DEA39A Unknown Unknown Unknown hadam3p_eu_um_6.0 00D92CD0 Unknown Unknown Unknown hadam3p_eu_um_6.0 00D91E9A Unknown Unknown Unknown hadam3p_eu_um_6.0 00D72819 Unknown Unknown Unknown hadam3p_eu_um_6.0 00C72287 Unknown Unknown Unknown hadam3p_eu_um_6.0 00D0E7B2 Unknown Unknown Unknown hadam3p_eu_um_6.0 00D0F2DA Unknown Unknown Unknown hadam3p_eu_um_6.0 00A89BD2 Unknown Unknown Unknown hadam3p_eu_um_6.0 00DCE638 Unknown Unknown Unknown kernel32.dll 76634B29 Unknown Unknown Unknown ntdll.dll 77DAE1C6 Unknown Unknown Unknown ntdll.dll 77DAE199 Unknown Unknown Unknown forrtl: severe (24): end-of-file during read, unit 9, file C:\ProgramData\BOINC\projects\climateprediction.net\hadam3p_eu_2kgx_1991_1_007380896\tmp\xaakg.namelists Image PC Routine Line Source hadrm3p_eu_um_6.0 0064C52A Unknown Unknown Unknown hadrm3p_eu_um_6.0 005F4460 Unknown Unknown Unknown hadrm3p_eu_um_6.0 005F362A Unknown Unknown Unknown hadrm3p_eu_um_6.0 005D2469 Unknown Unknown Unknown hadrm3p_eu_um_6.0 004D66EB Unknown Unknown Unknown hadrm3p_eu_um_6.0 00572AE2 Unknown Unknown Unknown hadrm3p_eu_um_6.0 005735AF Unknown Unknown Unknown hadrm3p_eu_um_6.0 00319860 Unknown Unknown Unknown hadrm3p_eu_um_6.0 00630893 Unknown Unknown Unknown kernel32.dll 76634B29 Unknown Unknown Unknown ntdll.dll 77DAE1C6 Unknown Unknown Unknown ntdll.dll 77DAE199 Unknown Unknown Unknown Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6496, selfPID=7296, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_8.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2kgx_1991_1_007380896_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
14 Sep 2011 17:14:14 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 80,736 | 368,204 | 4.5606 |
11 Sep 2011 12:01:28 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 69,216 | 316,581 | 4.5738 |
29 Aug 2011 19:54:56 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 57,696 | 264,731 | 4.5884 |
28 Aug 2011 17:22:05 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 46,176 | 212,307 | 4.5978 |
27 Aug 2011 15:01:11 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 34,656 | 161,159 | 4.6502 |
25 Aug 2011 15:03:39 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 23,136 | 108,722 | 4.6993 |
07 Aug 2011 09:48:50 | 725427 | 13180153 | hadam3p_eu_2kgx_1991_1_007380896_0 | 11,616 | 56,202 | 4.8383 |
©2024 cpdn.org