Name | hadam3p_pnw_6xkx_2000_1_007592105_1 |
Workunit | 7770235 |
Created | 23 Dec 2011, 11:54:20 UTC |
Sent | 23 Dec 2011, 12:03:43 UTC |
Report deadline | 4 Dec 2012, 17:23:43 UTC |
Received | 12 Feb 2012, 9:21:52 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 725427 |
Run time | 10 days 14 hours 33 min 3 sec |
CPU time | 8 days 1 hours 52 min 12 sec |
Validate state | Invalid |
Credit | 2,755.56 |
Device peak FLOPS | 2.16 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> 13:05:01 (7096): No heartbeat from core client for 30 sec - exiting 13:05:02 (7096): No heartbeat from core client for 30 sec - exiting 13:05:03 (7096): No heartbeat from core client for 30 sec - exiting 13:05:04 (7096): No heartbeat from core client for 30 sec - exiting 13:05:05 (7096): No heartbeat from core client for 30 sec - exiting 13:05:06 (7096): No heartbeat from core client for 30 sec - exiting 13:05:07 (7096): No heartbeat from core client for 30 sec - exiting 13:05:08 (7096): No heartbeat from core client for 30 sec - exiting 13:05:09 (7096): No heartbeat from core client for 30 sec - exiting 13:05:10 (7096): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10164, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9292, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10172, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9400, iMonCtr=2 Leaving CPDN_Main::Monitor... 21:48:33 (3408): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1264, selfPID=1264, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6208, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4660, iMonCtr=2 Model crash detected, will try to restart... 08:45:12 (4508): No heartbeat from core client for 30 sec - exiting 08:45:13 (4508): No heartbeat from core client for 30 sec - exiting 08:45:14 (4508): No heartbeat from core client for 30 sec - exiting 08:45:15 (4508): No heartbeat from core client for 30 sec - exiting 08:45:16 (4508): No heartbeat from core client for 30 sec - exiting 08:45:17 (4508): No heartbeat from core client for 30 sec - exiting 08:45:18 (4508): No heartbeat from core client for 30 sec - exiting 08:45:19 (4508): No heartbeat from core client for 30 sec - exiting 08:45:20 (4508): No heartbeat from core client for 30 sec - exiting 08:45:21 (4508): No heartbeat from core client for 30 sec - exiting 08:45:22 (4508): No heartbeat from core client for 30 sec - exiting 08:45:23 (4508): No heartbeat from core client for 30 sec - exiting 08:45:24 (4508): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2564, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3556, selfPID=6852, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8936, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=10004, iMonCtr=2 Model crash detected, will try to restart... GCobal Wooketroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6132, iMonCtr=2 Model crash detected, will try to restart... r:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=2 Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9268, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 1 CGntroller:: CPDN procels bas not running, exiting, bRetVal = 1, checkPID=0, selfPID=6752, iMonCtr=2 Model crash detected, will try to restart... l Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7236, iMonCtr=2 Leaving CPDN_Main::Monitor... 00:27:31 (6656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5140, selfPID=5140, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6464, selfPID=3468, iMonCtr=1 Model crash detected, will try to restart... 17:22:55 (5668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:22:56 (5668): No heartbeat from core client for 30 sec - exiting 17:22:57 (5668): No heartbeat from core client for 30 sec - exiting 17:22:59 (5668): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3096, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6724, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6600, selfPID=6600, iMonCtr=2 GCobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=944, iMonCtr=2 ontroller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6192, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9332, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 5 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6604, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7060, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4508, selfPID=9560, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 5 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5896, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6836, selfPID=5668, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5664, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7068, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 5 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6740, iMonCtr=2 Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 6 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4580, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7764, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 6 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1056, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7140, selfPID=6776, iMonCtr=1 Model crash detected, will try to restart... Colobal Workerro:llerN prCceDNs is noss runnint , exiting ,ebRetVal b,RceecVal = 01 s clfhID=6876=0 iMonCfPr=2 3452, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=912, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7920, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 7 22:58:11 (3784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7648, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8596, selfPID=7432, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8452, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8240, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 8 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8108, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2592, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3716, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5356, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3552, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 9 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3288, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1144, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 9 09:05:05 (5756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7064, iMonCtr=2 10:14:46 (2412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=2 Mode l crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 10 23:17:39 (2988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6148, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Coltrolll WorCPDN p:: CsP DN not rcessinis not runn,ing, exiting,, bRcePID=l = selfPheckPI4D =0, selfPI Dodel cr iMondCtr=2ed , will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2412, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 11 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_6xkx_2000_1_007592105/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadam3p_pnw_6xkx_2000_1_007592105/dataout/region_restart.day after 11 attempts Model crashed: READHIST: End of file in READ from history file for namelist NLIHISTO tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_pnw_6xkx_2000_1_007592105_1_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
11 Feb 2012 19:18:21 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 126,816 | 682,608 | 5.3827 |
29 Jan 2012 12:31:51 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 115,296 | 620,773 | 5.3842 |
23 Jan 2012 13:06:17 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 103,776 | 559,699 | 5.3933 |
21 Jan 2012 00:03:25 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 92,266 | 498,783 | 5.4059 |
20 Jan 2012 23:03:06 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 92,256 | 497,955 | 5.3975 |
14 Jan 2012 20:22:23 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 80,736 | 436,394 | 5.4052 |
08 Jan 2012 17:37:04 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 69,216 | 374,252 | 5.4070 |
02 Jan 2012 16:31:36 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 57,696 | 311,128 | 5.3925 |
31 Dec 2011 19:11:58 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 46,176 | 250,536 | 5.4257 |
30 Dec 2011 09:36:20 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 34,661 | 190,319 | 5.4909 |
30 Dec 2011 00:32:30 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 34,656 | 189,496 | 5.4679 |
28 Dec 2011 14:41:12 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 23,136 | 126,939 | 5.4866 |
26 Dec 2011 09:05:12 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 11,618 | 64,255 | 5.5306 |
25 Dec 2011 23:53:23 | 725427 | 13812763 | hadam3p_pnw_6xkx_2000_1_007592105_1 | 11,616 | 63,507 | 5.4672 |
©2024 cpdn.org