Name | hadam3p_eu_2qvl_1976_1_007296681_0 |
Workunit | 7494105 |
Created | 15 Jun 2011, 19:23:15 UTC |
Sent | 15 Jun 2011, 19:23:25 UTC |
Report deadline | 28 May 2012, 0:43:25 UTC |
Received | 17 Oct 2011, 4:22:17 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT |
Computer ID | 1145250 |
Run time | 6 days 1 hours 32 min 42 sec |
CPU time | 2 days 13 hours 23 min 9 sec |
Validate state | Invalid |
Credit | 1,194.02 |
Device peak FLOPS | 2.92 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.60</core_client_version> <![CDATA[ <message> Got ack for job that's till active </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2852, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=436, selfPID=3788, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1040, selfPID=1040, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5284, selfPID=4228, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4728, selfPID=4728, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 03:01:26 (3588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:01:29 (3588): No heartbeat from core client for 30 sec - exiting 03:01:30 (3588): No heartbeat from core client for 30 sec - exiting 03:01:31 (3588): No heartbeat from core client for 30 sec - exiting 03:01:32 (3588): No heartbeat from core client for 30 sec - exiting 03:01:33 (3588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4224, selfPID=4868, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7104, iMonCtr=2 01:14:08 (4664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:14:09 (4664): No heartbeat from core client for 30 sec - exiting 01:14:10 (4664): No heartbeat from core client for 30 sec - exiting 01:14:11 (4664): No heartbeat from core client for 30 sec - exiting 01:14:12 (4664): No heartbeat from core client for 30 sec - exiting 01:14:13 (4664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4824, selfPID=4824, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3932, selfPID=3932, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2828, selfPID=2828, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:31:57 (3596): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:31:59 (3596): No heartbeat from core client for 30 sec - exiting 12:32:00 (3596): No heartbeat from core client for 30 sec - exiting 12:32:01 (3596): No heartbeat from core client for 30 sec - exiting 12:32:57 (3600): No heartbeat from core client for 30 sec - exiting 12:32:58 (3600): No heartbeat from core client for 30 sec - exiting 12:32:59 (3600): No heartbeat from core client for 30 sec - exiting 12:33:00 (3600): No heartbeat from core client for 30 sec - exiting 12:33:01 (3600): No heartbeat from core client for 30 sec - exiting 12:33:02 (3600): No heartbeat from core client for 30 sec - exiting 12:33:03 (3600): No heartbeat from core client for 30 sec - exiting 12:33:04 (3600): No heartbeat from core client for 30 sec - exiting 12:33:05 (3600): No heartbeat from core client for 30 sec - exiting 12:33:06 (3600): No heartbeat from core client for 30 sec - exiting 12:33:07 (3600): No heartbeat from core client for 30 sec - exiting 12:33:08 (3600): No heartbeat from core client for 30 sec - exiting 12:33:09 (3600): No heartbeat from core client for 30 sec - exiting 12:33:10 (3600): No heartbeat from core client for 30 sec - exiting 12:33:11 (3600): No heartbeat from core client for 30 sec - exiting 12:33:12 (3600): No heartbeat from core client for 30 sec - exiting 12:33:13 (3600): No heartbeat from core client for 30 sec - exiting 12:33:14 (3600): No heartbeat from core client for 30 sec - exiting 12:33:15 (3600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:34:03 (5960): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:34:05 (5960): No heartbeat from core client for 30 sec - exiting 13:34:06 (5960): No heartbeat from core client for 30 sec - exiting 13:34:07 (5960): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7720, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3676, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9028, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8988, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2620, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4360, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2160, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=2 Model crash detected, will try to restart... 17:54:51 (3648): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:55:06 (3012): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:55:07 (3012): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6232, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=15272, selfPID=15272, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5500, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1264, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=8708, selfPID=8708, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=10320, selfPID=10448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2364, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4456, selfPID=4456, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4328, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2612, selfPID=2612, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4260, selfPID=4260, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=532, selfPID=4340, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... GCobal Worker:: CPDN process is not ronning, exi:in: CPDN process, c eis not r selfPID=4344, iMonCtr=2 = 1, checkPID=0, selfPID=4424, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4172, selfPID=4172, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3600, iMonCtr=2 Model crash detected, will try to restart... 00:35:09 (4036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:35:10 (4036): No heartbeat from core client for 30 sec - exiting 00:35:11 (4036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4836, selfPID=4836, iMonCtr=2 CPDN Monitor - Quit request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4840, selfPID=3828, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3116, selfPID=4672, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=564, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4464, selfPID=1192, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4220, selfPID=4220, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4976, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3608, selfPID=3608, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3460, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4792, iMonCtr= 2 del crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4488, selfPID=4772, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=704, selfPID=704, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... ContrCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 04:51:17 (2120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:16:45 (3592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:48:11 (4480): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 20:26:27 (4824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:26:32 (4824): No heartbeat from core client for 30 sec - exiting 20:26:33 (4824): No heartbeat from core client for 30 sec - exiting 20:26:34 (4824): No heartbeat from core client for 30 sec - exiting 20:26:35 (4824): No heartbeat from core client for 30 sec - exiting 20:26:36 (4824): No heartbeat from core client for 30 sec - exiting 20:26:37 (4824): No heartbeat from core client for 30 sec - exiting 20:26:38 (4824): No heartbeat from core client for 30 sec - exiting 20:26:39 (4824): No heartbeat from core client for 30 sec - exiting 20:28:59 (2376): No heartbeat from core client for 30 sec - exiting 20:29:00 (2376): No heartbeat from core client for 30 sec - exiting 20:29:01 (2376): No heartbeat from core client for 30 sec - exiting 20:29:02 (2376): No heartbeat from core client for 30 sec - exiting 20:29:03 (2376): No heartbeat from core client for 30 sec - exiting 20:29:04 (2376): No heartbeat from core client for 30 sec - exiting 20:29:05 (2376): No heartbeat from core client for 30 sec - exiting 20:29:06 (2376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4332, selfPID=3348, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Signal 11 received, exiting... Called boinc_finish Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2384, selfPID=2384, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2384, selfPID=1380, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... zip error: Nothing to do! (../_1.zip) Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
10 Oct 2011 04:24:24 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 69,216 | 213,402 | 3.0831 |
24 Sep 2011 02:47:58 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 57,701 | 178,426 | 3.0923 |
23 Sep 2011 21:23:31 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 57,697 | 177,886 | 3.0831 |
23 Sep 2011 05:08:48 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 57,696 | 177,427 | 3.0752 |
07 Sep 2011 10:13:57 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 46,176 | 142,814 | 3.0928 |
27 Aug 2011 17:33:05 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 34,656 | 107,985 | 3.1159 |
08 Aug 2011 22:51:52 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 23,155 | 73,009 | 3.1531 |
08 Aug 2011 21:49:08 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 23,152 | 72,511 | 3.1320 |
08 Aug 2011 10:10:36 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 23,148 | 72,019 | 3.1112 |
08 Aug 2011 05:53:24 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 23,139 | 71,487 | 3.0895 |
07 Aug 2011 13:29:19 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 23,136 | 70,995 | 3.0686 |
25 Jul 2011 19:15:09 | 1145250 | 12982243 | hadam3p_eu_2qvl_1976_1_007296681_0 | 11,616 | 36,115 | 3.1091 |
©2024 cpdn.org