Name | hadam3p_eu_xqa8_1991_1_006996072_0 |
Workunit | 7199388 |
Created | 24 Nov 2010, 10:40:28 UTC |
Sent | 20 Jan 2011, 10:25:44 UTC |
Report deadline | 2 Jan 2012, 15:45:44 UTC |
Received | 26 Jan 2011, 9:34:06 UTC |
Server state | Over |
Outcome | No reply |
Client state | Compute error |
Exit status | 194 (0x000000C2) EXIT_ABORTED_BY_CLIENT |
Computer ID | 1118002 |
Run time | 5 days 10 hours 19 min 21 sec |
CPU time | 3 days 23 hours 5 min 27 sec |
Validate state | Invalid |
Credit | 1,392.75 |
Device peak FLOPS | 1.97 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.08 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> Got ack for job that's till active </message> <stderr_txt> 11:33:45 (636): No heartbeat from core client for 30 sec - exiting 11:33:46 (636): No heartbeat from core client for 30 sec - exiting 11:33:47 (636): No heartbeat from core client for 30 sec - exiting 11:33:48 (636): No heartbeat from core client for 30 sec - exiting 11:33:49 (636): No heartbeat from core client for 30 sec - exiting 11:33:50 (636): No heartbeat from core client for 30 sec - exiting 11:33:51 (636): No heartbeat from core client for 30 sec - exiting 11:33:52 (636): No heartbeat from core client for 30 sec - exiting 11:33:54 (636): No heartbeat from core client for 30 sec - exiting 11:33:55 (636): No heartbeat from core client for 30 sec - exiting 11:33:56 (636): No heartbeat from core client for 30 sec - exiting 11:33:57 (636): No heartbeat from core client for 30 sec - exiting 11:33:58 (636): No heartbeat from core client for 30 sec - exiting 11:33:59 (636): No heartbeat from core client for 30 sec - exiting 11:34:00 (636): No heartbeat from core client for 30 sec - exiting 11:34:01 (636): No heartbeat from core client for 30 sec - exiting 11:34:02 (636): No heartbeat from core client for 30 sec - exiting 11:34:03 (636): No heartbeat from core client for 30 sec - exiting 11:34:05 (636): No heartbeat from core client for 30 sec - exiting 11:34:06 (636): No heartbeat from core client for 30 sec - exiting 11:34:07 (636): No heartbeat from core client for 30 sec - exiting 11:34:08 (636): No heartbeat from core client for 30 sec - exiting 11:34:09 (636): No heartbeat from core client for 30 sec - exiting 11:34:10 (636): No heartbeat from core client for 30 sec - exiting 11:34:11 (636): No heartbeat from core client for 30 sec - exiting 11:34:12 (636): No heartbeat from core client for 30 sec - exiting 11:34:13 (636): No heartbeat from core client for 30 sec - exiting 11:34:14 (636): No heartbeat from core client for 30 sec - exiting 11:34:15 (636): No heartbeat from core client for 30 sec - exiting 11:34:17 (636): No heartbeat from core client for 30 sec - exiting 11:34:18 (636): No heartbeat from core client for 30 sec - exiting 11:34:19 (636): No heartbeat from core client for 30 sec - exiting 11:34:20 (636): No heartbeat from core client for 30 sec - exiting 11:34:21 (636): No heartbeat from core client for 30 sec - exiting 11:34:22 (636): No heartbeat from core client for 30 sec - exiting 11:34:23 (636): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2668, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4028, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7908, selfPID=5792, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6412, selfPID=5920, iMonCtr=1 Model crash detected, will try to restart... 18:12:35 (3072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:12:38 (3072): No heartbeat from core client for 30 sec - exiting 18:12:39 (3072): No heartbeat from core client for 30 sec - exiting 18:12:41 (3072): No heartbeat from core client for 30 sec - exiting Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7200, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7632, selfPID=6960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6204, selfPID=5696, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6700, selfPID=5652, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3952, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5948, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4740, selfPID=4540, iMonCtr=1 Model crash detected, will try to restart... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/xaakm.pipe_dummy 2048 Leaving CPDN_Main::Monitor... zip error: Output file write failure (write error on zip file) 04:33:10 (6100): called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 Jan 2011 18:18:06 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 80,736 | 308,957 | 3.8268 |
25 Jan 2011 00:02:10 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 69,216 | 264,601 | 3.8228 |
24 Jan 2011 06:48:47 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 57,696 | 220,810 | 3.8271 |
23 Jan 2011 15:01:49 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 46,176 | 176,661 | 3.8258 |
22 Jan 2011 20:54:44 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 34,656 | 133,074 | 3.8399 |
22 Jan 2011 05:05:17 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 23,136 | 89,227 | 3.8566 |
21 Jan 2011 14:43:45 | 1118002 | 12280835 | hadam3p_eu_xqa8_1991_1_006996072_0 | 11,616 | 44,840 | 3.8602 |
©2024 cpdn.org