Name | hadam3p_eu_2rio_1993_1_008208687_0 |
Workunit | 8363811 |
Created | 3 Oct 2012, 17:09:39 UTC |
Sent | 3 Oct 2012, 17:09:55 UTC |
Report deadline | 15 Sep 2013, 22:29:55 UTC |
Received | 7 Nov 2012, 21:40:30 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1092573 |
Run time | 3 days 10 hours 43 min 24 sec |
CPU time | 2 days 15 hours 56 min 39 sec |
Validate state | Invalid |
Credit | 1,591.55 |
Device peak FLOPS | 2.81 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5484, selfPID=5016, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3444, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5296, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... 21:06:09 (4440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:06:10 (4440): No heartbeat from core client for 30 sec - exiting 21:06:11 (4440): No heartbeat from core client for 30 sec - exiting 21:06:12 (4440): No heartbeat from core client for 30 sec - exiting 21:06:13 (4440): No heartbeat from core client for 30 sec - exiting 21:01:08 (2172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1924, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4100, selfPID=4696, iMonCtr=1 Model crash detected, will try to restart... 07:30:57 (5212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:01:04 (4368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:43:45 (5008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1584, selfPID=4644, iMonCtr=1 Model crash detected, will try to restart... 21:01:31 (4880): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:06:34 (7172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1640, selfPID=1640, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7160, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... 07:44:31 (5188): No heartbeat from core client for 30 sec - exiting 07:44:32 (5188): No heartbeat from core client for 30 sec - exiting 07:44:33 (5188): No heartbeat from core client for 30 sec - exiting 07:44:34 (5188): No heartbeat from core client for 30 sec - exiting 07:44:35 (5188): No heartbeat from core client for 30 sec - exiting 07:44:37 (5188): No heartbeat from core client for 30 sec - exiting 07:44:38 (5188): No heartbeat from core client for 30 sec - exiting 07:44:39 (5188): No heartbeat from core client for 30 sec - exiting 07:44:40 (5188): No heartbeat from core client for 30 sec - exiting 07:44:41 (5188): No heartbeat from core client for 30 sec - exiting 07:44:42 (5188): No heartbeat from core client for 30 sec - exiting 07:44:43 (5188): No heartbeat from core client for 30 sec - exiting 07:44:44 (5188): No heartbeat from core client for 30 sec - exiting 07:44:45 (5188): No heartbeat from core client for 30 sec - exiting 07:44:46 (5188): No heartbeat from core client for 30 sec - exiting 07:44:47 (5188): No heartbeat from core client for 30 sec - exiting 07:44:49 (5188): No heartbeat from core client for 30 sec - exiting 07:44:50 (5188): No heartbeat from core client for 30 sec - exiting 07:44:51 (5188): No heartbeat from core client for 30 sec - exiting 07:44:52 (5188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2272, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5100, selfPID=3308, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5004, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5968, selfPID=928, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3304, selfPID=5328, iMonCtr=1 Model crash detected, will try to restart... 10:32:34 (4352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1092, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4400, selfPID=2832, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4604, iMonCtr=2 Model crash detected, will try to restart... 21:01:01 (4628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:37:22 (4948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6068, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2440, iMonCtr=2 Model crash detected, will try to restart... </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_2rio_1993_1_008208687_0_9.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2rio_1993_1_008208687_0_10.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2rio_1993_1_008208687_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_2rio_1993_1_008208687_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
06 Nov 2012 12:01:04 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 92,260 | 215,300 | 2.3336 |
06 Nov 2012 07:05:27 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 92,256 | 214,942 | 2.3298 |
03 Nov 2012 21:18:24 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 80,736 | 188,169 | 2.3307 |
03 Nov 2012 12:40:19 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 69,216 | 161,680 | 2.3359 |
01 Nov 2012 09:56:44 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 57,696 | 134,419 | 2.3298 |
29 Oct 2012 20:28:34 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 46,176 | 107,572 | 2.3296 |
27 Oct 2012 22:32:52 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 34,656 | 80,774 | 2.3307 |
26 Oct 2012 18:44:41 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 23,136 | 54,115 | 2.3390 |
25 Oct 2012 19:26:35 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 11,619 | 27,771 | 2.3901 |
25 Oct 2012 09:39:55 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 11,617 | 27,413 | 2.3597 |
25 Oct 2012 06:54:38 | 1092573 | 15324458 | hadam3p_eu_2rio_1993_1_008208687_0 | 11,616 | 27,071 | 2.3305 |
©2025 cpdn.org