Name | hadam3p_pnw_q7u2_2035_1_008368814_0 |
Workunit | 8519673 |
Created | 14 May 2013, 12:32:12 UTC |
Sent | 14 May 2013, 12:33:24 UTC |
Report deadline | 26 Apr 2014, 17:53:24 UTC |
Received | 11 Jun 2013, 8:28:33 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1115566 |
Run time | 4 days 12 hours 17 min 21 sec |
CPU time | 2 days 23 hours 33 min 4 sec |
Validate state | Invalid |
Credit | 753.03 |
Device peak FLOPS | 0.93 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> 08:06:22 (3984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:06:24 (3984): No heartbeat from core client for 30 sec - exiting 09:16:22 (5472): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:25:25 (2804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:25:27 (2804): No heartbeat from core client for 30 sec - exiting 10:25:28 (2804): No heartbeat from core client for 30 sec - exiting 10:25:29 (2804): No heartbeat from core client for 30 sec - exiting 10:25:30 (2804): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 10:59:46 (1664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:07:53 (588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:07:54 (588): No heartbeat from core client for 30 sec - exiting 12:07:55 (588): No heartbeat from core client for 30 sec - exiting 13:16:08 (2832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:16:09 (2832): No heartbeat from core client for 30 sec - exiting 13:16:10 (2832): No heartbeat from core client for 30 sec - exiting 13:16:11 (2832): No heartbeat from core client for 30 sec - exiting 13:16:12 (2832): No heartbeat from core client for 30 sec - exiting 13:16:13 (2832): No heartbeat from core client for 30 sec - exiting 13:50:01 (460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:23:58 (5920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:57:48 (5312): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5268, selfPID=5268, iMonCtr=2 15:31:39 (5916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:45:15 (2600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:19:07 (4788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:00:16 (5988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:35:53 (3732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:46:00 (3276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2092, selfPID=5032, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 0 10:21:59 (1036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:22:01 (1036): No heartbeat from core client for 30 sec - exiting 11:35:24 (4248): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1224, selfPID=1224, iMonCtr=2 12:49:53 (6060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:24:50 (5868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:33:22 (396): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3652, selfPID=3652, iMonCtr=2 CPDN Monitor - Quit request from BOINC... 17:01:14 (5744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:53:28 (5968): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:45:07 (5516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:45:09 (5516): No heartbeat from core client for 30 sec - exiting 18:45:10 (5516): No heartbeat from core client for 30 sec - exiting 18:45:11 (5516): No heartbeat from core client for 30 sec - exiting 18:45:12 (5516): No heartbeat from core client for 30 sec - exiting 19:37:05 (3632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:29:05 (2288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:06:16 (5844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6132, selfPID=6132, iMonCtr=2 21:55:53 (7892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1084, selfPID=1084, iMonCtr=2 22:43:48 (4204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:45:25 (4644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:34:13 (5744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:10:45 (5796): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:55:20 (5664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6096, selfPID=6096, iMonCtr=2 20:36:24 (5656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:13:40 (5376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:50:00 (4020): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:06:10 (4628): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:31:54 (4972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5304, selfPID=5304, iMonCtr=2 12:40:35 (4764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:23:32 (5468): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:06:39 (7132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:49:44 (2404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5776, selfPID=5776, iMonCtr=2 15:33:20 (5836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:16:21 (6376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:42:39 (7152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:18:01 (6212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4520, selfPID=4520, iMonCtr=2 CPDN Monitor - Quit request from BOINC... 18:42:50 (5212): No heartbeat from core client for 30 sec - exiting 18:42:51 (5212): No heartbeat from core client for 30 sec - exiting 18:42:52 (5212): No heartbeat from core client for 30 sec - exiting 18:42:53 (5212): No heartbeat from core client for 30 sec - exiting 18:42:54 (5212): No heartbeat from core client for 30 sec - exiting 18:42:56 (5212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:49:34 (4128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:54:50 (5444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:59:59 (5724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:05:11 (3872): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:05:12 (3872): No heartbeat from core client for 30 sec - exiting 00:37:46 (684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:02:28 (7044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:36:19 (8056): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:00:25 (5176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:32:54 (6908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:38:14 (6496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=7416, selfPID=7416, iMonCtr=2 18:10:53 (2644): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:54:21 (6268): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:59:30 (6908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3436, selfPID=3436, iMonCtr=2 22:32:32 (6840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:05:15 (7016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:37:40 (6668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:10:28 (1152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 10:54:49 (5868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:54:50 (5868): No heartbeat from core client for 30 sec - exiting 10:54:51 (5868): No heartbeat from core client for 30 sec - exiting 10:54:52 (5868): No heartbeat from core client for 30 sec - exiting 10:54:54 (5868): No heartbeat from core client for 30 sec - exiting 10:54:55 (5868): No heartbeat from core client for 30 sec - exiting 10:54:56 (5868): No heartbeat from core client for 30 sec - exiting 13:10:29 (7460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:20:08 (5040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:02:12 (3908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:35:13 (7708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5356, selfPID=5356, iMonCtr=2 17:08:53 (5908): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:42:08 (7700): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:15:28 (6440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:56:01 (7604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:29:24 (8112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5912, selfPID=1996, iMonCtr=1 Model crash detected, will try to restart... 18:11:11 (4640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:46:31 (3656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:31:50 (2896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 16:26:55 (5660): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:15:56 (5248): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:17:46 (6828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:17:47 (6828): No heartbeat from core client for 30 sec - exiting </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
08 Jun 2013 15:41:36 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 34,656 | 202,576 | 5.8453 |
01 Jun 2013 16:07:37 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 23,142 | 131,012 | 5.6612 |
01 Jun 2013 15:52:11 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 23,137 | 129,975 | 5.6176 |
01 Jun 2013 15:16:58 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 23,136 | 129,012 | 5.5762 |
19 May 2013 20:24:52 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 11,619 | 64,580 | 5.5581 |
19 May 2013 19:39:08 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 11,617 | 63,798 | 5.4918 |
19 May 2013 18:47:46 | 1115566 | 15783597 | hadam3p_pnw_q7u2_2035_1_008368814_0 | 11,616 | 63,104 | 5.4325 |
©2024 cpdn.org