Name | hadcm3n_z9uj_1880_40_008249839_2 |
Workunit | 8404963 |
Created | 22 Nov 2012, 2:40:58 UTC |
Sent | 22 Nov 2012, 2:42:16 UTC |
Report deadline | 21 Feb 2013, 10:09:27 UTC |
Received | 10 Jan 2013, 21:36:07 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1183189 |
Run time | 9 days 2 hours 45 min 16 sec |
CPU time | 8 days 18 hours 23 min 1 sec |
Validate state | Invalid |
Credit | 4,354.56 |
Device peak FLOPS | 2.35 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:16:42 (9724): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:27:09 (18892): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=19296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5804, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12816, iMonCtr=1 Model crash detected, will try to restart... 23:04:43 (4708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:03:34 (9012): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:02:23 (12472): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:01:16 (17140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=23256, iMonCtr=1 Model crash detected, will try to restart... 20:52:20 (4384): No heartbeat from core client for 30 sec - exiting 20:52:21 (4384): No heartbeat from core client for 30 sec - exiting 20:52:23 (4384): No heartbeat from core client for 30 sec - exiting 20:52:24 (4384): No heartbeat from core client for 30 sec - exiting 20:52:25 (4384): No heartbeat from core client for 30 sec - exiting 20:52:26 (4384): No heartbeat from core client for 30 sec - exiting 20:52:27 (4384): No heartbeat from core client for 30 sec - exiting 20:52:28 (4384): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5772, iMonCtr=1 Model crash detected, will try to restart... 15:51:12 (8764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:38:33 (9068): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7752, iMonCtr=1 Model crash detected, will try to restart... 21:03:38 (20200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:02:29 (21136): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:01:25 (25720): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:00:15 (30788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:59:18 (36712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:58:08 (42596): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:21:27 (7372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:20:23 (10612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:19:19 (14164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:58:46 (14016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:57:46 (11228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:42:46 (5180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:41:41 (11328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:40:39 (15356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=23136, iMonCtr=1 Model crash detected, will try to restart... 00:36:11 (3972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:35:06 (10604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:33:59 (10788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:32:58 (20316): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:31:57 (25752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=1 Model crash detected, will try to restart... 03:37:47 (5812): No heartbeat from core client for 30 sec - exiting 03:37:48 (5812): No heartbeat from core client for 30 sec - exiting 03:37:49 (5812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:43:18 (2500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
09 Jan 2013 23:26:58 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 362,880 | 743,616 | 2.0492 |
20 Dec 2012 01:49:41 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 336,960 | 691,815 | 2.0531 |
19 Dec 2012 11:07:24 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 311,040 | 637,614 | 2.0499 |
17 Dec 2012 23:35:04 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 285,120 | 583,939 | 2.0480 |
17 Dec 2012 09:44:53 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 259,200 | 533,267 | 2.0574 |
16 Dec 2012 03:03:22 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 233,280 | 482,660 | 2.0690 |
14 Dec 2012 00:23:36 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 207,360 | 432,258 | 2.0846 |
13 Dec 2012 17:35:07 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 181,440 | 379,745 | 2.0930 |
13 Dec 2012 17:35:07 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 155,520 | 329,447 | 2.1184 |
13 Dec 2012 17:35:07 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 129,600 | 279,943 | 2.1601 |
13 Dec 2012 17:35:07 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 103,680 | 223,863 | 2.1592 |
29 Nov 2012 02:31:09 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 77,760 | 167,979 | 2.1602 |
28 Nov 2012 10:07:04 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 51,840 | 106,987 | 2.0638 |
26 Nov 2012 00:08:17 | 1183189 | 15450651 | hadcm3n_z9uj_1880_40_008249839_2 | 25,920 | 50,249 | 1.9386 |
©2024 cpdn.org