Name | hadsm3dhet2_jrm3_006598781_4 |
Workunit | 6802154 |
Created | 15 Mar 2010, 12:05:36 UTC |
Sent | 25 Jun 2010, 16:08:17 UTC |
Report deadline | 7 Jun 2011, 21:28:17 UTC |
Received | 21 Jul 2010, 21:05:25 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1051426 |
Run time | 6 days 17 hours 33 min 42 sec |
CPU time | 5 days 12 hours 59 min 9 sec |
Validate state | Invalid |
Credit | 1,488.65 |
Device peak FLOPS | 1.33 GFLOPS |
Application version | UK Met Office HadSM3 Slab Model v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1176, iMonCtr=1 Model crash detected, will try to restart... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5380, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5380, iMonCtr=1 Model crash detected, will try to restart... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=1 Model crash detected, will try to restart... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
17 Jul 2010 10:05:03 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 162,030 | 459,345 | 2.8349 |
16 Jul 2010 01:27:44 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 151,228 | 417,912 | 2.7635 |
14 Jul 2010 08:57:41 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 140,426 | 377,840 | 2.6907 |
12 Jul 2010 06:46:48 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 129,624 | 336,795 | 2.5982 |
09 Jul 2010 01:17:50 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 118,822 | 296,450 | 2.4949 |
08 Jul 2010 06:13:34 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 108,020 | 255,373 | 2.3641 |
07 Jul 2010 08:15:33 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 97,218 | 213,957 | 2.2008 |
06 Jul 2010 03:34:14 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 86,416 | 186,520 | 2.1584 |
02 Jul 2010 10:44:18 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 75,614 | 163,731 | 2.1654 |
02 Jul 2010 04:51:03 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 64,812 | 141,998 | 2.1909 |
01 Jul 2010 13:49:08 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 54,010 | 121,126 | 2.2427 |
01 Jul 2010 06:39:37 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 43,208 | 97,794 | 2.2633 |
01 Jul 2010 00:08:51 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 32,406 | 75,884 | 2.3417 |
30 Jun 2010 09:35:54 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 21,604 | 51,432 | 2.3807 |
27 Jun 2010 13:54:06 | 1051426 | 11049542 | hadsm3dhet2_jrm3_006598781_4 | 10,802 | 25,869 | 2.3948 |
©2024 cpdn.org