Name | hadam3p_saf_12ar_1962_1_006902011_1 |
Workunit | 7105327 |
Created | 26 Mar 2011, 8:59:16 UTC |
Sent | 26 Mar 2011, 9:39:49 UTC |
Report deadline | 7 Mar 2012, 14:59:49 UTC |
Received | 15 Jul 2011, 16:53:50 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 1141103 |
Run time | 4 days 11 hours 36 min 21 sec |
CPU time | 3 days 9 hours 56 min 48 sec |
Validate state | Invalid |
Credit | 1,683.45 |
Device peak FLOPS | 2.28 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1596, selfPID=1596, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4056, selfPID=2856, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5920, selfPID=5920, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1384, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5052, selfPID=5052, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5360, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5484, selfPID=5972, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2680, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3776, selfPID=3584, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2904, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4668, selfPID=3596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5272, iMonCtr=2 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3248, selfPID=2992, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3600, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3988, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 16:54:32 (2624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3020, selfPID=3020, iMonCtr=2 CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2756, selfPID=2756, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3256, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5772, selfPID=5772, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3100, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4716, selfPID=5448, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1048, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... ControlCPDN Monitor - Quit request from BOINC... No Process Handle Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5200, selfPID=5200, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6044, selfPID=6044, iMonCtr=2 GCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5652, selfPID=5192, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5400, selfPID=5400, iMonCtr=2 CPDN Monitor - Quit request from BOINC... C </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
25 Jul 2011 14:22:42 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 103,776 | 292,186 | 2.8155 |
08 Jul 2011 14:40:08 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 92,256 | 261,391 | 2.8333 |
19 Jun 2011 22:01:03 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 80,736 | 230,721 | 2.8577 |
10 Jun 2011 15:42:26 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 69,216 | 198,501 | 2.8678 |
03 Jun 2011 12:54:51 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 57,696 | 167,698 | 2.9066 |
29 May 2011 16:57:33 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 46,197 | 136,574 | 2.9563 |
29 May 2011 13:54:28 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 46,188 | 136,116 | 2.9470 |
27 May 2011 16:57:25 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 46,176 | 135,667 | 2.9380 |
20 May 2011 17:02:21 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 34,656 | 102,339 | 2.9530 |
11 May 2011 20:26:09 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 23,136 | 69,712 | 3.0131 |
22 Apr 2011 07:03:09 | 1141103 | 12725665 | hadam3p_saf_12ar_1962_1_006902011_1 | 11,616 | 37,079 | 3.1921 |
©2024 cpdn.org