Name | hadam3p_saf_0npu_1977_1_006844714_1 |
Workunit | 7048030 |
Created | 22 Apr 2012, 13:18:53 UTC |
Sent | 22 Apr 2012, 18:25:26 UTC |
Report deadline | 4 Apr 2013, 23:45:26 UTC |
Received | 24 May 2012, 14:01:33 UTC |
Server state | Over |
Outcome | Success |
Client state | Done |
Exit status | 0 (0x00000000) |
Computer ID | 1170519 |
Run time | 4 days 18 hours 20 min 59 sec |
CPU time | 4 days 10 hours 46 min 23 sec |
Validate state | Workunit error - check skipped |
Credit | 2,244.09 |
Device peak FLOPS | 2.46 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Southern Africa v6.09 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2220, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2644, selfPID=1888, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3532, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2176, selfPID=3964, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1392, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2248, selfPID=3496, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1296, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3952, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3192, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=952, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3688, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3628, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2848, selfPID=3616, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3720, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=2 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3512, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=684, selfPID=2240, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3696, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3988, selfPID=3248, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3208, selfPID=1728, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2116, selfPID=4084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2244, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1048, selfPID=3340, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4088, iMonCtr=2 Model crash detected, will try to restart... 12:57:11 (3528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1280, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1604, selfPID=3376, iMonCtr=1 Model crash detected, will try to restart... 20:13:18 (3692): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3200, selfPID=1976, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2744, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2904, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4040, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2792, selfPID=3120, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=716, selfPID=3868, iMonCtr=1 Model crash detected, will try to restart... CGlobal Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3204, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1932, selfPID=3260, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3092, selfPID=3140, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2736, iMonCtr=2 Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2276, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2884, selfPID=3704, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3168, selfPID=3464, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3468, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2768, selfPID=3068, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3476, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3296, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Leaving CPDN_Main::Monitor... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
24 May 2012 11:58:40 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 138,336 | 383,755 | 2.7741 |
23 May 2012 15:51:47 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 126,816 | 350,922 | 2.7672 |
22 May 2012 06:30:52 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 115,296 | 318,611 | 2.7634 |
21 May 2012 06:40:35 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 103,776 | 286,793 | 2.7636 |
19 May 2012 09:40:45 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 92,256 | 255,336 | 2.7677 |
14 May 2012 18:32:57 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 80,736 | 224,656 | 2.7826 |
14 May 2012 05:49:15 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 69,216 | 193,483 | 2.7954 |
09 May 2012 12:03:08 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 57,696 | 162,308 | 2.8132 |
08 May 2012 11:00:45 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 46,176 | 130,049 | 2.8164 |
07 May 2012 15:31:09 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 34,656 | 97,943 | 2.8261 |
06 May 2012 11:26:27 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 23,136 | 65,394 | 2.8265 |
29 Apr 2012 12:16:05 | 1170519 | 14561567 | hadam3p_saf_0npu_1977_1_006844714_1 | 11,616 | 33,362 | 2.8721 |
©2024 cpdn.org