Name | hadam3p_pnw_q7nq_2031_1_008353327_0 |
Workunit | 8504186 |
Created | 19 Apr 2013, 16:24:57 UTC |
Sent | 19 Apr 2013, 16:28:51 UTC |
Report deadline | 1 Apr 2014, 21:48:51 UTC |
Received | 1 Jul 2013, 12:10:37 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1097161 |
Run time | 6 days 6 hours 13 min 42 sec |
CPU time | 5 days 17 hours 29 min 8 sec |
Validate state | Invalid |
Credit | 2,505.24 |
Device peak FLOPS | 1.65 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Pacific North West v6.09 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:19:31 (4708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5836, selfPID=2240, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5700, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5464, selfPID=3684, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5372, selfPID=4948, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5332, selfPID=5108, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5440, selfPID=3556, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3156, selfPID=4628, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5320, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2036, selfPID=4200, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 4 Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4320, selfPID=4960, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 4 Suspended CPDN Monitor - Suspend request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6184, selfPID=2276, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6008, selfPID=5128, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3220, selfPID=5896, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=168, selfPID=3240, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3872, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4052, selfPID=5224, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6068, selfPID=4060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6932, selfPID=5784, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6064, selfPID=5512, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5264, selfPID=6636, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3612, selfPID=5148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1124, selfPID=5276, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5116, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5308, selfPID=3944, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4964, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5280, iMonCtr=2 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1940, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=780, selfPID=3648, iMonCtr=1 Model crash detected, will try to restart... CGCController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4124, selfPID=5296, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 8 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4540, selfPID=5096, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6244, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3236, selfPID=4000, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4944, selfPID=5272, iMonCtr=1 Model crash detected, will try to restart... 10:25:22 (5172): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2172, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Model crashed: Leaving CPDN_Main::Monitor... Regional yearly means requires 12 input files got 10 Called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_pnw_q7nq_2031_1_008353327_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_pnw_q7nq_2031_1_008353327_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
02 Jul 2013 10:55:14 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 115,296 | 485,201 | 4.2083 |
27 Jun 2013 17:43:39 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 103,776 | 436,573 | 4.2069 |
22 Jun 2013 20:09:38 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 92,256 | 387,482 | 4.2001 |
18 Jun 2013 20:03:22 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 80,736 | 340,911 | 4.2225 |
05 Jun 2013 19:32:48 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 69,216 | 293,408 | 4.2390 |
25 May 2013 15:54:32 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 57,696 | 246,449 | 4.2715 |
18 May 2013 16:37:03 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 46,176 | 197,171 | 4.2700 |
05 May 2013 19:58:23 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 34,656 | 149,411 | 4.3113 |
03 May 2013 07:13:11 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 23,136 | 99,606 | 4.3052 |
28 Apr 2013 11:52:06 | 1097161 | 15737089 | hadam3p_pnw_q7nq_2031_1_008353327_0 | 11,616 | 49,248 | 4.2397 |
©2024 cpdn.org