climateprediction.net home page
Task 16281758

Task 16281758

Name hadam3p_pnw_uaiv_2007_1_008506691_0
Workunit 8656498
Created 7 Feb 2014, 11:41:46 UTC
Sent 7 Feb 2014, 17:38:29 UTC
Report deadline 20 Jan 2015, 22:58:29 UTC
Received 25 Jul 2014, 16:12:00 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x00000000)
Computer ID 1272486
Run time 4 days 16 hours 48 min 24 sec
CPU time 4 days 7 hours 9 min 52 sec
Validate state Invalid
Credit 2,759.97
Device peak FLOPS 2.91 GFLOPS
Application version UK Met Office HadAM3P-HadRM3P Pacific North West v7.22
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5504, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4884, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4924, selfPID=3700, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3236, iMonCtr=2
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3628, iMonCtr=2
Model crash detected, will try to restart...
18:02:51 (3492): No heartbeat from client for 30 sec - exiting
18:02:51 (3492): timer handler: client dead, exiting
18:02:52 (3492): No heartbeat from client for 30 sec - exiting
18:02:52 (3492): timer handler: client dead, exiting
18:02:54 (3492): No heartbeat from client for 30 sec - exiting
18:02:54 (3492): timer handler: client dead, exiting
18:02:55 (3492): No heartbeat from client for 30 sec - exiting
18:02:55 (3492): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1616, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3640, selfPID=3160, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3772, selfPID=3412, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2520, iMonCtr=2
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5000, selfPID=4660, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3680, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3904, iMonCtr=2
GCPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GCPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3792, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1616, selfPID=3452, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3920, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4824, selfPID=3828, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3832, selfPID=3508, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3488, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3432, selfPID=3128, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=944, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4208, selfPID=3732, iMonCtr=1
Model crash detected, will try to restart...
G19:47:37 (3204): No heartbeat from client for 30 sec - exiting
19:47:37 (3204): timer handler: client dead, exiting
19:47:38 (3204): No heartbeat from client for 30 sec - exiting
19:47:38 (3204): timer handler: client dead, exiting
19:47:39 (3204): No heartbeat from client for 30 sec - exiting
19:47:39 (3204): timer handler: client dead, exiting
19:47:40 (3204): No heartbeat from client for 30 sec - exiting
19:47:40 (3204): timer handler: client dead, exiting
19:47:41 (3204): No heartbeat from client for 30 sec - exiting
19:47:41 (3204): timer handler: client dead, exiting
19:47:42 (3204): No heartbeat from client for 30 sec - exiting
19:47:42 (3204): timer handler: client dead, exiting
19:47:44 (3204): No heartbeat from client for 30 sec - exiting
19:47:44 (3204): timer handler: client dead, exiting
19:47:45 (3204): No heartbeat from client for 30 sec - exiting
19:47:45 (3204): timer handler: client dead, exiting
19:47:46 (3204): No heartbeat from client for 30 sec - exiting
19:47:46 (3204): timer handler: client dead, exiting
19:47:47 (3204): No heartbeat from client for 30 sec - exiting
19:47:47 (3204): timer handler: client dead, exiting
19:47:48 (3204): No heartbeat from client for 30 sec - exiting
19:47:48 (3204): timer handler: client dead, exiting
19:47:49 (3204): No heartbeat from client for 30 sec - exiting
19:47:49 (3204): timer handler: client dead, exiting
19:47:50 (3204): No heartbeat from client for 30 sec - exiting
19:47:50 (3204): timer handler: client dead, exiting
19:47:51 (3204): No heartbeat from client for 30 sec - exiting
19:47:51 (3204): timer handler: client dead, exiting
19:47:52 (3204): No heartbeat from client for 30 sec - exiting
19:47:52 (3204): timer handler: client dead, exiting
19:47:53 (3204): No heartbeat from client for 30 sec - exiting
19:47:53 (3204): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=932, iMonCtr=2
Model crash detected, will try to restart...
19:54:42 (3388): No heartbeat from client for 30 sec - exiting
19:54:42 (3388): timer handler: client dead, exiting
19:54:43 (3388): No heartbeat from client for 30 sec - exiting
19:54:43 (3388): timer handler: client dead, exiting
19:54:44 (3388): No heartbeat from client for 30 sec - exiting
19:54:44 (3388): timer handler: client dead, exiting
19:54:45 (3388): No heartbeat from client for 30 sec - exiting
19:54:45 (3388): timer handler: client dead, exiting
19:54:46 (3388): No heartbeat from client for 30 sec - exiting
19:54:46 (3388): timer handler: client dead, exiting
19:54:47 (3388): No heartbeat from client for 30 sec - exiting
19:54:47 (3388): timer handler: client dead, exiting
19:54:48 (3388): No heartbeat from client for 30 sec - exiting
19:54:48 (3388): timer handler: client dead, exiting
19:54:50 (3388): No heartbeat from client for 30 sec - exiting
19:54:50 (3388): timer handler: client dead, exiting
19:54:51 (3388): No heartbeat from client for 30 sec - exiting
19:54:51 (3388): timer handler: client dead, exiting
19:54:52 (3388): No heartbeat from client for 30 sec - exiting
19:54:52 (3388): timer handler: client dead, exiting
19:54:53 (3388): No heartbeat from client for 30 sec - exiting
19:54:53 (3388): timer handler: client dead, exiting
19:54:54 (3388): No heartbeat from client for 30 sec - exiting
19:54:54 (3388): timer handler: client dead, exiting
19:54:55 (3388): No heartbeat from client for 30 sec - exiting
19:54:55 (3388): timer handler: client dead, exiting
19:54:56 (3388): No heartbeat from client for 30 sec - exiting
19:54:56 (3388): timer handler: client dead, exiting
19:54:57 (3388): No heartbeat from client for 30 sec - exiting
19:54:57 (3388): timer handler: client dead, exiting
19:54:58 (3388): No heartbeat from client for 30 sec - exiting
19:54:58 (3388): timer handler: client dead, exiting
19:54:59 (3388): No heartbeat from client for 30 sec - exiting
19:54:59 (3388): timer handler: client dead, exiting
19:55:00 (3388): No heartbeat from client for 30 sec - exiting
19:55:00 (3388): timer handler: client dead, exiting
19:55:01 (3388): No heartbeat from client for 30 sec - exiting
19:55:01 (3388): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:55:03 (3388): No heartbeat from client for 30 sec - exiting
19:55:03 (3388): timer handler: client dead, exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4208, selfPID=3616, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3216, selfPID=3608, iMonCtr=1
Model crash detected, will try to restart...
GSuspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3976, selfPID=3784, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4556, selfPID=4196, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=964, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3828, iMonCtr=2
Suspended CPDN Monitor - Suspend request from BOINC...
20:20:29 (3160): No heartbeat from client for 30 sec - exiting
20:20:29 (3160): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3568, selfPID=3568, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3452, selfPID=3028, iMonCtr=1
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3176, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5516, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1620, selfPID=3844, iMonCtr=1
Model crash detected, will try to restart...
G20:18:59 (2840): No heartbeat from client for 30 sec - exiting
20:18:59 (2840): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5340, selfPID=3216, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4268, selfPID=2784, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4904, selfPID=4904, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4564, selfPID=4352, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4252, iMonCtr=2
Model crash detected, will try to restart...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2968, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2468, selfPID=3352, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5060, selfPID=4228, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5560, selfPID=5316, iMonCtr=1
Model crash detected, will try to restart...
20:45:01 (3496): No heartbeat from client for 30 sec - exiting
20:45:01 (3496): timer handler: client dead, exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4332, selfPID=4332, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4520, selfPID=1036, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4328, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4632, selfPID=3044, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4016, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2268, selfPID=3780, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4156, iMonCtr=2
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5000, selfPID=4236, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4268, selfPID=3756, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3452, selfPID=3520, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4640, selfPID=3632, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5268, selfPID=3456, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4688, selfPID=4688, iMonCtr=2
Global Worker:: CPDN process is not running, exiting, bRetVal = 0, checkPID=0, selfPID=3972, iMonCtr=1
Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4028, selfPID=4028, iMonCtr=2
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4028, selfPID=3340, iMonCtr=1
Model crash detected, will try to restart...
Leaving CPDN_Main::Monitor...
17:33:11 (3340): called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
  <file_name>hadam3p_pnw_uaiv_2007_1_008506691_0_12.zip</file_name>
  <error_code>-161</error_code>
</file_xfer_error>

</message>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
15 Jul 2014 18:55:56 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 127,019 354,960 2.7945
30 Jun 2014 07:50:29 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 115,499 323,149 2.7979
03 Jun 2014 19:48:40 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 103,979 291,260 2.8011
20 May 2014 18:04:37 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 92,459 260,240 2.8147
06 May 2014 18:25:30 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 80,939 228,024 2.8172
20 Apr 2014 08:23:03 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 69,419 196,091 2.8247
06 Apr 2014 18:45:13 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 57,899 163,361 2.8215
20 Mar 2014 18:22:41 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 46,379 130,464 2.8130
13 Mar 2014 17:19:05 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 34,859 98,179 2.8165
03 Mar 2014 17:05:24 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 23,339 65,748 2.8171
19 Feb 2014 17:23:41 1272486 16281758 hadam3p_pnw_uaiv_2007_1_008506691_0 11,819 33,924 2.8703


©2024 climateprediction.net