climateprediction.net home page
Task 15520370

Task 15520370

Name hadcm3n_zh9u_1920_40_008280931_0
Workunit 8432066
Created 1 Jan 2013, 23:13:02 UTC
Sent 1 Jan 2013, 23:13:19 UTC
Report deadline 3 Apr 2013, 6:40:30 UTC
Received 10 Jul 2013, 5:43:39 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 849069
Run time 32 days 21 hours 37 min 31 sec
CPU time 28 days 6 hours 24 min 54 sec
Validate state Invalid
Credit 9,331.20
Device peak FLOPS 2.60 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.6.28</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5996, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5620, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2760, iMonCtr=1
Model crash detected, will try to restart...
18:30:42 (5912): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:16:40 (5944): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=736, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5664, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2780, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/zh9uko.pjc6c10
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1528, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6036, iMonCtr=1
Model crash detected, will try to restart...
20:48:01 (5964): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5436, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5200, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5704, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1148, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2972, iMonCtr=1
Model crash detected, will try to restart...
21:20:36 (5060): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:20:55 (5060): No heartbeat from core client for 30 sec - exiting
22:38:56 (5672): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:38:58 (5672): No heartbeat from core client for 30 sec - exiting
22:40:38 (4912): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:40:41 (4912): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5052, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5532, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2608, iMonCtr=1
Model crash detected, will try to restart...
Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4456, selfPID=4456, iMonCtr=1
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/zh9uko.pjd7c10
Error converting file to netcdf: dataout/zh9uko.pid7c10
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5200, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5716, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5800, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6000, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3784, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5456, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5076, iMonCtr=1
Model crash detected, will try to restart...
12:04:09 (2044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:27:16 (5044): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:27:18 (5044): No heartbeat from core client for 30 sec - exiting
19:31:16 (5468): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:50:27 (4884): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:50:28 (4884): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
15:08:38 (6516): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:08:42 (6516): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5444, iMonCtr=1
Model crash detected, will try to restart...
17:21:50 (5376): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5536, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4508, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4936, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4668, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4600, iMonCtr=1
Model crash detected, will try to restart...
18:26:32 (4372): No heartbeat from core client for 30 sec - exiting
18:26:33 (4372): No heartbeat from core client for 30 sec - exiting
18:26:34 (4372): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
10 Jul 2013 04:44:37 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 777,600 2,442,285 3.1408
26 Jun 2013 05:58:21 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 751,680 2,362,955 3.1436
20 Jun 2013 19:18:33 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 725,760 2,277,188 3.1377
15 Jun 2013 19:07:01 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 699,840 2,194,651 3.1359
10 Jun 2013 04:15:55 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 673,920 2,115,028 3.1384
07 Jun 2013 15:25:15 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 648,000 2,036,671 3.1430
18 May 2013 05:12:18 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 622,080 1,957,659 3.1470
12 May 2013 16:51:16 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 596,160 1,875,745 3.1464
05 May 2013 23:10:42 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 570,240 1,793,569 3.1453
29 Apr 2013 21:31:40 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 544,320 1,711,462 3.1442
26 Apr 2013 04:16:33 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 518,400 1,633,174 3.1504
20 Apr 2013 23:34:18 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 492,480 1,550,717 3.1488
15 Apr 2013 06:19:35 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 466,560 1,464,592 3.1391
11 Apr 2013 17:32:20 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 440,640 1,379,368 3.1304
06 Apr 2013 18:39:50 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 414,720 1,295,174 3.1230
30 Mar 2013 19:58:34 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 388,800 1,213,060 3.1200
25 Mar 2013 01:00:15 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 362,880 1,130,510 3.1154
18 Mar 2013 04:17:30 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 336,960 1,050,501 3.1176
12 Mar 2013 05:55:44 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 311,040 970,419 3.1199
08 Mar 2013 03:24:48 849069 15520370 hadcm3n_zh9u_1920_40_008280931_0 285,120 887,863 3.1140


©2024 cpdn.org