climateprediction.net home page
Task 15584912

Task 15584912

Name hadcm3n_o6el_2140_40_008270186_3
Workunit 8425310
Created 5 Feb 2013, 18:41:58 UTC
Sent 5 Feb 2013, 18:42:04 UTC
Report deadline 8 May 2013, 2:09:15 UTC
Received 23 Apr 2013, 21:33:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 25 (0x00000019) Unknown error code
Computer ID 1019939
Run time 12 days 19 hours 44 min 47 sec
CPU time 11 days 11 hours 8 min 11 sec
Validate state Invalid
Credit 5,287.68
Device peak FLOPS 2.89 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
Das Laufwerk kann einen bestimmten Bereich oder eine bestimmte Spur nicht finden. (0x19) - exit code 25 (0x19)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4612, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
00:46:25 (4700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:46:27 (4700): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4572, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4704, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/o6elko.pjy4c10
Error converting file to netcdf: dataout/o6elko.piy4c10
Error converting file to netcdf: dataout/o6elko.pfy4c10
Error converting file to netcdf: dataout/o6elka.phy4c10
Error converting file to netcdf: dataout/o6elka.pgy4c10
Error converting file to netcdf: dataout/o6elka.pey4c10
Error converting file to netcdf: dataout/o6elka.pdy4c10
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3908, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4672, iMonCtr=1
Model crash detected, will try to restart...
CSuspended CPDN Monitor - Suspend request from BOINC...
00:46:45 (1576): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:46:46 (1576): No heartbeat from core client for 30 sec - exiting
18:07:48 (4660): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:25:33 (4624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4360, iMonCtr=1
Model crash detected, will try to restart...
19:13:38 (4276): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:14:24 (5732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=976, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4648, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4284, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4424, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4032, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4648, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3452, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4040, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4112, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4224, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4800, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4476, iMonCtr=1
Model crash detected, will try to restart...
10:47:45 (1680): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
C19:42:36 (2856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:43:50 (4808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
19:54:37 (4384): No heartbeat from core client for 30 sec - exiting
19:54:38 (4384): No heartbeat from core client for 30 sec - exiting
19:54:39 (4384): No heartbeat from core client for 30 sec - exiting
19:54:40 (4384): No heartbeat from core client for 30 sec - exiting
19:54:41 (4384): No heartbeat from core client for 30 sec - exiting
19:54:42 (4384): No heartbeat from core client for 30 sec - exiting
19:54:44 (4384): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:54:45 (4384): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
20:18:26 (4312): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:33:53 (4596): No heartbeat from core client for 30 sec - exiting
07:33:54 (4596): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:33:55 (4596): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4180, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4300, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4444, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4840, iMonCtr=1
Model crash detected, will try to restart...
19:11:56 (5052): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Apr 2013 18:04:33 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 440,640 980,866 2.2260
09 Apr 2013 18:37:40 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 414,720 906,469 2.1857
06 Apr 2013 12:40:03 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 388,800 849,433 2.1848
01 Apr 2013 14:35:38 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 362,880 785,125 2.1636
28 Mar 2013 21:08:22 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 336,960 730,835 2.1689
21 Mar 2013 21:39:43 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 311,040 667,153 2.1449
12 Mar 2013 19:07:08 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 285,120 581,318 2.0389
05 Mar 2013 19:40:25 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 259,200 509,516 1.9657
02 Mar 2013 18:58:38 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 233,280 451,001 1.9333
28 Feb 2013 20:07:26 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 207,360 393,110 1.8958
24 Feb 2013 08:40:17 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 181,440 342,417 1.8872
23 Feb 2013 22:15:51 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 155,520 305,433 1.9639
21 Feb 2013 21:54:19 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 129,600 258,568 1.9951
16 Feb 2013 18:01:38 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 103,680 198,365 1.9132
11 Feb 2013 22:11:52 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 77,760 130,606 1.6796
09 Feb 2013 19:14:54 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 51,840 86,107 1.6610
07 Feb 2013 21:26:43 1019939 15584912 hadcm3n_o6el_2140_40_008270186_3 25,920 42,819 1.6520


©2024 cpdn.org