climateprediction.net home page
Task 11525374

Task 11525374

Name famous_uh0v_1999_200_006654906_3
Workunit 6858278
Created 10 Jun 2010, 14:12:57 UTC
Sent 23 Aug 2010, 21:54:09 UTC
Report deadline 23 Nov 2010, 5:21:20 UTC
Received 22 Oct 2010, 13:13:33 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1026045
Run time 9 days 16 hours 55 min 32 sec
CPU time 8 days 6 hours 31 min 42 sec
Validate state Invalid
Credit 2,686.79
Device peak FLOPS 1.53 GFLOPS
Application version UK Met Office FAMOUS v6.11
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2704, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
22:31:10 (1484): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1236, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2956, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=1
Model crash detected, will try to restart...
16:45:18 (2860): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:45:19 (2860): No heartbeat from core client for 30 sec - exiting
16:45:20 (2860): No heartbeat from core client for 30 sec - exiting
16:45:21 (2860): No heartbeat from core client for 30 sec - exiting
16:45:22 (2860): No heartbeat from core client for 30 sec - exiting
16:48:48 (5500): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3772, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2544, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=920, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1416, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2264, iMonCtr=1
Model crash detected, will try to restart...
19:16:27 (2240): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:16:28 (2240): No heartbeat from core client for 30 sec - exiting
19:16:29 (2240): No heartbeat from core client for 30 sec - exiting
19:16:30 (2240): No heartbeat from core client for 30 sec - exiting
19:16:31 (2240): No heartbeat from core client for 30 sec - exiting
19:16:32 (2240): No heartbeat from core client for 30 sec - exiting
19:16:33 (2240): No heartbeat from core client for 30 sec - exiting
19:16:34 (2240): No heartbeat from core client for 30 sec - exiting
19:16:35 (2240): No heartbeat from core client for 30 sec - exiting
19:16:36 (2240): No heartbeat from core client for 30 sec - exiting
19:16:37 (2240): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=340, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
00:47:17 (340): called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
21 Oct 2010 21:53:18 1026045 11525374 famous_uh0v_1999_200_006654906_3 814,346 708,368 0.8699
21 Oct 2010 19:22:16 1026045 11525374 famous_uh0v_1999_200_006654906_3 804,986 700,100 0.8697
21 Oct 2010 16:49:46 1026045 11525374 famous_uh0v_1999_200_006654906_3 795,626 692,042 0.8698
20 Oct 2010 19:29:33 1026045 11525374 famous_uh0v_1999_200_006654906_3 786,266 684,048 0.8700
20 Oct 2010 17:01:31 1026045 11525374 famous_uh0v_1999_200_006654906_3 776,906 676,102 0.8702
19 Oct 2010 21:42:37 1026045 11525374 famous_uh0v_1999_200_006654906_3 767,546 668,042 0.8704
19 Oct 2010 15:41:41 1026045 11525374 famous_uh0v_1999_200_006654906_3 758,186 659,842 0.8703
18 Oct 2010 19:23:30 1026045 11525374 famous_uh0v_1999_200_006654906_3 748,826 651,243 0.8697
18 Oct 2010 16:41:44 1026045 11525374 famous_uh0v_1999_200_006654906_3 739,466 643,231 0.8699
17 Oct 2010 22:01:07 1026045 11525374 famous_uh0v_1999_200_006654906_3 730,106 635,132 0.8699
17 Oct 2010 19:49:10 1026045 11525374 famous_uh0v_1999_200_006654906_3 720,746 627,293 0.8703
17 Oct 2010 17:37:07 1026045 11525374 famous_uh0v_1999_200_006654906_3 711,386 619,454 0.8708
17 Oct 2010 14:53:36 1026045 11525374 famous_uh0v_1999_200_006654906_3 702,026 610,960 0.8703
17 Oct 2010 12:40:26 1026045 11525374 famous_uh0v_1999_200_006654906_3 692,666 603,097 0.8707
16 Oct 2010 22:35:12 1026045 11525374 famous_uh0v_1999_200_006654906_3 683,306 594,838 0.8705
15 Oct 2010 22:48:32 1026045 11525374 famous_uh0v_1999_200_006654906_3 673,946 585,906 0.8694
15 Oct 2010 20:22:45 1026045 11525374 famous_uh0v_1999_200_006654906_3 664,586 577,650 0.8692
15 Oct 2010 14:59:19 1026045 11525374 famous_uh0v_1999_200_006654906_3 655,226 569,462 0.8691
14 Oct 2010 20:41:44 1026045 11525374 famous_uh0v_1999_200_006654906_3 645,866 561,348 0.8691
14 Oct 2010 16:55:47 1026045 11525374 famous_uh0v_1999_200_006654906_3 636,506 553,033 0.8689


©2024 climateprediction.net