climateprediction.net home page
Task 11876052

Task 11876052

Name famous_v4wj_1799_200_006692121_5
Workunit 6895374
Created 8 Sep 2010, 9:20:49 UTC
Sent 11 Sep 2010, 13:21:28 UTC
Report deadline 11 Dec 2010, 20:48:39 UTC
Received 28 Jan 2011, 3:51:10 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1050356
Run time 19 days 4 hours 20 min 32 sec
CPU time 18 days 5 hours 51 min 33 sec
Validate state Invalid
Credit 5,651.43
Device peak FLOPS 1.14 GFLOPS
Application version UK Met Office FAMOUS v6.11
windows_intelx86
Stderr
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4588, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4508, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2500, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2228, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1712, iMonCtr=1
Model crash detected, will try to restart...
18:06:35 (3960): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2688, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1820, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2056, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5192, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4316, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2468, iMonCtr=1
Model crash detected, will try to restart...
CCPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=176, iMonCtr=1
Model crash detected, will try to restart...
08:22:49 (3492): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:22:57 (3492): No heartbeat from core client for 30 sec - exiting
10:41:54 (3160): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5332, iMonCtr=1
Model crash detected, will try to restart...
16:04:09 (4412): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5344, iMonCtr=1
Model crash detected, will try to restart...
17:24:05 (1832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3892, iMonCtr=1
Model crash detected, will try to restart...
22:01:43 (3700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
10:03:16 (2624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
12:30:43 (2668): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:30:44 (2668): No heartbeat from core client for 30 sec - exiting
19:27:37 (3464): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:27:45 (3464): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2832, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
20:34:08 (2604): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:34:09 (2604): No heartbeat from core client for 30 sec - exiting

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
06:20:00 (2908): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: Result too large
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CPDN Monitor - Quit request from BOINC...
20:43:58 (2476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1336, iMonCtr=1
Model crash detected, will try to restart...
11:59:00 (2056): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:59:11 (2056): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2492, iMonCtr=1
Model crash detected, will try to restart...
19:23:49 (3816): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3412, iMonCtr=1
Model crash detected, will try to restart...
08:52:08 (1020): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
10:08:28 (5624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1416, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 60 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 61 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16

BUFFIN: Read Failed: No such file or directory
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2824, iMonCtr=1
Model crash detected, will try to restart...
08:27:10 (3160): No heartbeat from core client for 30 sec - exiting
08:27:22 (3160): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:27:34 (3160): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=952, iMonCtr=1
Model crash detected, will try to restart...
07:55:34 (4264): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:41:49 (2412): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1
Model crash detected, will try to restart...
19:48:21 (5896): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
C10:32:17 (2968): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1
Model crash detected, will try to restart...
09:34:44 (4084): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=296, iMonCtr=1
Model crash detected, will try to restart...
09:21:16 (3172): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
08:51:46 (1408): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3796, iMonCtr=1
Model crash detected, will try to restart...
14:42:32 (1284): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:42:47 (1284): No heartbeat from core client for 30 sec - exiting
09:15:21 (2704): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:15:32 (2704): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1
Model crash detected, will try to restart...
09:27:58 (4512): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4960, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4092, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1120, iMonCtr=1
Model crash detected, will try to restart...
08:00:52 (2808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:00:59 (2808): No heartbeat from core client for 30 sec - exiting
08:18:17 (2172): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:22:36 (2172): No heartbeat from core client for 30 sec - exiting

Model crashed: READHIST: End of file in READ from history file for namelist NLCHISTO                                                                                                                                                                                           tmp/pipe_dummy                                                                  

Model crashed: READHIST: End of file in READ from history file for namelist NLCHISTO                                                                                                                                                                                           tmp/pipe_dummy                                                                  

Model crashed: READHIST: End of file in READ from history file for namelist NLCHISTO                                                                                                                                                                                           tmp/pipe_dummy                                                                  

Model crashed: READHIST: End of file in READ from history file for namelist NLCHISTO                                                                                                                                                                                           tmp/pipe_dummy                                                                  

Model crashed: READHIST: End of file in READ from history file for namelist NLCHISTO                                                                                                                                                                                           tmp/pipe_dummy                                                                  

Model crashed: READHIST: End of file in READ from history file for namelist NLCHISTO                                                                                                                                                                                           tmp/pipe_dummy                                                                  
Sorry, too many model crashes! :-(
08:34:52 (480): called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,712,906 1,574,414 0.9191
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,703,546 1,565,469 0.9189
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,694,186 1,556,644 0.9188
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,684,826 1,547,874 0.9187
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,675,466 1,539,098 0.9186
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,666,106 1,530,340 0.9185
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,656,746 1,521,903 0.9186
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,647,386 1,513,308 0.9186
28 Jan 2011 03:54:38 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,638,026 1,504,457 0.9185
22 Jan 2011 16:10:09 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,628,666 1,495,639 0.9183
22 Jan 2011 13:34:41 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,619,306 1,487,025 0.9183
22 Jan 2011 12:01:51 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,609,946 1,478,392 0.9183
22 Jan 2011 12:01:51 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,600,586 1,469,528 0.9181
22 Jan 2011 05:51:16 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,591,226 1,460,679 0.9180
21 Jan 2011 15:47:25 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,581,866 1,451,752 0.9177
21 Jan 2011 14:38:36 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,572,506 1,442,833 0.9175
21 Jan 2011 14:38:36 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,563,146 1,434,049 0.9174
21 Jan 2011 14:38:36 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,553,786 1,425,401 0.9174
21 Jan 2011 14:38:36 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,544,426 1,416,729 0.9173
21 Jan 2011 14:38:36 1050356 11876052 famous_v4wj_1799_200_006692121_5 1,535,066 1,408,023 0.9172


©2024 climateprediction.net