Name | famous_ubou_599_200_006647993_2 |
Workunit | 6851365 |
Created | 10 Jun 2010, 13:12:39 UTC |
Sent | 12 Aug 2010, 6:46:00 UTC |
Report deadline | 11 Nov 2010, 14:13:11 UTC |
Received | 9 Oct 2010, 22:56:28 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -226 (0xFFFFFF1E) ERR_TOO_MANY_EXITS |
Computer ID | 960056 |
Run time | 17 days 7 hours 37 min 15 sec |
CPU time | 16 days 0 hours 21 min 31 sec |
Validate state | Invalid |
Credit | 5,898.48 |
Device peak FLOPS | 2.30 GFLOPS |
Application version | UK Met Office FAMOUS v6.11 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... 01:53:40 (4456): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: STWORK : Error in PP_FILE tmp/pipe_dummy 01:57:43 (7852): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:09:56 (7728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5628, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: STWORK : Error in PP_FILE tmp/pipe_dummy no start tag in app init data 19:23:24 (5272): Can't parse init data file - running in standalone mode 19:23:28 (7036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: STWORK : Error in PP_FILE tmp/pipe_dummy CPDN Monitor - Quit request from BOINC... 11:37:50 (5944): No heartbeat from core client for 30 sec - exiting 11:37:51 (5944): No heartbeat from core client for 30 sec - exiting 11:37:52 (5944): No heartbeat from core client for 30 sec - exiting 11:37:53 (5944): No heartbeat from core client for 30 sec - exiting 11:37:54 (5944): No heartbeat from core client for 30 sec - exiting 11:37:55 (5944): No heartbeat from core client for 30 sec - exiting 11:37:56 (5944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5640, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:29:09 (6740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:29:10 (6740): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3824, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( 10:25:31 (3824): called boinc_finish CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:31:17 (7564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITHEAD: I/O error tmp/pipe_dummy no start tag in app init data 07:58:30 (4208): Can't parse init data file - running in standalone mode 08:00:11 (3844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITHEAD: I/O error tmp/pipe_dummy 08:04:21 (8160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 60 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 61 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: Read Failed: Result too large BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 Sep 2010 09:37:00 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,787,786 | 799,217 | 0.4470 |
11 Sep 2010 17:34:38 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,778,426 | 795,052 | 0.4471 |
11 Sep 2010 16:20:07 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,769,066 | 790,933 | 0.4471 |
11 Sep 2010 13:51:39 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,759,706 | 786,669 | 0.4470 |
11 Sep 2010 12:36:21 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,750,346 | 782,364 | 0.4470 |
11 Sep 2010 11:21:06 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,740,986 | 778,060 | 0.4469 |
10 Sep 2010 21:37:29 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,731,626 | 773,770 | 0.4468 |
10 Sep 2010 17:29:24 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,722,266 | 769,420 | 0.4467 |
10 Sep 2010 06:16:15 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,712,906 | 765,175 | 0.4467 |
10 Sep 2010 05:02:39 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,703,546 | 760,875 | 0.4466 |
10 Sep 2010 03:48:08 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,694,186 | 756,617 | 0.4466 |
09 Sep 2010 17:48:07 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,684,826 | 752,390 | 0.4466 |
09 Sep 2010 06:51:41 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,675,466 | 748,203 | 0.4466 |
09 Sep 2010 05:42:44 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,666,106 | 744,059 | 0.4466 |
09 Sep 2010 02:06:22 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,656,746 | 739,925 | 0.4466 |
09 Sep 2010 01:03:06 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,647,386 | 735,755 | 0.4466 |
08 Sep 2010 23:48:01 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,638,026 | 731,592 | 0.4466 |
08 Sep 2010 21:35:24 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,628,666 | 727,319 | 0.4466 |
08 Sep 2010 19:45:10 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,619,306 | 723,029 | 0.4465 |
08 Sep 2010 17:53:28 | 960056 | 11490798 | famous_ubou_599_200_006647993_2 | 1,609,946 | 718,802 | 0.4465 |
©2024 climateprediction.net