Name | famous_uner_1999_200_006663182_0 |
Workunit | 6866554 |
Created | 10 Jun 2010, 15:25:56 UTC |
Sent | 5 Jul 2010, 9:01:20 UTC |
Report deadline | 4 Oct 2010, 16:28:31 UTC |
Received | 22 Jul 2010, 15:15:42 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1297050 |
Run time | 4 days 19 hours 0 min 11 sec |
CPU time | 4 days 14 hours 35 min 9 sec |
Validate state | Invalid |
Credit | 1,945.63 |
Device peak FLOPS | 1.74 GFLOPS |
Application version | UK Met Office FAMOUS v6.11 i686-pc-linux-gnu |
Stderr | <core_client_version>6.10.56</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> (20088): Can't acquire lockfile (-154) - waiting 35s (19506): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... (20088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16091, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=16091, iMonCtr=1 Model crash detected, will try to restart... (28129): Can't acquire lockfile (-154) - waiting 35s (16091): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=28129, iMonCtr=1 Model crash detected, will try to restart... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 60 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 61 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 1 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25456, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25456, iMonCtr=1 Model crash detected, will try to restart... (25456): No heartbeat from core client for 30 sec - exiting (6437): Can't acquire lockfile (-154) - waiting 35s CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6437, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6437, iMonCtr=1 Model crash detected, will try to restart... (6437): No heartbeat from core client for 30 sec - exiting (19810): Can't acquire lockfile (-154) - waiting 35s CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25966, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=25966, iMonCtr=1 Model crash detected, will try to restart... (26931): Can't acquire lockfile (-154) - waiting 35s (25966): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... (29701): Can't acquire lockfile (-154) - waiting 35s (28121): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... (31029): Can't acquire lockfile (-154) - waiting 35s (31029): Can't acquire lockfile (-154) - exiting (29607): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... (10730): Can't acquire lockfile (-154) - waiting 35s (10730): Can't acquire lockfile (-154) - exiting (9337): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... (21395): Can't acquire lockfile (-154) - waiting 35s (21395): Can't acquire lockfile (-154) - exiting (10820): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=32242, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=27963, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=27963, iMonCtr=1 Model crash detected, will try to restart... (7579): Can't acquire lockfile (-154) - waiting 35s (27963): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=19776, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=19776, iMonCtr=1 Model crash detected, will try to restart... (31804): Can't acquire lockfile (-154) - waiting 35s (19776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 15 received, exiting... (14442): called boinc_finish SIGSEGV: segmentation violation Stack trace (7 frames): ../../projects/climateprediction.net/famous_6.11_i686-pc-linux-gnu(boinc_catch_signal+0x58)[0x809e59c] [0xffffe400] ../../projects/climateprediction.net/famous_6.11_i686-pc-linux-gnu[0x804f906] ../../projects/climateprediction.net/famous_6.11_i686-pc-linux-gnu[0x805085a] ../../projects/climateprediction.net/famous_6.11_i686-pc-linux-gnu[0x8050ad6] /lib32/libc.so.6(__libc_start_main+0xe5)[0xf749542d] ../../projects/climateprediction.net/famous_6.11_i686-pc-linux-gnu(__gxx_personality_v0+0xe1)[0x804c449] Exiting... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
22 Jul 2010 06:04:24 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 589,706 | 391,897 | 0.6646 |
22 Jul 2010 04:22:09 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 580,346 | 385,658 | 0.6645 |
22 Jul 2010 02:09:39 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 570,986 | 379,440 | 0.6645 |
22 Jul 2010 01:33:36 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 561,626 | 373,267 | 0.6646 |
21 Jul 2010 22:06:07 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 552,266 | 367,076 | 0.6647 |
21 Jul 2010 20:21:31 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 542,906 | 360,893 | 0.6647 |
21 Jul 2010 18:36:29 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 533,546 | 354,636 | 0.6647 |
21 Jul 2010 16:23:09 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 524,186 | 348,412 | 0.6647 |
21 Jul 2010 15:35:15 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 514,826 | 342,201 | 0.6647 |
21 Jul 2010 13:28:59 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 505,466 | 335,986 | 0.6647 |
21 Jul 2010 10:38:06 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 496,106 | 329,797 | 0.6648 |
21 Jul 2010 09:43:53 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 486,746 | 323,610 | 0.6648 |
21 Jul 2010 06:58:41 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 477,386 | 317,333 | 0.6647 |
21 Jul 2010 05:55:06 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 468,026 | 311,125 | 0.6648 |
21 Jul 2010 02:38:10 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 458,666 | 304,929 | 0.6648 |
21 Jul 2010 01:26:07 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 449,306 | 298,686 | 0.6648 |
20 Jul 2010 23:59:16 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 439,946 | 292,472 | 0.6648 |
20 Jul 2010 22:20:02 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 430,586 | 286,276 | 0.6649 |
20 Jul 2010 19:05:16 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 421,226 | 280,077 | 0.6649 |
20 Jul 2010 16:46:38 | 779807 | 11566768 | famous_uner_1999_200_006663182_0 | 411,866 | 273,896 | 0.6650 |
©2024 cpdn.org