Name | hadcm3n_8574_1980_40_008464692_0 |
Workunit | 8615531 |
Created | 19 Sep 2013, 14:44:05 UTC |
Sent | 20 Sep 2013, 14:17:02 UTC |
Report deadline | 20 Dec 2013, 21:44:13 UTC |
Received | 12 Feb 2014, 13:46:57 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 25 (0x00000019) Unknown error code |
Computer ID | 1154140 |
Run time | 15 days 8 hours 55 min 16 sec |
CPU time | 14 days 17 hours 58 min 13 sec |
Validate state | Invalid |
Credit | 9,953.28 |
Device peak FLOPS | 3.23 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.26</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5080, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5880, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5168, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6112, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2616, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2804, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=440, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:19:20 (2228): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:21:03 (4516): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5340, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4756, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3708, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3772, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3812, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2452, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2556, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2956, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1300, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2536, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3148, iMonCtr=1 Model crash detected, will try to restart... 09:53:21 (2984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5860, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4616, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2084, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=900, iMonCtr=1 Model crash detected, will try to restart... 12:45:39 (4120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5892, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5208, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3668, iMonCtr=1 Model crash detected, will try to restart... 12:03:52 (1276): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1800, iMonCtr=1 Model crash detected, will try to restart... 10:00:40 (4148): No heartbeat from core client for 30 sec - exiting 10:00:41 (4148): No heartbeat from core client for 30 sec - exiting 10:00:42 (4148): No heartbeat from core client for 30 sec - exiting 10:00:43 (4148): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:01:24 (4916): No heartbeat from core client for 30 sec - exiting 10:01:25 (4916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5740, iMonCtr=1 Model crash detected, will try to restart... 10:35:10 (2404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:00:38 (3060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:21:23 (4100): No heartbeat from core client for 30 sec - exiting 12:21:24 (4100): No heartbeat from core client for 30 sec - exiting 12:21:25 (4100): No heartbeat from core client for 30 sec - exiting 12:21:26 (4100): No heartbeat from core client for 30 sec - exiting 12:21:27 (4100): No heartbeat from core client for 30 sec - exiting 12:21:28 (4100): No heartbeat from core client for 30 sec - exiting 12:21:29 (4100): No heartbeat from core client for 30 sec - exiting 12:21:30 (4100): No heartbeat from core client for 30 sec - exiting 12:21:32 (4100): No heartbeat from core client for 30 sec - exiting 12:21:33 (4100): No heartbeat from core client for 30 sec - exiting 12:21:34 (4100): No heartbeat from core client for 30 sec - exiting 12:21:35 (4100): No heartbeat from core client for 30 sec - exiting 12:21:36 (4100): No heartbeat from core client for 30 sec - exiting 12:21:37 (4100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:21:38 (4100): No heartbeat from core client for 30 sec - exiting 12:21:39 (4100): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 10:09:27 (2640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2808, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
11 Feb 2014 11:11:22 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 829,440 | 1,249,976 | 1.5070 |
07 Feb 2014 13:41:18 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 803,520 | 1,209,515 | 1.5053 |
04 Feb 2014 12:34:42 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 777,600 | 1,170,971 | 1.5059 |
30 Jan 2014 15:16:10 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 751,680 | 1,130,568 | 1.5041 |
28 Jan 2014 11:52:00 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 725,760 | 1,092,486 | 1.5053 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 699,840 | 1,055,907 | 1.5088 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 673,920 | 1,021,769 | 1.5162 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 648,000 | 987,606 | 1.5241 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 622,080 | 953,432 | 1.5327 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 596,160 | 919,206 | 1.5419 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 570,240 | 877,154 | 1.5382 |
27 Jan 2014 09:57:50 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 544,320 | 839,082 | 1.5415 |
22 Jan 2014 12:24:09 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 518,400 | 800,229 | 1.5437 |
20 Jan 2014 14:11:40 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 492,480 | 757,672 | 1.5385 |
16 Jan 2014 12:44:19 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 466,560 | 717,556 | 1.5380 |
14 Jan 2014 17:14:38 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 440,640 | 680,978 | 1.5454 |
13 Jan 2014 09:27:57 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 414,720 | 643,074 | 1.5506 |
07 Jan 2014 14:00:51 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 388,800 | 604,512 | 1.5548 |
03 Jan 2014 15:22:45 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 362,880 | 565,579 | 1.5586 |
23 Dec 2013 13:47:27 | 1154140 | 16026036 | hadcm3n_8574_1980_40_008464692_0 | 336,960 | 526,804 | 1.5634 |
©2024 cpdn.org