Name | hadcm3n_88oh_1980_40_008720668_0 |
Workunit | 8866646 |
Created | 23 Apr 2014, 12:26:01 UTC |
Sent | 5 May 2014, 18:45:21 UTC |
Report deadline | 5 Aug 2014, 2:12:32 UTC |
Received | 27 Jun 2014, 22:18:08 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 25 (0x00000019) Unknown error code |
Computer ID | 1184169 |
Run time | 9 days 14 hours 26 min 31 sec |
CPU time | 9 days 0 hours 50 min 9 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 3.71 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:41:47 (23948): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 00:14:23 (27184): No heartbeat from core client for 30 sec - exiting 00:14:24 (27184): No heartbeat from core client for 30 sec - exiting 00:14:25 (27184): No heartbeat from core client for 30 sec - exiting 00:14:26 (27184): No heartbeat from core client for 30 sec - exiting 00:14:27 (27184): No heartbeat from core client for 30 sec - exiting 00:14:28 (27184): No heartbeat from core client for 30 sec - exiting 00:14:29 (27184): No heartbeat from core client for 30 sec - exiting 00:14:30 (27184): No heartbeat from core client for 30 sec - exiting 00:14:31 (27184): No heartbeat from core client for 30 sec - exiting 00:14:32 (27184): No heartbeat from core client for 30 sec - exiting 00:14:33 (27184): No heartbeat from core client for 30 sec - exiting 00:14:34 (27184): No heartbeat from core client for 30 sec - exiting 00:14:35 (27184): No heartbeat from core client for 30 sec - exiting 00:14:36 (27184): No heartbeat from core client for 30 sec - exiting 00:14:37 (27184): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:48:17 (3956): No heartbeat from core client for 30 sec - exiting 18:48:18 (3956): No heartbeat from core client for 30 sec - exiting 18:48:19 (3956): No heartbeat from core client for 30 sec - exiting 18:48:20 (3956): No heartbeat from core client for 30 sec - exiting 18:48:21 (3956): No heartbeat from core client for 30 sec - exiting 18:48:22 (3956): No heartbeat from core client for 30 sec - exiting 18:48:23 (3956): No heartbeat from core client for 30 sec - exiting 18:48:24 (3956): No heartbeat from core client for 30 sec - exiting 18:48:25 (3956): No heartbeat from core client for 30 sec - exiting 18:48:26 (3956): No heartbeat from core client for 30 sec - exiting 18:48:27 (3956): No heartbeat from core client for 30 sec - exiting 18:48:28 (3956): No heartbeat from core client for 30 sec - exiting 18:48:29 (3956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 15:15:11 (1752): No heartbeat from core client for 30 sec - exiting 15:15:12 (1752): No heartbeat from core client for 30 sec - exiting 15:15:13 (1752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 00:53:33 (203956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 23:37:22 (75948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=74952, iMonCtr=1 Model crash detected, will try to restart... 19:38:39 (3612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:54:29 (11492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12192, iMonCtr=1 Model crash detected, will try to restart... 17:24:26 (9016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 00:17:02 (99200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:41:15 (2528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 00:50:30 (180428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:10:33 (8756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 20:52:27 (12284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/88ohko.pjj2c10 Error converting file to netcdf: dataout/88ohko.pij2c10 Error converting file to netcdf: dataout/88ohko.pfj2c10 Error converting file to netcdf: dataout/88ohka.phj2c10 Error converting file to netcdf: dataout/88ohka.pgj2c10 Error converting file to netcdf: dataout/88ohka.pej2c10 Error converting file to netcdf: dataout/88ohka.pdj2c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:13:49 (9008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:12:10 (6824): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:52:20 (1992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9552, iMonCtr=1 Model crash detected, will try to restart... 08:57:54 (3876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:19:58 (3492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=42612, iMonCtr=1 Model crash detected, will try to restart... 12:28:23 (8844): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish Suspended CPDN Monitor - Suspend request from BOINC... 18:43:50 (5128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:08:58 (1252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:08:51 (3044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:02:35 (8436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:26:33 (9532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 02:08:31 (49504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:47:20 (8256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 22:09:52 (8684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:22:16 (1660): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:56:48 (10580): No heartbeat from core client for 30 sec - exiting 19:56:49 (10580): No heartbeat from core client for 30 sec - exiting 19:56:50 (10580): No heartbeat from core client for 30 sec - exiting 19:56:51 (10580): No heartbeat from core client for 30 sec - exiting 19:56:52 (10580): No heartbeat from core client for 30 sec - exiting 19:56:53 (10580): No heartbeat from core client for 30 sec - exiting 19:56:54 (10580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:56:55 (10580): No heartbeat from core client for 30 sec - exiting 22:06:06 (11556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 23:06:22 (12264): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12832, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 23:11:27 (134132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:11:59 (122916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 13:37:24 (4568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 23:33:50 (1156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:33:51 (1156): No heartbeat from core client for 30 sec - exiting 19:16:33 (8588): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 22:36:15 (23300): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:53:53 (4384): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:42:41 (12976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/88ohko.pji9c10 Error converting file to netcdf: dataout/88ohko.pii9c10 Error converting file to netcdf: dataout/88ohko.pfi9c10 Error converting file to netcdf: dataout/88ohka.phi9c10 Error converting file to netcdf: dataout/88ohka.pgi9c10 Error converting file to netcdf: dataout/88ohka.pei9c10 Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
10 Jun 2014 09:02:06 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 518,400 | 525,097 | 1.0129 |
10 Jun 2014 09:00:47 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 492,480 | 498,698 | 1.0126 |
10 Jun 2014 00:31:26 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 466,560 | 472,618 | 1.0130 |
07 Jun 2014 00:02:16 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 440,640 | 446,386 | 1.0130 |
06 Jun 2014 11:08:08 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 414,720 | 420,421 | 1.0137 |
05 Jun 2014 19:04:42 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 388,800 | 394,564 | 1.0148 |
03 Jun 2014 20:28:47 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 362,880 | 368,514 | 1.0155 |
03 Jun 2014 12:42:26 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 336,960 | 342,552 | 1.0166 |
02 Jun 2014 18:58:50 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 311,040 | 316,619 | 1.0179 |
01 Jun 2014 12:28:42 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 285,120 | 290,417 | 1.0186 |
31 May 2014 13:30:05 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 259,200 | 264,213 | 1.0193 |
30 May 2014 20:13:38 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 233,280 | 238,122 | 1.0208 |
29 May 2014 19:59:19 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 207,360 | 211,833 | 1.0216 |
25 May 2014 21:40:34 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 181,440 | 185,775 | 1.0239 |
24 May 2014 17:12:45 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 155,520 | 159,680 | 1.0267 |
23 May 2014 17:22:22 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 129,600 | 133,439 | 1.0296 |
18 May 2014 13:02:50 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 103,680 | 107,276 | 1.0347 |
17 May 2014 15:58:58 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 77,760 | 80,869 | 1.0400 |
14 May 2014 19:56:15 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 51,840 | 54,158 | 1.0447 |
10 May 2014 19:23:08 | 1184169 | 16585455 | hadcm3n_88oh_1980_40_008720668_0 | 25,920 | 27,296 | 1.0531 |
©2024 cpdn.org