Name | hadcm3n_o5kd_2140_40_008282718_1 |
Workunit | 8433853 |
Created | 1 Feb 2013, 13:05:45 UTC |
Sent | 1 Feb 2013, 13:06:02 UTC |
Report deadline | 3 May 2013, 20:33:13 UTC |
Received | 18 Feb 2013, 14:42:50 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1100639 |
Run time | 15 days 13 hours 1 min 7 sec |
CPU time | 14 days 22 hours 4 min 16 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 2.68 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:40:00 (2893116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 00:06:07 (3520452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:06:08 (3520452): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 09:35:45 (4183304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 13:57:11 (4486632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 17:20:52 (176444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:21:24 (177752): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:57:43 (996492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:46:51 (477540): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:11:18 (661252): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:11:06 (672552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:58:59 (684324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 13:58:03 (727188): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:41:58 (2171120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:41:59 (2171120): No heartbeat from core client for 30 sec - exiting 20:42:00 (2171120): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 20:44:23 (2326240): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:46:59 (2328876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 15:56:45 (652428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:56:46 (652428): No heartbeat from core client for 30 sec - exiting 15:56:47 (652428): No heartbeat from core client for 30 sec - exiting 15:56:48 (652428): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:46:55 (5350372): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:46:56 (5350372): No heartbeat from core client for 30 sec - exiting 11:46:57 (5350372): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 11:54:00 (5393200): No heartbeat from core client for 30 sec - exiting 11:54:01 (5393200): No heartbeat from core client for 30 sec - exiting 11:54:02 (5393200): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:07:25 (5407016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:09:03 (5408300): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 12:13:27 (5409860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:13:28 (5409860): No heartbeat from core client for 30 sec - exiting 12:13:29 (5409860): No heartbeat from core client for 30 sec - exiting 12:13:30 (5409860): No heartbeat from core client for 30 sec - exiting 12:13:31 (5409860): No heartbeat from core client for 30 sec - exiting 12:13:32 (5409860): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 12:15:03 (5412176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:15:04 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:05 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:06 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:07 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:08 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:09 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:10 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:11 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:12 (5412176): No heartbeat from core client for 30 sec - exiting 12:15:13 (5412176): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 21:15:15 (5967564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... MainError: 12:09:15 AM No files match the supplied pattern. MainError: 12:09:15 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 12:28:01 AM No files match the supplied pattern. MainError: 12:28:01 AM No files match the supplied pattern. MainError: 12:54:18 AM No files match the supplied pattern. MainError: 12:54:18 AM No files match the supplied pattern. 01:32:45 (2709772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 01:37:20 (1345460): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:37:21 (1345460): No heartbeat from core client for 30 sec - exiting 01:37:22 (1345460): No heartbeat from core client for 30 sec - exiting 01:37:23 (1345460): No heartbeat from core client for 30 sec - exiting 01:37:24 (1345460): No heartbeat from core client for 30 sec - exiting 01:37:25 (1345460): No heartbeat from core client for 30 sec - exiting 01:37:27 (1345460): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 01:30:53 PM No files match the supplied pattern. MainError: 01:30:53 PM No files match the supplied pattern. 11:21:04 (1349080): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 01:57:41 AM No files match the supplied pattern. MainError: 01:57:41 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 02:22:37 PM No files match the supplied pattern. MainError: 02:22:37 PM No files match the supplied pattern. MainError: 02:54:22 AM No files match the supplied pattern. MainError: 02:54:22 AM No files match the supplied pattern. CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 03:08:10 PM No files match the supplied pattern. MainError: 03:08:10 PM No files match the supplied pattern. CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... MainError: 04:36:06 AM No files match the supplied pattern. MainError: 04:36:06 AM No files match the supplied pattern. Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... MainError: 05:11:49 PM No files match the supplied pattern. MainError: 05:11:49 PM No files match the supplied pattern. CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Error converting file to netcdf: dataout/o5kdka.ph11c10 Error converting file to netcdf: dataout/o5kdka.pg11c10 Error converting file to netcdf: dataout/o5kdka.pe11c10 MainError: 09:12:37 AM No files match the supplied pattern. MainError: 09:12:37 AM No files match the supplied pattern. BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
18 Feb 2013 09:53:32 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 777,600 | 1,306,877 | 1.6807 |
17 Feb 2013 17:16:07 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 751,680 | 1,262,456 | 1.6795 |
17 Feb 2013 04:41:09 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 725,760 | 1,219,113 | 1.6798 |
16 Feb 2013 15:10:37 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 699,840 | 1,176,056 | 1.6805 |
16 Feb 2013 02:57:45 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 673,920 | 1,133,502 | 1.6820 |
15 Feb 2013 14:50:29 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 648,000 | 1,090,200 | 1.6824 |
15 Feb 2013 02:00:44 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 622,080 | 1,047,167 | 1.6833 |
14 Feb 2013 13:35:16 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 596,160 | 1,004,095 | 1.6843 |
14 Feb 2013 00:58:56 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 570,240 | 960,936 | 1.6851 |
13 Feb 2013 12:30:01 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 544,320 | 917,989 | 1.6865 |
13 Feb 2013 00:14:31 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 518,400 | 874,684 | 1.6873 |
12 Feb 2013 12:30:02 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 492,480 | 831,325 | 1.6880 |
11 Feb 2013 21:05:40 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 466,560 | 788,148 | 1.6893 |
11 Feb 2013 08:56:13 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 440,640 | 745,032 | 1.6908 |
10 Feb 2013 16:31:03 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 414,720 | 702,569 | 1.6941 |
10 Feb 2013 03:35:44 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 388,800 | 659,338 | 1.6958 |
09 Feb 2013 13:30:14 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 362,880 | 616,115 | 1.6978 |
09 Feb 2013 00:39:46 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 336,960 | 573,059 | 1.7007 |
08 Feb 2013 11:12:55 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 311,040 | 530,008 | 1.7040 |
07 Feb 2013 22:52:59 | 1100639 | 15578252 | hadcm3n_o5kd_2140_40_008282718_1 | 285,120 | 487,651 | 1.7103 |
©2024 cpdn.org