Name | hadcm3n_z8kq_1880_40_008200461_2 |
Workunit | 8355585 |
Created | 16 Sep 2012, 14:46:30 UTC |
Sent | 16 Sep 2012, 14:47:17 UTC |
Report deadline | 16 Dec 2012, 22:14:28 UTC |
Received | 4 Nov 2012, 1:47:43 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 775427 |
Run time | 23 days 7 hours 37 min 22 sec |
CPU time | 21 days 15 hours 4 min 57 sec |
Validate state | Invalid |
Credit | 11,819.52 |
Device peak FLOPS | 2.30 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:50:18 (1564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:50:56 (7692): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9428, iMonCtr=1 Model crash detected, will try to restart... 16:22:51 (5216): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:24:29 (7400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:55:48 (4092): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:01:46 (4016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:26:21 (7952): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:27:56 (7776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:38 (8632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 11:22:50 (4932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:25:31 (4592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 21:34:11 (5192): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:34:12 (5192): No heartbeat from core client for 30 sec - exiting 11:20:21 (4232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:11:22 (4812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:34:47 (8788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8580, iMonCtr=1 Model crash detected, will try to restart... 19:44:45 (3860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:44:46 (3860): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 20:46:05 (8268): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7240, iMonCtr=1 Model crash detected, will try to restart... 22:47:51 (3988): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:47:52 (3988): No heartbeat from core client for 30 sec - exiting 23:49:43 (6288): No heartbeat from core client for 30 sec - exiting 23:49:44 (6288): No heartbeat from core client for 30 sec - exiting 23:49:45 (6288): No heartbeat from core client for 30 sec - exiting 23:49:46 (6288): No heartbeat from core client for 30 sec - exiting 23:49:47 (6288): No heartbeat from core client for 30 sec - exiting 23:49:48 (6288): No heartbeat from core client for 30 sec - exiting 23:49:49 (6288): No heartbeat from core client for 30 sec - exiting 23:49:50 (6288): No heartbeat from core client for 30 sec - exiting 23:49:51 (6288): No heartbeat from core client for 30 sec - exiting 23:49:52 (6288): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/z8kqko.pj89c10 Error converting file to netcdf: dataout/z8kqko.pi89c10 Error converting file to netcdf: dataout/z8kqko.pf89c10 Error converting file to netcdf: dataout/z8kqka.ph89c10 Error converting file to netcdf: dataout/z8kqka.pg89c10 Error converting file to netcdf: dataout/z8kqka.pe89c10 Error converting file to netcdf: dataout/z8kqka.pd89c10 12:32:04 (5744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:32:05 (5744): No heartbeat from core client for 30 sec - exiting 20:03:39 (4204): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:12:30 (7776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7156, iMonCtr=1 Model crash detected, will try to restart... 08:14:33 (3612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:12:30 (2432): No heartbeat from core client for 30 sec - exiting 12:12:32 (2432): No heartbeat from core client for 30 sec - exiting 12:12:33 (2432): No heartbeat from core client for 30 sec - exiting 12:12:34 (2432): No heartbeat from core client for 30 sec - exiting 12:12:35 (2432): No heartbeat from core client for 30 sec - exiting 12:12:36 (2432): No heartbeat from core client for 30 sec - exiting 12:12:37 (2432): No heartbeat from core client for 30 sec - exiting 12:12:38 (2432): No heartbeat from core client for 30 sec - exiting 12:12:39 (2432): No heartbeat from core client for 30 sec - exiting 12:12:40 (2432): No heartbeat from core client for 30 sec - exiting 12:12:41 (2432): No heartbeat from core client for 30 sec - exiting 12:12:42 (2432): No heartbeat from core client for 30 sec - exiting 12:12:43 (2432): No heartbeat from core client for 30 sec - exiting 12:12:44 (2432): No heartbeat from core client for 30 sec - exiting 12:12:45 (2432): No heartbeat from core client for 30 sec - exiting 12:12:46 (2432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:56:17 (4488): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:34:40 (8156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:23:06 (1504): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:24:35 (4352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:47:52 (7656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:27:08 (4512): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1156, iMonCtr=1 Model crash detected, will try to restart... 08:07:09 (4948): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:08:48 (3744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:13:32 (5780): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:19:39 (4492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... C16:45:54 (4828): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 18:37:34 (5404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:27:07 (5712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=1 Model crash detected, will try to restart... 22:28:52 (4660): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:28:53 (4660): No heartbeat from core client for 30 sec - exiting 15:57:37 (6272): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:57:38 (6272): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 03:40:27 (5604): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:40:28 (5604): No heartbeat from core client for 30 sec - exiting 05:57:57 (7348): No heartbeat from core client for 30 sec - exiting 05:57:58 (7348): No heartbeat from core client for 30 sec - exiting 05:57:59 (7348): No heartbeat from core client for 30 sec - exiting 05:58:00 (7348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... zip error: Could not create output file (was replacing the original zip file) Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:48:15 (5656): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:59:32 (2624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2460, iMonCtr=1 Model crash detected, will try to restart... 18:09:02 (1600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:16:08 (6072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:58:00 (7856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:58:01 (7856): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 22:42:27 (4756): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:48:42 (5404): No heartbeat from core client for 30 sec - exiting 11:48:43 (5404): No heartbeat from core client for 30 sec - exiting 11:48:44 (5404): No heartbeat from core client for 30 sec - exiting 11:48:45 (5404): No heartbeat from core client for 30 sec - exiting 11:48:46 (5404): No heartbeat from core client for 30 sec - exiting 11:48:47 (5404): No heartbeat from core client for 30 sec - exiting 11:48:48 (5404): No heartbeat from core client for 30 sec - exiting 11:48:49 (5404): No heartbeat from core client for 30 sec - exiting 11:48:50 (5404): No heartbeat from core client for 30 sec - exiting 11:48:51 (5404): No heartbeat from core client for 30 sec - exiting 11:48:52 (5404): No heartbeat from core client for 30 sec - exiting 11:48:53 (5404): No heartbeat from core client for 30 sec - exiting 11:48:54 (5404): No heartbeat from core client for 30 sec - exiting 11:48:55 (5404): No heartbeat from core client for 30 sec - exiting 11:48:56 (5404): No heartbeat from core client for 30 sec - exiting 11:48:57 (5404): No heartbeat from core client for 30 sec - exiting 11:48:58 (5404): No heartbeat from core client for 30 sec - exiting 11:48:59 (5404): No heartbeat from core client for 30 sec - exiting 11:49:00 (5404): No heartbeat from core client for 30 sec - exiting 11:49:01 (5404): No heartbeat from core client for 30 sec - exiting 11:49:02 (5404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... MainError: 05:49:09 PM No files match the supplied pattern. MainError: 05:49:09 PM No files match the supplied pattern. 19:12:12 (6640): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:40:03 (5044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:06:23 (6476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:07:56 (7112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 14:15:06 (5864): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:23:05 (7548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:18:41 (7784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:20:08 (4912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:34:00 (4632): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:21:27 (4992): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:55:22 (6264): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:56:40 (6412): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:07:22 (7564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 04:03:49 (5312): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:24:03 (6428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:27:42 (7832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1 Model crash detected, will try to restart... 13:40:17 (5268): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:22:55 (7400): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 Model crashed: STWORK : I/O error - PP fixed length header tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Nov 2012 00:24:24 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 984,960 | 1,868,671 | 1.8972 |
02 Nov 2012 23:28:28 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 959,040 | 1,818,946 | 1.8966 |
02 Nov 2012 08:14:51 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 933,120 | 1,770,493 | 1.8974 |
01 Nov 2012 16:32:46 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 907,200 | 1,719,921 | 1.8959 |
01 Nov 2012 01:25:16 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 881,280 | 1,666,972 | 1.8915 |
01 Nov 2012 01:25:16 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 855,360 | 1,614,461 | 1.8875 |
30 Oct 2012 10:20:10 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 829,440 | 1,563,111 | 1.8845 |
29 Oct 2012 18:43:06 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 803,520 | 1,510,090 | 1.8793 |
28 Oct 2012 17:49:55 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 777,600 | 1,459,024 | 1.8763 |
27 Oct 2012 18:52:26 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 751,680 | 1,409,701 | 1.8754 |
26 Oct 2012 18:14:34 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 725,760 | 1,360,469 | 1.8745 |
25 Oct 2012 18:56:25 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 699,840 | 1,313,313 | 1.8766 |
24 Oct 2012 18:43:05 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 673,920 | 1,266,627 | 1.8795 |
23 Oct 2012 19:48:38 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 648,000 | 1,220,951 | 1.8842 |
22 Oct 2012 20:55:09 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 622,080 | 1,175,311 | 1.8893 |
21 Oct 2012 23:31:11 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 596,160 | 1,129,181 | 1.8941 |
21 Oct 2012 00:18:45 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 570,240 | 1,083,523 | 1.9001 |
20 Oct 2012 01:05:30 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 544,320 | 1,038,093 | 1.9071 |
19 Oct 2012 12:01:56 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 518,400 | 993,170 | 1.9158 |
18 Oct 2012 22:14:39 | 775427 | 15288830 | hadcm3n_z8kq_1880_40_008200461_2 | 492,480 | 947,620 | 1.9242 |
©2024 cpdn.org