Name | hadcm3n_yl3z_1900_40_007360217_0 |
Workunit | 7557647 |
Created | 6 Jul 2011, 15:11:17 UTC |
Sent | 7 Jul 2011, 19:38:44 UTC |
Report deadline | 7 Oct 2011, 3:05:55 UTC |
Received | 31 Aug 2011, 8:43:49 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1007184 |
Run time | 28 days 10 hours 54 min 45 sec |
CPU time | 22 days 12 hours 20 min 29 sec |
Validate state | Invalid |
Credit | 10,575.36 |
Device peak FLOPS | 1.66 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... 12:13:35 (3056): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:13:36 (3056): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:38:29 (4044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:38:31 (4044): No heartbeat from core client for 30 sec - exiting 13:38:32 (4044): No heartbeat from core client for 30 sec - exiting 13:38:33 (4044): No heartbeat from core client for 30 sec - exiting 13:38:34 (4044): No heartbeat from core client for 30 sec - exiting 13:38:35 (4044): No heartbeat from core client for 30 sec - exiting 13:38:36 (4044): No heartbeat from core client for 30 sec - exiting 13:38:37 (4044): No heartbeat from core client for 30 sec - exiting 13:38:38 (4044): No heartbeat from core client for 30 sec - exiting 13:38:39 (4044): No heartbeat from core client for 30 sec - exiting 13:38:40 (4044): No heartbeat from core client for 30 sec - exiting 13:38:41 (4044): No heartbeat from core client for 30 sec - exiting 13:38:42 (4044): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:33:29 (1072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:33:31 (1072): No heartbeat from core client for 30 sec - exiting 11:33:32 (1072): No heartbeat from core client for 30 sec - exiting 11:33:33 (1072): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:39:17 (2240): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:39:18 (2240): No heartbeat from core client for 30 sec - exiting 12:39:19 (2240): No heartbeat from core client for 30 sec - exiting 12:40:45 (3552): No heartbeat from core client for 30 sec - exiting 12:40:47 (3552): No heartbeat from core client for 30 sec - exiting 12:40:48 (3552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:40:49 (3552): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3068, selfPID=3068, iMonCtr=1 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/yl3zko.pjb1c10 Error converting file to netcdf: dataout/yl3zko.pib1c10 Error converting file to netcdf: dataout/yl3zko.pfb1c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:50:18 (2748): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:50:20 (2748): No heartbeat from core client for 30 sec - exiting 13:50:21 (2748): No heartbeat from core client for 30 sec - exiting 13:50:22 (2748): No heartbeat from core client for 30 sec - exiting 13:50:23 (2748): No heartbeat from core client for 30 sec - exiting 13:50:24 (2748): No heartbeat from core client for 30 sec - exiting 13:50:25 (2748): No heartbeat from core client for 30 sec - exiting 13:50:26 (2748): No heartbeat from core client for 30 sec - exiting 13:50:27 (2748): No heartbeat from core client for 30 sec - exiting 13:50:28 (2748): No heartbeat from core client for 30 sec - exiting 13:50:29 (2748): No heartbeat from core client for 30 sec - exiting 13:50:30 (2748): No heartbeat from core client for 30 sec - exiting 13:50:31 (2748): No heartbeat from core client for 30 sec - exiting 13:50:32 (2748): No heartbeat from core client for 30 sec - exiting 13:50:33 (2748): No heartbeat from core client for 30 sec - exiting 13:50:34 (2748): No heartbeat from core client for 30 sec - exiting 13:52:53 (2444): No heartbeat from core client for 30 sec - exiting 13:52:55 (2444): No heartbeat from core client for 30 sec - exiting 13:52:56 (2444): No heartbeat from core client for 30 sec - exiting 13:52:57 (2444): No heartbeat from core client for 30 sec - exiting 13:52:58 (2444): No heartbeat from core client for 30 sec - exiting 13:52:59 (2444): No heartbeat from core client for 30 sec - exiting 13:53:00 (2444): No heartbeat from core client for 30 sec - exiting 13:53:01 (2444): No heartbeat from core client for 30 sec - exiting 13:53:02 (2444): No heartbeat from core client for 30 sec - exiting 13:53:03 (2444): No heartbeat from core client for 30 sec - exiting 13:53:04 (2444): No heartbeat from core client for 30 sec - exiting 13:53:05 (2444): No heartbeat from core client for 30 sec - exiting 13:53:06 (2444): No heartbeat from core client for 30 sec - exiting 13:53:07 (2444): No heartbeat from core client for 30 sec - exiting 13:53:08 (2444): No heartbeat from core client for 30 sec - exiting 13:53:09 (2444): No heartbeat from core client for 30 sec - exiting 13:53:10 (2444): No heartbeat from core client for 30 sec - exiting 13:53:11 (2444): No heartbeat from core client for 30 sec - exiting 13:53:12 (2444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 13:53:18 (2444): No heartbeat from core client for 30 sec - exiting 13:53:23 (2444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:19:47 (2236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:19:48 (2236): No heartbeat from core client for 30 sec - exiting 11:19:49 (2236): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=304, iMonCtr=1 Model crash detected, will try to restart... 11:21:53 (304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:21:55 (304): No heartbeat from core client for 30 sec - exiting 11:21:56 (304): No heartbeat from core client for 30 sec - exiting 11:21:57 (304): No heartbeat from core client for 30 sec - exiting 11:26:54 (2556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:54:43 (2920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:54:45 (2920): No heartbeat from core client for 30 sec - exiting 09:54:46 (2920): No heartbeat from core client for 30 sec - exiting 09:54:47 (2920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Model crashed: ATM_DYN : INVALID THETA DETECTED. tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
30 Aug 2011 20:33:22 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 881,280 | 2,300,978 | 2.6109 |
30 Aug 2011 00:57:24 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 855,360 | 2,232,931 | 2.6105 |
29 Aug 2011 04:29:18 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 829,440 | 2,162,171 | 2.6068 |
28 Aug 2011 08:01:01 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 803,520 | 2,091,500 | 2.6029 |
27 Aug 2011 12:01:59 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 777,600 | 2,020,984 | 2.5990 |
26 Aug 2011 15:24:35 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 751,680 | 1,949,954 | 2.5941 |
25 Aug 2011 19:39:54 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 725,760 | 1,879,801 | 2.5901 |
24 Aug 2011 22:58:28 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 699,840 | 1,809,080 | 2.5850 |
24 Aug 2011 02:47:58 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 673,920 | 1,738,633 | 2.5799 |
23 Aug 2011 06:00:28 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 648,000 | 1,668,410 | 2.5747 |
22 Aug 2011 10:38:48 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 622,080 | 1,599,248 | 2.5708 |
21 Aug 2011 14:31:20 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 596,160 | 1,530,852 | 2.5679 |
20 Aug 2011 19:15:49 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 570,240 | 1,462,932 | 2.5655 |
20 Aug 2011 00:02:22 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 544,320 | 1,395,194 | 2.5632 |
19 Aug 2011 02:39:08 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 518,400 | 1,361,180 | 2.6257 |
18 Aug 2011 03:09:38 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 492,480 | 1,293,778 | 2.6271 |
17 Aug 2011 08:07:21 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 466,560 | 1,226,957 | 2.6298 |
16 Aug 2011 13:11:04 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 440,640 | 1,160,144 | 2.6329 |
15 Aug 2011 13:26:43 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 414,720 | 1,091,894 | 2.6328 |
14 Aug 2011 01:01:16 | 1007184 | 13124305 | hadcm3n_yl3z_1900_40_007360217_0 | 388,800 | 1,023,937 | 2.6336 |
©2024 climateprediction.net