Name | hadcm3n_o2vt_1980_40_008385076_4 |
Workunit | 8535935 |
Created | 14 Sep 2013, 20:40:48 UTC |
Sent | 14 Sep 2013, 21:21:59 UTC |
Report deadline | 15 Dec 2013, 4:49:10 UTC |
Received | 16 Nov 2013, 19:06:48 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1213041 |
Run time | 16 days 3 hours 23 min 53 sec |
CPU time | 13 days 2 hours 14 min 39 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 2.41 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4344, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:52:43 (5940): No heartbeat from core client for 30 sec - exiting 16:52:44 (5940): No heartbeat from core client for 30 sec - exiting 16:52:45 (5940): No heartbeat from core client for 30 sec - exiting 16:52:46 (5940): No heartbeat from core client for 30 sec - exiting 16:52:47 (5940): No heartbeat from core client for 30 sec - exiting 16:52:48 (5940): No heartbeat from core client for 30 sec - exiting 16:52:49 (5940): No heartbeat from core client for 30 sec - exiting 16:52:50 (5940): No heartbeat from core client for 30 sec - exiting 16:52:51 (5940): No heartbeat from core client for 30 sec - exiting 16:52:52 (5940): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:26:16 (4664): No heartbeat from core client for 30 sec - exiting 16:26:17 (4664): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5340, iMonCtr=1 Model crash detected, will try to restart... 16:35:27 (5548): No heartbeat from core client for 30 sec - exiting 16:35:28 (5548): No heartbeat from core client for 30 sec - exiting 16:35:29 (5548): No heartbeat from core client for 30 sec - exiting 16:35:30 (5548): No heartbeat from core client for 30 sec - exiting 16:35:31 (5548): No heartbeat from core client for 30 sec - exiting 16:35:32 (5548): No heartbeat from core client for 30 sec - exiting 16:35:33 (5548): No heartbeat from core client for 30 sec - exiting 16:35:34 (5548): No heartbeat from core client for 30 sec - exiting 16:35:35 (5548): No heartbeat from core client for 30 sec - exiting 16:35:36 (5548): No heartbeat from core client for 30 sec - exiting 16:35:37 (5548): No heartbeat from core client for 30 sec - exiting 16:35:38 (5548): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:53:01 (5564): No heartbeat from core client for 30 sec - exiting 09:53:02 (5564): No heartbeat from core client for 30 sec - exiting 09:53:03 (5564): No heartbeat from core client for 30 sec - exiting 09:53:04 (5564): No heartbeat from core client for 30 sec - exiting 09:53:05 (5564): No heartbeat from core client for 30 sec - exiting 09:53:06 (5564): No heartbeat from core client for 30 sec - exiting 09:53:08 (5564): No heartbeat from core client for 30 sec - exiting 09:53:09 (5564): No heartbeat from core client for 30 sec - exiting 09:53:10 (5564): No heartbeat from core client for 30 sec - exiting 09:53:11 (5564): No heartbeat from core client for 30 sec - exiting 09:53:12 (5564): No heartbeat from core client for 30 sec - exiting 09:53:13 (5564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:53:14 (5564): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 17:55:16 (6032): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3168, iMonCtr=1 Model crash detected, will try to restart... 17:59:57 (5624): No heartbeat from core client for 30 sec - exiting 17:59:58 (5624): No heartbeat from core client for 30 sec - exiting 17:59:59 (5624): No heartbeat from core client for 30 sec - exiting 18:00:00 (5624): No heartbeat from core client for 30 sec - exiting 18:00:01 (5624): No heartbeat from core client for 30 sec - exiting 18:00:02 (5624): No heartbeat from core client for 30 sec - exiting 18:00:03 (5624): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:09:30 (5180): No heartbeat from core client for 30 sec - exiting 18:09:31 (5180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3080, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3080, iMonCtr=1 Model crash detected, will try to restart... 12:38:21 (5356): No heartbeat from core client for 30 sec - exiting 12:38:22 (5356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:43:58 (5356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:48:41 (6132): No heartbeat from core client for 30 sec - exiting 17:48:42 (6132): No heartbeat from core client for 30 sec - exiting 17:48:43 (6132): No heartbeat from core client for 30 sec - exiting 17:48:44 (6132): No heartbeat from core client for 30 sec - exiting 17:48:45 (6132): No heartbeat from core client for 30 sec - exiting 17:48:46 (6132): No heartbeat from core client for 30 sec - exiting 17:48:48 (6132): No heartbeat from core client for 30 sec - exiting 17:48:49 (6132): No heartbeat from core client for 30 sec - exiting 17:48:50 (6132): No heartbeat from core client for 30 sec - exiting 17:48:51 (6132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... C18:44:41 (5404): No heartbeat from core client for 30 sec - exiting 18:44:42 (5404): No heartbeat from core client for 30 sec - exiting 18:44:43 (5404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4920, iMonCtr=1 Model crash detected, will try to restart... 16:37:02 (1672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:21:13 (6500): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:19:06 (3708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:43:40 (5832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:20:18 (1292): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 16:44:18 (5916): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:16:06 (6140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=1 Model crash detected, will try to restart... 16:39:07 (6420): No heartbeat from core client for 30 sec - exiting 16:39:08 (6420): No heartbeat from core client for 30 sec - exiting 16:39:09 (6420): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/o2vtko.pjj9c10 Error converting file to netcdf: dataout/o2vtko.pij9c10 Error converting file to netcdf: dataout/o2vtko.pfj9c10 Error converting file to netcdf: dataout/o2vtka.phj9c10 Error converting file to netcdf: dataout/o2vtka.pgj9c10 Error converting file to netcdf: dataout/o2vtka.pej9c10 Error converting file to netcdf: dataout/o2vtka.pdj9c10 16:28:57 (5488): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
16 Nov 2013 19:10:53 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 518,400 | 1,131,270 | 2.1822 |
14 Nov 2013 17:42:01 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 492,480 | 1,076,057 | 2.1850 |
11 Nov 2013 17:22:28 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 466,560 | 1,020,516 | 2.1873 |
08 Nov 2013 23:14:00 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 440,640 | 965,860 | 2.1919 |
06 Nov 2013 20:27:02 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 414,720 | 910,895 | 2.1964 |
04 Nov 2013 21:13:37 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 388,800 | 855,640 | 2.2007 |
02 Nov 2013 21:49:10 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 362,880 | 800,048 | 2.2047 |
27 Oct 2013 18:47:17 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 336,960 | 743,380 | 2.2061 |
25 Oct 2013 19:35:33 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 311,040 | 685,911 | 2.2052 |
21 Oct 2013 18:37:07 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 285,120 | 628,712 | 2.2051 |
18 Oct 2013 21:56:10 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 259,200 | 571,174 | 2.2036 |
14 Oct 2013 21:13:46 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 233,280 | 514,414 | 2.2051 |
12 Oct 2013 21:59:37 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 207,360 | 457,575 | 2.2067 |
07 Oct 2013 19:35:35 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 181,440 | 400,859 | 2.2093 |
05 Oct 2013 17:26:34 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 155,520 | 343,471 | 2.2085 |
03 Oct 2013 18:23:21 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 129,600 | 285,925 | 2.2062 |
29 Sep 2013 11:34:49 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 103,680 | 228,764 | 2.2064 |
23 Sep 2013 15:29:05 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 77,760 | 170,869 | 2.1974 |
21 Sep 2013 12:51:48 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 51,840 | 113,283 | 2.1852 |
17 Sep 2013 16:57:49 | 1213041 | 16016764 | hadcm3n_o2vt_1980_40_008385076_4 | 25,920 | 56,782 | 2.1907 |
©2024 cpdn.org