Name | hadcm3n_4bce_1940_40_008307646_1 |
Workunit | 8458781 |
Created | 10 Feb 2013, 14:03:32 UTC |
Sent | 10 Feb 2013, 14:03:44 UTC |
Report deadline | 12 May 2013, 21:30:55 UTC |
Received | 3 Apr 2013, 13:34:18 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1077037 |
Run time | 45 days 20 hours 3 min 8 sec |
CPU time | 42 days 6 hours 24 min 17 sec |
Validate state | Invalid |
Credit | 12,441.60 |
Device peak FLOPS | 2.26 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> 10:12:30 (8164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:12:31 (8164): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 09:45:22 (4804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 11:26:17 (5132): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:14:31 (8168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=22168, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... 02:40:59 (1352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 08:34:18 (6168): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:34:20 (6168): No heartbeat from core client for 30 sec - exiting 08:36:17 (32164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:54:46 (47440): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:58:18 (49492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=45776, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/4bceko.pjf4c10 Error converting file to netcdf: dataout/4bceko.pif4c10 Error converting file to netcdf: dataout/4bceko.pff4c10 Error converting file to netcdf: dataout/4bceka.phf4c10 Error converting file to netcdf: dataout/4bceka.pgf4c10 Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 17:53:10 (5540): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 04:12:15 (26388): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 00:02:38 (11008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:02:40 (11008): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 04:15:36 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:26:59 (57528): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 05:34:35 (7492): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 05:43:15 (58572): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... 07:24:29 (6456): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 10:22:41 (8444): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... 07:54:02 (9580): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:56:20 (8156): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:02:55 (14984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:06:46 (10056): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:09:57 (4688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:13:05 (22164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
03 Apr 2013 13:35:04 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 1,036,800 | 3,651,854 | 3.5222 |
02 Apr 2013 14:10:28 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 1,010,880 | 3,571,912 | 3.5335 |
01 Apr 2013 16:46:24 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 984,960 | 3,493,850 | 3.5472 |
31 Mar 2013 14:25:27 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 959,040 | 3,404,087 | 3.5495 |
30 Mar 2013 12:30:35 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 933,120 | 3,308,977 | 3.5461 |
28 Mar 2013 20:51:49 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 907,200 | 3,206,707 | 3.5347 |
27 Mar 2013 14:30:27 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 881,280 | 3,108,405 | 3.5271 |
26 Mar 2013 15:20:25 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 855,360 | 3,010,603 | 3.5197 |
25 Mar 2013 13:14:44 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 829,440 | 2,914,829 | 3.5142 |
23 Mar 2013 23:08:37 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 803,520 | 2,817,119 | 3.5060 |
22 Mar 2013 17:42:24 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 777,600 | 2,720,255 | 3.4983 |
21 Mar 2013 13:25:22 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 751,680 | 2,624,014 | 3.4909 |
20 Mar 2013 13:35:07 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 725,760 | 2,525,176 | 3.4794 |
19 Mar 2013 05:03:09 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 699,840 | 2,428,181 | 3.4696 |
18 Mar 2013 02:37:05 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 673,920 | 2,335,827 | 3.4660 |
16 Mar 2013 23:16:37 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 648,000 | 2,241,354 | 3.4589 |
15 Mar 2013 19:46:29 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 622,080 | 2,142,805 | 3.4446 |
14 Mar 2013 15:25:35 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 596,160 | 2,045,751 | 3.4315 |
13 Mar 2013 10:55:28 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 570,240 | 1,949,057 | 3.4180 |
12 Mar 2013 05:10:31 | 1077037 | 15603781 | hadcm3n_4bce_1940_40_008307646_1 | 544,320 | 1,848,045 | 3.3951 |
©2024 climateprediction.net