Task 15800525

Name	hadcm3n_3l6a_1940_40_008266050_3
Workunit	8421174
Created	29 May 2013, 8:52:19 UTC
Sent	29 May 2013, 8:52:25 UTC
Report deadline	28 Aug 2013, 16:19:36 UTC
Received	5 Jul 2013, 8:37:14 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	1127622
Run time	7 days 23 hours 21 min 33 sec
CPU time	7 days 21 hours 20 min 50 sec
Validate state	Invalid
Credit	5,598.72
Device peak FLOPS	2.28 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> Das Laufwerk kann einen bestimmten Bereich oder eine bestimmte Spur nicht finden. (0x19) - exit code 25 (0x19) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6068, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5616, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5188, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5188, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 13:27:23 (7212): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 08:20:27 (5284): No heartbeat from core client for 30 sec - exiting 08:20:28 (5284): No heartbeat from core client for 30 sec - exiting 08:20:29 (5284): No heartbeat from core client for 30 sec - exiting 08:20:30 (5284): No heartbeat from core client for 30 sec - exiting 08:20:31 (5284): No heartbeat from core client for 30 sec - exiting 08:20:32 (5284): No heartbeat from core client for 30 sec - exiting 08:20:33 (5284): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:24:09 (7940): Can't acquire lockfile (32) - waiting 35s 11:24:31 (8348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:55:28 (10772): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9604, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:00:15 (5224): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7876, iMonCtr=1 Model crash detected, will try to restart... 17:16:27 (7876): No heartbeat from core client for 30 sec - exiting 17:16:27 (1632): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 15:02:52 (6668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6820, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6820, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:38:58 (6332): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:11:34 (9452): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8092, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
04 Jul 2013 14:26:50	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	466,560	671,661	1.4396
02 Jul 2013 12:06:31	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	440,640	633,984	1.4388
26 Jun 2013 11:10:43	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	414,720	596,757	1.4389
24 Jun 2013 14:57:42	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	388,800	557,992	1.4352
17 Jun 2013 10:06:33	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	362,880	515,752	1.4213
14 Jun 2013 10:48:27	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	336,960	474,887	1.4093
12 Jun 2013 09:06:12	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	311,040	436,868	1.4045
11 Jun 2013 09:48:43	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	285,120	400,550	1.4048
10 Jun 2013 13:49:49	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	259,200	364,254	1.4053
07 Jun 2013 20:00:41	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	233,280	328,738	1.4092
07 Jun 2013 09:55:59	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	207,360	292,814	1.4121
06 Jun 2013 08:16:07	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	181,440	256,389	1.4131
05 Jun 2013 10:00:56	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	155,520	221,019	1.4212
04 Jun 2013 12:41:35	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	129,600	185,587	1.4320
03 Jun 2013 17:45:51	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	103,680	149,295	1.4400
03 Jun 2013 07:27:27	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	77,760	112,738	1.4498
30 May 2013 15:03:06	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	51,840	76,474	1.4752
29 May 2013 19:58:28	1127622	15800525	hadcm3n_3l6a_1940_40_008266050_3	25,920	39,418	1.5208