Task 12989555

Name	hadcm3n_s3mq_1940_40_007299534_0
Workunit	7496958
Created	20 Jun 2011, 19:08:58 UTC
Sent	20 Jun 2011, 19:09:09 UTC
Report deadline	20 Sep 2011, 2:36:20 UTC
Received	9 Jul 2011, 0:18:23 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	25 (0x00000019) Unknown error code
Computer ID	981127
Run time	7 days 1 hours 28 min 51 sec
CPU time	6 days 11 hours 28 min 24 sec
Validate state	Invalid
Credit	2,488.32
Device peak FLOPS	2.61 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.6.38</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4156, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6008, iMonCtr=1 Model crash detected, will try to restart... 10:04:20 (5112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:04:21 (5112): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5320, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5520, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4576, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5316, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5056, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4652, iMonCtr=1 Model crash detected, will try to restart... C20:06:29 (4956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4808, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5072, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4824, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5212, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
08 Jul 2011 14:24:08	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	207,360	555,167	2.6773
07 Jul 2011 19:23:00	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	181,440	484,517	2.6704
04 Jul 2011 21:47:50	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	155,520	413,797	2.6607
01 Jul 2011 21:33:22	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	129,600	342,653	2.6439
29 Jun 2011 15:37:34	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	103,680	272,451	2.6278
27 Jun 2011 23:47:44	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	77,760	203,654	2.6190
23 Jun 2011 20:43:36	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	51,840	134,744	2.5992
22 Jun 2011 13:26:55	981127	12989555	hadcm3n_s3mq_1940_40_007299534_0	25,920	67,000	2.5849