Task 16162008

Name	hadcm3n_odau_1900_40_008472361_1
Workunit	8623200
Created	27 Dec 2013, 17:40:52 UTC
Sent	27 Dec 2013, 17:40:55 UTC
Report deadline	29 Mar 2014, 1:08:06 UTC
Received	31 Jan 2014, 11:03:47 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1296679
Run time	10 days 21 hours 2 min 44 sec
CPU time	8 days 18 hours 29 min 58 sec
Validate state	Invalid
Credit	6,842.88
Device peak FLOPS	3.11 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>7.2.33</core_client_version> <![CDATA[ <message> El dispositivo no reconoce el comando. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5124, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3020, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2792, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4136, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1364, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2040, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/odauko.pja6c10 Error converting file to netcdf: dataout/odauko.pia6c10 Error converting file to netcdf: dataout/odauko.pfa6c10 Error converting file to netcdf: dataout/odauka.pha6c10 Error converting file to netcdf: dataout/odauka.pga6c10 Error converting file to netcdf: dataout/odauka.pea6c10 Error converting file to netcdf: dataout/odauka.pda6c10 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1428, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3900, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CCPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 01:01:44 (2868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:14:49 (5140): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:06:29 (5120): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... C10:06:58 (4124): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:18:17 (5160): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:27:53 (4352): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:46:51 (5128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5248, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4076, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Jan 2014 01:01:20	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	570,240	754,609	1.3233
30 Jan 2014 14:30:23	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	544,320	720,279	1.3233
30 Jan 2014 14:30:23	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	518,400	686,085	1.3235
27 Jan 2014 20:58:51	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	492,480	652,046	1.3240
26 Jan 2014 22:20:40	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	466,560	617,666	1.3239
25 Jan 2014 15:46:24	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	440,640	583,410	1.3240
24 Jan 2014 14:21:16	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	414,720	548,892	1.3235
22 Jan 2014 21:07:55	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	388,800	514,731	1.3239
21 Jan 2014 02:48:46	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	362,880	480,078	1.3230
19 Jan 2014 19:37:22	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	336,960	444,975	1.3206
18 Jan 2014 16:00:08	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	311,040	411,060	1.3216
14 Jan 2014 16:54:27	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	285,120	377,375	1.3236
12 Jan 2014 18:14:05	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	259,200	343,293	1.3244
10 Jan 2014 02:05:27	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	233,280	308,827	1.3238
08 Jan 2014 10:01:10	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	207,360	274,565	1.3241
06 Jan 2014 00:51:26	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	181,440	239,794	1.3216
04 Jan 2014 11:42:57	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	155,520	206,268	1.3263
02 Jan 2014 12:41:37	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	129,600	171,562	1.3238
31 Dec 2013 20:06:09	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	103,680	136,871	1.3201
30 Dec 2013 20:01:54	1296679	16162008	hadcm3n_odau_1900_40_008472361_1	77,760	102,681	1.3205