Task 13027738

Name	hadcm3n_t721_1940_40_007316937_1
Workunit	7514367
Created	29 Jun 2011, 5:35:15 UTC
Sent	29 Jun 2011, 5:59:24 UTC
Report deadline	28 Sep 2011, 13:26:35 UTC
Received	1 Aug 2011, 9:13:41 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	853181
Run time	26 days 9 hours 40 min 7 sec
CPU time	22 days 6 hours 58 min 19 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	1.65 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86
Stderr	<core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3308, iMonCtr=1 Model crash detected, will try to restart... 03:33:15 (4976): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4108, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3108, iMonCtr=1 Model crash detected, will try to restart... 18:23:38 (5488): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:29:33 (4592): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3972, iMonCtr=1 Model crash detected, will try to restart... 06:59:50 (4768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 09:30:44 (4576): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:39:55 (5804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:39:57 (5804): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7076, iMonCtr=1 Model crash detected, will try to restart... 12:59:54 (4176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:59:55 (4176): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... No Process Handle Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6016, selfPID=6016, iMonCtr=1 zip error: Could not create output file (was replacing the original zip file) cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_t721_1940_40_007316937/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
31 Jul 2011 09:10:13	853181	13027738	hadcm3n_t721_1940_40_007316937_1	518,400	1,925,833	3.7150
30 Jul 2011 02:13:27	853181	13027738	hadcm3n_t721_1940_40_007316937_1	492,480	1,831,900	3.7197
28 Jul 2011 18:02:01	853181	13027738	hadcm3n_t721_1940_40_007316937_1	466,560	1,736,942	3.7229
27 Jul 2011 09:16:43	853181	13027738	hadcm3n_t721_1940_40_007316937_1	440,640	1,641,226	3.7246
26 Jul 2011 00:39:17	853181	13027738	hadcm3n_t721_1940_40_007316937_1	414,720	1,544,993	3.7254
25 Jul 2011 22:08:54	853181	13027738	hadcm3n_t721_1940_40_007316937_1	388,800	1,448,895	3.7266
25 Jul 2011 20:29:20	853181	13027738	hadcm3n_t721_1940_40_007316937_1	362,880	1,352,573	3.7273
25 Jul 2011 19:29:04	853181	13027738	hadcm3n_t721_1940_40_007316937_1	336,960	1,256,299	3.7283
25 Jul 2011 18:55:23	853181	13027738	hadcm3n_t721_1940_40_007316937_1	311,040	1,160,236	3.7302
25 Jul 2011 17:45:15	853181	13027738	hadcm3n_t721_1940_40_007316937_1	285,120	1,063,461	3.7299
25 Jul 2011 15:43:06	853181	13027738	hadcm3n_t721_1940_40_007316937_1	259,200	962,289	3.7125
10 Jul 2011 14:59:12	853181	13027738	hadcm3n_t721_1940_40_007316937_1	233,280	864,298	3.7050
09 Jul 2011 08:35:18	853181	13027738	hadcm3n_t721_1940_40_007316937_1	207,360	767,809	3.7028
08 Jul 2011 02:11:23	853181	13027738	hadcm3n_t721_1940_40_007316937_1	181,440	671,640	3.7017
07 Jul 2011 15:42:20	853181	13027738	hadcm3n_t721_1940_40_007316937_1	155,520	575,968	3.7035
05 Jul 2011 13:20:54	853181	13027738	hadcm3n_t721_1940_40_007316937_1	129,600	479,618	3.7008
04 Jul 2011 06:52:06	853181	13027738	hadcm3n_t721_1940_40_007316937_1	103,680	382,984	3.6939
03 Jul 2011 00:51:10	853181	13027738	hadcm3n_t721_1940_40_007316937_1	77,760	287,405	3.6961
01 Jul 2011 19:11:33	853181	13027738	hadcm3n_t721_1940_40_007316937_1	51,840	191,852	3.7008
30 Jun 2011 12:27:13	853181	13027738	hadcm3n_t721_1940_40_007316937_1	25,920	95,565	3.6869