Task 17334908

Name	hadcm3n_s0z1_1940_40_009094422_4
Workunit	9224758
Created	1 Nov 2014, 3:51:47 UTC
Sent	1 Nov 2014, 3:55:44 UTC
Report deadline	31 Jan 2015, 11:22:55 UTC
Received	12 Nov 2014, 20:20:16 UTC
Server state	Over
Outcome	Computation error
Client state	Compute error
Exit status	22 (0x00000016) Unknown error code
Computer ID	1289368
Run time	4 days 17 hours 0 min 41 sec
CPU time	3 days 20 hours 33 min 7 sec
Validate state	Invalid
Credit	6,220.80
Device peak FLOPS	4.24 GFLOPS
Application version	UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-apple-darwin
Stderr	<core_client_version>7.4.26</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:38:24 (72336): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:38:25 (72336): No heartbeat from core client for 30 sec - exiting 07:40:06 (72401): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:40:07 (72401): No heartbeat from core client for 30 sec - exiting 07:40:08 (72401): No heartbeat from core client for 30 sec - exiting 07:41:30 (72494): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:43:15 (72568): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:42:47 (22630): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 16:29:27 (35255): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:29:28 (35255): No heartbeat from core client for 30 sec - exiting 16:37:34 (35422): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:37:35 (35422): No heartbeat from core client for 30 sec - exiting 16:37:36 (35422): No heartbeat from core client for 30 sec - exiting 16:37:37 (35422): No heartbeat from core client for 30 sec - exiting 16:37:38 (35422): No heartbeat from core client for 30 sec - exiting 16:37:39 (35422): No heartbeat from core client for 30 sec - exiting 16:37:40 (35422): No heartbeat from core client for 30 sec - exiting 16:37:41 (35422): No heartbeat from core client for 30 sec - exiting 16:37:43 (35422): No heartbeat from core client for 30 sec - exiting 16:37:44 (35422): No heartbeat from core client for 30 sec - exiting 16:37:45 (35422): No heartbeat from core client for 30 sec - exiting 16:37:46 (35422): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:03:50 (59533): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 19:24:12 (60972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:24:13 (60972): No heartbeat from core client for 30 sec - exiting 19:24:14 (60972): No heartbeat from core client for 30 sec - exiting 19:24:15 (60972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:00:28 (1708): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:00:29 (1708): No heartbeat from core client for 30 sec - exiting 20:00:30 (1708): No heartbeat from core client for 30 sec - exiting 20:00:31 (1708): No heartbeat from core client for 30 sec - exiting SIGSEGV: segmentation violation 20:08:12 (3595): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:26:53 (86311): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 14:42:05 (97088): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:24:34 (21475): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:54:32 (32201): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x818804: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x818800: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e04: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e00: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e04: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e00: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e04: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e00: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e04: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x2800e00: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x1010e04: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug hadcm3n_6.07_i686-apple-darwin(36093,0xa0e901d4) malloc: * error for object 0x1010e00: incorrect checksum for freed object - object was probably modified after being freed. * set a breakpoint in malloc_error_break to debug 20:19:26 (36093): No heartbeat from core client for 30 sec - exiting 20:19:27 (36093): No heartbeat from core client for 30 sec - exiting 20:19:28 (36093): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /Library/Application Support/BOINC Data/projects/climateprediction.net/hadcm3n_s0z1_1940_40_009094422/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]>

Latest Trickles Received
Time Sent (UTC)	Host ID	Result ID	Result Name	Timestep	CPU Time (sec)	Average (sec/TS)
12 Nov 2014 19:20:11	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	518,400	333,162	0.6427
12 Nov 2014 13:27:03	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	492,480	316,287	0.6422
12 Nov 2014 07:14:15	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	466,560	300,076	0.6432
11 Nov 2014 08:16:46	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	440,640	284,151	0.6449
09 Nov 2014 10:30:05	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	414,720	267,310	0.6446
06 Nov 2014 14:15:51	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	388,800	250,684	0.6448
06 Nov 2014 07:51:37	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	362,880	234,209	0.6454
06 Nov 2014 02:19:37	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	336,960	217,541	0.6456
05 Nov 2014 21:03:17	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	311,040	201,107	0.6466
05 Nov 2014 14:08:33	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	285,120	183,664	0.6442
05 Nov 2014 07:27:11	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	259,200	166,296	0.6416
05 Nov 2014 02:11:08	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	233,280	149,936	0.6427
04 Nov 2014 20:34:12	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	207,360	133,323	0.6430
04 Nov 2014 14:46:39	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	181,440	116,364	0.6413
04 Nov 2014 09:09:12	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	155,520	99,044	0.6369
03 Nov 2014 23:26:46	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	129,600	82,309	0.6351
03 Nov 2014 04:41:06	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	103,680	65,714	0.6338
02 Nov 2014 23:04:17	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	77,760	49,323	0.6343
02 Nov 2014 17:32:53	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	51,840	32,896	0.6346
02 Nov 2014 11:25:17	1289368	17334908	hadcm3n_s0z1_1940_40_009094422_4	25,920	16,513	0.6371