Name | hadcm3n_p4ei_1900_40_007223290_1 |
Workunit | 7421530 |
Created | 26 Apr 2011, 15:28:59 UTC |
Sent | 29 Apr 2011, 17:09:57 UTC |
Report deadline | 30 Jul 2011, 0:37:08 UTC |
Received | 5 Jun 2011, 18:35:46 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 871946 |
Run time | |
CPU time | 26 days 19 hours 33 min 7 sec |
Validate state | Invalid |
Credit | 6,220.80 |
Device peak FLOPS | 0.81 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu |
Stderr | <core_client_version>6.2.14</core_client_version> <![CDATA[ <message> process exited with code 22 (0x16, -234) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3205, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=22817, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 15:11:17 (27029): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 63 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 64 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 65 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 66 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 67 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 1 15:16:58 (28531): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Inappropriate ioctl for device BUFFIN: C I/O Error feof - Unit 63 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 64 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 65 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 66 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 67 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 68 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 69 - Return code = 1 15:18:30 (28535): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Inappropriate ioctl for device BUFFIN: C I/O Error feof - Unit 63 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 64 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 65 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 66 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 67 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 68 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 69 - Return code = 1 CPDN Monitor - Quit request from BOINC... 18:46:57 (2868): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 63 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 64 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 65 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 66 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 67 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 68 - Return code = 1 BUFFIN: Read Failed: No such file or directory BUFFIN: C I/O Error feof - Unit 69 - Return code = 1 18:48:58 (4435): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: Read Failed: Inappropriate ioctl for device BUFFIN: C I/O Error feof - Unit 63 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 64 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 65 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 66 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 67 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 68 - Return code = 1 BUFFIN: Read Failed: Invalid argument BUFFIN: C I/O Error feof - Unit 69 - Return code = 1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:36:07 (7663): No heartbeat from core client for 30 sec - exiting 19:36:08 (7663): No heartbeat from core client for 30 sec - exiting 19:36:09 (7663): No heartbeat from core client for 30 sec - exiting 19:36:10 (7663): No heartbeat from core client for 30 sec - exiting 19:36:11 (7663): No heartbeat from core client for 30 sec - exiting 19:36:13 (7663): No heartbeat from core client for 30 sec - exiting 19:36:14 (7663): No heartbeat from core client for 30 sec - exiting 19:36:15 (7663): No heartbeat from core client for 30 sec - exiting 19:36:16 (7663): No heartbeat from core client for 30 sec - exiting 19:36:17 (7663): No heartbeat from core client for 30 sec - exiting 19:36:18 (7663): No heartbeat from core client for 30 sec - exiting 19:36:19 (7663): No heartbeat from core client for 30 sec - exiting 19:36:20 (7663): No heartbeat from core client for 30 sec - exiting 19:36:21 (7663): No heartbeat from core client for 30 sec - exiting 19:36:22 (7663): No heartbeat from core client for 30 sec - exiting 19:36:23 (7663): No heartbeat from core client for 30 sec - exiting 19:36:24 (7663): No heartbeat from core client for 30 sec - exiting 19:36:25 (7663): No heartbeat from core client for 30 sec - exiting 19:36:26 (7663): No heartbeat from core client for 30 sec - exiting 19:36:27 (7663): No heartbeat from core client for 30 sec - exiting 19:36:28 (7663): No heartbeat from core client for 30 sec - exiting 19:36:29 (7663): No heartbeat from core client for 30 sec - exiting 19:36:30 (7663): No heartbeat from core client for 30 sec - exiting 19:36:31 (7663): No heartbeat from core client for 30 sec - exiting 19:36:32 (7663): No heartbeat from core client for 30 sec - exiting 19:36:33 (7663): No heartbeat from core client for 30 sec - exiting 19:36:34 (7663): No heartbeat from core client for 30 sec - exiting 19:36:35 (7663): No heartbeat from core client for 30 sec - exiting 19:36:36 (7663): No heartbeat from core client for 30 sec - exiting 19:36:37 (7663): No heartbeat from core client for 30 sec - exiting 19:36:38 (7663): No heartbeat from core client for 30 sec - exiting 19:36:39 (7663): No heartbeat from core client for 30 sec - exiting 19:36:40 (7663): No heartbeat from core client for 30 sec - exiting 19:36:41 (7663): No heartbeat from core client for 30 sec - exiting 19:36:42 (7663): No heartbeat from core client for 30 sec - exiting 19:36:43 (7663): No heartbeat from core client for 30 sec - exiting 19:36:44 (7663): No heartbeat from core client for 30 sec - exiting 19:36:45 (7663): No heartbeat from core client for 30 sec - exiting 19:36:46 (7663): No heartbeat from core client for 30 sec - exiting 19:36:47 (7663): No heartbeat from core client for 30 sec - exiting 19:36:48 (7663): No heartbeat from core client for 30 sec - exiting 19:36:49 (7663): No heartbeat from core client for 30 sec - exiting 19:36:50 (7663): No heartbeat from core client for 30 sec - exiting 19:36:51 (7663): No heartbeat from core client for 30 sec - exiting 19:36:52 (7663): No heartbeat from core client for 30 sec - exiting 19:36:53 (7663): No heartbeat from core client for 30 sec - exiting 19:36:54 (7663): No heartbeat from core client for 30 sec - exiting 19:36:55 (7663): No heartbeat from core client for 30 sec - exiting 19:36:56 (7663): No heartbeat from core client for 30 sec - exiting 19:36:57 (7663): No heartbeat from core client for 30 sec - exiting 19:37:30 (7663): No heartbeat from core client for 30 sec - exiting 19:37:31 (7663): No heartbeat from core client for 30 sec - exiting 19:37:32 (7663): No heartbeat from core client for 30 sec - exiting 19:37:33 (7663): No heartbeat from core client for 30 sec - exiting 19:37:34 (7663): No heartbeat from core client for 30 sec - exiting 19:37:35 (7663): No heartbeat from core client for 30 sec - exiting 19:37:36 (7663): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/atmos_restart.day after 11 attempts cpdnmonitor: cannot open input file /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_p4ei_1900_40_007223290/dataout/ocean_restart.day after 11 attempts Model crashed: READ_FLH: I/O error tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
05 Jun 2011 18:38:15 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 518,400 | 2,316,835 | 4.4692 |
03 Jun 2011 20:03:36 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 492,480 | 2,172,699 | 4.4118 |
01 Jun 2011 23:58:11 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 466,560 | 2,033,812 | 4.3592 |
31 May 2011 00:36:04 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 440,640 | 1,895,884 | 4.3026 |
29 May 2011 03:51:24 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 414,720 | 1,756,796 | 4.2361 |
28 May 2011 17:18:28 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 388,800 | 1,616,364 | 4.1573 |
25 May 2011 14:21:17 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 362,880 | 1,531,662 | 4.2208 |
24 May 2011 19:41:25 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 336,960 | 1,433,930 | 4.2555 |
22 May 2011 13:56:45 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 311,040 | 1,286,576 | 4.1364 |
20 May 2011 17:40:06 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 285,120 | 1,148,715 | 4.0289 |
18 May 2011 21:43:52 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 259,200 | 1,156,099 | 4.4603 |
17 May 2011 03:16:59 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 233,280 | 1,011,472 | 4.3359 |
15 May 2011 08:22:01 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 207,360 | 872,437 | 4.2074 |
13 May 2011 04:54:09 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 181,440 | 736,401 | 4.0586 |
10 May 2011 23:53:04 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 155,520 | 601,069 | 3.8649 |
08 May 2011 23:09:02 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 129,600 | 465,055 | 3.5884 |
07 May 2011 00:51:38 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 103,680 | 327,368 | 3.1575 |
05 May 2011 04:37:57 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 77,760 | 195,697 | 2.5167 |
03 May 2011 08:54:20 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 51,840 | 56,786 | 1.0954 |
01 May 2011 12:02:41 | 871946 | 12826505 | hadcm3n_p4ei_1900_40_007223290_1 | 25,920 | 145,800 | 5.6250 |
©2024 climateprediction.net