climateprediction.net home page
Task 13771331

Task 13771331

Name hadcm3n_yc2j_1940_40_007615449_0
Workunit 7793579
Created 12 Dec 2011, 23:56:11 UTC
Sent 13 Dec 2011, 5:21:20 UTC
Report deadline 13 Mar 2012, 12:48:31 UTC
Received 15 Jan 2012, 22:25:44 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1184704
Run time 10 days 10 hours 8 min 41 sec
CPU time 9 days 23 hours 53 min 20 sec
Validate state Invalid
Credit 3,421.44
Device peak FLOPS 2.29 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
14:40:52 (5192): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5160, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5160, iMonCtr=1
Model crash detected, will try to restart...
11:17:35 (1152): No heartbeat from core client for 30 sec - exiting
11:17:37 (1152): No heartbeat from core client for 30 sec - exiting
11:17:38 (1152): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:17:39 (1152): No heartbeat from core client for 30 sec - exiting
11:17:40 (1152): No heartbeat from core client for 30 sec - exiting
11:17:42 (1152): No heartbeat from core client for 30 sec - exiting
11:29:37 (5792): No heartbeat from core client for 30 sec - exiting
11:29:38 (5792): No heartbeat from core client for 30 sec - exiting
11:29:39 (5792): No heartbeat from core client for 30 sec - exiting
11:29:40 (5792): No heartbeat from core client for 30 sec - exiting
11:29:41 (5792): No heartbeat from core client for 30 sec - exiting
11:29:43 (5792): No heartbeat from core client for 30 sec - exiting
11:29:44 (5792): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:30:17 (5792): No heartbeat from core client for 30 sec - exiting
11:31:25 (8864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:31:27 (8864): No heartbeat from core client for 30 sec - exiting
11:31:28 (8864): No heartbeat from core client for 30 sec - exiting
11:31:29 (8864): No heartbeat from core client for 30 sec - exiting
11:31:30 (8864): No heartbeat from core client for 30 sec - exiting
11:34:00 (8368): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
11:36:41 (6732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
11:54:51 (5508): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6812, iMonCtr=1
Model crash detected, will try to restart...
12:21:17 (5580): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:21:19 (5580): No heartbeat from core client for 30 sec - exiting
12:21:20 (5580): No heartbeat from core client for 30 sec - exiting
12:21:21 (5580): No heartbeat from core client for 30 sec - exiting
12:21:22 (5580): No heartbeat from core client for 30 sec - exiting
12:21:23 (5580): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5232, iMonCtr=1
Model crash detected, will try to restart...
12:39:11 (5924): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:41:01 (4796): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8040, iMonCtr=1
Model crash detected, will try to restart...
13:06:58 (5804): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6452, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6452, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6452, iMonCtr=1
Model crash detected, will try to restart...
14:02:03 (5580): No heartbeat from core client for 30 sec - exiting
14:02:04 (5580): No heartbeat from core client for 30 sec - exiting
14:02:05 (5580): No heartbeat from core client for 30 sec - exiting
14:02:06 (5580): No heartbeat from core client for 30 sec - exiting
14:02:07 (5580): No heartbeat from core client for 30 sec - exiting
14:02:08 (5580): No heartbeat from core client for 30 sec - exiting
14:02:10 (5580): No heartbeat from core client for 30 sec - exiting
14:02:11 (5580): No heartbeat from core client for 30 sec - exiting
14:02:12 (5580): No heartbeat from core client for 30 sec - exiting
14:02:13 (5580): No heartbeat from core client for 30 sec - exiting
14:02:14 (5580): No heartbeat from core client for 30 sec - exiting
14:02:15 (5580): No heartbeat from core client for 30 sec - exiting
14:02:16 (5580): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:02:17 (5580): No heartbeat from core client for 30 sec - exiting
14:04:20 (7572): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:05:58 (5912): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:18:58 (5352): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7396, iMonCtr=1
Model crash detected, will try to restart...
15:14:01 (5136): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=816, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1860, iMonCtr=1
Model crash detected, will try to restart...
10:36:05 (1860): No heartbeat from core client for 30 sec - exiting
10:36:07 (1860): No heartbeat from core client for 30 sec - exiting
10:36:09 (1860): No heartbeat from core client for 30 sec - exiting
10:36:10 (1860): No heartbeat from core client for 30 sec - exiting
10:36:11 (1860): No heartbeat from core client for 30 sec - exiting
10:36:12 (1860): No heartbeat from core client for 30 sec - exiting
10:36:14 (1860): No heartbeat from core client for 30 sec - exiting
10:36:15 (1860): No heartbeat from core client for 30 sec - exiting
10:36:17 (1860): No heartbeat from core client for 30 sec - exiting
Sorry, too many model crashes! :-(
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
16:53:05 (5484): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file D:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yc2j_1940_40_007615449/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
17:00:24 (10664): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:02:36 (10724): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 30 - Return code = 16

Model crashed: READHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 30 - Return code = 16

Model crashed: READHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 30 - Return code = 16

Model crashed: READHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
17:03:53 (6184): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 30 - Return code = 16

Model crashed: READHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
BUFFIN: C I/O Error feof - Unit 30 - Return code = 16

Model crashed: READHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
17:04:33 (8428): No heartbeat from core client for 30 sec - exiting
17:04:34 (8428): No heartbeat from core client for 30 sec - exiting
17:04:35 (8428): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 30 - Return code = 16

Model crashed: READHEAD: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
17:06:53 (6508): No heartbeat from core client for 30 sec - exiting
Called boinc_finish
17:06:55 (6508): No heartbeat from core client for 30 sec - exiting
17:06:56 (6508): No heartbeat from core client for 30 sec - exiting

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
05 Jan 2012 01:07:35 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 285,120 800,337 2.8070
01 Jan 2012 20:41:39 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 259,200 721,439 2.7833
31 Dec 2011 18:43:29 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 233,280 632,512 2.7114
30 Dec 2011 22:08:19 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 207,360 558,454 2.6932
30 Dec 2011 22:08:19 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 181,440 514,082 2.8333
28 Dec 2011 12:05:32 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 155,520 454,493 2.9224
27 Dec 2011 09:42:49 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 129,600 363,538 2.8051
26 Dec 2011 05:24:31 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 103,680 264,356 2.5497
25 Dec 2011 04:07:49 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 77,760 174,642 2.2459
23 Dec 2011 02:59:43 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 51,840 100,820 1.9448
22 Dec 2011 07:38:09 1184704 13771331 hadcm3n_yc2j_1940_40_007615449_0 25,920 45,810 1.7674


©2024 climateprediction.net