climateprediction.net home page
Task 18120337

Task 18120337

Name hadcm3n_xblf_1940_40_009151321_2
Workunit 9281657
Created 16 Mar 2015, 2:17:30 UTC
Sent 16 Mar 2015, 2:17:40 UTC
Report deadline 15 Jun 2015, 9:44:51 UTC
Received 12 May 2015, 17:39:37 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1327756
Run time 14 days 23 hours 56 min 21 sec
CPU time 14 days 2 hours 14 min 16 sec
Validate state Invalid
Credit 8,398.08
Device peak FLOPS 2.80 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>7.4.42</core_client_version>
<![CDATA[
<message>
The device does not recognize the command.
 (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2584, iMonCtr=1
Model crash detected, will try to restart...
16:11:20 (3648): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5556, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3384, iMonCtr=1
Model crash detected, will try to restart...
18:10:12 (5224): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:10:13 (5224): No heartbeat from core client for 30 sec - exiting
11:36:51 (2708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1028, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5908, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5924, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
15:06:38 (4756): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=1
Model crash detected, will try to restart...
19:03:28 (6120): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:31:58 (5072): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
23:59:51 (6968): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
16:25:35 (4856): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3696, iMonCtr=1
Model crash detected, will try to restart...
10:24:33 (5584): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
00:35:17 (1236): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
02:18:28 (7808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
21:31:37 (6624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3412, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4416, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3892, iMonCtr=1
Model crash detected, will try to restart...
11:35:02 (5484): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
11:47:51 (5056): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
17:53:48 (3732): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:19:26 (4024): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5364, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
04:13:30 (3440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
10:25:17 (5180): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6008, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
08:58:50 (7276): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:58:51 (7276): No heartbeat from core client for 30 sec - exiting
08:58:52 (7276): No heartbeat from core client for 30 sec - exiting
08:58:53 (7276): No heartbeat from core client for 30 sec - exiting
08:58:54 (7276): No heartbeat from core client for 30 sec - exiting
08:58:55 (7276): No heartbeat from core client for 30 sec - exiting
08:58:56 (7276): No heartbeat from core client for 30 sec - exiting
08:58:57 (7276): No heartbeat from core client for 30 sec - exiting
08:58:58 (7276): No heartbeat from core client for 30 sec - exiting
08:58:59 (7276): No heartbeat from core client for 30 sec - exiting
08:59:00 (7276): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6012, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
CPDN Monitor - Quit request from BOINC...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4480, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_xblf_1940_40_009151321/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4480, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
08 May 2015 19:28:03 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 699,840 1,213,006 1.7333
08 May 2015 19:24:28 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 673,920 1,167,895 1.7330
08 May 2015 19:10:29 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 648,000 1,123,024 1.7331
27 Apr 2015 06:22:55 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 622,080 1,077,721 1.7324
26 Apr 2015 00:06:37 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 596,160 1,032,648 1.7322
24 Apr 2015 21:10:03 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 570,240 987,405 1.7316
20 Apr 2015 04:57:03 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 544,320 942,092 1.7308
18 Apr 2015 06:43:46 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 518,400 896,904 1.7301
16 Apr 2015 20:56:56 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 492,480 851,993 1.7300
15 Apr 2015 05:19:18 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 466,560 806,731 1.7291
13 Apr 2015 20:48:56 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 440,640 761,665 1.7285
10 Apr 2015 04:42:27 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 414,720 717,364 1.7298
07 Apr 2015 02:49:48 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 388,800 673,540 1.7324
05 Apr 2015 03:19:07 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 362,880 629,552 1.7349
04 Apr 2015 03:19:36 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 336,960 585,380 1.7372
02 Apr 2015 01:52:11 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 311,040 539,862 1.7357
31 Mar 2015 22:44:16 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 285,120 494,210 1.7333
29 Mar 2015 22:22:34 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 259,200 448,095 1.7288
28 Mar 2015 20:43:22 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 233,280 403,982 1.7317
27 Mar 2015 03:47:33 1327756 18120337 hadcm3n_xblf_1940_40_009151321_2 207,360 359,882 1.7355


©2024 climateprediction.net