climateprediction.net home page
Task 15696005

Task 15696005

Name hadcm3n_n256_1920_40_008334587_3
Workunit 8485448
Created 30 Mar 2013, 19:26:21 UTC
Sent 30 Mar 2013, 19:26:27 UTC
Report deadline 30 Jun 2013, 2:53:38 UTC
Received 4 Jun 2013, 19:11:29 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1096206
Run time 8 days 18 hours 57 min 2 sec
CPU time 8 days 15 hours 29 min 7 sec
Validate state Invalid
Credit 6,531.84
Device peak FLOPS 2.66 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
Het apparaat herkent de opdracht niet. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=868, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1
Model crash detected, will try to restart...
23:33:25 (2592): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:33:26 (2592): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9504, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=9504, iMonCtr=1
Model crash detected, will try to restart...
Atmos Hold Restart file rename failed on atmos_restart.hold
Atmos Hold Restart file rename failed on atmos_restart.hold
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=12188, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8652, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3144, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6516, iMonCtr=1
Model crash detected, will try to restart...
22:30:41 (9896): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:20:21 (7624): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:20:24 (7624): No heartbeat from core client for 30 sec - exiting
00:20:25 (7624): No heartbeat from core client for 30 sec - exiting
00:20:26 (7624): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=13916, iMonCtr=1
Model crash detected, will try to restart...
00:31:36 (13916): No heartbeat from core client for 30 sec - exiting
00:31:37 (13916): No heartbeat from core client for 30 sec - exiting
00:31:38 (13916): No heartbeat from core client for 30 sec - exiting
00:31:39 (13916): No heartbeat from core client for 30 sec - exiting
00:31:40 (13916): No heartbeat from core client for 30 sec - exiting
00:31:41 (13916): No heartbeat from core client for 30 sec - exiting
00:31:42 (13916): No heartbeat from core client for 30 sec - exiting
00:31:43 (13916): No heartbeat from core client for 30 sec - exiting
00:31:44 (13916): No heartbeat from core client for 30 sec - exiting
00:31:45 (13916): No heartbeat from core client for 30 sec - exiting
00:31:46 (13916): No heartbeat from core client for 30 sec - exiting
00:31:47 (13916): No heartbeat from core client for 30 sec - exiting
00:31:48 (13916): No heartbeat from core client for 30 sec - exiting
00:31:49 (13916): No heartbeat from core client for 30 sec - exiting
00:31:50 (13916): No heartbeat from core client for 30 sec - exiting
00:31:51 (13916): No heartbeat from core client for 30 sec - exiting
00:31:52 (13916): No heartbeat from core client for 30 sec - exiting
00:31:53 (13916): No heartbeat from core client for 30 sec - exiting
00:31:54 (13916): No heartbeat from core client for 30 sec - exiting
00:31:55 (13916): No heartbeat from core client for 30 sec - exiting
00:31:56 (13916): No heartbeat from core client for 30 sec - exiting
00:31:57 (13916): No heartbeat from core client for 30 sec - exiting
00:31:58 (13916): No heartbeat from core client for 30 sec - exiting
00:31:59 (13916): No heartbeat from core client for 30 sec - exiting
00:32:00 (13916): No heartbeat from core client for 30 sec - exiting
00:32:01 (13916): No heartbeat from core client for 30 sec - exiting
00:32:02 (13916): No heartbeat from core client for 30 sec - exiting
00:32:03 (13916): No heartbeat from core client for 30 sec - exiting
00:32:04 (13916): No heartbeat from core client for 30 sec - exiting
00:32:05 (13916): No heartbeat from core client for 30 sec - exiting
00:32:06 (13916): No heartbeat from core client for 30 sec - exiting
00:32:07 (13916): No heartbeat from core client for 30 sec - exiting
00:32:08 (13916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
00:32:09 (13916): No heartbeat from core client for 30 sec - exiting
00:32:10 (13916): No heartbeat from core client for 30 sec - exiting
15:15:57 (13700): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:24:07 (1916): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:24:10 (1916): No heartbeat from core client for 30 sec - exiting
18:24:11 (1916): No heartbeat from core client for 30 sec - exiting
18:26:07 (5132): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:58:42 (5948): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:01:09 (6216): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:01:11 (6216): No heartbeat from core client for 30 sec - exiting
21:01:12 (6216): No heartbeat from core client for 30 sec - exiting
21:11:20 (11848): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:11:23 (11848): No heartbeat from core client for 30 sec - exiting
21:11:24 (11848): No heartbeat from core client for 30 sec - exiting
21:11:25 (11848): No heartbeat from core client for 30 sec - exiting
21:13:02 (12612): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:13:03 (12612): No heartbeat from core client for 30 sec - exiting
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8464, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8464, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_n256_1920_40_008334587/dataout/ocean_restart.day after 11 attempts
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2776, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
01 Jun 2013 22:39:50 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 544,320 745,785 1.3701
01 Jun 2013 11:45:29 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 518,400 707,299 1.3644
27 May 2013 15:56:43 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 492,480 669,543 1.3595
23 May 2013 22:13:55 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 466,560 630,709 1.3518
19 May 2013 13:53:53 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 440,640 592,562 1.3448
18 May 2013 16:32:00 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 414,720 557,217 1.3436
16 May 2013 15:14:43 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 388,800 522,401 1.3436
08 May 2013 19:46:06 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 362,880 488,555 1.3463
02 May 2013 07:55:33 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 336,960 455,408 1.3515
01 May 2013 23:02:36 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 311,040 423,515 1.3616
28 Apr 2013 15:55:58 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 285,120 389,396 1.3657
28 Apr 2013 04:49:50 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 259,200 355,580 1.3718
27 Apr 2013 19:34:54 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 233,280 322,572 1.3828
25 Apr 2013 20:27:44 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 207,360 287,554 1.3867
20 Apr 2013 20:47:47 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 181,440 253,556 1.3975
16 Apr 2013 19:40:46 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 155,520 219,596 1.4120
15 Apr 2013 22:48:14 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 129,600 187,690 1.4482
13 Apr 2013 18:05:09 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 103,680 154,359 1.4888
10 Apr 2013 21:30:08 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 77,760 119,278 1.5339
06 Apr 2013 14:55:33 1096206 15696005 hadcm3n_n256_1920_40_008334587_3 51,840 82,063 1.5830


©2024 cpdn.org