climateprediction.net home page
Task 12750081

Task 12750081

Name hadcm3n_o70n_1900_40_007204426_1
Workunit 7402706
Created 28 Mar 2011, 14:18:57 UTC
Sent 28 Mar 2011, 20:55:32 UTC
Report deadline 28 Jun 2011, 4:22:43 UTC
Received 4 Jul 2011, 8:57:32 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1123850
Run time 17 days 10 hours 14 min 15 sec
CPU time 12 days 16 hours 18 min 44 sec
Validate state Invalid
Credit 6,220.80
Device peak FLOPS 2.13 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.58</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2384, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4220, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5168, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5168, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5488, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4440, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3320, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4984, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4456, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Atmos Hold Restart file rename failed on atmos_restart.hold
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4012, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5780, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5272, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4252, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4252, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4252, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4248, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3620, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5468, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4484, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5472, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3416, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5184, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5252, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5252, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5252, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4912, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4912, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4760, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3780, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3780, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4592, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5896, iMonCtr=1
Model crash detected, will try to restart...
20:39:37 (476): No heartbeat from core client for 30 sec - exiting
20:39:38 (476): No heartbeat from core client for 30 sec - exiting
20:39:39 (476): No heartbeat from core client for 30 sec - exiting
20:39:40 (476): No heartbeat from core client for 30 sec - exiting
20:39:41 (476): No heartbeat from core client for 30 sec - exiting
20:39:43 (476): No heartbeat from core client for 30 sec - exiting
20:39:44 (476): No heartbeat from core client for 30 sec - exiting
20:39:45 (476): No heartbeat from core client for 30 sec - exiting
20:39:46 (476): No heartbeat from core client for 30 sec - exiting
20:39:47 (476): No heartbeat from core client for 30 sec - exiting
20:39:48 (476): No heartbeat from core client for 30 sec - exiting
20:39:49 (476): No heartbeat from core client for 30 sec - exiting
20:39:51 (476): No heartbeat from core client for 30 sec - exiting
20:39:52 (476): No heartbeat from core client for 30 sec - exiting
20:39:53 (476): No heartbeat from core client for 30 sec - exiting
20:39:54 (476): No heartbeat from core client for 30 sec - exiting
20:39:55 (476): No heartbeat from core client for 30 sec - exiting
20:39:56 (476): No heartbeat from core client for 30 sec - exiting
20:39:57 (476): No heartbeat from core client for 30 sec - exiting
20:39:58 (476): No heartbeat from core client for 30 sec - exiting
20:39:59 (476): No heartbeat from core client for 30 sec - exiting
20:40:00 (476): No heartbeat from core client for 30 sec - exiting
20:40:02 (476): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:55:58 (5748): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1192, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1192, iMonCtr=1
Model crash detected, will try to restart...
07:09:57 (5000): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2784, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3592, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4272, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5108, iMonCtr=1
Model crash detected, will try to restart...
19:12:13 (1152): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
21:23:36 (5764): No heartbeat from core client for 30 sec - exiting
21:23:37 (5764): No heartbeat from core client for 30 sec - exiting
21:23:38 (5764): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=800, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6720, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
14:42:08 (4760): No heartbeat from core client for 30 sec - exiting
14:42:10 (4760): No heartbeat from core client for 30 sec - exiting
14:42:11 (4760): No heartbeat from core client for 30 sec - exiting
14:42:12 (4760): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4072, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4596, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4792, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4928, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
07:42:56 (6096): No heartbeat from core client for 30 sec - exiting
07:42:57 (6096): No heartbeat from core client for 30 sec - exiting
07:42:58 (6096): No heartbeat from core client for 30 sec - exiting
07:42:59 (6096): No heartbeat from core client for 30 sec - exiting
07:43:00 (6096): No heartbeat from core client for 30 sec - exiting
07:43:01 (6096): No heartbeat from core client for 30 sec - exiting
07:43:02 (6096): No heartbeat from core client for 30 sec - exiting
07:43:03 (6096): No heartbeat from core client for 30 sec - exiting
07:43:04 (6096): No heartbeat from core client for 30 sec - exiting
07:43:05 (6096): No heartbeat from core client for 30 sec - exiting
07:43:07 (6096): No heartbeat from core client for 30 sec - exiting
07:43:08 (6096): No heartbeat from core client for 30 sec - exiting
07:43:09 (6096): No heartbeat from core client for 30 sec - exiting
07:43:10 (6096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:43:11 (6096): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - Quit request from BOINC...
08:48:12 (5832): No heartbeat from core client for 30 sec - exiting
08:48:13 (5832): No heartbeat from core client for 30 sec - exiting
08:48:14 (5832): No heartbeat from core client for 30 sec - exiting
08:48:15 (5832): No heartbeat from core client for 30 sec - exiting
08:48:16 (5832): No heartbeat from core client for 30 sec - exiting
08:48:17 (5832): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
23:54:27 (3808): No heartbeat from core client for 30 sec - exiting
23:54:28 (3808): No heartbeat from core client for 30 sec - exiting
23:54:29 (3808): No heartbeat from core client for 30 sec - exiting
23:54:30 (3808): No heartbeat from core client for 30 sec - exiting
23:54:31 (3808): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:22:14 (3516): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
12:52:33 (6708): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
15:33:19 (6596): No heartbeat from core client for 30 sec - exiting
15:33:20 (6596): No heartbeat from core client for 30 sec - exiting
15:33:21 (6596): No heartbeat from core client for 30 sec - exiting
15:33:22 (6596): No heartbeat from core client for 30 sec - exiting
15:33:23 (6596): No heartbeat from core client for 30 sec - exiting
15:33:24 (6596): No heartbeat from core client for 30 sec - exiting
15:33:25 (6596): No heartbeat from core client for 30 sec - exiting
15:33:26 (6596): No heartbeat from core client for 30 sec - exiting
15:33:27 (6596): No heartbeat from core client for 30 sec - exiting
15:33:28 (6596): No heartbeat from core client for 30 sec - exiting
15:33:29 (6596): No heartbeat from core client for 30 sec - exiting
15:33:30 (6596): No heartbeat from core client for 30 sec - exiting
15:33:31 (6596): No heartbeat from core client for 30 sec - exiting
15:33:32 (6596): No heartbeat from core client for 30 sec - exiting
15:33:33 (6596): No heartbeat from core client for 30 sec - exiting
15:33:34 (6596): No heartbeat from core client for 30 sec - exiting
15:33:35 (6596): No heartbeat from core client for 30 sec - exiting
15:33:36 (6596): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/ocean_restart.day after 11 attempts

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/ocean_restart.day after 11 attempts

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/ocean_restart.day after 11 attempts

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/ocean_restart.day after 11 attempts

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/ocean_restart.day after 11 attempts

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/atmos_restart.day after 11 attempts
cpdnmonitor: cannot open input file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_o70n_1900_40_007204426/dataout/ocean_restart.day after 11 attempts

Model crashed: READ_FLH: I/O error                                                                                                                                                                                                                                             tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
02 Jul 2011 21:33:47 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 518,400 1,095,585 2.1134
28 Jun 2011 15:38:18 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 492,480 1,040,831 2.1134
26 Jun 2011 19:59:26 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 466,560 986,176 2.1137
23 Jun 2011 07:46:49 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 440,640 930,455 2.1116
21 Jun 2011 14:23:27 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 414,720 871,332 2.1010
20 Jun 2011 07:34:39 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 388,800 816,444 2.0999
13 Jun 2011 20:41:58 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 362,880 761,756 2.0992
10 Jun 2011 22:44:28 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 336,960 709,140 2.1045
09 Jun 2011 09:45:59 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 311,040 654,894 2.1055
05 Jun 2011 16:56:43 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 285,120 599,948 2.1042
03 Jun 2011 18:11:24 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 259,200 544,500 2.1007
27 May 2011 19:54:54 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 233,280 489,920 2.1001
25 May 2011 17:00:58 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 207,360 436,242 2.1038
17 May 2011 18:18:04 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 181,440 382,853 2.1101
08 May 2011 14:33:46 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 155,520 328,375 2.1115
03 May 2011 11:41:27 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 129,600 273,181 2.1079
25 Apr 2011 17:29:31 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 103,680 218,299 2.1055
20 Apr 2011 19:46:41 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 77,760 163,504 2.1027
20 Apr 2011 19:46:41 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 51,840 109,228 2.1070
31 Mar 2011 20:10:16 1123850 12750081 hadcm3n_o70n_1900_40_007204426_1 25,920 54,652 2.1085


©2024 cpdn.org