climateprediction.net home page
Task 15471133

Task 15471133

Name hadcm3n_o0fp_2060_40_008242216_3
Workunit 8397340
Created 4 Dec 2012, 21:54:18 UTC
Sent 4 Dec 2012, 21:54:23 UTC
Report deadline 6 Mar 2013, 5:21:34 UTC
Received 19 Dec 2012, 9:24:47 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1343109
Run time 12 days 1 hours 42 min 52 sec
CPU time 7 hours 11 min 28 sec
Validate state Invalid
Credit 4,976.64
Device peak FLOPS 2.00 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
i686-pc-linux-gnu
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
process exited with code 22 (0x16, -234)
</message>
<stderr_txt>
03:04:59 (367): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...

Model crashed: SETPOS: Unit 42 to Word Address 9058304 Failed with Error Code -1
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F7606BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F75E9BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F761BBD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...

Model crashed: SETPOS: Unit 42 to Word Address 3811328 Failed with Error Code -1
Suspended CPDN Monitor - Suspend request from BOINC...
Atmos Hold Restart file rename failed on atmos_restart.hold
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F7598BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
forrtl: severe (28): CLOSE error, unit 6, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  084213A0  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F8A6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F7598BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F75EBBD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Atmos Restart file copy failed on o0fpka.daq69s0
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
cpdnmonitor: error closing file dataout/ocean_restart.day
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Atmos Restart file copy failed on o0fpka.dar08i0
forrtl: severe (28): CLOSE error, unit 6, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  084213A0  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F8A6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F7536BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
forrtl: severe (28): CLOSE error, unit 6, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  084213A0  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F8A6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F762EBD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...

Model crashed: TEMPHIST: Failed in OPEN of history file                                                                                                                                                                                                                        tmp/pipe_dummy                                                                  2048    
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F75CBBD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
forrtl: severe (28): CLOSE error, unit 6, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  084213A0  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F8A6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F75CBBD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
cpdnmonitor: error closing file dataout/ocean_restart.day
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: error closing file dataout/ocean_restart.day
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F7605BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=663, iMonCtr=1
Model crash detected, will try to restart...
cpdnmonitor: error closing file dataout/ocean_restart.day
forrtl: severe (28): CLOSE error, unit 8, file "Unknown"
Image              PC        Routine            Line        Source             
hadcm3n_um_6.07_i  0848EB7D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0848D975  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0845F3CF  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F90D  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841F257  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0841DD18  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  08121427  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  083907E5  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0838F8B7  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0839BDF8  Unknown               Unknown  Unknown
libc.so.6          F75E2BD6  Unknown               Unknown  Unknown
hadcm3n_um_6.07_i  0804CB11  Unknown               Unknown  Unknown
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2586, iMonCtr=1
Model crash detected, will try to restart...

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 41 - Return code = 1

Model crashed: READDUMP: BAD BUFFIN OF DATA                                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 41 - Return code = 1

Model crashed: READDUMP: BAD BUFFIN OF DATA                                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 41 - Return code = 1

Model crashed: READDUMP: BAD BUFFIN OF DATA                                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 41 - Return code = 1

Model crashed: READDUMP: BAD BUFFIN OF DATA                                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    

BUFFIN: Read Failed: Inappropriate ioctl for device
BUFFIN: C I/O Error feof - Unit 41 - Return code = 1

Model crashed: READDUMP: BAD BUFFIN OF DATA                                                                                                                                                                                                                                    tmp/pipe_dummy                                                                  2048    
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
18 Dec 2012 18:46:32 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 414,720 38,791 0.0935
18 Dec 2012 01:30:25 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 388,800 26,034 0.0670
16 Dec 2012 19:03:24 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 362,880 9,105 0.0251
15 Dec 2012 15:00:42 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 336,960 49,803 0.1478
14 Dec 2012 10:16:21 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 311,040 30,915 0.0994
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 285,120 12,770 0.0448
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 259,200 31,186 0.1203
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 233,280 77,205 0.3310
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 207,360 24,317 0.1173
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 181,440 35,155 0.1938
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 155,520 24,602 0.1582
13 Dec 2012 17:35:08 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 129,600 31,717 0.2447
07 Dec 2012 19:31:44 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 103,680 73,734 0.7112
07 Dec 2012 02:35:58 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 77,760 18,791 0.2417
06 Dec 2012 09:43:32 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 51,840 39,752 0.7668
05 Dec 2012 18:07:37 1180506 15471133 hadcm3n_o0fp_2060_40_008242216_3 25,920 10,862 0.4191


©2024 cpdn.org