climateprediction.net home page
Task 13016403

Task 13016403

Name hadcm3n_t3z2_1940_40_007312484_0
Workunit 7509914
Created 28 Jun 2011, 3:36:00 UTC
Sent 28 Jun 2011, 3:36:21 UTC
Report deadline 27 Sep 2011, 11:03:32 UTC
Received 4 Jul 2011, 8:11:41 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1150994
Run time 2 days 11 hours 11 min 41 sec
CPU time 2 days 10 hours 29 min 53 sec
Validate state Invalid
Credit 1,866.24
Device peak FLOPS 3.34 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.26</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3872, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3872, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4716, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3856, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3764, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
09:39:20 (4500): No heartbeat from core client for 30 sec - exiting
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/t3z2ko.pje6c10
Error converting file to netcdf: dataout/t3z2ko.pie6c10
Error converting file to netcdf: dataout/t3z2ko.pfe6c10
Error converting file to netcdf: dataout/t3z2ka.phe6c10
Error converting file to netcdf: dataout/t3z2ka.pge6c10
Error converting file to netcdf: dataout/t3z2ka.pee6c10
Error converting file to netcdf: dataout/t3z2ka.pde6c10
CPDN Monitor - Quit request from BOINC...
14:59:27 (544): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5000, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
02 Jul 2011 21:43:53 1150994 13016403 hadcm3n_t3z2_1940_40_007312484_0 155,520 174,299 1.1207
02 Jul 2011 00:35:30 1150994 13016403 hadcm3n_t3z2_1940_40_007312484_0 129,600 144,805 1.1173
30 Jun 2011 08:54:35 1150994 13016403 hadcm3n_t3z2_1940_40_007312484_0 103,680 116,940 1.1279
29 Jun 2011 04:27:17 1150994 13016403 hadcm3n_t3z2_1940_40_007312484_0 77,760 88,763 1.1415
28 Jun 2011 20:21:33 1150994 13016403 hadcm3n_t3z2_1940_40_007312484_0 51,840 59,692 1.1515
28 Jun 2011 12:00:24 1150994 13016403 hadcm3n_t3z2_1940_40_007312484_0 25,920 29,800 1.1497


©2024 cpdn.org