climateprediction.net home page
Task 13296789

Task 13296789

Name hadcm3n_y92l_1940_40_007423694_1
Workunit 7621329
Created 25 Aug 2011, 15:27:24 UTC
Sent 25 Aug 2011, 15:35:59 UTC
Report deadline 24 Nov 2011, 23:03:10 UTC
Received 12 Oct 2011, 13:00:56 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1040873
Run time 8 days 20 hours 46 min 49 sec
CPU time 8 days 1 hours 30 min 57 sec
Validate state Invalid
Credit 6,220.80
Device peak FLOPS 2.93 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4304, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6060, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5980, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2352, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5988, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5376, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4120, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1676, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4788, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5880, iMonCtr=1
Model crash detected, will try to restart...
20:50:33 (6084): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:15:11 (5880): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5832, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5200, iMonCtr=1
Model crash detected, will try to restart...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6208, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5776, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4332, iMonCtr=1
Model crash detected, will try to restart...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/y92lko.pjf5c10
Error converting file to netcdf: dataout/y92lko.pif5c10
Error converting file to netcdf: dataout/y92lko.pff5c10
Error converting file to netcdf: dataout/y92lka.phf5c10
Error converting file to netcdf: dataout/y92lka.pgf5c10
Error converting file to netcdf: dataout/y92lka.pef5c10
Error converting file to netcdf: dataout/y92lka.pdf5c10
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5596, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5576, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5704, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6120, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5476, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77146E0F read attempt to address 0x409FF659

Engaging BOINC Windows Runtime Debugger...

Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_y92l_1940_40_007423694/dataout/shmem_restart.day
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
12 Oct 2011 13:04:12 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 518,400 696,651 1.3438
12 Oct 2011 13:04:12 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 492,480 662,752 1.3457
12 Oct 2011 13:04:12 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 466,560 629,429 1.3491
12 Oct 2011 13:04:12 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 440,640 593,854 1.3477
01 Oct 2011 03:48:47 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 414,720 556,865 1.3427
26 Sep 2011 14:53:49 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 388,800 519,012 1.3349
18 Sep 2011 11:49:59 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 362,880 483,067 1.3312
16 Sep 2011 19:21:59 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 336,960 450,212 1.3361
14 Sep 2011 02:50:27 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 311,040 414,667 1.3332
11 Sep 2011 17:21:44 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 285,120 377,391 1.3236
07 Sep 2011 22:44:38 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 259,200 341,497 1.3175
06 Sep 2011 11:20:02 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 233,280 305,164 1.3081
06 Sep 2011 01:38:00 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 207,360 271,518 1.3094
05 Sep 2011 16:34:36 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 181,440 239,343 1.3191
03 Sep 2011 15:42:02 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 155,520 205,700 1.3227
03 Sep 2011 06:09:46 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 129,600 172,617 1.3319
02 Sep 2011 15:25:11 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 103,680 138,136 1.3323
31 Aug 2011 20:38:17 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 77,760 103,971 1.3371
30 Aug 2011 20:28:21 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 51,840 68,743 1.3261
29 Aug 2011 17:39:09 1040873 13296789 hadcm3n_y92l_1940_40_007423694_1 25,920 33,712 1.3006


©2024 climateprediction.net