climateprediction.net home page
Task 13126161

Task 13126161

Name hadcm3n_yltr_1900_40_007361145_0
Workunit 7558575
Created 6 Jul 2011, 15:17:19 UTC
Sent 7 Jul 2011, 16:06:39 UTC
Report deadline 6 Oct 2011, 23:33:50 UTC
Received 15 Nov 2011, 19:36:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 193 (0x000000C1) EXIT_SIGNAL
Computer ID 1020632
Run time 21 days 10 hours 16 min 26 sec
CPU time 18 days 23 hours 5 min 52 sec
Validate state Invalid
Credit 6,220.80
Device peak FLOPS 1.24 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.12.34</core_client_version>
<![CDATA[
<message>
 - exit code 193 (0xc1)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4260, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Quit request from BOINC...
16:44:35 (4420): No heartbeat from core client for 30 sec - exiting
16:44:36 (4420): No heartbeat from core client for 30 sec - exiting
16:44:37 (4420): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5676, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CBUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/yltrko.pja3c10
Error converting file to netcdf: dataout/yltrko.pia3c10
Error converting file to netcdf: dataout/yltrko.pfa3c10
Error converting file to netcdf: dataout/yltrka.pha3c10
Error converting file to netcdf: dataout/yltrka.pga3c10
Error converting file to netcdf: dataout/yltrka.pea3c10
Error converting file to netcdf: dataout/yltrka.pda3c10
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
20:13:28 (5152): No heartbeat from core client for 30 sec - exiting
20:13:29 (5152): No heartbeat from core client for 30 sec - exiting
20:13:30 (5152): No heartbeat from core client for 30 sec - exiting
20:13:31 (5152): No heartbeat from core client for 30 sec - exiting
20:13:32 (5152): No heartbeat from core client for 30 sec - exiting
20:13:33 (5152): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5464, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5284, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1
Model crash detected, will try to restart...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
CPDN Monitor - Quit request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5516, iMonCtr=1
Model crash detected, will try to restart...
18:34:53 (4864): No heartbeat from core client for 30 sec - exiting
18:34:54 (4864): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4372, iMonCtr=1
Model crash detected, will try to restart...
CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5280, iMonCtr=1
Model crash detected, will try to restart...
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5848, iMonCtr=1
Model crash detected, will try to restart...
C10:19:57 (5876): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x777C8801 read attempt to address 0x409675DA

Engaging BOINC Windows Runtime Debugger...



Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77328801 read attempt to address 0xFFFFFFF8

Engaging BOINC Windows Runtime Debugger...

Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yltr_1900_40_007361145/dataout/shmem_restart.day
Signal 11 received, exiting...
Called boinc_finish


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77328801 read attempt to address 0xFFFFFFF8

Engaging BOINC Windows Runtime Debugger...

Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yltr_1900_40_007361145/dataout/shmem_restart.day
Signal 11 received, exiting...
Called boinc_finish


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x77328801 read attempt to address 0xFFFFFFF8

Engaging BOINC Windows Runtime Debugger...

Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_yltr_1900_40_007361145/dataout/shmem_restart.day
Signal 11 received, exiting...
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
15 Nov 2011 19:41:05 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 518,400 1,618,366 3.1218
15 Nov 2011 19:41:04 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 492,480 1,537,672 3.1223
08 Nov 2011 07:07:45 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 466,560 1,456,751 3.1223
04 Nov 2011 12:14:27 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 440,640 1,376,172 3.1231
02 Nov 2011 16:01:52 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 414,720 1,295,838 3.1246
31 Oct 2011 15:11:48 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 388,800 1,214,250 3.1231
10 Oct 2011 22:40:23 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 362,880 1,133,569 3.1238
04 Oct 2011 12:15:00 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 336,960 1,053,063 3.1252
01 Oct 2011 11:30:48 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 311,040 972,335 3.1261
20 Sep 2011 18:30:20 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 285,120 892,213 3.1293
12 Sep 2011 17:33:13 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 259,200 811,508 3.1308
10 Sep 2011 16:13:02 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 233,280 731,822 3.1371
07 Sep 2011 02:01:55 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 207,360 651,327 3.1410
04 Sep 2011 22:01:44 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 181,440 570,377 3.1436
01 Sep 2011 16:01:48 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 155,520 490,005 3.1508
26 Aug 2011 18:05:14 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 129,600 409,481 3.1596
18 Aug 2011 14:10:33 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 103,680 326,928 3.1532
13 Aug 2011 02:24:01 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 77,760 245,982 3.1633
30 Jul 2011 13:56:19 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 51,840 163,292 3.1499
29 Jul 2011 14:12:21 1020632 13126161 hadcm3n_yltr_1900_40_007361145_0 25,920 82,404 3.1792


©2024 cpdn.org