Name | hadcm3n_001x_1940_40_007958986_0 |
Workunit | 8114098 |
Created | 14 May 2012, 14:52:46 UTC |
Sent | 14 May 2012, 14:53:10 UTC |
Report deadline | 13 Aug 2012, 22:20:21 UTC |
Received | 13 Jul 2012, 10:48:22 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -1073741819 (0xC0000005) STATUS_ACCESS_VIOLATION |
Computer ID | 991581 |
Run time | 15 days 11 hours 47 min 39 sec |
CPU time | 10 days 6 hours 28 min 21 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 1.21 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <message> - exit code -1073741819 (0xc0000005) </message> <stderr_txt> 11:55:12 (7348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7636, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7812, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=66Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3984, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8152, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7932, iMonCtr=1 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... 10:12:55 (7364): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFIN: C I/O Error feof - Unit 63 - Return code = 16 BUFFIN: C I/O Error feof - Unit 64 - Return code = 16 BUFFIN: C I/O Error feof - Unit 65 - Return code = 16 BUFFIN: C I/O Error feof - Unit 66 - Return code = 16 BUFFIN: C I/O Error feof - Unit 67 - Return code = 16 BUFFIN: C I/O Error feof - Unit 68 - Return code = 16 BUFFIN: C I/O Error feof - Unit 69 - Return code = 16 Error converting file to netcdf: dataout/001xko.pje4c10 Error converting file to netcdf: dataout/001xko.pie4c10 Error converting file to netcdf: dataout/001xko.pfe4c10 Error converting file to netcdf: dataout/001xka.phe4c10 Error converting file to netcdf: dataout/001xka.pge4c10 Error converting file to netcdf: dataout/001xka.pee4c10 Error converting file to netcdf: dataout/001xka.pde4c10 CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1296, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPController:: CPDN process is not running, exiting, bRetVal = 1, cheCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7544, iMonCtr=1 Model crash detected, will try to restart... 18:04:23 (8036): No heartbeat from core client for 30 sec - exiting 18:04:24 (8036): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:28:45 (5896): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6836, iMonCtr=1 Model crash detected, will try to restart... C09:52:04 (7904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:24:27 (7732): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 07:53:34 (7380): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:30:58 (6296): No heartbeat from core client for 30 sec - exiting 16:31:00 (6296): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:39:24 (7616): No heartbeat from core client for 30 sec - exiting 08:39:26 (7616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3832, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7932, iMonCtr=1 Model crash detected, will try to restart... 13:57:35 (6832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:04:05 (5744): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7372, iMonCtr=1 Model crash detected, will try to restart... C22:31:50 (7196): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 22:31:51 (7196): No heartbeat from core client for 30 sec - exiting 09:19:48 (7392): No heartbeat from core client for 30 sec - exiting 09:19:50 (7392): No heartbeat from core client for 30 sec - exiting 09:19:51 (7392): No heartbeat from core client for 30 sec - exiting 09:19:52 (7392): No heartbeat from core client for 30 sec - exiting 09:19:53 (7392): No heartbeat from core client for 30 sec - exiting 09:19:54 (7392): No heartbeat from core client for 30 sec - exiting 09:19:55 (7392): No heartbeat from core client for 30 sec - exiting 09:19:56 (7392): No heartbeat from core client for 30 sec - exiting 09:19:57 (7392): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:20:47 (7796): Can't acquire lockfile (32) - waiting 35s Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6200, iMonCtr=1 Model crash detected, will try to restart... 19:49:05 (7428): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:19:47 (7256): No heartbeat from core client for 30 sec - exiting 09:19:48 (7256): No heartbeat from core client for 30 sec - exiting 09:19:49 (7256): No heartbeat from core client for 30 sec - exiting 09:19:50 (7256): No heartbeat from core client for 30 sec - exiting 09:19:51 (7256): No heartbeat from core client for 30 sec - exiting 09:19:52 (7256): No heartbeat from core client for 30 sec - exiting 09:19:53 (7256): No heartbeat from core client for 30 sec - exiting 09:19:55 (7256): No heartbeat from core client for 30 sec - exiting 09:19:56 (7256): No heartbeat from core client for 30 sec - exiting 09:19:57 (7256): No heartbeat from core client for 30 sec - exiting 09:19:58 (7256): No heartbeat from core client for 30 sec - exiting 09:19:59 (7256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:20:00 (7256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77896E5F read attempt to address 0x40C7729D Engaging BOINC Windows Runtime Debugger... Signal 11 received, exiting... Called boinc_finish ERROR: Invalid parameter detected in function (null). File: (null) Line: 0 ERROR: Expression: (null) </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 Jul 2012 19:04:05 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 259,200 | 879,066 | 3.3915 |
08 Jul 2012 18:19:55 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 233,280 | 789,225 | 3.3832 |
08 Jul 2012 18:19:55 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 207,360 | 701,052 | 3.3808 |
02 Jul 2012 18:08:26 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 181,440 | 612,688 | 3.3768 |
24 Jun 2012 01:47:30 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 155,520 | 522,707 | 3.3610 |
15 Jun 2012 21:23:49 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 129,600 | 432,758 | 3.3392 |
09 Jun 2012 09:17:16 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 103,680 | 345,700 | 3.3343 |
01 Jun 2012 12:17:57 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 77,760 | 259,835 | 3.3415 |
28 May 2012 08:49:33 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 51,840 | 174,404 | 3.3643 |
19 May 2012 18:46:08 | 991581 | 14665699 | hadcm3n_001x_1940_40_007958986_0 | 25,920 | 87,620 | 3.3804 |
©2024 cpdn.org