Name | hadcm3n_u0ss_2020_40_008338147_1 |
Workunit | 8489008 |
Created | 5 Apr 2013, 23:25:17 UTC |
Sent | 7 Apr 2013, 6:17:05 UTC |
Report deadline | 7 Jul 2013, 13:44:16 UTC |
Received | 9 Jun 2013, 1:56:58 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | -529697949 (0xE06D7363) Unknown error code |
Computer ID | 1167459 |
Run time | 15 days 0 hours 8 min 22 sec |
CPU time | 14 days 21 hours 55 min 17 sec |
Validate state | Invalid |
Credit | 12,441.60 |
Device peak FLOPS | 3.18 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> - exit code -529697949 (0xe06d7363) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4520, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4968, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4968, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3548, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4488, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5472, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5064, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4320, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=964, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2940, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... 18:58:28 (4180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4356, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3524, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3524, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4228, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77B53AB3 read attempt to address 0x40E45ECD Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x76FC3AB3 read attempt to address 0x40E45ECD Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_u0ss_2020_40_008338147/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x76FC3792 read attempt to address 0x40E45ECD Engaging BOINC Windows Runtime Debugger... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
08 Jun 2013 12:45:21 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 1,036,800 | 1,288,023 | 1.2423 |
07 Jun 2013 16:52:26 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 1,010,880 | 1,253,184 | 1.2397 |
06 Jun 2013 12:29:26 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 984,960 | 1,218,877 | 1.2375 |
04 Jun 2013 12:06:19 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 959,040 | 1,187,105 | 1.2378 |
03 Jun 2013 07:47:47 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 933,120 | 1,155,501 | 1.2383 |
01 Jun 2013 13:46:19 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 907,200 | 1,122,260 | 1.2371 |
01 Jun 2013 04:25:41 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 881,280 | 1,087,658 | 1.2342 |
31 May 2013 10:31:57 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 855,360 | 1,051,448 | 1.2292 |
30 May 2013 11:00:12 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 829,440 | 1,018,733 | 1.2282 |
28 May 2013 12:40:07 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 803,520 | 986,611 | 1.2279 |
27 May 2013 11:29:52 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 777,600 | 955,788 | 1.2292 |
26 May 2013 08:09:53 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 751,680 | 925,127 | 1.2307 |
25 May 2013 10:12:35 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 725,760 | 894,458 | 1.2324 |
25 May 2013 00:54:20 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 699,840 | 862,845 | 1.2329 |
22 May 2013 14:40:32 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 673,920 | 830,270 | 1.2320 |
18 May 2013 13:50:33 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 648,000 | 796,078 | 1.2285 |
18 May 2013 05:32:25 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 622,080 | 765,322 | 1.2303 |
16 May 2013 13:24:01 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 596,160 | 732,874 | 1.2293 |
14 May 2013 14:00:36 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 570,240 | 699,329 | 1.2264 |
12 May 2013 10:31:38 | 1167459 | 15710709 | hadcm3n_u0ss_2020_40_008338147_1 | 544,320 | 667,816 | 1.2269 |
©2024 cpdn.org