Name | hadcm3n_849h_1980_40_008463481_2 |
Workunit | 8614320 |
Created | 11 Nov 2013, 2:30:52 UTC |
Sent | 11 Nov 2013, 2:30:59 UTC |
Report deadline | 10 Feb 2014, 9:58:10 UTC |
Received | 7 Feb 2014, 1:00:50 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1233028 |
Run time | 15 days 13 hours 43 min 1 sec |
CPU time | 13 days 8 hours 30 min 46 sec |
Validate state | Invalid |
Credit | 3,732.48 |
Device peak FLOPS | 2.50 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.28</core_client_version> <![CDATA[ <message> (unknown error) - exit code 193 (0xc1) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish 19:19:03 (4728): No heartbeat from core client for 30 sec - exiting 19:19:04 (4728): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4272, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... 23:34:14 (5404): No heartbeat from core client for 30 sec - exiting 23:34:15 (5404): No heartbeat from core client for 30 sec - exiting 23:34:16 (5404): No heartbeat from core client for 30 sec - exiting 23:34:17 (5404): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:52:10 (5524): No heartbeat from core client for 30 sec - exiting 14:52:11 (5524): No heartbeat from core client for 30 sec - exiting 14:52:12 (5524): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4348, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1456, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5088, iMonCtr=1 Model crash detected, will try to restart... 18:34:09 (5112): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4276, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4996, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3672, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5328, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5852, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5692, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=680, iMonCtr=1 Model crash detected, will try to restart... 21:02:52 (3672): No heartbeat from core client for 30 sec - exiting 21:02:54 (3672): No heartbeat from core client for 30 sec - exiting 21:02:55 (3672): No heartbeat from core client for 30 sec - exiting 21:02:56 (3672): No heartbeat from core client for 30 sec - exiting 21:02:57 (3672): No heartbeat from core client for 30 sec - exiting 21:02:58 (3672): No heartbeat from core client for 30 sec - exiting 21:02:59 (3672): No heartbeat from core client for 30 sec - exiting 21:03:00 (3672): No heartbeat from core client for 30 sec - exiting 21:03:01 (3672): No heartbeat from core client for 30 sec - exiting 21:03:02 (3672): No heartbeat from core client for 30 sec - exiting 21:03:03 (3672): No heartbeat from core client for 30 sec - exiting 21:03:04 (3672): No heartbeat from core client for 30 sec - exiting 21:03:05 (3672): No heartbeat from core client for 30 sec - exiting 21:03:06 (3672): No heartbeat from core client for 30 sec - exiting 21:03:07 (3672): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... C13:49:23 (4928): No heartbeat from core client for 30 sec - exiting 13:49:25 (4928): No heartbeat from core client for 30 sec - exiting 13:49:26 (4928): No heartbeat from core client for 30 sec - exiting 13:49:27 (4928): No heartbeat from core client for 30 sec - exiting 13:49:28 (4928): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3096, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5292, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5292, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4464, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5844, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... 10:23:49 (4384): No heartbeat from core client for 30 sec - exiting 10:23:50 (4384): No heartbeat from core client for 30 sec - exiting 10:23:51 (4384): No heartbeat from core client for 30 sec - exiting 10:23:53 (4384): No heartbeat from core client for 30 sec - exiting 10:23:54 (4384): No heartbeat from core client for 30 sec - exiting 10:23:55 (4384): No heartbeat from core client for 30 sec - exiting 10:23:56 (4384): No heartbeat from core client for 30 sec - exiting 10:23:57 (4384): No heartbeat from core client for 30 sec - exiting 10:23:58 (4384): No heartbeat from core client for 30 sec - exiting 10:23:59 (4384): No heartbeat from core client for 30 sec - exiting 10:24:00 (4384): No heartbeat from core client for 30 sec - exiting 10:24:01 (4384): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5648, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish 10:23:23 (4972): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 10:27:51 (5576): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:30:12 (5416): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1944, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4820, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5316, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5068, iMonCtr=1 Model crash detected, will try to restart... 16:56:31 (3876): No heartbeat from core client for 30 sec - exiting 16:56:32 (3876): No heartbeat from core client for 30 sec - exiting 16:56:33 (3876): No heartbeat from core client for 30 sec - exiting 16:56:35 (3876): No heartbeat from core client for 30 sec - exiting 16:56:36 (3876): No heartbeat from core client for 30 sec - exiting 16:56:37 (3876): No heartbeat from core client for 30 sec - exiting 16:56:38 (3876): No heartbeat from core client for 30 sec - exiting 16:56:39 (3876): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5176, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... 13:24:30 (4668): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1120, iMonCtr=1 Model crash detected, will try to restart... 21:03:37 (5544): No heartbeat from core client for 30 sec - exiting 21:03:38 (5544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:03:39 (5544): No heartbeat from core client for 30 sec - exiting 21:03:40 (5544): No heartbeat from core client for 30 sec - exiting 21:03:41 (5544): No heartbeat from core client for 30 sec - exiting 21:03:42 (5544): No heartbeat from core client for 30 sec - exiting 21:03:44 (5544): No heartbeat from core client for 30 sec - exiting 21:03:45 (5544): No heartbeat from core client for 30 sec - exiting 21:03:46 (5544): No heartbeat from core client for 30 sec - exiting 21:03:47 (5544): No heartbeat from core client for 30 sec - exiting 21:03:48 (5544): No heartbeat from core client for 30 sec - exiting 21:03:49 (5544): No heartbeat from core client for 30 sec - exiting 21:07:41 (5956): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:32:03 (4180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:19:00 (5328): No heartbeat from core client for 30 sec - exiting 17:19:01 (5328): No heartbeat from core client for 30 sec - exiting 17:19:02 (5328): No heartbeat from core client for 30 sec - exiting 17:19:04 (5328): No heartbeat from core client for 30 sec - exiting 17:19:05 (5328): No heartbeat from core client for 30 sec - exiting 17:19:06 (5328): No heartbeat from core client for 30 sec - exiting 17:19:07 (5328): No heartbeat from core client for 30 sec - exiting 17:19:08 (5328): No heartbeat from core client for 30 sec - exiting 17:19:09 (5328): No heartbeat from core client for 30 sec - exiting 17:19:10 (5328): No heartbeat from core client for 30 sec - exiting 17:19:11 (5328): No heartbeat from core client for 30 sec - exiting 17:19:12 (5328): No heartbeat from core client for 30 sec - exiting 17:19:13 (5328): No heartbeat from core client for 30 sec - exiting 17:19:14 (5328): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:21:24 (4888): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3448, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4184, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4620, iMonCtr=1 Model crash detected, will try to restart... 20:49:57 (5028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 23:42:45 (6128): No heartbeat from core client for 30 sec - exiting 23:42:46 (6128): No heartbeat from core client for 30 sec - exiting 23:42:47 (6128): No heartbeat from core client for 30 sec - exiting 23:42:48 (6128): No heartbeat from core client for 30 sec - exiting 23:42:49 (6128): No heartbeat from core client for 30 sec - exiting 23:42:50 (6128): No heartbeat from core client for 30 sec - exiting 23:42:51 (6128): No heartbeat from core client for 30 sec - exiting 23:42:52 (6128): No heartbeat from core client for 30 sec - exiting 23:42:53 (6128): No heartbeat from core client for 30 sec - exiting 23:42:54 (6128): No heartbeat from core client for 30 sec - exiting 23:42:55 (6128): No heartbeat from core client for 30 sec - exiting 23:42:56 (6128): No heartbeat from core client for 30 sec - exiting 23:42:57 (6128): No heartbeat from core client for 30 sec - exiting 23:42:58 (6128): No heartbeat from core client for 30 sec - exiting 23:42:59 (6128): No heartbeat from core client for 30 sec - exiting 23:43:00 (6128): No heartbeat from core client for 30 sec - exiting 23:43:01 (6128): No heartbeat from core client for 30 sec - exiting 23:43:02 (6128): No heartbeat from core client for 30 sec - exiting 23:43:03 (6128): No heartbeat from core client for 30 sec - exiting 23:43:04 (6128): No heartbeat from core client for 30 sec - exiting 23:43:05 (6128): No heartbeat from core client for 30 sec - exiting 23:43:06 (6128): No heartbeat from core client for 30 sec - exiting 23:43:07 (6128): No heartbeat from core client for 30 sec - exiting 23:43:08 (6128): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:43:09 (6128): No heartbeat from core client for 30 sec - exiting 23:43:10 (6128): No heartbeat from core client for 30 sec - exiting 23:43:11 (6128): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5028, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77987383 read attempt to address 0x4094630B Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77D47383 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_849h_1980_40_008463481/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
27 Dec 2013 02:25:19 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 311,040 | 611,306 | 1.9654 |
24 Dec 2013 00:11:30 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 285,120 | 560,718 | 1.9666 |
22 Dec 2013 00:42:26 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 259,200 | 509,342 | 1.9651 |
15 Dec 2013 20:35:28 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 233,280 | 458,423 | 1.9651 |
12 Dec 2013 01:32:58 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 207,360 | 407,581 | 1.9656 |
08 Dec 2013 15:58:39 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 181,440 | 357,411 | 1.9699 |
05 Dec 2013 01:55:37 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 155,520 | 307,552 | 1.9776 |
01 Dec 2013 16:34:00 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 129,600 | 256,131 | 1.9763 |
30 Nov 2013 12:12:34 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 103,680 | 205,683 | 1.9838 |
28 Nov 2013 03:21:01 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 77,760 | 155,152 | 1.9953 |
24 Nov 2013 03:10:28 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 51,840 | 104,151 | 2.0091 |
18 Nov 2013 00:07:41 | 1233028 | 16079390 | hadcm3n_849h_1980_40_008463481_2 | 25,920 | 53,933 | 2.0807 |
©2024 cpdn.org