Name | hadcm3n_4fg2_1940_40_008305437_0 |
Workunit | 8456572 |
Created | 7 Feb 2013, 6:10:35 UTC |
Sent | 7 Feb 2013, 6:13:07 UTC |
Report deadline | 9 May 2013, 13:40:18 UTC |
Received | 30 Mar 2013, 23:16:20 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1103498 |
Run time | 9 days 18 hours 3 min |
CPU time | 6 days 22 hours 25 min 25 sec |
Validate state | Invalid |
Credit | 3,110.40 |
Device peak FLOPS | 2.39 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6256, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5816, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3428, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1728, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... C18:40:59 (2688): No heartbeat from core client for 30 sec - exiting 18:41:00 (2688): No heartbeat from core client for 30 sec - exiting 18:41:01 (2688): No heartbeat from core client for 30 sec - exiting 18:41:02 (2688): No heartbeat from core client for 30 sec - exiting 18:41:03 (2688): No heartbeat from core client for 30 sec - exiting 18:41:04 (2688): No heartbeat from core client for 30 sec - exiting 18:41:05 (2688): No heartbeat from core client for 30 sec - exiting 18:41:06 (2688): No heartbeat from core client for 30 sec - exiting 18:41:08 (2688): No heartbeat from core client for 30 sec - exiting 18:41:09 (2688): No heartbeat from core client for 30 sec - exiting 18:41:10 (2688): No heartbeat from core client for 30 sec - exiting 18:41:11 (2688): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1072, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4520, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4532, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6128, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... C10:29:23 (856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 20:42:41 (3932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5876, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3172, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 08:26:57 (4324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6652, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... C11:51:25 (4836): No heartbeat from core client for 30 sec - exiting 11:51:26 (4836): No heartbeat from core client for 30 sec - exiting 11:51:27 (4836): No heartbeat from core client for 30 sec - exiting 11:51:28 (4836): No heartbeat from core client for 30 sec - exiting 11:51:29 (4836): No heartbeat from core client for 30 sec - exiting 11:51:30 (4836): No heartbeat from core client for 30 sec - exiting 11:51:31 (4836): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 13:45:49 (2696): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 18:43:19 (6684): No heartbeat from core client for 30 sec - exiting 18:43:20 (6684): No heartbeat from core client for 30 sec - exiting 18:43:21 (6684): No heartbeat from core client for 30 sec - exiting 18:43:22 (6684): No heartbeat from core client for 30 sec - exiting 18:43:23 (6684): No heartbeat from core client for 30 sec - exiting 18:43:24 (6684): No heartbeat from core client for 30 sec - exiting 18:43:25 (6684): No heartbeat from core client for 30 sec - exiting 18:43:26 (6684): No heartbeat from core client for 30 sec - exiting 18:43:27 (6684): No heartbeat from core client for 30 sec - exiting 18:43:28 (6684): No heartbeat from core client for 30 sec - exiting 18:43:29 (6684): No heartbeat from core client for 30 sec - exiting 18:43:30 (6684): No heartbeat from core client for 30 sec - exiting 18:43:31 (6684): No heartbeat from core client for 30 sec - exiting 18:43:32 (6684): No heartbeat from core client for 30 sec - exiting 18:43:33 (6684): No heartbeat from core client for 30 sec - exiting 18:43:34 (6684): No heartbeat from core client for 30 sec - exiting 18:43:35 (6684): No heartbeat from core client for 30 sec - exiting 18:43:36 (6684): No heartbeat from core client for 30 sec - exiting 18:43:37 (6684): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:12:32 (4260): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5800, iMonCtr=1 Model crash detected, will try to restart... 10:39:55 (3564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1764, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1012, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4176, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6060, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1616, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3776, iMonCtr=1 Model crash detected, will try to restart... 11:11:11 (4996): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 11:18:22 (7152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:02:34 (6984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 12:02:35 (6984): No heartbeat from core client for 30 sec - exiting 12:02:36 (6984): No heartbeat from core client for 30 sec - exiting 12:02:37 (6984): No heartbeat from core client for 30 sec - exiting 12:02:38 (6984): No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6696, iMonCtr=1 Model crash detected, will try to restart... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x77583AB3 read attempt to address 0x40EEC668 Engaging BOINC Windows Runtime Debugger... Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x76FB7373 read attempt to address 0xFFFFFFF8 Engaging BOINC Windows Runtime Debugger... Cannot serialize file C:\ProgramData\BOINC/projects/climateprediction.net/hadcm3n_4fg2_1940_40_008305437/dataout/shmem_restart.day Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
29 Mar 2013 02:22:32 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 259,200 | 586,853 | 2.2641 |
24 Mar 2013 06:11:21 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 233,280 | 531,277 | 2.2774 |
20 Mar 2013 18:06:03 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 207,360 | 475,292 | 2.2921 |
17 Mar 2013 04:02:56 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 181,440 | 420,478 | 2.3174 |
13 Mar 2013 03:33:36 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 155,520 | 365,632 | 2.3510 |
08 Mar 2013 03:14:45 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 129,600 | 307,674 | 2.3740 |
05 Mar 2013 03:29:24 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 103,680 | 247,701 | 2.3891 |
27 Feb 2013 02:40:25 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 77,760 | 184,477 | 2.3724 |
18 Feb 2013 03:57:14 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 51,840 | 125,940 | 2.4294 |
13 Feb 2013 03:10:10 | 1103498 | 15591536 | hadcm3n_4fg2_1940_40_008305437_0 | 25,920 | 62,497 | 2.4111 |
©2024 cpdn.org