Name | hadcm3n_3n4g_1940_40_008260810_0 |
Workunit | 8415934 |
Created | 20 Dec 2012, 19:40:14 UTC |
Sent | 20 Dec 2012, 19:46:20 UTC |
Report deadline | 22 Mar 2013, 3:13:31 UTC |
Received | 22 Mar 2013, 18:05:21 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1164195 |
Run time | 17 days 6 hours 28 min 28 sec |
CPU time | 14 days 2 hours 12 min 43 sec |
Validate state | Invalid |
Credit | 5,598.72 |
Device peak FLOPS | 2.42 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.25</core_client_version> <![CDATA[ <message> - exit code 193 (0xc1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4880, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exitiCPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3180, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:41:08 (4420): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3700, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4148, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1896, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3380, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3668, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2968, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3880, iMonCtr=1 Model crash detected, will try to restart... 06:26:36 (3152): No heartbeat from core client for 30 sec - exiting 06:26:37 (3152): No heartbeat from core client for 30 sec - exiting 06:26:38 (3152): No heartbeat from core client for 30 sec - exiting 06:26:39 (3152): No heartbeat from core client for 30 sec - exiting 06:26:40 (3152): No heartbeat from core client for 30 sec - exiting 06:26:41 (3152): No heartbeat from core client for 30 sec - exiting 06:26:42 (3152): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1188, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3372, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4672, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4132, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=908, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3552, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2788, iMonCtr=1 Model crash detected, will try to restart... 06:25:41 (4008): No heartbeat from core client for 30 sec - exiting 06:25:42 (4008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=212, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2976, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3480, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1188, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2648, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3696, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2888, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1176, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4384, iMonCtr=1 Model crash detected, will try to restart... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4280, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=720, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2184, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3360, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2704, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3272, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=960, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5008, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1356, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2564, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... 16:39:38 (3444): No heartbeat from core client for 30 sec - exiting 16:39:39 (3444): No heartbeat from core client for 30 sec - exiting 16:39:40 (3444): No heartbeat from core client for 30 sec - exiting 16:39:41 (3444): No heartbeat from core client for 30 sec - exiting 16:39:42 (3444): No heartbeat from core client for 30 sec - exiting 16:39:43 (3444): No heartbeat from core client for 30 sec - exiting 16:39:44 (3444): No heartbeat from core client for 30 sec - exiting 16:39:45 (3444): No heartbeat from core client for 30 sec - exiting 16:39:46 (3444): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3984, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2996, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2184, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2868, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1768, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1768, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3436, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3436, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:06:13 (2040): No heartbeat from core client for 30 sec - exiting 19:06:14 (2040): No heartbeat from core client for 30 sec - exiting 19:06:15 (2040): No heartbeat from core client for 30 sec - exiting 19:06:16 (2040): No heartbeat from core client for 30 sec - exiting 19:06:17 (2040): No heartbeat from core client for 30 sec - exiting 19:06:18 (2040): No heartbeat from core client for 30 sec - exiting 19:06:19 (2040): No heartbeat from core client for 30 sec - exiting 19:06:20 (2040): No heartbeat from core client for 30 sec - exiting 19:06:21 (2040): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 07:43:12 (1776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 06:28:16 (3008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1640, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2748, iMonCtr=1 Model crash detected, will try to restart... Signal 11 received, exiting... Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
12 Feb 2013 19:42:24 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 466,560 | 752,658 | 1.6132 |
11 Feb 2013 21:41:24 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 440,640 | 712,190 | 1.6163 |
10 Feb 2013 22:38:29 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 414,720 | 670,950 | 1.6178 |
07 Feb 2013 07:57:36 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 388,800 | 627,877 | 1.6149 |
03 Feb 2013 23:30:36 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 362,880 | 585,179 | 1.6126 |
30 Jan 2013 06:47:53 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 336,960 | 543,456 | 1.6128 |
26 Jan 2013 05:34:24 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 311,040 | 501,501 | 1.6123 |
22 Jan 2013 22:51:18 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 285,120 | 458,943 | 1.6096 |
20 Jan 2013 16:31:05 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 259,200 | 416,281 | 1.6060 |
16 Jan 2013 22:27:16 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 233,280 | 373,987 | 1.6032 |
13 Jan 2013 18:56:20 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 207,360 | 331,544 | 1.5989 |
08 Jan 2013 21:27:08 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 181,440 | 288,711 | 1.5912 |
06 Jan 2013 15:22:35 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 155,520 | 248,081 | 1.5952 |
02 Jan 2013 19:34:41 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 129,600 | 207,551 | 1.6015 |
28 Dec 2012 14:10:43 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 103,680 | 165,796 | 1.5991 |
27 Dec 2012 13:20:22 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 77,760 | 123,986 | 1.5945 |
26 Dec 2012 14:15:09 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 51,840 | 82,951 | 1.6001 |
25 Dec 2012 11:18:04 | 1164195 | 15488868 | hadcm3n_3n4g_1940_40_008260810_0 | 25,920 | 42,513 | 1.6402 |
©2024 cpdn.org