Name | hadcm3n_yey3_1980_40_007683949_0 |
Workunit | 7839036 |
Created | 16 Jan 2012, 8:48:29 UTC |
Sent | 16 Jan 2012, 8:50:18 UTC |
Report deadline | 16 Apr 2012, 16:17:29 UTC |
Received | 4 Feb 2012, 23:23:45 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 25 (0x00000019) Unknown error code |
Computer ID | 1190099 |
Run time | 15 days 4 hours 12 min 52 sec |
CPU time | 10 days 18 hours 49 min 49 sec |
Validate state | Invalid |
Credit | 8,398.08 |
Device peak FLOPS | 3.04 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The drive cannot locate a specific area or track on the disk. (0x19) - exit code 25 (0x19) </message> <stderr_txt> Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7476, iMonCtr=1 Model crash detected, will try to restart... 14:39:24 (6164): No heartbeat from core client for 30 sec - exiting 14:39:26 (6164): No heartbeat from core client for 30 sec - exiting 14:39:27 (6164): No heartbeat from core client for 30 sec - exiting 14:39:28 (6164): No heartbeat from core client for 30 sec - exiting 14:39:29 (6164): No heartbeat from core client for 30 sec - exiting 14:39:30 (6164): No heartbeat from core client for 30 sec - exiting 14:39:31 (6164): No heartbeat from core client for 30 sec - exiting 14:39:32 (6164): No heartbeat from core client for 30 sec - exiting 14:39:33 (6164): No heartbeat from core client for 30 sec - exiting 14:39:34 (6164): No heartbeat from core client for 30 sec - exiting 14:39:35 (6164): No heartbeat from core client for 30 sec - exiting 14:39:36 (6164): No heartbeat from core client for 30 sec - exiting 14:39:37 (6164): No heartbeat from core client for 30 sec - exiting 14:39:38 (6164): No heartbeat from core client for 30 sec - exiting 14:39:39 (6164): No heartbeat from core client for 30 sec - exiting 14:39:40 (6164): No heartbeat from core client for 30 sec - exiting 14:39:41 (6164): No heartbeat from core client for 30 sec - exiting 14:39:42 (6164): No heartbeat from core client for 30 sec - exiting 14:39:43 (6164): No heartbeat from core client for 30 sec - exiting 14:39:44 (6164): No heartbeat from core client for 30 sec - exiting 14:39:45 (6164): No heartbeat from core client for 30 sec - exiting 14:39:46 (6164): No heartbeat from core client for 30 sec - exiting 14:39:47 (6164): No heartbeat from core client for 30 sec - exiting 14:39:48 (6164): No heartbeat from core client for 30 sec - exiting 14:39:49 (6164): No heartbeat from core client for 30 sec - exiting 14:39:50 (6164): No heartbeat from core client for 30 sec - exiting 14:39:51 (6164): No heartbeat from core client for 30 sec - exiting 14:39:52 (6164): No heartbeat from core client for 30 sec - exiting 14:39:53 (6164): No heartbeat from core client for 30 sec - exiting 14:39:54 (6164): No heartbeat from core client for 30 sec - exiting 14:39:55 (6164): No heartbeat from core client for 30 sec - exiting 14:39:56 (6164): No heartbeat from core client for 30 sec - exiting 14:39:57 (6164): No heartbeat from core client for 30 sec - exiting 14:39:58 (6164): No heartbeat from core client for 30 sec - exiting 14:39:59 (6164): No heartbeat from core client for 30 sec - exiting 14:40:00 (6164): No heartbeat from core client for 30 sec - exiting 14:40:01 (6164): No heartbeat from core client for 30 sec - exiting 14:40:02 (6164): No heartbeat from core client for 30 sec - exiting 14:40:03 (6164): No heartbeat from core client for 30 sec - exiting 14:40:04 (6164): No heartbeat from core client for 30 sec - exiting 14:40:05 (6164): No heartbeat from core client for 30 sec - exiting 14:40:06 (6164): No heartbeat from core client for 30 sec - exiting 14:40:07 (6164): No heartbeat from core client for 30 sec - exiting 14:40:08 (6164): No heartbeat from core client for 30 sec - exiting 14:40:09 (6164): No heartbeat from core client for 30 sec - exiting 14:40:10 (6164): No heartbeat from core client for 30 sec - exiting 14:40:11 (6164): No heartbeat from core client for 30 sec - exiting 14:40:12 (6164): No heartbeat from core client for 30 sec - exiting 14:40:13 (6164): No heartbeat from core client for 30 sec - exiting 14:40:14 (6164): No heartbeat from core client for 30 sec - exiting 14:40:15 (6164): No heartbeat from core client for 30 sec - exiting 14:40:16 (6164): No heartbeat from core client for 30 sec - exiting 14:40:17 (6164): No heartbeat from core client for 30 sec - exiting 14:40:18 (6164): No heartbeat from core client for 30 sec - exiting 14:40:19 (6164): No heartbeat from core client for 30 sec - exiting 14:40:20 (6164): No heartbeat from core client for 30 sec - exiting 14:40:21 (6164): No heartbeat from core client for 30 sec - exiting 14:40:22 (6164): No heartbeat from core client for 30 sec - exiting 14:40:23 (6164): No heartbeat from core client for 30 sec - exiting 14:40:24 (6164): No heartbeat from core client for 30 sec - exiting 14:40:25 (6164): No heartbeat from core client for 30 sec - exiting 14:40:26 (6164): No heartbeat from core client for 30 sec - exiting 14:40:27 (6164): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 01:22:05 (10496): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 01:22:07 (10496): No heartbeat from core client for 30 sec - exiting 01:22:08 (10496): No heartbeat from core client for 30 sec - exiting 01:22:09 (10496): No heartbeat from core client for 30 sec - exiting 01:22:10 (10496): No heartbeat from core client for 30 sec - exiting 01:22:11 (10496): No heartbeat from core client for 30 sec - exiting 01:22:12 (10496): No heartbeat from core client for 30 sec - exiting 01:22:13 (10496): No heartbeat from core client for 30 sec - exiting 01:22:14 (10496): No heartbeat from core client for 30 sec - exiting 01:22:15 (10496): No heartbeat from core client for 30 sec - exiting 01:22:16 (10496): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=8588, iMonCtr=1 Model crash detected, will try to restart... 12:45:32 (8924): No heartbeat from core client for 30 sec - exiting 12:45:33 (8924): No heartbeat from core client for 30 sec - exiting 12:45:34 (8924): No heartbeat from core client for 30 sec - exiting 12:45:35 (8924): No heartbeat from core client for 30 sec - exiting 12:45:36 (8924): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4024, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7796, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
04 Feb 2012 06:38:54 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 699,840 | 908,675 | 1.2984 |
03 Feb 2012 13:59:36 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 673,920 | 874,651 | 1.2979 |
02 Feb 2012 23:48:20 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 648,000 | 840,656 | 1.2973 |
02 Feb 2012 10:49:03 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 622,080 | 806,929 | 1.2971 |
01 Feb 2012 21:40:41 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 596,160 | 773,168 | 1.2969 |
01 Feb 2012 00:20:51 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 570,240 | 738,786 | 1.2956 |
31 Jan 2012 08:42:10 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 544,320 | 704,956 | 1.2951 |
30 Jan 2012 15:15:39 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 518,400 | 670,851 | 1.2941 |
30 Jan 2012 01:40:52 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 492,480 | 636,448 | 1.2923 |
29 Jan 2012 13:48:08 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 466,560 | 602,624 | 1.2916 |
29 Jan 2012 01:45:14 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 440,640 | 568,720 | 1.2907 |
27 Jan 2012 17:44:45 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 414,720 | 535,272 | 1.2907 |
27 Jan 2012 01:45:46 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 388,800 | 501,898 | 1.2909 |
26 Jan 2012 12:11:39 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 362,880 | 468,331 | 1.2906 |
25 Jan 2012 23:12:24 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 336,960 | 434,534 | 1.2896 |
25 Jan 2012 03:30:52 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 311,040 | 400,781 | 1.2885 |
24 Jan 2012 13:43:44 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 285,120 | 366,851 | 1.2867 |
23 Jan 2012 20:40:25 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 259,200 | 332,214 | 1.2817 |
23 Jan 2012 05:49:34 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 233,280 | 298,497 | 1.2796 |
22 Jan 2012 12:06:50 | 1190099 | 13928341 | hadcm3n_yey3_1980_40_007683949_0 | 207,360 | 263,984 | 1.2731 |
©2024 cpdn.org