Name | hadsm3dhet2_jl5k_006590410_8 |
Workunit | 6793783 |
Created | 15 Mar 2010, 11:53:37 UTC |
Sent | 21 Oct 2010, 3:55:20 UTC |
Report deadline | 3 Oct 2011, 9:15:20 UTC |
Received | 20 Jan 2013, 1:07:12 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1108548 |
Run time | 12 days 3 hours 19 min 20 sec |
CPU time | 8 days 22 hours 40 min 20 sec |
Validate state | Invalid |
Credit | 2,778.81 |
Device peak FLOPS | 0.65 GFLOPS |
Application version | UK Met Office HadSM3 Slab Model v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4476, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4476, iMonCtr=1 Model crash detected, will try to restart... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5220, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1064, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5404, iMonCtr=1 Model crash detected, will try to restart... No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting No heartbeat from core client for 30 sec - exiting MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. MainError: 03:28:24 AM No files match the supplied pattern. CPDN MoNo heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... NoCPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4400, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
19 Jan 2013 07:24:50 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 43,208 | 763,330 | 2.5238 |
22 Sep 2012 08:27:53 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 32,406 | 737,641 | 2.5292 |
21 Sep 2012 22:50:38 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 21,604 | 711,725 | 2.5342 |
21 Sep 2012 12:28:39 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 10,802 | 685,821 | 2.5396 |
21 Sep 2012 03:31:27 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 259,248 | 660,165 | 2.5465 |
20 Sep 2012 18:54:40 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 248,446 | 634,396 | 2.5535 |
20 Sep 2012 07:36:27 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 237,644 | 608,405 | 2.5602 |
19 Sep 2012 23:34:12 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 226,842 | 582,546 | 2.5681 |
19 Sep 2012 12:29:02 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 216,040 | 556,562 | 2.5762 |
19 Sep 2012 05:12:16 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 205,238 | 531,561 | 2.5900 |
18 Sep 2012 21:59:26 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 194,436 | 506,343 | 2.6042 |
18 Sep 2012 07:55:18 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 183,634 | 481,082 | 2.6198 |
17 Sep 2012 19:29:58 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 172,832 | 454,959 | 2.6324 |
17 Sep 2012 06:54:46 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 162,030 | 428,758 | 2.6462 |
16 Sep 2012 16:01:22 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 151,228 | 394,522 | 2.6088 |
16 Sep 2012 04:13:29 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 140,426 | 368,395 | 2.6234 |
15 Sep 2012 14:48:55 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 129,624 | 341,772 | 2.6366 |
15 Sep 2012 03:26:21 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 118,822 | 315,794 | 2.6577 |
07 Sep 2012 07:19:35 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 108,020 | 290,155 | 2.6861 |
07 Sep 2012 00:22:31 | 1108548 | 10965835 | hadsm3dhet2_jl5k_006590410_8 | 97,218 | 264,517 | 2.7209 |
©2024 cpdn.org