Name | hadcm3n_ldit_198012_480_010219793_0 |
Workunit | 10219793 |
Created | 4 Dec 2015, 14:39:32 UTC |
Sent | 4 Dec 2015, 14:43:36 UTC |
Report deadline | 15 Nov 2016, 20:03:36 UTC |
Received | 18 Jan 2016, 12:27:38 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1364207 |
Run time | 13 days 15 hours 14 min 58 sec |
CPU time | 13 days 7 hours 34 min 3 sec |
Validate state | Invalid |
Credit | 11,508.48 |
Device peak FLOPS | 3.65 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.6.22</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... 14:43:05 (10052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:48:05 (11764): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:53:14 (1696): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 14:58:21 (12364): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:04:42 (10904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:08:27 (6840): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:18:38 (5800): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:23:43 (6368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:28:50 (9368): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:34:02 (13564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:20:06 (912): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:50:18 (14232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:58:02 (11520): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:00:27 (5100): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:46:35 (6860): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 00:57:47 (19348): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:53:54 (18856): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:09:08 (21256): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:20:27 (7232): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 04:24:14 (21116): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17368, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17368, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17368, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17368, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17368, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=17368, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
18 Jan 2016 05:28:49 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 959,040 | 1,124,863 | 1.1729 |
17 Jan 2016 20:22:13 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 933,120 | 1,094,009 | 1.1724 |
17 Jan 2016 11:10:31 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 907,200 | 1,063,082 | 1.1718 |
17 Jan 2016 02:06:03 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 881,280 | 1,032,268 | 1.1713 |
16 Jan 2016 16:49:08 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 855,360 | 1,001,314 | 1.1706 |
16 Jan 2016 08:06:39 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 829,440 | 970,349 | 1.1699 |
15 Jan 2016 23:23:58 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 803,520 | 939,521 | 1.1693 |
15 Jan 2016 14:26:43 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 777,600 | 908,779 | 1.1687 |
15 Jan 2016 05:18:28 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 751,680 | 877,526 | 1.1674 |
14 Jan 2016 20:14:23 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 725,760 | 845,516 | 1.1650 |
14 Jan 2016 11:05:52 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 699,840 | 813,671 | 1.1627 |
14 Jan 2016 01:41:34 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 673,920 | 781,949 | 1.1603 |
13 Jan 2016 16:53:18 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 648,000 | 749,652 | 1.1569 |
13 Jan 2016 07:44:16 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 622,080 | 717,763 | 1.1538 |
12 Jan 2016 22:37:40 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 596,160 | 685,233 | 1.1494 |
12 Jan 2016 13:30:53 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 570,240 | 652,693 | 1.1446 |
12 Jan 2016 04:23:31 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 544,320 | 619,892 | 1.1388 |
11 Jan 2016 18:59:28 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 518,400 | 589,589 | 1.1373 |
11 Jan 2016 11:33:20 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 492,480 | 561,266 | 1.1397 |
11 Jan 2016 04:26:13 | 1364207 | 19130795 | hadcm3n_ldit_198012_480_010219793_0 | 466,560 | 535,470 | 1.1477 |
©2024 cpdn.org