Name | hadcm3n_ofpq_1900_40_008475489_2 |
Workunit | 8626328 |
Created | 22 Jan 2014, 0:41:01 UTC |
Sent | 22 Jan 2014, 0:41:25 UTC |
Report deadline | 23 Apr 2014, 8:08:36 UTC |
Received | 16 Jun 2014, 22:20:46 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1227663 |
Run time | 15 days 5 hours 38 min 43 sec |
CPU time | 14 days 21 hours 44 min |
Validate state | Invalid |
Credit | 9,020.16 |
Device peak FLOPS | 2.68 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.0.28</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 21:16:53 (3932): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3772, iMonCtr=1 Model crash detected, will try to restart... 21:18:17 (3532): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:29:18 (788): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:35:44 (5736): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5132, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1 Model crash detected, will try to restart... 18:25:27 (5476): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1648, iMonCtr=1 Model crash detected, will try to restart... 21:07:08 (5652): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3688, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4852, iMonCtr=1 Model crash detected, will try to restart... 19:58:13 (4904): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:12:27 (5600): No heartbeat from core client for 30 sec - exiting 15:12:28 (5600): No heartbeat from core client for 30 sec - exiting 15:12:29 (5600): No heartbeat from core client for 30 sec - exiting 15:12:30 (5600): No heartbeat from core client for 30 sec - exiting 15:12:31 (5600): No heartbeat from core client for 30 sec - exiting 15:12:32 (5600): No heartbeat from core client for 30 sec - exiting 15:12:33 (5600): No heartbeat from core client for 30 sec - exiting 15:12:34 (5600): No heartbeat from core client for 30 sec - exiting 15:12:35 (5600): No heartbeat from core client for 30 sec - exiting 15:12:36 (5600): No heartbeat from core client for 30 sec - exiting 15:12:37 (5600): No heartbeat from core client for 30 sec - exiting 15:12:38 (5600): No heartbeat from core client for 30 sec - exiting 15:12:39 (5600): No heartbeat from core client for 30 sec - exiting 15:12:40 (5600): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3772, iMonCtr=1 Model crash detected, will try to restart... 15:35:58 (4004): No heartbeat from core client for 30 sec - exiting 15:36:00 (4004): No heartbeat from core client for 30 sec - exiting 15:36:01 (4004): No heartbeat from core client for 30 sec - exiting 15:36:02 (4004): No heartbeat from core client for 30 sec - exiting 15:36:03 (4004): No heartbeat from core client for 30 sec - exiting 15:36:04 (4004): No heartbeat from core client for 30 sec - exiting 15:36:05 (4004): No heartbeat from core client for 30 sec - exiting 15:36:06 (4004): No heartbeat from core client for 30 sec - exiting 15:36:07 (4004): No heartbeat from core client for 30 sec - exiting 15:36:08 (4004): No heartbeat from core client for 30 sec - exiting 15:36:09 (4004): No heartbeat from core client for 30 sec - exiting 15:36:10 (4004): No heartbeat from core client for 30 sec - exiting 15:36:11 (4004): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:47:30 (3356): No heartbeat from core client for 30 sec - exiting 19:47:31 (3356): No heartbeat from core client for 30 sec - exiting 19:47:32 (3356): No heartbeat from core client for 30 sec - exiting 19:47:33 (3356): No heartbeat from core client for 30 sec - exiting 19:47:34 (3356): No heartbeat from core client for 30 sec - exiting 19:47:35 (3356): No heartbeat from core client for 30 sec - exiting 19:47:36 (3356): No heartbeat from core client for 30 sec - exiting 19:47:37 (3356): No heartbeat from core client for 30 sec - exiting 19:47:38 (3356): No heartbeat from core client for 30 sec - exiting 19:47:39 (3356): No heartbeat from core client for 30 sec - exiting 19:47:40 (3356): No heartbeat from core client for 30 sec - exiting 19:47:41 (3356): No heartbeat from core client for 30 sec - exiting 19:47:42 (3356): No heartbeat from core client for 30 sec - exiting 19:47:43 (3356): No heartbeat from core client for 30 sec - exiting 19:47:44 (3356): No heartbeat from core client for 30 sec - exiting 19:47:45 (3356): No heartbeat from core client for 30 sec - exiting 19:47:46 (3356): No heartbeat from core client for 30 sec - exiting 19:47:47 (3356): No heartbeat from core client for 30 sec - exiting 19:47:48 (3356): No heartbeat from core client for 30 sec - exiting 19:47:49 (3356): No heartbeat from core client for 30 sec - exiting 19:47:50 (3356): No heartbeat from core client for 30 sec - exiting 19:47:51 (3356): No heartbeat from core client for 30 sec - exiting 19:47:52 (3356): No heartbeat from core client for 30 sec - exiting 19:47:53 (3356): No heartbeat from core client for 30 sec - exiting 19:47:54 (3356): No heartbeat from core client for 30 sec - exiting 19:47:55 (3356): No heartbeat from core client for 30 sec - exiting 19:47:56 (3356): No heartbeat from core client for 30 sec - exiting 19:47:57 (3356): No heartbeat from core client for 30 sec - exiting 19:47:58 (3356): No heartbeat from core client for 30 sec - exiting 19:47:59 (3356): No heartbeat from core client for 30 sec - exiting 19:48:00 (3356): No heartbeat from core client for 30 sec - exiting 19:48:01 (3356): No heartbeat from core client for 30 sec - exiting 19:48:02 (3356): No heartbeat from core client for 30 sec - exiting 19:48:03 (3356): No heartbeat from core client for 30 sec - exiting 19:48:04 (3356): No heartbeat from core client for 30 sec - exiting 19:48:05 (3356): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4264, iMonCtr=1 Model crash detected, will try to restart... C19:51:35 (4324): No heartbeat from core client for 30 sec - exiting 19:51:37 (4324): No heartbeat from core client for 30 sec - exiting 19:51:38 (4324): No heartbeat from core client for 30 sec - exiting 19:51:39 (4324): No heartbeat from core client for 30 sec - exiting 19:51:40 (4324): No heartbeat from core client for 30 sec - exiting 19:51:41 (4324): No heartbeat from core client for 30 sec - exiting 19:51:42 (4324): No heartbeat from core client for 30 sec - exiting 19:51:43 (4324): No heartbeat from core client for 30 sec - exiting 19:51:44 (4324): No heartbeat from core client for 30 sec - exiting 19:51:45 (4324): No heartbeat from core client for 30 sec - exiting 19:51:46 (4324): No heartbeat from core client for 30 sec - exiting 19:51:47 (4324): No heartbeat from core client for 30 sec - exiting 19:51:48 (4324): No heartbeat from core client for 30 sec - exiting 19:51:49 (4324): No heartbeat from core client for 30 sec - exiting 19:51:50 (4324): No heartbeat from core client for 30 sec - exiting 19:51:51 (4324): No heartbeat from core client for 30 sec - exiting 19:51:52 (4324): No heartbeat from core client for 30 sec - exiting 19:51:53 (4324): No heartbeat from core client for 30 sec - exiting 19:51:54 (4324): No heartbeat from core client for 30 sec - exiting 19:51:55 (4324): No heartbeat from core client for 30 sec - exiting 19:51:56 (4324): No heartbeat from core client for 30 sec - exiting 19:51:57 (4324): No heartbeat from core client for 30 sec - exiting 19:51:58 (4324): No heartbeat from core client for 30 sec - exiting 19:51:59 (4324): No heartbeat from core client for 30 sec - exiting 19:52:00 (4324): No heartbeat from core client for 30 sec - exiting 19:52:01 (4324): No heartbeat from core client for 30 sec - exiting 19:52:02 (4324): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:52:03 (4324): No heartbeat from core client for 30 sec - exiting Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6004, iMonCtr=1 Model crash detected, will try to restart... 20:46:48 (1776): No heartbeat from core client for 30 sec - exiting 20:46:49 (1776): No heartbeat from core client for 30 sec - exiting 20:46:50 (1776): No heartbeat from core client for 30 sec - exiting 20:46:51 (1776): No heartbeat from core client for 30 sec - exiting 20:46:52 (1776): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:51:53 (4564): No heartbeat from core client for 30 sec - exiting 18:51:54 (4564): No heartbeat from core client for 30 sec - exiting 18:51:55 (4564): No heartbeat from core client for 30 sec - exiting 18:51:56 (4564): No heartbeat from core client for 30 sec - exiting 18:51:57 (4564): No heartbeat from core client for 30 sec - exiting 18:51:58 (4564): No heartbeat from core client for 30 sec - exiting 18:51:59 (4564): No heartbeat from core client for 30 sec - exiting 18:52:00 (4564): No heartbeat from core client for 30 sec - exiting 18:52:01 (4564): No heartbeat from core client for 30 sec - exiting 18:52:02 (4564): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4204, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=116, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4784, iMonCtr=1 Model crash detected, will try to restart... 18:31:07 (3712): No heartbeat from core client for 30 sec - exiting 18:31:09 (3712): No heartbeat from core client for 30 sec - exiting 18:31:10 (3712): No heartbeat from core client for 30 sec - exiting 18:31:11 (3712): No heartbeat from core client for 30 sec - exiting 18:31:12 (3712): No heartbeat from core client for 30 sec - exiting 18:31:13 (3712): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 18:29:14 (4024): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 21:19:49 (5304): No heartbeat from core client for 30 sec - exiting 21:19:51 (5304): No heartbeat from core client for 30 sec - exiting 21:19:52 (5304): No heartbeat from core client for 30 sec - exiting 21:19:53 (5304): No heartbeat from core client for 30 sec - exiting 21:19:54 (5304): No heartbeat from core client for 30 sec - exiting 21:19:55 (5304): No heartbeat from core client for 30 sec - exiting 21:19:56 (5304): No heartbeat from core client for 30 sec - exiting 21:19:57 (5304): No heartbeat from core client for 30 sec - exiting 21:19:58 (5304): No heartbeat from core client for 30 sec - exiting 21:19:59 (5304): No heartbeat from core client for 30 sec - exiting 21:20:00 (5304): No heartbeat from core client for 30 sec - exiting 21:20:01 (5304): No heartbeat from core client for 30 sec - exiting 21:20:02 (5304): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4460, iMonCtr=1 Model crash detected, will try to restart... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4500, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=2444, iMonCtr=1 Model crash detected, will try to restart... Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Model crashed: INITDUMP: Wrong no of ocean prognostic fields tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
11 Jun 2014 00:51:13 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 751,680 | 1,278,217 | 1.7005 |
10 Jun 2014 09:02:11 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 725,760 | 1,238,593 | 1.7066 |
10 Jun 2014 06:02:01 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 699,840 | 1,194,635 | 1.7070 |
02 Jun 2014 21:44:24 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 673,920 | 1,149,400 | 1.7055 |
23 May 2014 02:39:19 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 648,000 | 1,102,720 | 1.7017 |
19 May 2014 21:49:09 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 622,080 | 1,056,530 | 1.6984 |
17 May 2014 20:31:15 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 596,160 | 1,011,538 | 1.6968 |
07 May 2014 23:24:42 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 570,240 | 966,613 | 1.6951 |
28 Apr 2014 23:49:05 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 544,320 | 922,170 | 1.6942 |
15 Apr 2014 01:47:46 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 518,400 | 883,706 | 1.7047 |
11 Apr 2014 00:16:37 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 492,480 | 840,478 | 1.7066 |
07 Apr 2014 23:33:51 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 466,560 | 794,427 | 1.7027 |
01 Apr 2014 04:49:16 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 440,640 | 750,016 | 1.7021 |
30 Mar 2014 21:13:32 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 414,720 | 706,123 | 1.7026 |
30 Mar 2014 09:22:05 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 388,800 | 663,785 | 1.7073 |
29 Mar 2014 21:30:04 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 362,880 | 621,237 | 1.7120 |
29 Mar 2014 09:21:05 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 336,960 | 577,801 | 1.7147 |
28 Mar 2014 00:51:00 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 311,040 | 533,606 | 1.7156 |
19 Mar 2014 23:54:05 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 285,120 | 487,755 | 1.7107 |
11 Mar 2014 00:32:04 | 1227663 | 16274831 | hadcm3n_ofpq_1900_40_008475489_2 | 259,200 | 442,589 | 1.7075 |
©2024 cpdn.org