Name | hadcm3n_o4b6_1980_40_008384955_4 |
Workunit | 8535814 |
Created | 7 Jan 2014, 23:33:02 UTC |
Sent | 7 Jan 2014, 23:33:15 UTC |
Report deadline | 9 Apr 2014, 7:00:26 UTC |
Received | 9 Jan 2014, 6:14:07 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1244729 |
Run time | 1 days 4 hours 47 min 10 sec |
CPU time | 1 days 2 hours 21 min 16 sec |
Validate state | Invalid |
Credit | 933.12 |
Device peak FLOPS | 3.30 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>7.2.33</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 23:40:19 (14964): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:40:21 (14964): No heartbeat from core client for 30 sec - exiting 23:40:22 (14964): No heartbeat from core client for 30 sec - exiting 02:46:01 (15436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 02:46:02 (15436): No heartbeat from core client for 30 sec - exiting 02:46:03 (15436): No heartbeat from core client for 30 sec - exiting 02:46:04 (15436): No heartbeat from core client for 30 sec - exiting 02:46:05 (15436): No heartbeat from core client for 30 sec - exiting 02:46:06 (15436): No heartbeat from core client for 30 sec - exiting 02:46:07 (15436): No heartbeat from core client for 30 sec - exiting 02:46:08 (15436): No heartbeat from core client for 30 sec - exiting 02:46:09 (15436): No heartbeat from core client for 30 sec - exiting 02:46:10 (15436): No heartbeat from core client for 30 sec - exiting 02:46:11 (15436): No heartbeat from core client for 30 sec - exiting 03:12:23 (19380): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:12:24 (19380): No heartbeat from core client for 30 sec - exiting 03:12:25 (19380): No heartbeat from core client for 30 sec - exiting 03:12:26 (19380): No heartbeat from core client for 30 sec - exiting 03:12:27 (19380): No heartbeat from core client for 30 sec - exiting 03:12:28 (19380): No heartbeat from core client for 30 sec - exiting 03:12:29 (19380): No heartbeat from core client for 30 sec - exiting 03:12:30 (19380): No heartbeat from core client for 30 sec - exiting 03:12:31 (19380): No heartbeat from core client for 30 sec - exiting 03:12:32 (19380): No heartbeat from core client for 30 sec - exiting 03:12:33 (19380): No heartbeat from core client for 30 sec - exiting 03:42:01 (5784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:42:06 (5784): No heartbeat from core client for 30 sec - exiting 03:42:07 (5784): No heartbeat from core client for 30 sec - exiting 03:42:08 (5784): No heartbeat from core client for 30 sec - exiting 03:42:09 (5784): No heartbeat from core client for 30 sec - exiting 03:42:10 (5784): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold 03:46:20 (19376): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 03:46:21 (19376): No heartbeat from core client for 30 sec - exiting 03:46:22 (19376): No heartbeat from core client for 30 sec - exiting 03:46:23 (19376): No heartbeat from core client for 30 sec - exiting 03:46:24 (19376): No heartbeat from core client for 30 sec - exiting 03:46:25 (19376): No heartbeat from core client for 30 sec - exiting 03:46:26 (19376): No heartbeat from core client for 30 sec - exiting 03:46:27 (19376): No heartbeat from core client for 30 sec - exiting 03:46:28 (19376): No heartbeat from core client for 30 sec - exiting 03:46:29 (19376): No heartbeat from core client for 30 sec - exiting 03:46:30 (19376): No heartbeat from core client for 30 sec - exiting 08:47:34 (12832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 08:47:36 (12832): No heartbeat from core client for 30 sec - exiting 08:47:37 (12832): No heartbeat from core client for 30 sec - exiting 08:47:38 (12832): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 09:47:14 (8176): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 09:47:24 (8176): No heartbeat from core client for 30 sec - exiting 09:47:25 (8176): No heartbeat from core client for 30 sec - exiting 09:47:26 (8176): No heartbeat from core client for 30 sec - exiting 09:47:27 (8176): No heartbeat from core client for 30 sec - exiting 09:47:28 (8176): No heartbeat from core client for 30 sec - exiting 09:47:29 (8176): No heartbeat from core client for 30 sec - exiting 09:47:30 (8176): No heartbeat from core client for 30 sec - exiting 09:47:31 (8176): No heartbeat from core client for 30 sec - exiting 09:47:32 (8176): No heartbeat from core client for 30 sec - exiting 09:47:33 (8176): No heartbeat from core client for 30 sec - exiting 09:47:34 (8176): No heartbeat from core client for 30 sec - exiting 09:47:35 (8176): No heartbeat from core client for 30 sec - exiting 09:47:36 (8176): No heartbeat from core client for 30 sec - exiting 09:47:37 (8176): No heartbeat from core client for 30 sec - exiting 09:47:38 (8176): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 09:49:31 (16052): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Atmos Hold Restart file rename failed on atmos_restart.hold 16:53:55 (15612): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... 19:28:43 (21556): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 19:28:44 (21556): No heartbeat from core client for 30 sec - exiting 19:28:45 (21556): No heartbeat from core client for 30 sec - exiting 19:28:46 (21556): No heartbeat from core client for 30 sec - exiting 19:28:47 (21556): No heartbeat from core client for 30 sec - exiting 19:28:48 (21556): No heartbeat from core client for 30 sec - exiting 19:28:49 (21556): No heartbeat from core client for 30 sec - exiting 19:28:50 (21556): No heartbeat from core client for 30 sec - exiting 19:28:51 (21556): No heartbeat from core client for 30 sec - exiting 19:28:52 (21556): No heartbeat from core client for 30 sec - exiting 19:28:53 (21556): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5744, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
08 Jan 2014 22:54:20 | 1244729 | 16203211 | hadcm3n_o4b6_1980_40_008384955_4 | 77,760 | 75,055 | 0.9652 |
08 Jan 2014 13:47:50 | 1244729 | 16203211 | hadcm3n_o4b6_1980_40_008384955_4 | 51,840 | 47,888 | 0.9238 |
08 Jan 2014 06:40:12 | 1244729 | 16203211 | hadcm3n_o4b6_1980_40_008384955_4 | 25,920 | 22,834 | 0.8809 |
©2024 cpdn.org