Name | hadcm3n_y7vm_1900_40_007522966_0 |
Workunit | 7720441 |
Created | 28 Oct 2011, 13:20:27 UTC |
Sent | 31 Oct 2011, 21:17:13 UTC |
Report deadline | 31 Jan 2012, 4:44:24 UTC |
Received | 15 Nov 2011, 16:40:08 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1122757 |
Run time | 5 days 4 hours 8 min 42 sec |
CPU time | 5 days 1 hours 58 min 37 sec |
Validate state | Invalid |
Credit | 1,555.20 |
Device peak FLOPS | 1.70 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.12.34</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 20:15:37 (5008): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:30:46 (5436): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:58:11 (5000): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:58:19 (5000): No heartbeat from core client for 30 sec - exiting 20:58:20 (5000): No heartbeat from core client for 30 sec - exiting 20:58:21 (5000): No heartbeat from core client for 30 sec - exiting 20:58:22 (5000): No heartbeat from core client for 30 sec - exiting 20:58:23 (5000): No heartbeat from core client for 30 sec - exiting 20:58:24 (5000): No heartbeat from core client for 30 sec - exiting 20:58:26 (5000): No heartbeat from core client for 30 sec - exiting 20:58:27 (5000): No heartbeat from core client for 30 sec - exiting 20:58:28 (5000): No heartbeat from core client for 30 sec - exiting 20:58:29 (5000): No heartbeat from core client for 30 sec - exiting 23:03:18 (4792): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 23:03:51 (4792): No heartbeat from core client for 30 sec - exiting 23:03:52 (4792): No heartbeat from core client for 30 sec - exiting 23:03:54 (4792): No heartbeat from core client for 30 sec - exiting 23:03:55 (4792): No heartbeat from core client for 30 sec - exiting 23:03:56 (4792): No heartbeat from core client for 30 sec - exiting 23:03:57 (4792): No heartbeat from core client for 30 sec - exiting 23:03:58 (4792): No heartbeat from core client for 30 sec - exiting 23:03:59 (4792): No heartbeat from core client for 30 sec - exiting 23:04:00 (4792): No heartbeat from core client for 30 sec - exiting 23:04:01 (4792): No heartbeat from core client for 30 sec - exiting 23:04:02 (4792): No heartbeat from core client for 30 sec - exiting 23:04:03 (4792): No heartbeat from core client for 30 sec - exiting 23:04:04 (4792): No heartbeat from core client for 30 sec - exiting 23:04:06 (4792): No heartbeat from core client for 30 sec - exiting 23:04:07 (4792): No heartbeat from core client for 30 sec - exiting 23:04:08 (4792): No heartbeat from core client for 30 sec - exiting 23:04:09 (4792): No heartbeat from core client for 30 sec - exiting 23:04:10 (4792): No heartbeat from core client for 30 sec - exiting 23:04:11 (4792): No heartbeat from core client for 30 sec - exiting 23:04:12 (4792): No heartbeat from core client for 30 sec - exiting 23:04:13 (4792): No heartbeat from core client for 30 sec - exiting 23:04:14 (4792): No heartbeat from core client for 30 sec - exiting 23:04:15 (4792): No heartbeat from core client for 30 sec - exiting 23:04:16 (4792): No heartbeat from core client for 30 sec - exiting 23:04:18 (4792): No heartbeat from core client for 30 sec - exiting 23:04:19 (4792): No heartbeat from core client for 30 sec - exiting 23:04:20 (4792): No heartbeat from core client for 30 sec - exiting 23:04:21 (4792): No heartbeat from core client for 30 sec - exiting 23:04:22 (4792): No heartbeat from core client for 30 sec - exiting 23:04:23 (4792): No heartbeat from core client for 30 sec - exiting 23:04:24 (4792): No heartbeat from core client for 30 sec - exiting 16:35:33 (3680): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 16:35:48 (3680): No heartbeat from core client for 30 sec - exiting 16:35:49 (3680): No heartbeat from core client for 30 sec - exiting 16:35:50 (3680): No heartbeat from core client for 30 sec - exiting 16:35:51 (3680): No heartbeat from core client for 30 sec - exiting 16:35:52 (3680): No heartbeat from core client for 30 sec - exiting 16:35:54 (3680): No heartbeat from core client for 30 sec - exiting 16:35:55 (3680): No heartbeat from core client for 30 sec - exiting 16:35:56 (3680): No heartbeat from core client for 30 sec - exiting 16:35:57 (3680): No heartbeat from core client for 30 sec - exiting 16:35:58 (3680): No heartbeat from core client for 30 sec - exiting 16:35:59 (3680): No heartbeat from core client for 30 sec - exiting 16:36:00 (3680): No heartbeat from core client for 30 sec - exiting 16:36:05 (3680): No heartbeat from core client for 30 sec - exiting 16:36:06 (3680): No heartbeat from core client for 30 sec - exiting 16:36:07 (3680): No heartbeat from core client for 30 sec - exiting 16:36:09 (3680): No heartbeat from core client for 30 sec - exiting 16:36:10 (3680): No heartbeat from core client for 30 sec - exiting 16:36:11 (3680): No heartbeat from core client for 30 sec - exiting 16:36:12 (3680): No heartbeat from core client for 30 sec - exiting 16:36:13 (3680): No heartbeat from core client for 30 sec - exiting 16:36:14 (3680): No heartbeat from core client for 30 sec - exiting 16:36:15 (3680): No heartbeat from core client for 30 sec - exiting 16:36:16 (3680): No heartbeat from core client for 30 sec - exiting 16:36:17 (3680): No heartbeat from core client for 30 sec - exiting 16:36:18 (3680): No heartbeat from core client for 30 sec - exiting 16:36:19 (3680): No heartbeat from core client for 30 sec - exiting 16:36:21 (3680): No heartbeat from core client for 30 sec - exiting 16:36:22 (3680): No heartbeat from core client for 30 sec - exiting 16:36:23 (3680): No heartbeat from core client for 30 sec - exiting 16:36:24 (3680): No heartbeat from core client for 30 sec - exiting 16:36:25 (3680): No heartbeat from core client for 30 sec - exiting 16:36:26 (3680): No heartbeat from core client for 30 sec - exiting 16:36:27 (3680): No heartbeat from core client for 30 sec - exiting 16:36:28 (3680): No heartbeat from core client for 30 sec - exiting 16:36:29 (3680): No heartbeat from core client for 30 sec - exiting 16:36:30 (3680): No heartbeat from core client for 30 sec - exiting 16:36:31 (3680): No heartbeat from core client for 30 sec - exiting 16:36:33 (3680): No heartbeat from core client for 30 sec - exiting 16:36:34 (3680): No heartbeat from core client for 30 sec - exiting 16:36:35 (3680): No heartbeat from core client for 30 sec - exiting 16:36:36 (3680): No heartbeat from core client for 30 sec - exiting Atmos Hold Restart file rename failed on atmos_restart.hold 02:30:41 (4544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7136, iMonCtr=1 Model crash detected, will try to restart... 02:37:45 (7136): No heartbeat from core client for 30 sec - exiting 02:37:46 (7136): No heartbeat from core client for 30 sec - exiting 02:37:47 (7136): No heartbeat from core client for 30 sec - exiting 02:37:48 (7136): No heartbeat from core client for 30 sec - exiting 02:37:49 (7136): No heartbeat from core client for 30 sec - exiting 02:37:50 (7136): No heartbeat from core client for 30 sec - exiting 02:37:51 (7136): No heartbeat from core client for 30 sec - exiting 02:37:52 (7136): No heartbeat from core client for 30 sec - exiting 02:37:53 (7136): No heartbeat from core client for 30 sec - exiting 02:37:54 (7136): No heartbeat from core client for 30 sec - exiting Signal 22 received, exiting... Called boinc_finish CPDN Monitor - No 'heartbeat' from BOINC... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1 Model crash detected, will try to restart... Signal 22 received, exiting... Called boinc_finish Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=6472, iMonCtr=1 Model crash detected, will try to restart... Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
15 Nov 2011 16:50:22 | 1122757 | 13551626 | hadcm3n_y7vm_1900_40_007522966_0 | 129,600 | 393,003 | 3.0324 |
15 Nov 2011 16:50:22 | 1122757 | 13551626 | hadcm3n_y7vm_1900_40_007522966_0 | 103,680 | 313,246 | 3.0213 |
15 Nov 2011 16:50:22 | 1122757 | 13551626 | hadcm3n_y7vm_1900_40_007522966_0 | 77,760 | 234,800 | 3.0195 |
10 Nov 2011 02:07:03 | 1122757 | 13551626 | hadcm3n_y7vm_1900_40_007522966_0 | 51,840 | 156,879 | 3.0262 |
09 Nov 2011 03:15:10 | 1122757 | 13551626 | hadcm3n_y7vm_1900_40_007522966_0 | 25,920 | 78,760 | 3.0386 |
©2024 cpdn.org