Name | hadcm3n_t4sr_1940_40_007448801_2 |
Workunit | 7646304 |
Created | 18 Sep 2011, 7:25:35 UTC |
Sent | 18 Sep 2011, 7:30:01 UTC |
Report deadline | 18 Dec 2011, 14:57:12 UTC |
Received | 13 Oct 2011, 16:47:57 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 22 (0x00000016) Unknown error code |
Computer ID | 1040277 |
Run time | 3 days 0 hours 15 min 52 sec |
CPU time | 2 days 9 hours 35 min 42 sec |
Validate state | Invalid |
Credit | 622.08 |
Device peak FLOPS | 1.77 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 windows_intelx86 |
Stderr | <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The device does not recognize the command. (0x16) - exit code 22 (0x16) </message> <stderr_txt> 14:07:27 (4944): No heartbeat from core client for 30 sec - exiting 14:07:28 (4944): No heartbeat from core client for 30 sec - exiting 14:07:29 (4944): No heartbeat from core client for 30 sec - exiting 14:07:30 (4944): No heartbeat from core client for 30 sec - exiting 14:07:31 (4944): No heartbeat from core client for 30 sec - exiting 14:07:33 (4944): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:49:47 (5016): No heartbeat from core client for 30 sec - exiting 10:49:50 (5016): No heartbeat from core client for 30 sec - exiting 10:49:51 (5016): No heartbeat from core client for 30 sec - exiting 10:49:53 (5016): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4796, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... C09:52:29 (4832): No heartbeat from core client for 30 sec - exiting 09:52:30 (4832): No heartbeat from core client for 30 sec - exiting 09:52:31 (4832): No heartbeat from core client for 30 sec - exiting 09:52:33 (4832): No heartbeat from core client for 30 sec - exiting 09:52:34 (4832): No heartbeat from core client for 30 sec - exiting 09:52:37 (4832): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5412, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 11:16:48 (4812): No heartbeat from core client for 30 sec - exiting 11:16:49 (4812): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5808, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:09:54 (5104): No heartbeat from core client for 30 sec - exiting 10:09:59 (5104): No heartbeat from core client for 30 sec - exiting 10:10:00 (5104): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 18:10:36 (4920): No heartbeat from core client for 30 sec - exiting 18:10:40 (4920): No heartbeat from core client for 30 sec - exiting 18:10:41 (4920): No heartbeat from core client for 30 sec - exiting 18:10:42 (4920): No heartbeat from core client for 30 sec - exiting 18:10:43 (4920): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 20:36:47 (4768): No heartbeat from core client for 30 sec - exiting 20:36:54 (4768): No heartbeat from core client for 30 sec - exiting 20:36:57 (4768): No heartbeat from core client for 30 sec - exiting 20:36:58 (4768): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 17:50:57 (5028): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:24:26 (4784): No heartbeat from core client for 30 sec - exiting 15:24:28 (4784): No heartbeat from core client for 30 sec - exiting 15:24:29 (4784): No heartbeat from core client for 30 sec - exiting 15:24:31 (4784): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITHEAD: I/O error tmp/pipe_dummy 2048 forrtl: There is not enough space on the disk. Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5928, iMonCtr=1 Model crash detected, will try to restart... BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITHEAD: I/O error tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: C I/O Error - Return code = 32 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 Sorry, too many model crashes! :-( Called boinc_finish </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
08 Oct 2011 20:09:57 | 1040277 | 13395555 | hadcm3n_t4sr_1940_40_007448801_2 | 51,840 | 168,697 | 3.2542 |
29 Sep 2011 10:05:13 | 1040277 | 13395555 | hadcm3n_t4sr_1940_40_007448801_2 | 25,920 | 84,531 | 3.2612 |
©2024 cpdn.org