climateprediction.net home page
Task 10999038

Task 10999038

Name hadsm3dhet2_jnpt_006593731_1
Workunit 6797104
Created 15 Mar 2010, 11:59:09 UTC
Sent 12 Oct 2010, 7:33:13 UTC
Report deadline 24 Sep 2011, 12:53:13 UTC
Received 26 Oct 2010, 10:30:05 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1102538
Run time 7 days 1 hours 45 min 13 sec
CPU time 3 days 21 hours 10 min 21 sec
Validate state Invalid
Credit 793.95
Device peak FLOPS 1.15 GFLOPS
Application version UK Met Office HadSM3 Slab Model v6.07
windows_intelx86
Stderr
<core_client_version>6.10.43</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1316, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3648, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=3984, iMonCtr=1
Model crash detected, will try to restart...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
CPDN Monitor - Quit request from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
No heartbeat from core client for 30 sec - exiting
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1
Model crash detected, will try to restart...
CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1664, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
26 Oct 2010 03:26:20 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 86,416 313,495 3.6277
22 Oct 2010 19:00:25 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 75,614 275,442 3.6427
21 Oct 2010 15:33:32 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 64,812 238,311 3.6770
20 Oct 2010 13:37:59 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 54,010 196,884 3.6453
19 Oct 2010 06:08:59 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 43,208 156,162 3.6142
18 Oct 2010 17:03:28 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 32,406 117,166 3.6156
13 Oct 2010 14:46:39 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 21,604 77,271 3.5767
12 Oct 2010 22:34:41 1102538 10999038 hadsm3dhet2_jnpt_006593731_1 10,802 38,225 3.5387


©2024 climateprediction.net