Name | hadam3p_eu_xcm6_1990_1_007154428_1 |
Workunit | 7339208 |
Created | 3 Sep 2012, 17:01:37 UTC |
Sent | 3 Sep 2012, 18:29:09 UTC |
Report deadline | 16 Aug 2013, 23:49:09 UTC |
Received | 4 May 2013, 18:16:06 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 1 (0x00000001) Unknown error code |
Computer ID | 1023437 |
Run time | 5 days 19 hours 48 min 33 sec |
CPU time | 5 days 15 hours 6 min 12 sec |
Validate state | Invalid |
Credit | 1,988.94 |
Device peak FLOPS | 2.13 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.09 windows_intelx86 |
Stderr | <core_client_version>6.6.36</core_client_version> <![CDATA[ <message> Fonction incorrecte. (0x1) - exit code 1 (0x1) </message> <stderr_txt> Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4856, selfPID=5700, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5308, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4692, selfPID=1716, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2896, selfPID=2148, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6200, selfPID=3604, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5924, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4740, selfPID=5572, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5612, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5664, selfPID=4656, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5608, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4632, selfPID=5008, iMonCtr=1 Model crash detected, will try to restart... 19:19:40 (4060): No heartbeat from core client for 30 sec - exiting 19:19:41 (4060): No heartbeat from core client for 30 sec - exiting 19:19:42 (4060): No heartbeat from core client for 30 sec - exiting 19:19:43 (4060): No heartbeat from core client for 30 sec - exiting 19:19:44 (4060): No heartbeat from core client for 30 sec - exiting 19:19:46 (4060): No heartbeat from core client for 30 sec - exiting 19:19:47 (4060): No heartbeat from core client for 30 sec - exiting 19:19:48 (4060): No heartbeat from core client for 30 sec - exiting 19:19:49 (4060): No heartbeat from core client for 30 sec - exiting 19:19:50 (4060): No heartbeat from core client for 30 sec - exiting 19:19:51 (4060): No heartbeat from core client for 30 sec - exiting 19:19:52 (4060): No heartbeat from core client for 30 sec - exiting 19:19:53 (4060): No heartbeat from core client for 30 sec - exiting 19:19:54 (4060): No heartbeat from core client for 30 sec - exiting 19:19:55 (4060): No heartbeat from core client for 30 sec - exiting 19:19:56 (4060): No heartbeat from core client for 30 sec - exiting 19:19:58 (4060): No heartbeat from core client for 30 sec - exiting 19:19:59 (4060): No heartbeat from core client for 30 sec - exiting 19:20:00 (4060): No heartbeat from core client for 30 sec - exiting 19:20:01 (4060): No heartbeat from core client for 30 sec - exiting 19:20:02 (4060): No heartbeat from core client for 30 sec - exiting 19:20:03 (4060): No heartbeat from core client for 30 sec - exiting 19:20:04 (4060): No heartbeat from core client for 30 sec - exiting 19:20:05 (4060): No heartbeat from core client for 30 sec - exiting 19:20:06 (4060): No heartbeat from core client for 30 sec - exiting 19:20:07 (4060): No heartbeat from core client for 30 sec - exiting 19:20:08 (4060): No heartbeat from core client for 30 sec - exiting 19:20:10 (4060): No heartbeat from core client for 30 sec - exiting 19:20:11 (4060): No heartbeat from core client for 30 sec - exiting 19:20:12 (4060): No heartbeat from core client for 30 sec - exiting 19:20:13 (4060): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5484, selfPID=4168, iMonCtr=1 Model crash detected, will try to restart... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4916, selfPID=2508, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4188, selfPID=2104, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5868, selfPID=4596, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4796, selfPID=2636, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4652, selfPID=4100, iMonCtr=1 Model crash detected, will try to restart... 08:38:27 (4268): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=7396, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3328, selfPID=2996, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4508, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4796, selfPID=3588, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2144, selfPID=4876, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3672, selfPID=3928, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5020, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4736, selfPID=5104, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2840, selfPID=4752, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5844, selfPID=5828, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5656, selfPID=4908, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4064, iMonCtr=2 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4600, selfPID=4776, iMonCtr=1 Model crash detected, will try to restart... 05:54:24 (5216): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3272, selfPID=4060, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5612, selfPID=4604, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5884, selfPID=5100, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4496, selfPID=3604, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3180, selfPID=3816, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CGController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3880, selfPID=6112, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4948, selfPID=3808, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5464, selfPID=4624, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5740, iMonCtr=2 Model crash detected, will try to restart... CCPDN Monitor - Quit request from BOINC... CController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5128, selfPID=4772, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6140, selfPID=6140, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4040, selfPID=4712, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6140, selfPID=4496, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5452, selfPID=4836, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3340, selfPID=4452, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4080, iMonCtr=2 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5140, selfPID=4152, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4288, selfPID=4356, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2740, selfPID=5948, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3588, selfPID=5280, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5408, selfPID=4924, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4440, selfPID=3708, iMonCtr=1 Model crash detected, will try to restart... 18:18:30 (5076): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3260, selfPID=3260, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5408, selfPID=4304, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4344, selfPID=1856, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2884, selfPID=4328, iMonCtr=1 Model crash detected, will try to restart... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4584, iMonCtr=2 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=6112, selfPID=6112, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5916, selfPID=4140, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4932, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4716, selfPID=4608, iMonCtr=1 Model crash detected, will try to restart... GController:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4060, selfPID=4652, iMonCtr=1 Model crash detected, will try to restart... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4056, iMonCtr=2 Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3916, selfPID=3916, iMonCtr=2 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3916, selfPID=984, iMonCtr=1 Model crash detected, will try to restart... Leaving CPDN_Main::Monitor... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
21 Apr 2013 16:51:11 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 115,296 | 464,860 | 4.0319 |
13 Mar 2013 10:55:28 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 103,776 | 420,085 | 4.0480 |
26 Jan 2013 17:26:02 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 92,256 | 373,835 | 4.0521 |
27 Dec 2012 02:43:06 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 80,736 | 328,116 | 4.0641 |
15 Dec 2012 23:52:45 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 69,220 | 282,347 | 4.0790 |
15 Dec 2012 22:52:30 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 69,217 | 281,742 | 4.0704 |
15 Dec 2012 13:20:21 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 69,216 | 281,135 | 4.0617 |
29 Nov 2012 00:25:43 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 57,696 | 234,437 | 4.0633 |
15 Nov 2012 22:04:05 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 46,179 | 187,293 | 4.0558 |
15 Nov 2012 12:34:47 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 46,176 | 186,712 | 4.0435 |
20 Oct 2012 12:47:02 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 34,656 | 140,838 | 4.0639 |
29 Sep 2012 17:57:02 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 23,136 | 94,446 | 4.0822 |
20 Sep 2012 23:14:55 | 1023437 | 15229666 | hadam3p_eu_xcm6_1990_1_007154428_1 | 11,616 | 47,474 | 4.0869 |
©2024 cpdn.org