climateprediction.net home page
Task 13060187

Task 13060187

Name hadcm3n_t1g3_1940_40_007311209_2
Workunit 7508639
Created 4 Jul 2011, 11:06:52 UTC
Sent 4 Jul 2011, 12:05:40 UTC
Report deadline 3 Oct 2011, 19:32:51 UTC
Received 29 Aug 2011, 9:46:19 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 22 (0x00000016) Unknown error code
Computer ID 1404986
Run time 38 days 9 hours 5 min 38 sec
CPU time 26 days 9 hours 33 min 41 sec
Validate state Invalid
Credit 4,354.56
Device peak FLOPS 1.33 GFLOPS
Application version UK Met Office Coupled Model Full Resolution Ocean v6.07
windows_intelx86
Stderr
<core_client_version>6.6.20</core_client_version>
<![CDATA[
<message>
The device does not recognize the command. (0x16) - exit code 22 (0x16)
</message>
<stderr_txt>
Suspended CPDN Monitor - Suspend request from BOINC...
09:44:42 (1488): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
09:44:43 (1488): No heartbeat from core client for 30 sec - exiting
09:44:44 (1488): No heartbeat from core client for 30 sec - exiting
12:09:58 (2404): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:09:59 (2404): No heartbeat from core client for 30 sec - exiting
12:10:00 (2404): No heartbeat from core client for 30 sec - exiting
12:45:01 (3340): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:45:02 (3340): No heartbeat from core client for 30 sec - exiting
19:39:18 (1692): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
19:39:20 (1692): No heartbeat from core client for 30 sec - exiting
19:39:21 (1692): No heartbeat from core client for 30 sec - exiting
20:14:15 (4020): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
20:14:16 (4020): No heartbeat from core client for 30 sec - exiting
20:14:17 (4020): No heartbeat from core client for 30 sec - exiting
20:14:18 (4020): No heartbeat from core client for 30 sec - exiting
20:14:19 (4020): No heartbeat from core client for 30 sec - exiting
20:14:20 (4020): No heartbeat from core client for 30 sec - exiting
20:14:21 (4020): No heartbeat from core client for 30 sec - exiting
21:00:47 (2552): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
21:00:48 (2552): No heartbeat from core client for 30 sec - exiting
22:05:53 (1084): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
22:05:55 (1084): No heartbeat from core client for 30 sec - exiting
22:05:56 (1084): No heartbeat from core client for 30 sec - exiting
06:17:53 (1820): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:17:55 (1820): No heartbeat from core client for 30 sec - exiting
06:32:55 (2184): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:32:57 (2184): No heartbeat from core client for 30 sec - exiting
06:32:58 (2184): No heartbeat from core client for 30 sec - exiting
06:32:59 (2184): No heartbeat from core client for 30 sec - exiting
07:46:55 (1464): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
07:46:56 (1464): No heartbeat from core client for 30 sec - exiting
11:42:23 (520): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
12:48:25 (1016): No heartbeat from core client for 30 sec - exiting
12:48:27 (1016): No heartbeat from core client for 30 sec - exiting
12:48:28 (1016): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
BUFFIN: C I/O Error feof - Unit 63 - Return code = 16
BUFFIN: C I/O Error feof - Unit 64 - Return code = 16
BUFFIN: C I/O Error feof - Unit 65 - Return code = 16
BUFFIN: C I/O Error feof - Unit 66 - Return code = 16
BUFFIN: C I/O Error feof - Unit 67 - Return code = 16
BUFFIN: C I/O Error feof - Unit 68 - Return code = 16
BUFFIN: C I/O Error feof - Unit 69 - Return code = 16
Error converting file to netcdf: dataout/t1g3ko.pje1c10
Error converting file to netcdf: dataout/t1g3ko.pie1c10
Error converting file to netcdf: dataout/t1g3ko.pfe1c10
Error converting file to netcdf: dataout/t1g3ka.phe1c10
Error converting file to netcdf: dataout/t1g3ka.pge1c10
Error converting file to netcdf: dataout/t1g3ka.pee1c10
Error converting file to netcdf: dataout/t1g3ka.pde1c10
13:28:28 (3204): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:28:29 (3204): No heartbeat from core client for 30 sec - exiting
13:28:31 (3204): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
16:52:44 (2440): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:13:43 (2640): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:58:38 (2340): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
14:58:40 (2340): No heartbeat from core client for 30 sec - exiting
14:58:41 (2340): No heartbeat from core client for 30 sec - exiting
23:00:45 (1768): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
00:57:19 (3652): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
Suspended CPDN Monitor - Suspend request from BOINC...
15:36:56 (3992): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
15:36:58 (3992): No heartbeat from core client for 30 sec - exiting
15:36:59 (3992): No heartbeat from core client for 30 sec - exiting
15:37:00 (3992): No heartbeat from core client for 30 sec - exiting
15:37:01 (3992): No heartbeat from core client for 30 sec - exiting
15:37:02 (3992): No heartbeat from core client for 30 sec - exiting
15:37:03 (3992): No heartbeat from core client for 30 sec - exiting
15:37:04 (3992): No heartbeat from core client for 30 sec - exiting
15:37:05 (3992): No heartbeat from core client for 30 sec - exiting
15:37:06 (3992): No heartbeat from core client for 30 sec - exiting
15:37:07 (3992): No heartbeat from core client for 30 sec - exiting
15:37:08 (3992): No heartbeat from core client for 30 sec - exiting
15:37:09 (3992): No heartbeat from core client for 30 sec - exiting
15:37:10 (3992): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
18:02:31 (3588): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:02:33 (3588): No heartbeat from core client for 30 sec - exiting
18:02:34 (3588): No heartbeat from core client for 30 sec - exiting
18:02:35 (3588): No heartbeat from core client for 30 sec - exiting
18:02:36 (3588): No heartbeat from core client for 30 sec - exiting
18:02:37 (3588): No heartbeat from core client for 30 sec - exiting
18:02:38 (3588): No heartbeat from core client for 30 sec - exiting
18:02:39 (3588): No heartbeat from core client for 30 sec - exiting
18:02:40 (3588): No heartbeat from core client for 30 sec - exiting
18:16:40 (2900): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:16:42 (2900): No heartbeat from core client for 30 sec - exiting
18:26:47 (2056): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
18:26:49 (2056): No heartbeat from core client for 30 sec - exiting
18:26:50 (2056): No heartbeat from core client for 30 sec - exiting
18:26:51 (2056): No heartbeat from core client for 30 sec - exiting
18:26:52 (2056): No heartbeat from core client for 30 sec - exiting
Suspended CPDN Monitor - Suspend request from BOINC...
01:11:05 (1168): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
01:11:07 (1168): No heartbeat from core client for 30 sec - exiting
01:11:08 (1168): No heartbeat from core client for 30 sec - exiting
01:39:25 (1792): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
01:39:27 (1792): No heartbeat from core client for 30 sec - exiting
01:39:28 (1792): No heartbeat from core client for 30 sec - exiting
01:39:29 (1792): No heartbeat from core client for 30 sec - exiting
06:11:22 (1052): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
06:11:25 (1052): No heartbeat from core client for 30 sec - exiting
06:11:26 (1052): No heartbeat from core client for 30 sec - exiting
06:11:27 (1052): No heartbeat from core client for 30 sec - exiting
06:11:28 (1052): No heartbeat from core client for 30 sec - exiting
08:07:02 (3236): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
08:07:05 (3236): No heartbeat from core client for 30 sec - exiting
08:07:06 (3236): No heartbeat from core client for 30 sec - exiting
08:07:07 (3236): No heartbeat from core client for 30 sec - exiting
08:07:08 (3236): No heartbeat from core client for 30 sec - exiting
08:07:09 (3236): No heartbeat from core client for 30 sec - exiting
08:07:10 (3236): No heartbeat from core client for 30 sec - exiting
13:07:00 (2740): No heartbeat from core client for 30 sec - exiting
CPDN Monitor - No 'heartbeat' from BOINC...
13:07:02 (2740): No heartbeat from core client for 30 sec - exiting
13:07:03 (2740): No heartbeat from core client for 30 sec - exiting
13:07:04 (2740): No heartbeat from core client for 30 sec - exiting
13:07:05 (2740): No heartbeat from core client for 30 sec - exiting
13:07:06 (2740): No heartbeat from core client for 30 sec - exiting
13:07:07 (2740): No heartbeat from core client for 30 sec - exiting
13:07:08 (2740): No heartbeat from core client for 30 sec - exiting
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Signal 22 received, exiting...
Called boinc_finish
Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4020, iMonCtr=1
Model crash detected, will try to restart...
Sorry, too many model crashes! :-(
Called boinc_finish

</stderr_txt>
]]>
Latest Trickles Received
Time Sent (UTC) Host ID Result ID Result Name Timestep CPU Time (sec) Average (sec/TS)
20 Aug 2011 02:48:34 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 362,880 2,205,474 6.0777
17 Aug 2011 11:53:12 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 336,960 2,038,797 6.0506
14 Aug 2011 23:26:49 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 311,040 1,868,102 6.0060
12 Aug 2011 10:13:10 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 285,120 1,704,263 5.9774
09 Aug 2011 05:02:34 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 259,200 1,529,980 5.9027
06 Aug 2011 14:05:40 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 233,280 1,391,663 5.9656
03 Aug 2011 16:06:11 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 207,360 1,217,017 5.8691
01 Aug 2011 05:45:53 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 181,440 1,081,242 5.9592
27 Jul 2011 21:25:41 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 155,520 919,045 5.9095
26 Jul 2011 00:03:38 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 129,600 781,295 6.0285
25 Jul 2011 21:59:07 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 103,680 634,331 6.1182
25 Jul 2011 17:24:06 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 77,760 486,328 6.2542
25 Jul 2011 16:03:25 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 51,840 331,915 6.4027
09 Jul 2011 05:14:03 1008295 13060187 hadcm3n_t1g3_1940_40_007311209_2 25,920 192,417 7.4235


©2024 cpdn.org