Name | hadcm3n_4fn9_1940_40_008311602_1 |
Workunit | 8462737 |
Created | 19 Mar 2013, 23:49:34 UTC |
Sent | 19 Mar 2013, 23:50:16 UTC |
Report deadline | 19 Jun 2013, 7:17:27 UTC |
Received | 9 Apr 2013, 20:29:31 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 193 (0x000000C1) EXIT_SIGNAL |
Computer ID | 1264094 |
Run time | 10 days 2 hours 9 min 26 sec |
CPU time | 3 days 10 hours 48 min 30 sec |
Validate state | Invalid |
Credit | 9,331.20 |
Device peak FLOPS | 3.64 GFLOPS |
Application version | UK Met Office Coupled Model Full Resolution Ocean v6.07 i686-pc-linux-gnu |
Stderr | <core_client_version>7.0.27</core_client_version> <![CDATA[ <message> process exited with code 193 (0xc1, -63) </message> <stderr_txt> CPDN Monitor - Quit request from BOINC... 17:40:12 (3874): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 BUFFOUT: Write FaSuspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... BUFFOUT: Write Failed: No space left on device BUFFOUT: C I/O Error - Return code = 1 Model crashed: WRITDUMP: BAD BUFFOUT OF DATA tmp/pipe_dummy 2048 00:46:05 (14544): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:32:09 (4313): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:34:11 (5035): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:35:30 (5175): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:39:06 (5616): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... 15:39:07 (5616): No heartbeat from core client for 30 sec - exiting 15:39:08 (5616): No heartbeat from core client for 30 sec - exiting 15:39:09 (5616): No heartbeat from core client for 30 sec - exiting 15:39:10 (5616): No heartbeat from core client for 30 sec - exiting 15:39:11 (5616): No heartbeat from core client for 30 sec - exiting 15:39:12 (5616): No heartbeat from core client for 30 sec - exiting 15:39:13 (5616): No heartbeat from core client for 30 sec - exiting 15:39:14 (5616): No heartbeat from core client for 30 sec - exiting 15:39:15 (5616): No heartbeat from core client for 30 sec - exiting 15:39:16 (5616): No heartbeat from core client for 30 sec - exiting 15:39:17 (5616): No heartbeat from core client for 30 sec - exiting 15:39:18 (5616): No heartbeat from core client for 30 sec - exiting 15:39:19 (5616): No heartbeat from core client for 30 sec - exiting 15:39:20 (5616): No heartbeat from core client for 30 sec - exiting 15:39:21 (5616): No heartbeat from core client for 30 sec - exiting 15:39:22 (5616): No heartbeat from core client for 30 sec - exiting 15:39:23 (5616): No heartbeat from core client for 30 sec - exiting 15:39:24 (5616): No heartbeat from core client for 30 sec - exiting 15:39:25 (5616): No heartbeat from core client for 30 sec - exiting 15:39:26 (5616): No heartbeat from core client for 30 sec - exiting 15:39:27 (5616): No heartbeat from core client for 30 sec - exiting 15:39:28 (5616): No heartbeat from core client for 30 sec - exiting 15:39:29 (5616): No heartbeat from core client for 30 sec - exiting 15:39:30 (5616): No heartbeat from core client for 30 sec - exiting 15:39:31 (5616): No heartbeat from core client for 30 sec - exiting 15:39:32 (5616): No heartbeat from core client for 30 sec - exiting 15:39:33 (5616): No heartbeat from core client for 30 sec - exiting 15:39:34 (5616): No heartbeat from core client for 30 sec - exiting 15:39:35 (5616): No heartbeat from core client for 30 sec - exiting 15:39:36 (5616): No heartbeat from core client for 30 sec - exiting 15:39:37 (5616): No heartbeat from core client for 30 sec - exiting 15:39:38 (5616): No heartbeat from core client for 30 sec - exiting 15:39:39 (5616): No heartbeat from core client for 30 sec - exiting 15:39:40 (5616): No heartbeat from core client for 30 sec - exiting Suspended CPDN Monitor - Suspend request from BOINC... 15:41:43 (6011): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... *** glibc detected *** ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu: double free or corruption (out): 0x09259000 *** ======= Backtrace: ========= /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x70f01)[0xf7549f01] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x72768)[0xf754b768] /lib/i386-linux-gnu/i686/cmov/libc.so.6(cfree+0x6d)[0xf754e81d] /usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xf76cc4bf] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x8057bc4] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x804f232] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x8050491] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x805112c] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x805137a] /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf74efe46] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu(__gxx_personality_v0+0x169)[0x804cb51] ======= Memory map: ======== 08048000-080e3000 r-xp 00000000 fe:00 443621 /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu 080e3000-080e4000 rw-p 0009b000 fe:00 443621 /var/lib/boinc-client/projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu 080e4000-0813b000 rw-p 00000000 00:00 0 09203000-09267000 rw-p 00000000 00:00 0 [heap] f6f00000-f6f21000 rw-p 00000000 00:00 0 f6f21000-f7000000 ---p 00000000 00:00 0 f705b000-f74d6000 rw-s 00000000 fe:00 469482 /var/lib/boinc-client/slots/2/132420 f74d6000-f74d9000 rw-p 00000000 00:00 0 f74d9000-f7635000 r-xp 00000000 fe:00 156090 /lib/i386-linux-gnu/i686/cmov/libc-2.13.so f7635000-f7636000 ---p 0015c000 fe:00 156090 /lib/i386-linux-gnu/i686/cmov/libc-2.13.so f7636000-f7638000 r--p 0015c000 fe:00 156090 /lib/i386-linux-gnu/i686/cmov/libc-2.13.so f7638000-f7639000 rw-p 0015e000 fe:00 156090 /lib/i386-linux-gnu/i686/cmov/libc-2.13.so f7639000-f763c000 rw-p 00000000 00:00 0 f763c000-f7658000 r-xp 00000000 fe:00 154411 /lib/i386-linux-gnu/libgcc_s.so.1 f7658000-f7659000 rw-p 0001b000 fe:00 154411 /lib/i386-linux-gnu/libgcc_s.so.1 f7659000-f767d000 r-xp 00000000 fe:00 156087 /lib/i386-linux-gnu/i686/cmov/libm-2.13.so f767d000-f767e000 r--p 00023000 fe:00 156087 /lib/i386-linux-gnu/i686/cmov/libm-2.13.so f767e000-f767f000 rw-p 00024000 fe:00 156087 /lib/i386-linux-gnu/i686/cmov/libm-2.13.so f767f000-f775f000 r-xp 00000000 fe:00 60222 /usr/lib/i386-linux-gnu/libstdc++.so.6.0.17 f775f000-f7763000 r--p 000e0000 fe:00 60222 /usr/lib/i386-linux-gnu/libstdc++.so.6.0.17 f7763000-f7764000 rw-p 000e4000 fe:00 60222 /usr/lib/i386-linux-gnu/libstdc++.so.6.0.17 f7764000-f776c000 rw-p 00000000 00:00 0 f776c000-f776e000 r-xp 00000000 fe:00 156083 /lib/i386-linux-gnu/i686/cmov/libdl-2.13.so f776e000-f776f000 r--p 00001000 fe:00 156083 /lib/i386-linux-gnu/i686/cmov/libdl-2.13.so f776f000-f7770000 rw-p 00002000 fe:00 156083 /lib/i386-linux-gnu/i686/cmov/libdl-2.13.so f7770000-f7785000 r-xp 00000000 fe:00 156080 /lib/i386-linux-gnu/i686/cmov/libpthread-2.13.so f7785000-f7786000 r--p 00014000 fe:00 156080 /lib/i386-linux-gnu/i686/cmov/libpthread-2.13.so f7786000-f7787000 rw-p 00015000 fe:00 156080 /lib/i386-linux-gnu/i686/cmov/libpthread-2.13.so f7787000-f7789000 rw-p 00000000 00:00 0 f77a8000-f77a9000 rw-p 00000000 00:00 0 f77a9000-f77aa000 ---p 00000000 00:00 0 f77aa000-f77ad000 rw-p 00000000 00:00 0 f77ad000-f77af000 rw-s 00000000 fe:00 469402 /var/lib/boinc-client/slots/2/boinc_mmap_file f77af000-f77b1000 rw-p 00000000 00:00 0 f77b1000-f77b2000 r-xp 00000000 00:00 0 [vdso] f77b2000-f77ce000 r-xp 00000000 fe:00 154435 /lib/i386-linux-gnu/ld-2.13.so f77ce000-f77cf000 r--p 0001b000 fe:00 154435 /lib/i386-linux-gnu/ld-2.13.so f77cf000-f77d0000 rw-p 0001c000 fe:00 154435 /lib/i386-linux-gnu/ld-2.13.so ff7f8000-ff867000 rw-p 00000000 00:00 0 [stack] SIGABRT: abort called Stack trace (17 frames): ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu(boinc_catch_signal+0x6f)[0x80b80df] [0xf77b1400] [0xf77b1430] /lib/i386-linux-gnu/i686/cmov/libc.so.6(gsignal+0x51)[0xf7503941] /lib/i386-linux-gnu/i686/cmov/libc.so.6(abort+0x182)[0xf7506d72] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x66e15)[0xf753fe15] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x70f01)[0xf7549f01] /lib/i386-linux-gnu/i686/cmov/libc.so.6(+0x72768)[0xf754b768] /lib/i386-linux-gnu/i686/cmov/libc.so.6(cfree+0x6d)[0xf754e81d] /usr/lib/i386-linux-gnu/libstdc++.so.6(_ZdlPv+0x1f)[0xf76cc4bf] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x8057bc4] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x804f232] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x8050491] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x805112c] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu[0x805137a] /lib/i386-linux-gnu/i686/cmov/libc.so.6(__libc_start_main+0xe6)[0xf74efe46] ../../projects/climateprediction.net/hadcm3n_6.07_i686-pc-linux-gnu(__gxx_personality_v0+0x169)[0x804cb51] Exiting... </stderr_txt> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
09 Apr 2013 20:32:31 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 777,600 | 298,108 | 0.3834 |
09 Apr 2013 12:31:22 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 751,680 | 270,931 | 0.3604 |
09 Apr 2013 04:38:32 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 725,760 | 243,619 | 0.3357 |
08 Apr 2013 20:39:26 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 699,840 | 216,597 | 0.3095 |
28 Mar 2013 21:49:12 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 673,920 | 189,472 | 0.2811 |
28 Mar 2013 13:40:11 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 648,000 | 162,092 | 0.2501 |
28 Mar 2013 05:37:35 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 622,080 | 134,623 | 0.2164 |
27 Mar 2013 21:28:34 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 596,160 | 627,925 | 1.0533 |
27 Mar 2013 12:48:56 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 570,240 | 600,553 | 1.0532 |
27 Mar 2013 04:40:51 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 544,320 | 573,174 | 1.0530 |
26 Mar 2013 20:28:37 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 518,400 | 545,830 | 1.0529 |
26 Mar 2013 12:02:15 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 492,480 | 518,463 | 1.0528 |
26 Mar 2013 03:54:15 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 466,560 | 491,115 | 1.0526 |
25 Mar 2013 19:43:15 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 440,640 | 463,760 | 1.0525 |
25 Mar 2013 11:07:48 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 414,720 | 436,429 | 1.0523 |
25 Mar 2013 02:50:47 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 388,800 | 409,097 | 1.0522 |
24 Mar 2013 18:47:38 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 362,880 | 381,747 | 1.0520 |
24 Mar 2013 10:32:36 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 336,960 | 354,396 | 1.0517 |
24 Mar 2013 02:30:00 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 311,040 | 327,043 | 1.0514 |
23 Mar 2013 18:24:06 | 1264094 | 15673459 | hadcm3n_4fn9_1940_40_008311602_1 | 285,120 | 299,683 | 1.0511 |
©2024 cpdn.org