Name | hadam3p_eu_xl1h_1964_1_007008477_0 |
Workunit | 7211793 |
Created | 24 Nov 2010, 13:36:06 UTC |
Sent | 15 Jan 2011, 13:07:05 UTC |
Report deadline | 28 Dec 2011, 18:27:05 UTC |
Received | 29 Jan 2011, 18:01:26 UTC |
Server state | Over |
Outcome | No reply |
Client state | Compute error |
Exit status | 0 (0x00000000) |
Computer ID | 1121004 |
Run time | 3 days 12 hours 47 min |
CPU time | 3 days 9 hours 8 min 34 sec |
Validate state | Invalid |
Credit | 2,061.80 |
Device peak FLOPS | 3.43 GFLOPS |
Application version | UK Met Office HadAM3P-HadRM3P Europe v6.08 windows_intelx86 |
Stderr | <core_client_version>6.10.58</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1476, selfPID=1476, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4352, selfPID=4352, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4624, selfPID=4624, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3904, selfPID=3904, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=5888, iMonCtr=2 21:10:57 (5256): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2300, selfPID=2300, iMonCtr=2 18:14:20 (4356): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3000, selfPID=4012, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4572, selfPID=4572, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Suspend request from BOINC... Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=1224, selfPID=2908, iMonCtr=1 Model crash detected, will try to restart... 12:43:59 (4832): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... 14:58:06 (3752): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Precis Restart file copy #1 failed on xl1hga.dag5720 Precis Restart file copy #1 failed on xl1hga.dag5730 Precis Restart file copy #1 failed on xl1hga.dag5740 Precis Restart file copy #1 failed on xl1hga.dag5750 Precis Restart file copy #1 failed on xl1hga.dag5760 Precis Restart file copy #1 failed on xl1hga.dag5770 Precis Restart file copy #1 failed on xl1hga.dag5780 Precis Restart file copy #1 failed on xl1hga.dag5790 Precis Restart file copy #1 failed on xl1hga.dag57a0 Precis Restart file copy #1 failed on xl1hga.dag57b0 CPDN Monitor - Quit request from BOINC... Precis Restart file copy #1 failed on xl1hga.dag5720 Precis Restart file copy #1 failed on xl1hga.dag5730 Precis Restart file copy #1 failed on xl1hga.dag5740 Precis Restart file copy #1 failed on xl1hga.dag5750 Precis Restart file copy #1 failed on xl1hga.dag5760 Precis Restart file copy #1 failed on xl1hga.dag5770 Precis Restart file copy #1 failed on xl1hga.dag5780 Precis Restart file copy #1 failed on xl1hga.dag5790 Precis Restart file copy #1 failed on xl1hga.dag57a0 Precis Restart file copy #1 failed on xl1hga.dag57b0 Precis Restart file copy #1 failed on xl1hga.dag57c0 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5320, selfPID=5320, iMonCtr=2 Precis Restart file copy #1 failed on xl1hga.dag5720 Precis Restart file copy #1 failed on xl1hga.dag5730 Precis Restart file copy #1 failed on xl1hga.dag5740 Precis Restart file copy #1 failed on xl1hga.dag5750 Precis Restart file copy #1 failed on xl1hga.dag5760 Precis Restart file copy #1 failed on xl1hga.dag5770 Precis Restart file copy #1 failed on xl1hga.dag5780 Precis Restart file copy #1 failed on xl1hga.dag5790 Precis Restart file copy #1 failed on xl1hga.dag57a0 Precis Restart file copy #1 failed on xl1hga.dag57b0 Precis Restart file copy #1 failed on xl1hga.dag57c0 Precis Restart file copy #1 failed on xl1hga.dag57d0 Precis Restart file copy #1 failed on xl1hga.dag57e0 Precis Restart file copy #1 failed on xl1hga.dag57f0 Precis Restart file copy #1 failed on xl1hga.dag57g0 Precis Restart file copy #1 failed on xl1hga.dag57h0 Precis Restart file copy #1 failed on xl1hga.dag57i0 Precis Restart file copy #1 failed on xl1hga.dag57j0 Precis Restart file copy #1 failed on xl1hga.dag57k0 Precis Restart file copy #1 failed on xl1hga.dag57l0 Precis Restart file copy #1 failed on xl1hga.dag57m0 Precis Restart file copy #1 failed on xl1hga.dag57n0 Precis Restart file copy #1 failed on xl1hga.dag57o0 Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3232, selfPID=5380, iMonCtr=1 Model crash detected, will try to restart... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=448, selfPID=448, iMonCtr=2 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 17:29:01 (1612): Can't acquire lockfile (32) - waiting 35s Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 10:29:36 (2860): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 19:29:11 (4888): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 13:50:34 (1204): Can't acquire lockfile (32) - waiting 35s Controller:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=4656, iMonCtr=2 Model crash detected, will try to restart... 07:49:20 (4396): Can't acquire lockfile (32) - waiting 35s 07:50:53 (3484): Can't set up shared mem: -1. Will run in standalone mode. No Process Handle Global Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=0, iMonCtr=1 CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=4832, selfPID=4832, iMonCtr=2 CPDN Monitor - Quit request from BOINC... 21:09:31 (4180): No heartbeat from core client for 30 sec - exiting 21:09:32 (4180): No heartbeat from core client for 30 sec - exiting 21:09:33 (4180): No heartbeat from core client for 30 sec - exiting 21:09:34 (4180): No heartbeat from core client for 30 sec - exiting 21:09:36 (4180): No heartbeat from core client for 30 sec - exiting 21:09:37 (4180): No heartbeat from core client for 30 sec - exiting 21:09:38 (4180): No heartbeat from core client for 30 sec - exiting 21:09:39 (4180): No heartbeat from core client for 30 sec - exiting 21:09:40 (4180): No heartbeat from core client for 30 sec - exiting 21:09:41 (4180): No heartbeat from core client for 30 sec - exiting 21:09:42 (4180): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... CPDN Monitor - Quit request from BOINC... 21:41:56 (4720): Can't acquire lockfile (32) - waiting 35s CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 12:50:06 (4432): No heartbeat from core client for 30 sec - exiting 12:50:07 (4432): No heartbeat from core client for 30 sec - exiting 12:50:08 (4432): No heartbeat from core client for 30 sec - exiting 12:50:10 (4432): No heartbeat from core client for 30 sec - exiting 12:50:11 (4432): No heartbeat from core client for 30 sec - exiting 12:50:12 (4432): No heartbeat from core client for 30 sec - exiting 12:50:13 (4432): No heartbeat from core client for 30 sec - exiting 12:50:14 (4432): No heartbeat from core client for 30 sec - exiting 12:50:15 (4432): No heartbeat from core client for 30 sec - exiting 12:50:16 (4432): No heartbeat from core client for 30 sec - exiting 12:50:17 (4432): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No 'heartbeat' from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=3196, selfPID=3196, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=2804, selfPID=2804, iMonCtr=2 Suspended CPDN Monitor - Suspend request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... Regional Worker:: CPDN process is not running, exiting, bRetVal = 1, checkPID=5012, selfPID=5012, iMonCtr=2 CPDN Monitor - Quit request from BOINC... Leaving CPDN_Main::Monitor... 23:27:01 (2636): called boinc_finish </stderr_txt> <message> <file_xfer_error> <file_name>hadam3p_eu_xl1h_1964_1_007008477_0_11.zip</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>hadam3p_eu_xl1h_1964_1_007008477_0_12.zip</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> |
Latest Trickles Received | ||||||
---|---|---|---|---|---|---|
Time Sent (UTC) | Host ID | Result ID | Result Name | Timestep | CPU Time (sec) | Average (sec/TS) |
27 Jan 2011 21:33:03 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 119,520 | 278,410 | 2.3294 |
24 Jan 2011 19:07:49 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 108,000 | 253,805 | 2.3500 |
23 Jan 2011 18:16:10 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 96,480 | 229,470 | 2.3784 |
23 Jan 2011 11:33:09 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 91,584 | 208,115 | 2.2724 |
22 Jan 2011 18:41:27 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 85,920 | 193,438 | 2.2514 |
22 Jan 2011 14:42:02 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 84,576 | 183,984 | 2.1754 |
22 Jan 2011 10:24:00 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 80,740 | 170,018 | 2.1057 |
21 Jan 2011 21:41:37 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 80,736 | 169,654 | 2.1013 |
21 Jan 2011 14:54:31 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 69,216 | 145,811 | 2.1066 |
21 Jan 2011 11:42:06 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 57,696 | 122,097 | 2.1162 |
20 Jan 2011 15:03:48 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 46,176 | 97,805 | 2.1181 |
19 Jan 2011 18:07:55 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 34,656 | 73,366 | 2.1170 |
17 Jan 2011 19:05:17 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 23,136 | 48,995 | 2.1177 |
16 Jan 2011 12:30:33 | 1121004 | 12293552 | hadam3p_eu_xl1h_1964_1_007008477_0 | 11,616 | 24,698 | 2.1262 |
©2024 cpdn.org