Excessive checkpointing on new Linux hadcm3s tasks?

Author	Message
alanb1951 Send message Joined: 31 Aug 04 Posts: 36 Credit: 9,581,380 RAC: 3,853	Message 61008 - Posted: 26 Sep 2019, 9:32:46 UTC I recently landed a few of the new Linux tasks (batch 835). I don't usually turn on checkpoint debug in BOINC-Manager, but I had cause to need to do so on one of my machines and was surprised to see that these tasks were checkpointing about once a minute! I turned the logging on on my other machine that had some CPDN work and it was the same! (Turned logging off again!) Now, on one machine I've got the checkpointing limit set to 600 seconds and on the other 240 seconds; it's obviously not respecting that! As I said, I don't normally monitor this, so for all I know this could have been standard behaviour for as long as hadcm3s tasks have been available. Alternatively, it might only be doing this on my machines (though I suspect that's unlikely). This is not exactly disc-I/O friendly if it's deliberate so I wonder has it always been like this, is this a side-effect of them trying to make Linux tasks more crash-proof, or is it a bug. Any insight appreciated - Al. ID: 61008 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4532 Credit: 18,836,565 RAC: 21,339	Message 61010 - Posted: 26 Sep 2019, 9:54:12 UTC - in response to Message 61008. Just checked on mine where it is about every five minutes but on a much slower machine. I think this is how it has always been for hadcm3s but don't have any evidence to back that up. ID: 61010 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61012 - Posted: 26 Sep 2019, 11:08:17 UTC - in response to Message 61008. This is not exactly disc-I/O friendly if it's deliberate so I wonder has it always been like this, is this a side-effect of them trying to make Linux tasks more crash-proof, or is it a bug. I am glad you mentioned writes. I routinely check them in order to properly set up a write cache to protect my SSD. On a Ryzen 2600, running hadcm3s work units on all 12 cores, the writes are about 350 GB/day. I set up a 4 GB write cache in Linux, with 30 minute latency (write-delay). In case you are interested, for Ubuntu the commands are: Set write cache to 4 GB/4.5 GB: for 16 GB main memory sudo sysctl vm.dirty_background_bytes=4000000000 (268435456 default) sudo sysctl vm.dirty_bytes=4500000000 (1073741824 x4 default) sudo sysctl vm.dirty_writeback_centisecs=500 (checks the cache every 5 seconds) sudo sysctl vm.dirty_expire_centisecs=180000 (page flush 30 min.; 3000 default) Whether this is really necessary to protect a typical SSD I don't know, but I prefer caution. ID: 61012 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61026 - Posted: 26 Sep 2019, 22:54:33 UTC - in response to Message 61012. And I just checked the writes on my Ryzen 3700x, which is running hadcm3s on all 16 of the cores. Since the writes are rather variable, I used iostat 7200, which measures over a two-hour period. The writes were 500 GB/day, too much for me without a write-cache for protection. In fact, my limit is 70 GB/day. ID: 61026 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4532 Credit: 18,836,565 RAC: 21,339	Message 61027 - Posted: 27 Sep 2019, 6:04:51 UTC - in response to Message 61026. This has been raised with the project. Check points are each model day. When this model type was introduced, computers were a lot slower and solid state disks didn't even exist or if they did cost an arm and a leg for even a 40GB one. So the problem has sort of crept up on us. I don't know how quick a fix it is to change checkpoints to every ten days or even monthly giving 12 checkpoints per zip file? ID: 61027 · Reply Quote

alanb1951 Send message Joined: 31 Aug 04 Posts: 36 Credit: 9,581,380 RAC: 3,853	Message 61029 - Posted: 27 Sep 2019, 7:04:13 UTC - in response to Message 61027. This has been raised with the project. Check points are each model day. When this model type was introduced, computers were a lot slower and solid state disks didn't even exist or if they did cost an arm and a leg for even a 40GB one. So the problem has sort of crept up on us. I don't know how quick a fix it is to change checkpoints to every ten days or even monthly giving 12 checkpoints per zip file? Dave, Thanks for this! I hope they can (and do) change this because I will not be running CPDN on my next system (Ryzen 3700X, I hope) if it's going to be hammering the discs like that if I get hadcm3s work units. (Thanks for the numbers and cache tuning stuff, Jim1348!) I presume we aren't ever going to get the facility to deselect certain applications back; if that's the case they ought to try to make details like checkpoint frequency as consistent as they can across all applications available on a given platform (at least as far as the most frequent checkpointing is concerned). I can understand an application ignoring the user's checkpoint guidelines by not checkpointing as often as the user allows, but checkpointing more often ought to be a no-no, as this situation demonstrates!... (It was theoretically possible to determine that limit and if the limit was reasonable enable some code on the checkpoint logic saying "how long since the last one? If longer than limit, do another..." -- I don't think that has changed.) By the way, do we know what the checkpoint behaviour of HadAM4 and OpenIFS is/will be??? Cheers - Al. ID: 61029 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4532 Credit: 18,836,565 RAC: 21,339	Message 61030 - Posted: 27 Sep 2019, 8:24:00 UTC - in response to Message 61029. George's email has been responded to and Sarah is going to look at reducing the checkpointing for future hadcm3s work though probably too late for the rest of the current batch. From memory on my slow machine hadam4 is just over 20minutes so about 4 on a fast machine. I can't remember from testing about the IFS tasks. ID: 61030 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,087 RAC: 2,202	Message 61036 - Posted: 27 Sep 2019, 12:37:09 UTC - in response to Message 61008. I don't usually turn on checkpoint debug in BOINC-Manager, How does one do that? ID: 61036 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4532 Credit: 18,836,565 RAC: 21,339	Message 61037 - Posted: 27 Sep 2019, 13:54:34 UTC I don't usually turn on checkpoint debug in BOINC-Manager, How does one do that? Options>Event log options it is one of 26 boxes you can check, some of which I actually understand! ID: 61037 · Reply Quote

geophi Volunteer moderator Send message Joined: 7 Aug 04 Posts: 2184 Credit: 64,822,615 RAC: 5,275	Message 61044 - Posted: 27 Sep 2019, 18:15:45 UTC - in response to Message 61026. Last modified: 27 Sep 2019, 18:16:04 UTC And I just checked the writes on my Ryzen 3700x, which is running hadcm3s on all 16 of the cores. Since the writes are rather variable, I used iostat 7200, which measures over a two-hour period. The writes were 500 GB/day, too much for me without a write-cache for protection. In fact, my limit is 70 GB/day. So with a command line of iostat -m 7200, does the following mean this PC is writing 40 GB every 2 hours? avg-cpu: %user %nice %system %iowait %steal %idle 0.50 61.89 2.98 3.32 0.00 31.30 Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn loop0 0.00 0.00 0.00 0 0 loop1 0.00 0.00 0.00 0 0 loop2 0.00 0.00 0.00 0 0 loop3 0.00 0.00 0.00 0 0 loop4 0.00 0.00 0.00 0 0 loop5 0.00 0.00 0.00 0 0 loop6 0.00 0.00 0.00 0 0 sda 94.19 0.00 5.58 0 40171 ID: 61044 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61046 - Posted: 28 Sep 2019, 2:14:28 UTC - in response to Message 61044. So with a command line of iostat -m 7200, does the following mean this PC is writing 40 GB every 2 hours? Yes. ID: 61046 · Reply Quote

bernard_ivo Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,608,083 RAC: 5,147	Message 61047 - Posted: 28 Sep 2019, 11:25:29 UTC Last modified: 28 Sep 2019, 11:27:07 UTC In my case it seem 705 GB in 2 hours (4 WUs only). However if I use -m 3600 I get almost the same results (not 1/2 as anticipated) avg-cpu: %user %nice %system %iowait %steal %idle 0.55 49.62 0.37 0.78 0.00 48.67 Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn loop0 0.00 0.00 0.00 2 0 loop1 0.00 0.00 0.00 1 0 loop2 0.00 0.00 0.00 0 0 loop3 0.00 0.00 0.00 0 0 loop4 0.00 0.00 0.00 2 0 loop5 0.00 0.00 0.00 0 0 loop6 0.10 0.00 0.00 43 0 loop7 0.00 0.00 0.00 0 0 sda 31.17 0.03 1.68 13406 705603 sdb 71.77 1.15 0.00 483446 1304 ID: 61047 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61048 - Posted: 28 Sep 2019, 14:45:30 UTC - in response to Message 61047. I just did a 12-hour test on my Ryzen 3770x, and the result is 247 GB, or 494 GB/day, which is consistent with my earlier 2-hour test below. jim@Ryzen3700X:~$ iostat -m 43200 Linux 5.0.0-29-generic (Ryzen3700X) 09/27/2019 _x86_64_ (16 CPU) Device tps MB_read/s MB_wrtn/s MB_read MB_wrtn loop0 0.00 0.00 0.00 0 0 loop1 0.00 0.00 0.00 0 0 loop2 0.00 0.00 0.00 0 0 loop3 0.00 0.00 0.00 0 0 loop4 0.00 0.00 0.00 0 0 loop5 0.00 0.00 0.00 0 0 loop6 0.00 0.00 0.00 0 0 loop7 0.00 0.00 0.00 0 0 sda 102.32 0.00 5.72 21 247023 loop8 0.00 0.00 0.00 0 0 loop9 0.00 0.00 0.00 0 0 loop10 0.00 0.00 0.00 0 0 loop11 0.00 0.00 0.00 0 0 loop12 0.00 0.00 0.00 0 0 loop13 0.00 0.00 0.00 0 0 loop14 0.00 0.00 0.00 0 0 So bernard_ivo I don't know why your 1 hour test did not give the expected results. I assume the same number of cores were operating (and not on hold), but if there were any downloads occurring, that would add to the writes also. Try it again; I think it will work as expected eventually. ID: 61048 · Reply Quote

bernard_ivo Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,608,083 RAC: 5,147	Message 61050 - Posted: 28 Sep 2019, 15:53:36 UTC - in response to Message 61048. Thanks Jim. Any downloads are on the sdb drive. I have just restarted the machine, to test. Am I using the iostat correctly though? I suppose to value the last 1 h i need to let the machine run for at least an hour and then run the command with -m 3600. I read the man page but still not sure whether I understand it correctly. And why after executing the command I need to stop it to get back to command line? ID: 61050 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,087 RAC: 2,202	Message 61051 - Posted: 28 Sep 2019, 15:59:54 UTC - in response to Message 61037. Does not exist for me. Running 7.2.33 which is the latest one supported for my Linus distro. ID: 61051 · Reply Quote

Jean-David Beyer Send message Joined: 5 Aug 04 Posts: 1120 Credit: 17,202,087 RAC: 2,202	Message 61052 - Posted: 28 Sep 2019, 16:10:44 UTC - in response to Message 61044. On my machine, /dev/sdd is the one with the boinc partition on it. There is also a partition there with some videos, but I watch them sometimes, but rarely write them. /dev/sde is a removable hard drive for backups that is currently plugged mounted. I have a 4-core processor that has been running two CPDN tasks for about 3 days. 1 rosetta and one WCG. $ iostat -m 7200 Linux 2.6.32-754.23.1.el6.x86_64 (DellT7600.localdomain) 09/28/2019 _x86_64_ (4 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 5.27 93.56 1.02 0.05 0.00 0.09 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdd 9.10 0.01 0.38 5052 154425 sdb 9.69 0.33 0.05 133111 19124 sda 0.00 0.00 0.00 2 0 sdc 0.02 0.00 0.00 538 79 sde 3.09 0.00 0.31 205 126202 ID: 61052 · Reply Quote

Dave Jackson Volunteer moderator Send message Joined: 15 May 09 Posts: 4532 Credit: 18,836,565 RAC: 21,339	Message 61053 - Posted: 28 Sep 2019, 17:19:53 UTC - in response to Message 61051. Does not exist for me. Running 7.2.33 which is the latest one supported for my Linus distro. The actual package is sysstat which includes the iostat command. ID: 61053 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61054 - Posted: 28 Sep 2019, 17:24:36 UTC - in response to Message 61050. Last modified: 28 Sep 2019, 17:27:49 UTC Am I using the iostat correctly though? I suppose to value the last 1 h i need to let the machine run for at least an hour and then run the command with -m 3600. I read the man page but still not sure whether I understand it correctly. And why after executing the command I need to stop it to get back to command line? It looks good to me. If you want another command line, you can just open another window and let iostat keep running if you want to. And if you have read the manual, you know more about it that I do. I have just used it a lot, but am not an expert. PS - Yes, I neglected to mention that you have to install sysstat first to use iostat. sudo apt install sysstat ID: 61054 · Reply Quote

bernard_ivo Send message Joined: 18 Jul 13 Posts: 438 Credit: 25,608,083 RAC: 5,147	Message 61055 - Posted: 28 Sep 2019, 17:47:09 UTC - in response to Message 61054. Last modified: 28 Sep 2019, 17:47:32 UTC It looks good to me. If you want another command line, you can just open another window and let iostat keep running if you want to. And if you have read the manual, you know more about it that I do. I have just used it a lot, but am not an expert. I just wondered why it does not exit after executing unless there is a reason to keep it running. It seems after a given interval 10 (10s) it generates a report for this interval. So after letting it run 3600 (1h) I will get a second report with the data read/written within this hour. The first report (on executing the command) seems to be all data since boot, hence my over 700 GB was not for 2h but since last boot. ID: 61055 · Reply Quote

Jim1348 Send message Joined: 15 Jan 06 Posts: 637 Credit: 26,751,529 RAC: 653	Message 61056 - Posted: 28 Sep 2019, 18:16:35 UTC - in response to Message 61055. The first report (on executing the command) seems to be all data since boot, hence my over 700 GB was not for 2h but since last boot. That is my interpretation too. I ignore the first value, since I want the timed value. Thanks for pointing this out. ID: 61056 · Reply Quote