climateprediction.net (CPDN) home page
Thread 'High Disc IO'

Thread 'High Disc IO'

Questions and Answers : Unix/Linux : High Disc IO
Message board moderation

To post messages, you must log in.

AuthorMessage
ruffieux

Send message
Joined: 21 Oct 07
Posts: 8
Credit: 12,864,838
RAC: 399
Message 67024 - Posted: 24 Dec 2022, 10:09:05 UTC
Last modified: 24 Dec 2022, 10:11:45 UTC

Hi,
I am running climateprediction on my Ubuntu Linux Box for many years without any issues.

However since a few months, I see, that model.exe creates such high IO that it locks up the entire box although memory and cpu values are ok. Looking at the tool iotop I can clearly see, that it is model.exe which creates high disk IO. It is confirmed by looking at the led of the disc at the computer itself, which is constantly on.
Pausing climateprediction immediately solves the problem

Linux box got 8 cores an running max 7 climateprediction processes at the time. Diskspace also is plenty available.

Running Einstein oder Rosetta units with similar package sizes do not create such problems.

Does anyone else have such problems?

Thanks

Heinz
ID: 67024 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 67025 - Posted: 24 Dec 2022, 11:37:07 UTC - in response to Message 67024.  
Last modified: 24 Dec 2022, 11:37:44 UTC

It has been noted in the OIFS discussion thread in number crunching. Might be worth looking at this post for details of how to reduce the checkpointing frequency. Unless on a very slow hard disk, there will only be a marginal improvement in speed tasks finish but it will reduce the load on the hard disk. Also possibly helpful if you have an ssd that is going to suffer eventually from the number of write cycles.
ID: 67025 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 67034 - Posted: 24 Dec 2022, 21:23:16 UTC - in response to Message 67024.  

I am running climateprediction on my Ubuntu Linux Box for many years without any issues.

However since a few months, I see, that model.exe creates such high IO that it locks up the entire box although memory and cpu values are ok. Looking at the tool iotop I can clearly see, that it is model.exe which creates high disk IO. It is confirmed by looking at the led of the disc at the computer itself, which is constantly on.


I am running climateprediction on my Red Hat Enterprise Linux 8.6 box. It is currently running five Oifs tasks at a time and does not seem to be overworking the spinning 5400 rpm SATA hard drive where the Boinc file system partition resides. For one thing, the disk light blinks rather often briefly, but is by no means constantly on. Iotop has this to say
Total DISK READ :	0.00 B/s | Total DISK WRITE :       2.24 M/s
Actual DISK READ:	0.00 B/s | Actual DISK WRITE:      81.60 M/s
    TID  PRIO  USER     DISK READ DISK WRITE>    COMMAND                                                                                    
 760242 idle boinc	 0.00 B/s    2.21 M/s ../../projects/climateprediction.net/oifs_43r3~50100 hq0f 0257 971 12187901 123 oifs_43r3_ps 1
  19361 idle boinc	 0.00 B/s   17.55 K/s boinc
 767462 idle boinc	 0.00 B/s 1633.50 B/s oifs_43r3_model.exe
 760402 idle boinc	 0.00 B/s 1225.12 B/s oifs_43r3_model.exe
 729733 idle boinc	 0.00 B/s  816.75 B/s oifs_43r3_model.exe
 760499 idle boinc	 0.00 B/s  816.75 B/s oifs_43r3_model.exe
 729730 idle boinc	 0.00 B/s  408.37 B/s ../../projects/climateprediction.net/oifs_43r3~50100 hq0f 0438 967 12184082 123 oifs_43r3_ps 1
 760397 idle boinc	 0.00 B/s  408.37 B/s ../../projects/climateprediction.net/oifs_43r3~50100 hq0f 0675 971 12188319 123 oifs_43r3_ps 1
 760488 idle boinc	 0.00 B/s  408.37 B/s ../../projects/climateprediction.net/oifs_43r3~50100 hq0f 0514 975 12192158 123 oifs_43r3_ps 1
 767457 idle boinc	 0.00 B/s  408.37 B/s ../../projects/climateprediction.net/oifs_43r3~50100 hq0f 0519 976 12193163 123 oifs_43r3_ps 1
 771819 idle boinc	 0.00 B/s  408.37 B/s ../../projects/www.worldcommunitygrid.org/wcgr~193773_5349.txt -DatabaseFile dataset-sarc1.txt
 773164 idle boinc	 0.00 B/s  408.37 B/s ../../projects/einstein.phys.uwm.edu/hsgamma_F~match 0.15 --debug 0 --debugCommandLineMangling
 773277 idle boinc	 0.00 B/s  408.37 B/s ../../projects/milkyway.cs.rpi.edu_milkyway/mi~.709531 326.439 8.36517 0.119634 3.0548 6.23573


That does not look like a lot of disk output to me.

What bothers some users is the high rate of "trickles" which are about 14 Megabytes each and my machine generates one every s seconds or so for each of these 5 processes. (And, for the last few hours, the server to which they are to be sent is not responding. But that is another story and has not much to do with disk writing speed.
ID: 67034 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 12 Apr 21
Posts: 317
Credit: 14,915,528
RAC: 15,795
Message 67036 - Posted: 25 Dec 2022, 3:29:52 UTC - in response to Message 67024.  

ruffieux,
OIFS does have high I/O. However, it seems like an issue you might be having is running too many tasks for the amount of RAM you have. It looks like your PC has 32GB RAM and OIFS tasks require up to 7GB per task (although I believe the current release of tasks requires 5GB per task). With 7 tasks you'd need 35GB of RAM plus overhead, so I'd guess that your system locks up due to running out of RAM. Additionally the majority of your tasks seem to fail and Glenn has mentioned that pushing the system memory limits increases the chance of task failures. I'd suggest trying to run about 4 concurrent OIFS tasks and see if your system behaves better and your success rate goes up. I'm experimenting with this myself on an older system that only has 16GB RAM.
ID: 67036 · Report as offensive     Reply Quote
Jean-David Beyer

Send message
Joined: 5 Aug 04
Posts: 1120
Credit: 17,202,915
RAC: 2,154
Message 67037 - Posted: 25 Dec 2022, 3:58:03 UTC - in response to Message 67036.  

It looks like your PC has 32GB RAM and OIFS tasks require up to 7GB per task (although I believe the current release of tasks requires 5GB per task). With 7 tasks you'd need 35GB of RAM plus overhead, so I'd guess that your system locks up due to running out of RAM.


Very good point If you are short of RAM, the memory management function of the Linux kernel will page out the least recently used parts of RAM to disk to make room for a running process to get itself more RAM if it needs it. Then when a process that has had something paged out to disk needs it, the kernel will need to page it back in from disk. That paging could certainly use a lot of disk IO time. It might even seem to make the system seem locked up, when actually it is working correctly, but waiting for all that paging.
ID: 67037 · Report as offensive     Reply Quote
ProfileLandjunge

Send message
Joined: 17 Aug 07
Posts: 8
Credit: 37,254,652
RAC: 11,568
Message 67038 - Posted: 25 Dec 2022, 7:25:14 UTC - in response to Message 67024.  

Hello ruffieux,

the new openIFS tasks need 4,4GB per WU. As your host only has 32GB of RAM i would try a maximum of 6 tasks in parallel, otherwise the maschine will swap very often( disk IO).
ID: 67038 · Report as offensive     Reply Quote
ruffieux

Send message
Joined: 21 Oct 07
Posts: 8
Credit: 12,864,838
RAC: 399
Message 67041 - Posted: 25 Dec 2022, 8:41:21 UTC - in response to Message 67037.  

Thanks guys,

Next time I get a batch of work units, I will again check memory usage and swapping again. I set the memory usage to 50% when the machine is in use and 90% if not. Maybe the alter was too high because I in deed had most of the problem when logging in.

Let's see. I will report again.

Many thanks

Heinz
ID: 67041 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 127
Credit: 42,311,144
RAC: 73,813
Message 67165 - Posted: 31 Dec 2022, 9:22:15 UTC - in response to Message 67041.  
Last modified: 31 Dec 2022, 10:00:03 UTC

90% might be too much even when it's idling, but that does depend on what other services you have. I also had an Ubuntu 32G machine and occasionally saw OOM kill when I set it to 90%. The same host has been doing fine for months after I reduced it to 85%. Another option to double check is whether you've unselected that "Leave non-GPU tasks in memory while suspended". From my observation, if that option is on, even if BOINC client suspends tasks when computer is in use, it won't free up memory from the suspended tasks. This honestly feels like a bug. If task suspension is due to memory usage, BOINC client should free up the memory regardless whether that option is selected or not. Betting on swap to handle them will likely cause the exact bad experience BOINC's memory management is trying to avoid. However, this may not be a great option for CPDN, so you might have to use the same memory limit whether the system is idle or not.

Even if all the setting above are what you intended, I believe you will still have problem for this OpenIFS workload. Though the wiki here about BOINC memory management mentions "proposed", the behavior it described matches what I observed on my system. The key part here is that BOINC client doesn't schedule tasks based on the worst memory bound specified by each application. That threshold is only used to abort tasks. Instead, it monitors active working set every some seconds, smooths the measurement and use that for scheduling. That's a great optimization for workload with stable memory usage per task but vary across tasks. However, if the task's memory usage fluctuates frequently and wildly throughout its lifetime, this algorithm will systematically under-estimate the true memory demand. When spikes arrive and the available memory on the system is already at edge, the system will start swapping or even get application OOM killed.

I've observed OpenIFS tasks' memory usage easily fluctuate by 1GB+. You can just watch top or any other monitoring application and see that happen multiple times a minute. You can also extract the smoothed working set size through boinccmd --get_tasks, which is usually around 3.5GB. That's quite far away from 4.5GB peak I saw on almost all tasks and they reach above 4GB pretty frequently. Thus BOINC would run too many tasks and overshoot your specified memory usage regularly.

I only see two solutions to counter this. 1) Use app_config to strictly restrict the concurrent OpenIFS tasks and plan 4.5-5GB of memory per task when you calculate the number. 2) Dial down the memory BOINC is allowed to use by 20% (relative) or more to what you set now, so the actual memory usage would be what you intend to allocate. For example, if you are fine with BOINC taking up 50% of memory, the actual setting you put in should be less than 40%.

I ended up going with 1), because 2) restricts all other projects even if their memory usage is more stable and predictable by BOINC client.
ID: 67165 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,551,831
RAC: 17,001
Message 67325 - Posted: 4 Jan 2023, 18:09:07 UTC - in response to Message 67041.  

Just seen this thread. I only have a few extra points to add from the great advice here already. Pretty clear your machine started swapping memory to disk if it locked up, due to too many OpenIFS tasks at once.

I have asked if CPDN can implement a 'max number of tasks' option on the Project Preferences page which would avoid users having to create app_config.xml files. Problem is they have their hands full right now. I can't say when they will get to it.

Check the value of 'rsc_memory_bound' in the client_state.xml file which you can find in your /var/lib/boinc (or wherever you installed it) e.g.
grep /var/lib/boinc/client_state.xml

the value(s) returned are in bytes. Allow at least that much RAM for each OpenIFS task.

I recommend running no more than 1 less OpenIFS task to the number of cores, not threads, you have for best memory use & throughput. e.g. 4c/8t machine, run 3 tasks max.

Use the iostat command to check if the machine is swapping at all:
iostat 1   # that's a one not 'ell'

and look at the si & so columns (swap pages in & out respectively). Avoid swapping at all.

Regarding the I/O from the task(s), this will depend on how many you run at once of course. I have been looking at this in more detail, the checkpoint files is one aspect that has been mentioned but they are only produced every 24 model steps. I will turn down the text logging output from the model which we use to check errors for future batches - that will help.

Thanks guys,

Next time I get a batch of work units, I will again check memory usage and swapping again. I set the memory usage to 50% when the machine is in use and 90% if not. Maybe the alter was too high because I in deed had most of the problem when logging in.

Let's see. I will report again.

Many thanks

Heinz
ID: 67325 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : High Disc IO

©2024 cpdn.org