climateprediction.net (CPDN) home page
Thread 'Had to abort -- computer almost frozen'

Thread 'Had to abort -- computer almost frozen'

Questions and Answers : Getting started : Had to abort -- computer almost frozen
Message board moderation

To post messages, you must log in.

AuthorMessage
Marathon

Send message
Joined: 13 May 22
Posts: 5
Credit: 11,859
RAC: 809
Message 70936 - Posted: 7 Jun 2024, 16:31:07 UTC

Linux: Ubuntu 22.04 on desktop.

I joined CPDN months ago. Yesterday, I finally got some work units. Unfortunately, after running between 10 minutes and 1/2 hour, my computer became unusable. My interactions with it took very, very long: It took several minutes to toggle the "Num Lock" or "Caps Lock" states. Moving the mouse took several minutes to move it even an inch.

This prevented my from accessing the BOINC client or any other programs running, including the system monitor. If I wanted to use my computer, the only thing I could do was to force a reboot.

I really would like to contribute to CPDN. But, it looks like, for some reason, my computer can't handle it.
ID: 70936 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,571,754
RAC: 15,537
Message 70937 - Posted: 7 Jun 2024, 16:57:08 UTC - in response to Message 70936.  
Last modified: 7 Jun 2024, 17:01:59 UTC

Sounds like your PC is swapping. That tells me boinc has started up too many of the CPDN tasks. Unfortunately there is a bug in the boinc client code that doesn't compute the required total memory of tasks correctly. The effect of this bug is to cause more tasks to start than the available memory can support. These tasks use about ~3Gb maximum RAM each, so if too many get started it'll cause the PC to swap because it's run out of RAM.

How much RAM does your machine have?

The workaround is to create a small file in the project space for CPDN. This will tell boinc the maximum tasks it can run for CPDN.

Assuming boinc is installed in the usual place: /var/lib/boinc.

1/ cd /var/lib/boinc/projects/climateprediction.net
2/ Using your favourite text editor, create a file called: app_config.xml with these contents:

<app_config>
   <project_max_concurrent>2</project_max_concurrent>
</app_config>

3/ Restart boinc for these changes to be read.

You can change the value '2' to whatever you want, depending how much memory you have. This will make sure boinc only starts a maximum of two of the tasks.

The app_config.xml file is more versatile. If you wanted, you could control each individual version of the apps as well.

There are a couple of boincmgr settings to check too.

Under 'Computing Preferences', 'Computing' tab, where it says 'Use at most:', make sure the percentage value is not higher than 50%. This is because boinc counts 'threads' as 'CPUs' rather than cores. But the CPDN apps use alot of floating point calculation and there is only one set of floating point units per core. Trying to use all the threads just means the tasks slow down as they have to share the floating point units.

Also, under 'General' on the same tab, make sure 'Leave non-GPU tasks in memory while suspended' is ticked. If this is left unticked, the tasks will keep restarting every time they are suspended, which makes the task take longer to finish.

Hope that helps.
---
CPDN Visiting Scientist
ID: 70937 · Report as offensive     Reply Quote
Marathon

Send message
Joined: 13 May 22
Posts: 5
Credit: 11,859
RAC: 809
Message 70948 - Posted: 7 Jun 2024, 23:38:23 UTC - in response to Message 70937.  

Thanks! I'm giving it a try.

I'm also running World Community Grid. The changes needed to run CPDN will mean fewer WCG tasks running at a time.

To answer your question: My machine has 15 GB memory, with an "8" core CPU (four physical cores).
ID: 70948 · Report as offensive     Reply Quote
Glenn Carver

Send message
Joined: 29 Oct 17
Posts: 1049
Credit: 16,571,754
RAC: 15,537
Message 70969 - Posted: 8 Jun 2024, 19:29:54 UTC - in response to Message 70948.  

I'm also running World Community Grid. The changes needed to run CPDN will mean fewer WCG tasks running at a time.
That may be but it's the throughput that'll improve i.e. how many tasks you can complete per day. Running one task per thread, not core, actually gives a lower throughput for most (not all) projects.
---
CPDN Visiting Scientist
ID: 70969 · Report as offensive     Reply Quote
ProfileDave Jackson
Volunteer moderator

Send message
Joined: 15 May 09
Posts: 4540
Credit: 19,039,635
RAC: 18,944
Message 70972 - Posted: 9 Jun 2024, 7:12:50 UTC

Running one task per thread, not core, actually gives a lower throughput for most (not all) projects.


For what it is worth, in the past with CPDN I have had maximum throughput on an 8 core, 16 thread Ryzen using 7 real cores. With WCG that goes up to 8 real cores unless the tasks are Africa Rain Forest in which case it goes back to 7. With OIFS tasks, I would not run more than 2 with only 15GB of RAM. I have never used an app_config file b ut you can limit how many tasks a specific project will run or even a specific type of task. Somewhere on the BOINC site there is a page giving all the options you can use.
ID: 70972 · Report as offensive     Reply Quote

Questions and Answers : Getting started : Had to abort -- computer almost frozen

©2024 cpdn.org