climateprediction.net home page
Bringing down the error rate

Bringing down the error rate

Message boards : Number crunching : Bringing down the error rate
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Jim1348

Send message
Joined: 15 Jan 06
Posts: 637
Credit: 26,751,529
RAC: 653
Message 51726 - Posted: 30 Mar 2015, 18:34:33 UTC

I am now using the above-noted write cache (PrimoCache by Romex Sofware) on my i7-4790 machine, and it is set to run the HadCM3 shorts and HadAM3P-HadRM3P Pacific North Wests on two cores. I tried them initially on another machine with a service install of BOINC, but they all errored out (or got stuck in a loop) after a few seconds, so it is back to a non-service install.

The advantage of a write cache over a ramdisk is that you don't have to worry so much about the amount of main memory. I have 16 GB of DDR3 on all my dedicated machines, and that is enough for an 11 or 12 GB ramdisk, but some of the projects (e.g., ATLAS, and even some of the large CEP2 jobs) can eat up even more. But a write cache can handle them with half that amount of memory. For CPDN, I think that using a 4 GB write cache with 8 GB of main memory is probably enough to support 6 to 8 cores with no problems. You could add a read-cache also, but that is not really necessary with an SSD due to its high read speed, and just uses up cache space.

At any rate, in case anyone is interested, my parameters for the PrimoCache (with 16 GB main memory) are:
Level-1 Cache: 8192 MB
Level-2 Cache: Disabled

Block Size: 4 KB (defaults to the SSD cluster size)
Strategy: Write-data Only
Defer-Write: Infinite

The "Infinite" deferred write simply means that the writes will go into the cache until it fills up, and then it does an "urgent write" to transfer the oldest data to the disk, which is an SSD in my case but could be a hard drive. You could set the deferred write delay to less than infinite; maybe an hour or two for a 4 GB size cache. In that case, the writes would go intially to the cache, and then after an hour anything still left in the cache would be written to the disk drive in a "normal write". However, with a large enough cache size, the cache will not need to write to the disk at all, since the old data will be erased when a new job starts. That is my case, and so I normally never see any normal or urgent writes. Therefore, the cache essentially behaves the same way as a ramdisk, except that the possibility of overflowing to the disk drive is there if you need it and you don't normally run out of space. I use it to protect my SSD from the high writes of my various BOINC projects, and the low error rates on CPDN are a bonus.

Here is what it looks like, with the PrimoCache being installed on 21 March:
http://climateapps2.oerc.ox.ac.uk/cpdnboinc/results.php?hostid=1349694

By the way, write caches are difficult to implement properly at the operating system level, since some operating system writes have to go to the disk drive immediately to prevent corruption in case of crashes or putting the machine to sleep, etc. I have found this one to be stable, but that will depend on your hardware, etc. It helps to have a backup power supply (UPS) here also, though recovery is usually easier as compared to a ramdisk in case of an outage.
ID: 51726 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Bringing down the error rate

©2024 climateprediction.net