climateprediction.net home page
\"CPDN process is not running\" error

\"CPDN process is not running\" error

Questions and Answers : Windows : \"CPDN process is not running\" error
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 30828 - Posted: 6 Oct 2007, 3:27:06 UTC
Last modified: 6 Oct 2007, 3:31:36 UTC

I\'m running a hadcm3 5.40 model, it\'s in 2032 and the temperature trace looks unremarkable.

http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6447415

It\'s been trickling regularly every 12 hours, give or take a minute. I checked this evening and it trickled after 12 hrs and 10 minutes. Looking at stderr.txt in the slots directory for that model, it has 7 sequences of these two lines:

CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1160, iMonCtr=1
Model crash detected, will try to restart...

widely interspersed among generally benign logging. The latest one was right before the last trickle, indicating a short rewind and explaining the longer trickle duration. There was also another error awhile back apparently

Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields

Anyway, other than those occasional restarts, it\'s been moving along fine.

I\'m wondering what the \"CPDN process is not running\" error really is. The computer was doing nothing other than crunching at the time.

You\'ll notice in the trickle listing for this model, it has also occasionally lost CPU time, so it is showing something like 0.79 s/TS now, even though it is really doing about 1.65 s/TS. I\'ve seen this happen before, maybe once in 15 models, but it\'s done it several times on this one.
ID: 30828 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 30858 - Posted: 7 Oct 2007, 16:27:04 UTC
Last modified: 7 Oct 2007, 23:40:01 UTC

Well, I\'ll assume until proven otherwise, that it is a memory instability. At the time of the writing of the first post, I had run memtest86+ for 4 hours without an error. But, after posting, I ran Prime95 and it bombed out after 5 hours.

I changed the processor, I changed the memory, I changed the power supply, and the same thing would happen...Prime95 would bomb out in 30 minutes to 5 hours. I was about ready to give up and replace the motherboard, when I thought, hmmm, why don\'t I switch memory slots. The two memory modules had been in DIMM slots 3 and 4, where they were stable when I put together this system about 6 months ago. Putting the modules in DIMM slots 1 and 2 solved the problem as it was Prime95 stable for 24 hours.

Lesson learned. Hopefully this model will now run to completion without any more oddities.
ID: 30858 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 30867 - Posted: 7 Oct 2007, 21:47:17 UTC


Unusual that Prime95 found the memory error and MemTest86 didn\'t... perhaps it was also related to system temperatures or something?

I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 30867 · Report as offensive     Reply Quote
Profile geophi
Volunteer moderator

Send message
Joined: 7 Aug 04
Posts: 2184
Credit: 64,822,615
RAC: 5,275
Message 30879 - Posted: 7 Oct 2007, 23:39:33 UTC - in response to Message 30867.  


Unusual that Prime95 found the memory error and MemTest86 didn\'t... perhaps it was also related to system temperatures or something?

Perhaps. It might also be some type of EM noise interference only there when the system is stressed. I was looking to enable Spread Spectrum, but it wasn\'t available on that mobo\'s BIOS (Abit AV8).
ID: 30879 · Report as offensive     Reply Quote

Questions and Answers : Windows : \"CPDN process is not running\" error

©2024 cpdn.org