Questions and Answers :
Windows :
\"CPDN process is not running\" error
Message board moderation
Author | Message |
---|---|
Send message Joined: 7 Aug 04 Posts: 2184 Credit: 64,822,615 RAC: 5,275 |
I\'m running a hadcm3 5.40 model, it\'s in 2032 and the temperature trace looks unremarkable. http://climateapps2.oucs.ox.ac.uk/cpdnboinc/result.php?resultid=6447415 It\'s been trickling regularly every 12 hours, give or take a minute. I checked this evening and it trickled after 12 hrs and 10 minutes. Looking at stderr.txt in the slots directory for that model, it has 7 sequences of these two lines: CPDN process is not running, exiting, bRetVal = 1, checkPID=0, selfPID=1160, iMonCtr=1 Model crash detected, will try to restart... widely interspersed among generally benign logging. The latest one was right before the last trickle, indicating a short rewind and explaining the longer trickle duration. There was also another error awhile back apparently Model crashed: umshell1.f: TRANSO2A: Missing data in ocean UV fields Anyway, other than those occasional restarts, it\'s been moving along fine. I\'m wondering what the \"CPDN process is not running\" error really is. The computer was doing nothing other than crunching at the time. You\'ll notice in the trickle listing for this model, it has also occasionally lost CPU time, so it is showing something like 0.79 s/TS now, even though it is really doing about 1.65 s/TS. I\'ve seen this happen before, maybe once in 15 models, but it\'s done it several times on this one. |
Send message Joined: 7 Aug 04 Posts: 2184 Credit: 64,822,615 RAC: 5,275 |
Well, I\'ll assume until proven otherwise, that it is a memory instability. At the time of the writing of the first post, I had run memtest86+ for 4 hours without an error. But, after posting, I ran Prime95 and it bombed out after 5 hours. I changed the processor, I changed the memory, I changed the power supply, and the same thing would happen...Prime95 would bomb out in 30 minutes to 5 hours. I was about ready to give up and replace the motherboard, when I thought, hmmm, why don\'t I switch memory slots. The two memory modules had been in DIMM slots 3 and 4, where they were stable when I put together this system about 6 months ago. Putting the modules in DIMM slots 1 and 2 solved the problem as it was Prime95 stable for 24 hours. Lesson learned. Hopefully this model will now run to completion without any more oddities. |
Send message Joined: 13 Jan 06 Posts: 1498 Credit: 15,613,038 RAC: 0 |
Unusual that Prime95 found the memory error and MemTest86 didn\'t... perhaps it was also related to system temperatures or something? I'm a volunteer and my views are my own. News and Announcements and FAQ |
Send message Joined: 7 Aug 04 Posts: 2184 Credit: 64,822,615 RAC: 5,275 |
Perhaps. It might also be some type of EM noise interference only there when the system is stressed. I was looking to enable Spread Spectrum, but it wasn\'t available on that mobo\'s BIOS (Abit AV8). |
©2024 cpdn.org