climateprediction.net home page
Another crash: \"22 (0x16)\"

Another crash: \"22 (0x16)\"

Questions and Answers : Windows : Another crash: \"22 (0x16)\"
Message board moderation

To post messages, you must log in.

AuthorMessage
Combat Marmot

Send message
Joined: 4 Jun 07
Posts: 5
Credit: 965,283
RAC: 240
Message 30679 - Posted: 24 Sep 2007, 12:44:36 UTC
Last modified: 24 Sep 2007, 13:07:50 UTC

Hi there,
Today another model crashed on me with the exit status given as \"22 (0x16)\". Another 4 models have gone that way for me with only one going with a -197 (0xffffff3b).

I run climate prediction on 2 computers. One has a AMD X2 4200 processor and the other (a laptop) has a T7400 Core 2 Duo processor. Both have 2 GB of RAM.

I always quit boinc before running anti-virus/spyware, and generally before playing games. The model crashed this morning about 20 minutes after I started my computer. I don\'t run the screen saver and I don\'t run 24/7.

I do have a backup for this WU but I\'m not sure whether to resume (I don\'t think I\'d have enough time to complete the model since my desktop will be off since I am away).

If anyone could tell me what I\'m doing wrong that\'d be great.
ID: 30679 · Report as offensive     Reply Quote
Profile MikeMarsUK
Volunteer moderator
Avatar

Send message
Joined: 13 Jan 06
Posts: 1498
Credit: 15,613,038
RAC: 0
Message 30691 - Posted: 24 Sep 2007, 20:58:34 UTC


Hi,

The error code 22 bit doesn\'t mean much, most errors will show up on the website like that.

The important bit is in the \'error text\', however nothing of interest actually shows up on your result page (the error text is truncated if it is too long, and I think it has truncated the interesting bit).

All I can suggest is taking backups at intervals, just in case of an error. This will allow you to restore from the backup and continue running. The other items in the README link in my signature may be worth reading also.

-Cheers,

Mike
I'm a volunteer and my views are my own.
News and Announcements and FAQ
ID: 30691 · Report as offensive     Reply Quote
DJStarfox

Send message
Joined: 27 Jan 07
Posts: 300
Credit: 3,288,263
RAC: 26,370
Message 30702 - Posted: 25 Sep 2007, 3:59:19 UTC - in response to Message 30679.  

Hi there,
Today another model crashed on me with the exit status given as \"22 (0x16)\". Another 4 models have gone that way for me with only one going with a -197 (0xffffff3b).

I run climate prediction on 2 computers. One has a AMD X2 4200 processor and the other (a laptop) has a T7400 Core 2 Duo processor. Both have 2 GB of RAM.

I always quit boinc before running anti-virus/spyware, and generally before playing games. The model crashed this morning about 20 minutes after I started my computer. I don\'t run the screen saver and I don\'t run 24/7.

I do have a backup for this WU but I\'m not sure whether to resume (I don\'t think I\'d have enough time to complete the model since my desktop will be off since I am away).

If anyone could tell me what I\'m doing wrong that\'d be great.


I\'ve seen several models crash with this error. The only think you can check is file and directory permissions on your boinc project folders. (That solved one model for me.) Other than that, try restoring from backup once. Otherwise, let it die and get a new model.
ID: 30702 · Report as offensive     Reply Quote
Combat Marmot

Send message
Joined: 4 Jun 07
Posts: 5
Credit: 965,283
RAC: 240
Message 30732 - Posted: 27 Sep 2007, 9:05:58 UTC

Hi,
Thank you for the replies. However I\'d like to know if the uploaded results are useful to the project if a model crashes? I mean this in the general sense rather than if it was the computer being unstable that caused the crash (I found the answer to that in the FAQ).

ID: 30732 · Report as offensive     Reply Quote
Profile mo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 30733 - Posted: 27 Sep 2007, 9:38:37 UTC

We know because we\'ve been told by Oxford that partial results are added into the HADCM coupled model results. So I assume this is also the case for HADSM slabs. But as far as I know, a model\'s most recent partial result will be the last zip file it sent to Oxford, not the most recent trickle. The zips are sent at the end of each phase for the slabs and at the end of each decade for the coupled models.

As you say, there\'s a procedure for weeding out results that indicate instability or processing errors on individual computers, and it doesn\'t just depend on what the error and exit codes are. They look, for example, for abnormal spikes in the graphs. This was certainly the case with the early Classic slabs and I assume the same quality control is still applied. There was a post about it years ago on the independent forum by a programmer who helped set Classic up and who\'s no longer working for cpdn. I think it was said then that about 2 or 3% of Classic models failed the quality control. They mentioned that a major cause of the failure of models to pass quality control was over-enthusiastic overclocking.
Cpdn news
ID: 30733 · Report as offensive     Reply Quote

Questions and Answers : Windows : Another crash: \"22 (0x16)\"

©2024 cpdn.org