climateprediction.net (CPDN) home page
Thread 'Restore'

Thread 'Restore'

Questions and Answers : Getting started : Restore
Message board moderation

To post messages, you must log in.

AuthorMessage
BugG

Send message
Joined: 24 Aug 08
Posts: 12
Credit: 663,890
RAC: 0
Message 35634 - Posted: 3 Dec 2008, 9:35:01 UTC

Computation stopped at 75% of Progress. Below is a copy/paste from the messages section of Boinc.

02/12/2008 11:31:27|climateprediction.net|Deferring communication for 1 min 0 sec
02/12/2008 11:31:27|climateprediction.net|Reason: Unrecoverable error for result hadcm3istd_12yl_1920_160_15994682_7 (The device does not recognize the command. (0x16) - exit code 22 (0x16))
02/12/2008 11:31:34|climateprediction.net|Computation for task hadcm3istd_12yl_1920_160_15994682_7 finished
02/12/2008 11:31:34|climateprediction.net|Output file hadcm3istd_12yl_1920_160_15994682_7_12.zip for task hadcm3istd_12yl_1920_160_15994682_7 absent
02/12/2008 11:31:34|climateprediction.net|Output file hadcm3istd_12yl_1920_160_15994682_7_13.zip for task hadcm3istd_12yl_1920_160_15994682_7 absent
02/12/2008 11:31:34|climateprediction.net|Output file hadcm3istd_12yl_1920_160_15994682_7_14.zip for task hadcm3istd_12yl_1920_160_15994682_7 absent
02/12/2008 11:31:34|climateprediction.net|Output file hadcm3istd_12yl_1920_160_15994682_7_15.zip for task hadcm3istd_12yl_1920_160_15994682_7 absent
02/12/2008 11:31:34|climateprediction.net|Output file hadcm3istd_12yl_1920_160_15994682_7_16.zip for task hadcm3istd_12yl_1920_160_15994682_7 absent


I have a backup file that I made last week. Is it worth restoring for this error ?

ID: 35634 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35635 - Posted: 3 Dec 2008, 11:56:06 UTC
Last modified: 3 Dec 2008, 11:57:04 UTC

I would say yes because the model\'s already completed about 70% of its 160 years. It was progressing very well. Congratulations for having a backup.

Unfortunately error code 22 gives us no clue about what caused the model crash, nor does the \'device does not recognize the command\' message. Ignore that message.

Have a look at the project READMEs linked in my signature. Go to the collection about crashes and problems, and in that collection look at item #6 by MikeMars. You may find something you should do in future, or avoid doing, to reduce the risk of model crashes.

If you restore the model, let us know please whether it passes the point where it crashed and then progresses.
Cpdn news
ID: 35635 · Report as offensive     Reply Quote
BugG

Send message
Joined: 24 Aug 08
Posts: 12
Credit: 663,890
RAC: 0
Message 35648 - Posted: 5 Dec 2008, 2:40:46 UTC

I tried to restore two models, the crashed one and a normal one. (CPU is dual core)
The instruction I used for the restoring was “Simple guide to backups” of README. “The Procedure for Restoring Individual Crashed Runs in a multi Project/model Setup” looked too complicated for me who is not advanced user.
It seems the restoring was successfully done and these two models are normally running.
But data on “Tasks for user” are strange. Below are some of them.
(1)Crashed model
The same data generated at the crash remain. Server state : Over. Outcome: Client error. Client state: Compute error etc.
(2)Normal model
Server state: Over. Outcome: Client detached. Granted credits are given.

I would like to know if this restoring is good or not and if not what to do for the next step.
It will take few days before reaching the point crashed.
Thank you very much for your help.
ID: 35648 · Report as offensive     Reply Quote
Profilemo.v
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 2363
Credit: 14,611,758
RAC: 0
Message 35649 - Posted: 5 Dec 2008, 10:19:34 UTC
Last modified: 5 Dec 2008, 10:20:40 UTC

Hi again

This is normal. Here\'s your Tasks for computer page. If a healthy model is restored the server classifies it as \'detached\'. If the server learns about a model crash it classifies the model as crashed (compute error) and gives it the code and messages for that crash. Forever. However, it still accepts more trickles and uploads and lets models complete. But a previously-crashed model still keeps the server\'s earlier classification! Of course this doesn\'t prevent models being used for the research.

So what matters isn\'t these now-irrelevant classifications, it\'s whether the models are progressing and trickling.

Thanks for spending time on this. Completed models are naturally much more useful than unfinished ones.
Cpdn news
ID: 35649 · Report as offensive     Reply Quote
BugG

Send message
Joined: 24 Aug 08
Posts: 12
Credit: 663,890
RAC: 0
Message 35668 - Posted: 8 Dec 2008, 22:15:04 UTC

Hi mo.v
It is my pleasure to tell you that the crashed model has passed 75%.
I don’t know the exact point where it crashed but it should be less than 75%.
The two restored models are normally progressing.

Thanks again.
ID: 35668 · Report as offensive     Reply Quote
Profileold_user471381

Send message
Joined: 8 Sep 07
Posts: 1
Credit: 11,949
RAC: 0
Message 35947 - Posted: 16 Jan 2009, 23:58:46 UTC


http://climateapps2.oucs.ox.ac.uk/cpdnboinc/user_search.php

Fatal error: Cannot redeclare main() (previously declared in /var/www/boinc/projects/cpdnboinc/html/user/user_search.php:154) in /var/www/boinc/projects/cpdnboinc/html/inc/hadsm3_graph.inc on line 1110
ID: 35947 · Report as offensive     Reply Quote

Questions and Answers : Getting started : Restore

©2024 cpdn.org