climateprediction.net home page
Posts by DJStarfox

Posts by DJStarfox

1) Message boards : Number crunching : Download issues (Message 68680)
Posted 20 Apr 2023 by DJStarfox
Post:
I think this is the same issue that I posted about:
https://www.cpdn.org/cpdnboinc/forum_thread.php?id=9198

The web admin needs to install the certificate chain to the web server(s).

Perhaps the moderators can merge these threads or link them.
2) Message boards : Number crunching : SSL Cert for www.climateprediction.net (Message 68678)
Posted 19 Apr 2023 by DJStarfox
Post:
OK, looks like the web admin forgot to install the intermediate certificate. The web server and its intermediate certificate are supposed to be sent to the client for TLS session setup.

See "certificate issues" here:
https://www.ssllabs.com/ssltest/analyze.html?d=www.climateprediction.net
3) Message boards : Number crunching : SSL Cert for www.climateprediction.net (Message 68677)
Posted 19 Apr 2023 by DJStarfox
Post:
Looks like www.climateprediction.net replaced their TLS/SSL certificate with "Let's Encrypt" (lasting only 3 months at a time).

I'm getting an error saying, "The issuer certificate of a locally looked up certificate could not be found. No certificates could be verified." Can someone check this cert is publicly trusted?
4) Message boards : Number crunching : The uploads are stuck (Message 67528)
Posted 11 Jan 2023 by DJStarfox
Post:
Might be a hung process taking over port 80. They'll have to stop the service, kill any orphaned processes, and restart the service.
5) Message boards : Number crunching : The uploads are stuck (Message 67504)
Posted 10 Jan 2023 by DJStarfox
Post:
Do you mean <max_file_xfers_per_project>? If your Internet connection can do more than one, why not do it?


Yes, that's what I mean:
        <max_file_xfers>4</max_file_xfers>
        <max_file_xfers_per_project>1</max_file_xfers_per_project>


For normal HTTPS traffic, yes, you want about 4 connections per server, and most browsers do 4 to 8 connections at a time anyway, because most big websites are server farms (multiple servers that can all work in parallel). However, file transfers are a different beast and BOINC projects in particular are, as most are grant funded (i.e., run on minimal hardware). Your 1 allowed file transfer will still download or upload at the maximum possible speed, limited by the project's internet connection. It does no good to hammer the same project file server with multiple connections, if connection #2 runs at half speed, connection #3 at 1/3 speed, etc. In other words, it won't take longer for YOU but it will help the project server by only needing to serve 1 connect per client x 1000 active users, etc.
6) Message boards : Number crunching : The uploads are stuck (Message 67497)
Posted 10 Jan 2023 by DJStarfox
Post:
Upload server update 9/1/23 10:49GMT
From a meeting this morning with CPDN they do not expect the upload server to be available until 17:00GMT TOMORROW (10th) at the earliest. The server itself is running, but they have to move many Tbs of data but also want to monitor the newly configured server to check it is stable. As already said, these are issues caused by the cloud provider, not CPDN themselves.


Thanks for the update, Glenn.

FYI... I set my "max uploads per project" to 1 in the cc_config.xml, which is what I recommend for everyone.
7) Message boards : Number crunching : The uploads are stuck (Message 67496)
Posted 10 Jan 2023 by DJStarfox
Post:
Edit: Just realized if you can't write state file, any messing within BOINC might be hopeless. So have to find the space elsewhere from the system.


Indeed, that's what I've done. The loss of the state file has caused problems: presumably, the .old state file was accessed as the client downloaded some hadam files; it also couldn't locate some of the oifs files and so 20 or so were abandoned as errors, with the loss of 20 results.

My next move is to split the /boinc-client folder: I'm thinking to leave the boinc-client directory on the /var/lib partition but mount the /projects folder on a separate partition. At the moment, the whole of the boinc-client folder is on a separate partition. This arrangement would have meant that the state file could still have been written, much like mounting /var/log separately to /var. Any thoughts?


Under your account's computing preferences, you can set set "Leave at least x GB free" (of disk space) to make sure there is enough left for uploads, etc.
8) Message boards : Number crunching : The uploads are stuck (Message 67322)
Posted 4 Jan 2023 by DJStarfox
Post:
The admins should plan for enough infrastructure to handle:
* "Computers with recent credit" as per the server status page. Right now, that number is 968 computers.
* With project backoff near 1 hour, that means: 16 uploads per minute average
* Total file size 224MB each model, means server needs to handle 3.5G per minute during peak time.
9) Message boards : Number crunching : The uploads are stuck (Message 67216)
Posted 2 Jan 2023 by DJStarfox
Post:
Uploads still stuck for me. Hopefully, server can be fixed today or tomorrow.
10) Message boards : Number crunching : Big models (Message 63463)
Posted 2 Feb 2021 by DJStarfox
Post:
Ya know... rather than making each model so big (multiple GB per task), could the programmers simply have each task share more files in common? That way each model takes less space, if that makes sense.
11) Message boards : Number crunching : Please fix the deadlines! (Message 63095)
Posted 4 Dec 2020 by DJStarfox
Post:
If you have 1000 credit on project 1 and 1,000,000 credit on project 2, Bonic will try to have project 1 "catch up" to project 2.


If what you say is true, then that design is faulty. The resource share should try to equalize the recent average credit (RAC), not total credit.
12) Message boards : Number crunching : New work Discussion (Message 63018)
Posted 25 Nov 2020 by DJStarfox
Post:
I got five (5) of those HadAM4h tasks, but each one is consuming ~4GB of disk space, which far exceeds the 10G max I have in BOINC settings. Seems like a bug in the task scheduler...

I had to abort a few to avoid running out of disk space on the /var partition. :(
13) Message boards : Number crunching : Preferences - project options missing (Message 60943)
Posted 19 Sep 2019 by DJStarfox
Post:
I too am surprised the application selection feature is completely gone now.
https://www.cpdn.org/prefs.php?subset=project

Various old models would take up 10's of gigabytes. At various points, I would have to periodically delete old models in order to download any new units. By selecting only a few models, this was less of a problem.

Also, in the past, the researchers were able to use different platforms as a way to validate a model. Since floating point calculations, etc. varied slightly, it helped them tune the models with a bigger variety of data points. If such an emergent result is no longer useful, then I would highly recommend they post the minimum requirements per platform. (For example, these could go on the CPDN homepage next to the how to join section.)
14) Message boards : Number crunching : What Happened ??? (Message 58153)
Posted 25 Apr 2018 by DJStarfox
Post:
The entire cpdn.org website was down for an extended time. I visited climateprediction.net several times but did not see any notice about downtime.
15) Message boards : Number crunching : What Happened ??? (Message 58148)
Posted 24 Apr 2018 by DJStarfox
Post:
When this site shuts down, people switch to the BOINC site, in particular the Projects section, top post, which is: News on Project Outages, where a message, usually from Andy, is posted.

In this case, a new thread was also created: CPDN project going offline this afternoon, which is full of posts from people, including one that I put there near the end of the thread, which I felt explains things well enough.


First of all, I (and likely many others) did not know there was a specific forum on some other site, somewhere, where a cryptic message about some maintenance was posted.... Furthermore, the site has been down for over a month! That sounds much worse than some routine maintenance, so I consider that post deceptive.

I'm glad the site is back, and I hope such a long outage does not happen again. In the future, a post on the HOME PAGE of climateprediction.net would be much more appropriate and visible to the community.
16) Message boards : Number crunching : Volunteer to work with the BOINC Release Manger on Linux Instructions? (Message 57862)
Posted 26 Feb 2018 by DJStarfox
Post:
I'm most familiar with RHEL/Centos 6 and 7 and have been running BOINC on Red Hat based distributions for 8+ years. I'm sure i could dedicate an hour or two to write some documentaion for BOINC. Please have Richard reach out to me if he needs anything.
17) Message boards : Number crunching : Site Back Up (Message 57824)
Posted 19 Feb 2018 by DJStarfox
Post:
Anyone know why the site was offline for so long?
18) Message boards : Number crunching : New work Discussion (Message 57684)
Posted 21 Jan 2018 by DJStarfox
Post:
Speaking of new work, I just got a single hadcm3s model after many months of nothing. Seems very random. Now, the server's queue is 27 but I have no idea which model these are.
19) Questions and Answers : Unix/Linux : Shutting down for re-boot. (Message 57337)
Posted 7 Nov 2017 by DJStarfox
Post:
Dave, I've had better luck with the following global compute preference checked:
"Leave non-GPU tasks in memory while suspended" = yes

If that is disabled on your account, try enabling it and see if that improves things.
20) Message boards : Number crunching : Credit Status (Message 57239)
Posted 28 Oct 2017 by DJStarfox
Post:
This thread is too long, and after reading it, I still don't know why so many people lost all their credits in the last day or so. I lost 1.1 million credits today.

What the hell is going on guys? This needs to get fixed ASAP.

And who programed BOINC to be this volatile that a rogue process or sysadmin can just wipe out credits granted to every user? This has got to change if BOINC is to continue long-term.


Next 20

©2024 climateprediction.net