quick server side update

Message boards : Technical News : quick server side update
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1014543 - Posted: 11 Jul 2010, 2:43:37 UTC - in response to Message 1014532.  

Jeff,

I don't know much about the configuration of the Seti db, but right now I only get 10 wu's at a time and when the servers shutdown, that only leaves me a day and a half of work for my dual processor. I have to sit doing nothing for the other day and a half. Anything that I can do about that?

Allen


Allen:

To help prevent database/server crashes due to overload after the 3-day outage, there are TEMPORARY limits in place for downloads, based on how many WUs your computers already have in progress. Last time I checked, they were at 8 per cpu core, 48 per gpu, and 1000 per host. Since (per the public data on your S@H profile page) you have more than 8 WUs in progress on each of your cpu's, that's all you can get right now. You may get some more as you complete WUs, or as Jeff raises the limits.

Jeff has been raising the limits periodically since the servers came up on Friday morning, and he has said he will remove all temporary limits by Monday morning. Then you should be able to load up for the next 3-Day outage.

More current info is available in several threads running in the Number Crunching section of the Forum.


The problem people were having was from the combined limit. The seperate cpu and gpu limits seem to have functioned admirably, and raising them as you see fit sounds like a great plan. It keeps demands manageable.


Janice
ID: 1014543 · Report as offensive
gomeyer
Volunteer tester

Send message
Joined: 21 May 99
Posts: 488
Credit: 50,370,425
RAC: 0
United States
Message 1014556 - Posted: 11 Jul 2010, 4:13:28 UTC

Jeff,

First, thanks for the hard work and for keeping us updated! All of my cores are now running hot, straight, and normal and so of course I'm content for now.

I would like to add my voice to two requests already made here and in other threads:

- It has been suggested that it should be possible to set up the scheduler so as to NOT assign VLAR work to GPU's. If so it would go a long way toward keeping the gears greased even with the current somewhat lower limits. Not having to reschedule VLAR's from GPU to CPU on the local client would be a big help keeping the GPU's running during the 3 day outages and should also keep the Application Details on your servers much more accurate.

- Would it not be reasonable to raise limits or drop them altogether SOONER than 24 hours before the outage? As I'm sure you noticed last week the bandwidth was totally saturated from Mon to Tues A.M. and many WU's that had been scheduled were never able to come down before the outage began. They did DL quickly after the outage but by that time some machines were out of work. If necessary to help with bandwidth limits (and if it doesn't mean an unreasonable extra workload for your team) you might increase limits in stages starting 48 hours or even more before Tuesday.

Thanks again.
ID: 1014556 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1014644 - Posted: 11 Jul 2010, 13:33:25 UTC

Thanks Jeff
My main bitch all along has not been the outages, lack of work, download limits etc. etc. It has been the lack of information and updates from SAH HQ as to just WTF is going on down in the Berkeley bunker.

The information and feedback you have posted over the last 2 weekends goes a long way towards filling that void and has been greatly appreciated.

Thanks again.

Brodo
ID: 1014644 · Report as offensive
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1014791 - Posted: 11 Jul 2010, 21:17:38 UTC

I raised the limits a little bit ago to:

10/CPU
80/GPU
2000/total
ID: 1014791 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6651
Credit: 121,090,076
RAC: 0
United States
Message 1014792 - Posted: 11 Jul 2010, 21:19:27 UTC - in response to Message 1014791.  

I raised the limits a little bit ago to:

10/CPU
80/GPU
2000/total


Thank you indeed! That explains what I have been seeing. Very much appreciated! :)

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1014792 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1014796 - Posted: 11 Jul 2010, 21:31:28 UTC - in response to Message 1014791.  

Thanks Jeff,

got a few more Cuda Wu's to go with Astropulse Wu's i got over the weekend,

Claggy
ID: 1014796 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1014797 - Posted: 11 Jul 2010, 21:33:21 UTC

Thank you once again, Jeff.

For both the raise in limits and the communication.

Meow meow!
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1014797 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1014814 - Posted: 11 Jul 2010, 22:38:06 UTC

Okee dokee....

Cricket graph has calmed down again.

Time for the next notch up?
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1014814 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1014820 - Posted: 11 Jul 2010, 22:51:59 UTC
Last modified: 11 Jul 2010, 22:52:24 UTC

edit: thank you for the bump:D

the network/server just recovered, approximately 2 hours after the bump, can you bump it again in another 6 hours?

these large bumps make it more fair in my opinion, some can only get a few hundred work units on Monday, while others could get 6,000!!!
ID: 1014820 · Report as offensive
Profile rebest Project Donor
Volunteer tester
Avatar

Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 45,357,093
RAC: 0
United States
Message 1014835 - Posted: 11 Jul 2010, 23:36:46 UTC - in response to Message 1014791.  
Last modified: 11 Jul 2010, 23:57:20 UTC

I raised the limits a little bit ago to:

10/CPU
80/GPU
2000/total


Better. Thanks very, very much. It's been a hassle manually rescheduling little bits at a time. It got to the point that I would download 3 WU, reschedule, then download 3 more; over and over. The good news is I now have as many CUDA WU's as I did when the outage started last Tuesday. That's a definite improvement.

VLARs are back, so all 40 of my allowed CPU WU's are VLAR again. Although I'm allowed 320 GPU WU, I was only able to download and reschedule 240 before hitting the CPU threshold. Unfortunately, that's 80 WU's I'll have to fight for on the buffet line tomorrow. :(

Thanks again.

Join the PACK!
ID: 1014835 · Report as offensive
Profile AllenIN
Volunteer tester
Avatar

Send message
Joined: 5 Dec 00
Posts: 292
Credit: 58,297,005
RAC: 311
United States
Message 1014840 - Posted: 11 Jul 2010, 23:48:48 UTC - in response to Message 1014532.  

Thanks for the reply Don. I guess that means (by my figures) that I will be running out of work on this machine with 17 hours left to go before the servers are turned back on.

Hopefully I will be able to get some additional WU's when the limits are off on Monday.

Thanks again!

Allen
ID: 1014840 · Report as offensive
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1014884 - Posted: 12 Jul 2010, 4:04:10 UTC

Limits are now:

20/CPU
160/GPU
4000/total

ID: 1014884 · Report as offensive
Profile gizbar
Avatar

Send message
Joined: 7 Jan 01
Posts: 586
Credit: 21,087,774
RAC: 0
United Kingdom
Message 1014886 - Posted: 12 Jul 2010, 4:04:55 UTC

Thanks very much for the information, Jeff!

Even if some people don't agree with the way that you and the crew have been handling things, the information that you have been providing has been priceless!

You may just end up being Seti's new PR guru!

(or at least help Matt out with a few posts here and there... :o) )

Giz.



A proud GPU User Server Donor!
ID: 1014886 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1014887 - Posted: 12 Jul 2010, 4:10:26 UTC - in response to Message 1014884.  

Limits are now:

20/CPU
160/GPU
4000/total

Thank you Jeff.. Seriously though.. the CPU/GPU limits are working great. I think it is totally safe to kill the "total" part.

But that should get me mostly out of tomorrows rush except for a small top off.
Janice
ID: 1014887 · Report as offensive
Bearcat

Send message
Joined: 10 Sep 99
Posts: 106
Credit: 10,778,506
RAC: 0
United States
Message 1014889 - Posted: 12 Jul 2010, 4:23:48 UTC - in response to Message 1014884.  
Last modified: 12 Jul 2010, 4:27:59 UTC

Limits are now:

20/CPU
160/GPU
4000/total


I have to agree with gizbar, just a little info here and there goes a long way. Thank you Jeff, keep up the good work.
ID: 1014889 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1014896 - Posted: 12 Jul 2010, 4:48:19 UTC

Jeff,

any chance of bringing the Beta project up so we can report our completed work?

Claggy
ID: 1014896 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1014936 - Posted: 12 Jul 2010, 8:55:39 UTC - in response to Message 1014309.  
Last modified: 12 Jul 2010, 8:58:17 UTC

I think the cricket shape we are trying for is that of a bath tub. High during the Friday opening and the Monday queue filling but less than max on the weekend. How deep that tub is is where the tuning comes in.


A few thoughts (The Server Stats being down for a while means a fair bit more guesswork than usual).

The intial limits of
CPU 5
GPU 40
Total 140
resulted in about a 4 hour period of full bandwidth being used.

changing them to
CPU 6
GPU 48
Total 150
resulted in another 4-5 hour burst.

The change of the total limit to 1,000 didn't appear to have much of an effect, about an hour or 2 (if that) of heavy traffic.

changing them to
CPU 10
GPU 80
Total 2000
resulted in a 3 hour period of heavy traffic.

changing them to
CPU 20
GPU 160
Total 4000
resulted in a 4-5 hour burst of heavy traffic (just dropping off now).


Depending when the servers are fired up, and whether the system for adjusting the limits can be automated, it should be possible to be at the present settings (CPU 20, GPU 160, Total 4000) by Friday night. Ideally the limits would be tweaked gradually over the weekend to top up caches in small bursts so come Monday when the limits are removed all caches (even the largest of the large) would be full by the time the next outage is due.
And as the change in total limit only from 150 to 1,000 had such a small effect it'd be worth trying things without it in place from the start after the next outage.


The only thing i find odd is the AstroPulse behaviour. I can't remember who pointed it out during the last outage, but the short bursts of network traffic that occur after everyone has reached the limits set at that time is due to AP downloads (Scarecrow's graphs show the AP Ready to Send queue slowly building up, then draining rapidly at the same time as the bursts in network traffic).
Grant
Darwin NT
ID: 1014936 · Report as offensive
Cameron
Avatar

Send message
Joined: 27 Nov 02
Posts: 110
Credit: 5,082,471
RAC: 17
Australia
Message 1014961 - Posted: 12 Jul 2010, 12:04:46 UTC

Thanks Jeff.

Your final/current Settings seem perfect. I've been able to top my cache up and last the outage before you actually remove the limits in a few hours time.
ID: 1014961 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 3187
Credit: 57,163,290
RAC: 0
United States
Message 1014996 - Posted: 12 Jul 2010, 14:52:44 UTC - in response to Message 1014896.  

Jeff,

any chance of bringing the Beta project up so we can report our completed work?

Claggy


I second the motion! I've got 88 units beggin' to be reported...
.

Hello, from Albany, CA!...
ID: 1014996 · Report as offensive
Jeff Cobb Project Donor
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 1 Mar 99
Posts: 122
Credit: 40,367
RAC: 0
United States
Message 1015002 - Posted: 12 Jul 2010, 15:06:21 UTC

Limits are off.

Beta is up (sorry about that!).

Thank you all for the limits feedback. That will be helpful
come the next server run.
ID: 1015002 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Technical News : quick server side update


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.