quick server side update


log in

Advanced search

Message boards : Technical News : quick server side update

Previous · 1 · 2 · 3 · Next
Author Message
Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,148
RAC: 1
United States
Message 1014543 - Posted: 11 Jul 2010, 2:43:37 UTC - in response to Message 1014532.

Jeff,

I don't know much about the configuration of the Seti db, but right now I only get 10 wu's at a time and when the servers shutdown, that only leaves me a day and a half of work for my dual processor. I have to sit doing nothing for the other day and a half. Anything that I can do about that?

Allen


Allen:

To help prevent database/server crashes due to overload after the 3-day outage, there are TEMPORARY limits in place for downloads, based on how many WUs your computers already have in progress. Last time I checked, they were at 8 per cpu core, 48 per gpu, and 1000 per host. Since (per the public data on your S@H profile page) you have more than 8 WUs in progress on each of your cpu's, that's all you can get right now. You may get some more as you complete WUs, or as Jeff raises the limits.

Jeff has been raising the limits periodically since the servers came up on Friday morning, and he has said he will remove all temporary limits by Monday morning. Then you should be able to load up for the next 3-Day outage.

More current info is available in several threads running in the Number Crunching section of the Forum.


The problem people were having was from the combined limit. The seperate cpu and gpu limits seem to have functioned admirably, and raising them as you see fit sounds like a great plan. It keeps demands manageable.


____________

Janice

gomeyer
Volunteer tester
Send message
Joined: 21 May 99
Posts: 488
Credit: 50,157,953
RAC: 0
United States
Message 1014556 - Posted: 11 Jul 2010, 4:13:28 UTC

Jeff,

First, thanks for the hard work and for keeping us updated! All of my cores are now running hot, straight, and normal and so of course I'm content for now.

I would like to add my voice to two requests already made here and in other threads:

- It has been suggested that it should be possible to set up the scheduler so as to NOT assign VLAR work to GPU's. If so it would go a long way toward keeping the gears greased even with the current somewhat lower limits. Not having to reschedule VLAR's from GPU to CPU on the local client would be a big help keeping the GPU's running during the 3 day outages and should also keep the Application Details on your servers much more accurate.

- Would it not be reasonable to raise limits or drop them altogether SOONER than 24 hours before the outage? As I'm sure you noticed last week the bandwidth was totally saturated from Mon to Tues A.M. and many WU's that had been scheduled were never able to come down before the outage began. They did DL quickly after the outage but by that time some machines were out of work. If necessary to help with bandwidth limits (and if it doesn't mean an unreasonable extra workload for your team) you might increase limits in stages starting 48 hours or even more before Tuesday.

Thanks again.

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1725
Credit: 205,997,973
RAC: 28,402
Australia
Message 1014644 - Posted: 11 Jul 2010, 13:33:25 UTC

Thanks Jeff
My main bitch all along has not been the outages, lack of work, download limits etc. etc. It has been the lack of information and updates from SAH HQ as to just WTF is going on down in the Berkeley bunker.

The information and feedback you have posted over the last 2 weekends goes a long way towards filling that void and has been greatly appreciated.

Thanks again.

Brodo

Jeff Cobb
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 1 Mar 99
Posts: 110
Credit: 40,367
RAC: 0
United States
Message 1014791 - Posted: 11 Jul 2010, 21:17:38 UTC

I raised the limits a little bit ago to:

10/CPU
80/GPU
2000/total
____________

Profile SciManStevProject donor
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 4880
Credit: 83,310,343
RAC: 37,731
United States
Message 1014792 - Posted: 11 Jul 2010, 21:19:27 UTC - in response to Message 1014791.

I raised the limits a little bit ago to:

10/CPU
80/GPU
2000/total


Thank you indeed! That explains what I have been seeing. Very much appreciated! :)

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4152
Credit: 33,848,432
RAC: 33,227
United Kingdom
Message 1014796 - Posted: 11 Jul 2010, 21:31:28 UTC - in response to Message 1014791.

Thanks Jeff,

got a few more Cuda Wu's to go with Astropulse Wu's i got over the weekend,

Claggy

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,654,737
RAC: 0
United States
Message 1014820 - Posted: 11 Jul 2010, 22:51:59 UTC
Last modified: 11 Jul 2010, 22:52:24 UTC

edit: thank you for the bump:D

the network/server just recovered, approximately 2 hours after the bump, can you bump it again in another 6 hours?

these large bumps make it more fair in my opinion, some can only get a few hundred work units on Monday, while others could get 6,000!!!
____________

Profile rebestProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 32,950,681
RAC: 9,739
United States
Message 1014835 - Posted: 11 Jul 2010, 23:36:46 UTC - in response to Message 1014791.
Last modified: 11 Jul 2010, 23:57:20 UTC

I raised the limits a little bit ago to:

10/CPU
80/GPU
2000/total


Better. Thanks very, very much. It's been a hassle manually rescheduling little bits at a time. It got to the point that I would download 3 WU, reschedule, then download 3 more; over and over. The good news is I now have as many CUDA WU's as I did when the outage started last Tuesday. That's a definite improvement.

VLARs are back, so all 40 of my allowed CPU WU's are VLAR again. Although I'm allowed 320 GPU WU, I was only able to download and reschedule 240 before hitting the CPU threshold. Unfortunately, that's 80 WU's I'll have to fight for on the buffet line tomorrow. :(

Thanks again.
____________

Join the PACK!

Profile AllenIN
Send message
Joined: 5 Dec 00
Posts: 159
Credit: 13,873,154
RAC: 12,874
United States
Message 1014840 - Posted: 11 Jul 2010, 23:48:48 UTC - in response to Message 1014532.

Thanks for the reply Don. I guess that means (by my figures) that I will be running out of work on this machine with 17 hours left to go before the servers are turned back on.

Hopefully I will be able to get some additional WU's when the limits are off on Monday.

Thanks again!

Allen
____________

Jeff Cobb
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 1 Mar 99
Posts: 110
Credit: 40,367
RAC: 0
United States
Message 1014884 - Posted: 12 Jul 2010, 4:04:10 UTC

Limits are now:

20/CPU
160/GPU
4000/total

____________

Profile gizbar
Avatar
Send message
Joined: 7 Jan 01
Posts: 586
Credit: 21,087,774
RAC: 0
United Kingdom
Message 1014886 - Posted: 12 Jul 2010, 4:04:55 UTC

Thanks very much for the information, Jeff!

Even if some people don't agree with the way that you and the crew have been handling things, the information that you have been providing has been priceless!

You may just end up being Seti's new PR guru!

(or at least help Matt out with a few posts here and there... :o) )

Giz.

____________


A proud GPU User Server Donor!

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,631,148
RAC: 1
United States
Message 1014887 - Posted: 12 Jul 2010, 4:10:26 UTC - in response to Message 1014884.

Limits are now:

20/CPU
160/GPU
4000/total

Thank you Jeff.. Seriously though.. the CPU/GPU limits are working great. I think it is totally safe to kill the "total" part.

But that should get me mostly out of tomorrows rush except for a small top off.
____________

Janice

Bearcat
Send message
Joined: 10 Sep 99
Posts: 106
Credit: 10,778,506
RAC: 0
United States
Message 1014889 - Posted: 12 Jul 2010, 4:23:48 UTC - in response to Message 1014884.
Last modified: 12 Jul 2010, 4:27:59 UTC

Limits are now:

20/CPU
160/GPU
4000/total


I have to agree with gizbar, just a little info here and there goes a long way. Thank you Jeff, keep up the good work.
____________

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4152
Credit: 33,848,432
RAC: 33,227
United Kingdom
Message 1014896 - Posted: 12 Jul 2010, 4:48:19 UTC

Jeff,

any chance of bringing the Beta project up so we can report our completed work?

Claggy

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5872
Credit: 60,873,411
RAC: 47,423
Australia
Message 1014936 - Posted: 12 Jul 2010, 8:55:39 UTC - in response to Message 1014309.
Last modified: 12 Jul 2010, 8:58:17 UTC

I think the cricket shape we are trying for is that of a bath tub. High during the Friday opening and the Monday queue filling but less than max on the weekend. How deep that tub is is where the tuning comes in.


A few thoughts (The Server Stats being down for a while means a fair bit more guesswork than usual).

The intial limits of
CPU 5
GPU 40
Total 140
resulted in about a 4 hour period of full bandwidth being used.

changing them to
CPU 6
GPU 48
Total 150
resulted in another 4-5 hour burst.

The change of the total limit to 1,000 didn't appear to have much of an effect, about an hour or 2 (if that) of heavy traffic.

changing them to
CPU 10
GPU 80
Total 2000
resulted in a 3 hour period of heavy traffic.

changing them to
CPU 20
GPU 160
Total 4000
resulted in a 4-5 hour burst of heavy traffic (just dropping off now).


Depending when the servers are fired up, and whether the system for adjusting the limits can be automated, it should be possible to be at the present settings (CPU 20, GPU 160, Total 4000) by Friday night. Ideally the limits would be tweaked gradually over the weekend to top up caches in small bursts so come Monday when the limits are removed all caches (even the largest of the large) would be full by the time the next outage is due.
And as the change in total limit only from 150 to 1,000 had such a small effect it'd be worth trying things without it in place from the start after the next outage.


The only thing i find odd is the AstroPulse behaviour. I can't remember who pointed it out during the last outage, but the short bursts of network traffic that occur after everyone has reached the limits set at that time is due to AP downloads (Scarecrow's graphs show the AP Ready to Send queue slowly building up, then draining rapidly at the same time as the bursts in network traffic).
____________
Grant
Darwin NT.

Cameron
Avatar
Send message
Joined: 27 Nov 02
Posts: 69
Credit: 1,044,880
RAC: 275
Australia
Message 1014961 - Posted: 12 Jul 2010, 12:04:46 UTC

Thanks Jeff.

Your final/current Settings seem perfect. I've been able to top my cache up and last the outage before you actually remove the limits in a few hours time.

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1965
Credit: 10,601,903
RAC: 14,286
United States
Message 1014996 - Posted: 12 Jul 2010, 14:52:44 UTC - in response to Message 1014896.

Jeff,

any chance of bringing the Beta project up so we can report our completed work?

Claggy


I second the motion! I've got 88 units beggin' to be reported...
____________
.

Jeff Cobb
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 1 Mar 99
Posts: 110
Credit: 40,367
RAC: 0
United States
Message 1015002 - Posted: 12 Jul 2010, 15:06:21 UTC

Limits are off.

Beta is up (sorry about that!).

Thank you all for the limits feedback. That will be helpful
come the next server run.
____________

Previous · 1 · 2 · 3 · Next

Message boards : Technical News : quick server side update

Copyright © 2014 University of California