Maxed (Dec 16 2010)


log in

Advanced search

Message boards : Technical News : Maxed (Dec 16 2010)

1 · 2 · 3 · 4 . . . 6 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1384
Credit: 74,079
RAC: 0
United States
Message 1056774 - Posted: 16 Dec 2010, 22:49:16 UTC

We're back to shoveling out workunits as fast as we can. I mentioned in another thread that the gigabit link project is still alive. In fact, the whole lab is interested in getting gigabit connectivity to the rest of campus, which makes the whole battle a lot easier (we'll still have to buy our own bits and get the hardware to keep them separate). Still, it's slow going due to campus staff cutbacks and higher priorities.

With the heavy load on oscar (splitting and assimilating full bore) I got some good i/o stats to determine how much we should reduce the stripe size on its database RAID partition. This will be enacted next week during the return of the 3-day weekly outage. It's unclear how regular these extended weekly outages will be - we'll figure that all out in the new year.

But back to oscar... we were pushing it pretty hard today - almost too much. It looked like we were about to run out of workunits for a minute there but I caught it just in time. We're still trying to figure some things out.

By the way, I think there was some general maintenance around the lab in general, which may have caused a temporary network "brown out."

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile soft^spirit
Avatar
Send message
Joined: 18 May 99
Posts: 6374
Credit: 28,216,480
RAC: 183
United States
Message 1056781 - Posted: 16 Dec 2010, 23:01:41 UTC - in response to Message 1056774.

Matt, please consider the throttle backs I mentioned through Eric and in NC:Panic 42, and to back us off as needed. This should lighten the load on both Oscar and the link.
____________

Janice

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 6968
Credit: 57,056,365
RAC: 22,601
Germany
Message 1056810 - Posted: 17 Dec 2010, 0:22:01 UTC - in response to Message 1056774.
Last modified: 17 Dec 2010, 0:23:00 UTC

Matt, thanks for the news!


BTW, the currently 'WU limit in progress' is too low for to bridge the next 3 day outage.. ;-)
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8275
Credit: 44,925,774
RAC: 13,458
United Kingdom
Message 1056812 - Posted: 17 Dec 2010, 0:27:15 UTC - in response to Message 1056810.

Matt, thanks for the news!

BTW, the currently 'WU limit in progress' is too low for to bridge the next 3 day outage.. ;-)

But it's too high to avoid maxing out the pipe ;-)

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 6968
Credit: 57,056,365
RAC: 22,601
Germany
Message 1056815 - Posted: 17 Dec 2010, 0:36:31 UTC - in response to Message 1056812.
Last modified: 17 Dec 2010, 0:38:23 UTC

Matt, thanks for the news!

BTW, the currently 'WU limit in progress' is too low for to bridge the next 3 day outage.. ;-)

But it's too high to avoid maxing out the pipe ;-)


Yes..

But only one day (Monday - Tuesday morning) for to fill up a 3 day WU cache is too low.
At least if you need 2,700+ WUs for 3 days. And you have only DSL light (384/64 kbit/s).
In past it was like this, increased limit at Monday and immediately backlogged DLs in BOINC. And not enough WUs DLed for to bridge the 3 days.

It would be wonderful to have a gigabit connection to the SETI@home server.

;-)
____________
BR



>Das Deutsche Cafe. The German Cafe.<

-BeNt-
Avatar
Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1056858 - Posted: 17 Dec 2010, 4:30:25 UTC - in response to Message 1056815.

Matt, thanks for the news!

BTW, the currently 'WU limit in progress' is too low for to bridge the next 3 day outage.. ;-)

But it's too high to avoid maxing out the pipe ;-)


Yes..

But only one day (Monday - Tuesday morning) for to fill up a 3 day WU cache is too low.
At least if you need 2,700+ WUs for 3 days. And you have only DSL light (384/64 kbit/s).
In past it was like this, increased limit at Monday and immediately backlogged DLs in BOINC. And not enough WUs DLed for to bridge the 3 days.

It would be wonderful to have a gigabit connection to the SETI@home server.

;-)


I think the issue now will still be a bottleneck even with a gigabit link. It will just take a larger number of concurrent downloaders, or make the duration of the log jam shorter. Too bad we don't have a scheduler for timed downloads where one user would recieve WU's in chunks of 5 or 10 then allow the next person to go. I believe this is the idea behind the back off period, but the number of users fighting over spots makes it more of less a dog fight for the first position that opens up.

____________
Traveling through space at ~67,000mph!

bill
Send message
Joined: 16 Jun 99
Posts: 847
Credit: 20,550,935
RAC: 14,586
United States
Message 1056862 - Posted: 17 Dec 2010, 4:48:45 UTC - in response to Message 1056858.
Last modified: 17 Dec 2010, 4:56:45 UTC

So who do you want to alienate? A lot of small crunchers and let the large crunchers hog the bandwidth/time slots, or a few big crunchers by letting everybody have an equal chunk of bandwidth/time slots.

Edited (note to self, don't forget to proof read before posting)

-BeNt-
Avatar
Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1056883 - Posted: 17 Dec 2010, 7:00:09 UTC - in response to Message 1056862.

So who do you want to alienate? A lot of small crunchers and let the large crunchers hog the bandwidth/time slots, or a few big crunchers by letting everybody have an equal chunk of bandwidth/time slots.

Edited (note to self, don't forget to proof read before posting)


I suppose you are talking to me? I didn't know I had implied alienating anyone. I think the time and bandwidth would be best served by equal bandwidth and time more than letting one 'class' dominate the connections.
____________
Traveling through space at ~67,000mph!

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,636,579
RAC: 13,037
United States
Message 1056886 - Posted: 17 Dec 2010, 7:19:59 UTC - in response to Message 1056883.

equal pay for everyone!!!
____________

Profile Bill Walker
Avatar
Send message
Joined: 4 Sep 99
Posts: 3246
Credit: 1,828,915
RAC: 228
Canada
Message 1056966 - Posted: 17 Dec 2010, 13:48:36 UTC - in response to Message 1056886.

equal pay for everyone!!!


From each according to their ability, to each according to their need. Isn't that the American way?
____________

Profile Helli
Volunteer tester
Avatar
Send message
Joined: 15 Dec 99
Posts: 697
Credit: 77,271,080
RAC: 75,129
Germany
Message 1056978 - Posted: 17 Dec 2010, 14:03:03 UTC - in response to Message 1056774.

... This will be enacted next week during the return of the 3-day weekly outage. It's unclear how regular these extended weekly outages will be - we'll figure that all out in the new year.
...
- Matt


Seriously?

That means nothing has changed for us, the Cruncher, and we are at the
same Point as before the big Outage? Overloaded internet connection, a
3-Day Outage and babysitting our Rigs...

I had hoped for more (because of our generous donation)… :-(

Helli
____________

Profile Chris S
Volunteer tester
Avatar
Send message
Joined: 19 Nov 00
Posts: 29483
Credit: 8,897,705
RAC: 27,161
United Kingdom
Message 1056982 - Posted: 17 Dec 2010, 14:18:13 UTC

But back to oscar... we were pushing it pretty hard today - almost too much.


Whenever anyone gets a new car the first thing they do is to gun it up the local freeway/interstate/by-pass etc to see what she'll do. Based upon that, they then know what a sensible level of performance is for longevity. Same with servers I guess! :-)

____________
Damsel Rescuer, Kitty Patron, Uli Fan, Julie Supporter, CAMRA
ES99 Admirer, Raccoon Friend, IFAW, PETA, 5% Badge


Aurora Borealis
Volunteer tester
Avatar
Send message
Joined: 14 Jan 01
Posts: 2975
Credit: 4,811,001
RAC: 1,567
Canada
Message 1056987 - Posted: 17 Dec 2010, 14:29:48 UTC - in response to Message 1056978.
Last modified: 17 Dec 2010, 14:32:25 UTC

... This will be enacted next week during the return of the 3-day weekly outage. It's unclear how regular these extended weekly outages will be - we'll figure that all out in the new year.
...
- Matt


Seriously?

That means nothing has changed for us, the Cruncher, and we are at the
same Point as before the big Outage? Overloaded internet connection, a
3-Day Outage and babysitting our Rigs...

I had hoped for more (because of our generous donation)… :-(

Helli

"I find my life is a lot easier the lower I keep my expectations."
–Calvin

"I’m not in this world to live up to your expectations and you’re not in this world to live up to mine."
— Bruce Lee

"Blessed is he who expects nothing, for he shall never be disappointed."
— Alexander Pope


My only hope is that Nitpicker will only take a year instead of the estimated 2.5 years to search through the current database.

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 21,567,117
RAC: 2,634
United States
Message 1057005 - Posted: 17 Dec 2010, 15:19:51 UTC

Part of the bandwidth/demand issue must be the limit of storage that berkeley can provide to hold results-in-progress. This issue is a technical one, for sure, meaning that it can only be understood from experience and testing. However, I would think that if we ran with larger caches, especially the crunchers, the delay times between retries were lengthened, and the download limit increased at least a small amount, the issue would be resolved in large measure, save the occasional seti crash or the (abhorrent) 3-day outages perhaps. We could hope that these actions would reduce the ghost issue as well.

So, does anybody know how many results-in-progress seti can manage and are we close to that number yet?

In my case, on my main hosts, I have relented unwillingly just to run einstein with a resource setting of zero. It reduces my contribution to seti, but keeps the electrons oscillating when seti goes burp.

msattler
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 37287
Credit: 498,390,064
RAC: 496,469
United States
Message 1057006 - Posted: 17 Dec 2010, 15:22:39 UTC - in response to Message 1057005.

Part of the bandwidth/demand issue must be the limit of storage that berkeley can provide to hold results-in-progress. This issue is a technical one, for sure, meaning that it can only be understood from experience and testing. However, I would think that if we ran with larger caches, especially the crunchers, the delay times between retries were lengthened, and the download limit increased at least a small amount, the issue would be resolved in large measure, save the occasional seti crash or the (abhorrent) 3-day outages perhaps. We could hope that these actions would reduce the ghost issue as well.

So, does anybody know how many results-in-progress seti can manage and are we close to that number yet?

In my case, on my main hosts, I have relented unwillingly just to run einstein with a resource setting of zero. It reduces my contribution to seti, but keeps the electrons oscillating when seti goes burp.

I would think that with the new servers online, our 'in process' limit has been increased immensely.

Kitties go burp from digesting the new WUs sent recently....
____________
******************
Crunching Seti, loving all of God's kitties.

I have met a few friends in my life.
Most were cats.

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 21,567,117
RAC: 2,634
United States
Message 1057007 - Posted: 17 Dec 2010, 15:24:00 UTC - in response to Message 1056966.

equal pay for everyone!!!


From each according to their ability, to each according to their need. Isn't that the American way?


I can resist a reply: This quote is simplistic liberal pap because it leaves hanging who decides ability and need. Too many Americans fail to read between the lines.

N9JFE
Volunteer tester
Avatar
Send message
Joined: 4 Oct 99
Posts: 9306
Credit: 11,905,639
RAC: 14,495
United States
Message 1057011 - Posted: 17 Dec 2010, 15:30:55 UTC - in response to Message 1056978.

... This will be enacted next week during the return of the 3-day weekly outage. It's unclear how regular these extended weekly outages will be - we'll figure that all out in the new year.
...
- Matt


Seriously?

That means nothing has changed for us, the Cruncher, and we are at the
same Point as before the big Outage? Overloaded internet connection, a
3-Day Outage and babysitting our Rigs...

I had hoped for more (because of our generous donation)… :-(

Helli

Seriously? You only read the first sentence of what you quoted?

I was hoping they could cut the outages down to two days, but it sounds like they're leaning more toward not doing them every week.

I, for one, am fine with that. I just let BOINC run on its own, the way it was designed to, for weeks at a time without even opening the BOINC manager and months at a time without getting into my preferences on the project web sites. I'm taking more active control lately because running Einstein exclusively during the Seti outage was causing my comptuer to crash, but I'm hoping to get it to reach a new equilibrium so I can leave it alone again. If that means I occasionally go a day or two without any work to do, then so be it.

David
____________
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.


Profile Helli
Volunteer tester
Avatar
Send message
Joined: 15 Dec 99
Posts: 697
Credit: 77,271,080
RAC: 75,129
Germany
Message 1057017 - Posted: 17 Dec 2010, 15:42:45 UTC - in response to Message 1057011.

LOL

If i would also have a RAC of only 118 - i would also not be worried. :D

Helli
____________

ACisnero210
Volunteer tester
Send message
Joined: 2 Aug 08
Posts: 20
Credit: 1,128,485
RAC: 0
United States
Message 1057031 - Posted: 17 Dec 2010, 16:21:13 UTC - in response to Message 1057007.

equal pay for everyone!!!


From each according to their ability, to each according to their need. Isn't that the American way?


I can resist a reply: This quote is simplistic liberal pap because it leaves hanging who decides ability and need. Too many Americans fail to read between the lines.

How about this? Too many Americans complicate the simplest idea!
____________

Profile Profi
Volunteer tester
Send message
Joined: 8 Dec 00
Posts: 19
Credit: 12,974,281
RAC: 393
Poland
Message 1057036 - Posted: 17 Dec 2010, 16:26:37 UTC
Last modified: 17 Dec 2010, 16:46:38 UTC

[As of 17 Dec 2010 16:00:08 UTC]

Results ready to send:198,960 <- BTW - shouldn't it be "Workunits" instead of "Results"? (maybe I'm getting this wrong but isn't it an amount of work-to-do created?)

Current result creation rate 16.4682/sec <- same as above - WU's instead of Results? (a work-to-do creation rate or is it a rate at which the results from users are being inserted into the database?)

Am I misunderstanding the definitions of "Workunit" and "Result"?

Is the "Workunit" a portion of a raw telescope data "chopped" into units of approx. 380KB (ok - skipping VLAR and other "derivatives") and sent to users?

Is a "Result" an outcome of cruncher's machine calculation on chunk of raw data which is returned to S@H server for cross-checking with other results returned by other users(machines) working on the same chunk of data (and other server-side operations like DB assimilation etc.) ?

Correct me here if I'm wrong - thanks in advance...


Results received in last hour 50,451

1hr=3600s

So from simple "steady-state" calculation:

At current rate

16.4682*3600=59285,52 WU's can be created each hour.

So right now the project is more less balanced - creation rate is slightly higher than WU demand. A current WU's pool will satisfy 4 hr of Crunchers' demand. If you want to have 3 days outage, than I'm afraid that the project will run out of the woods and it will get worse (after outage - higher demand) and as a time goes by crunching machines are not getting slower....

But keep on pushing folks!! :)

Profi

p.s. I was suggesting 1Gbit connection as a temporary solution a long time ago - hopefully it will be realized soon. But I think that sooner or later "server side" of the project will have to be splitted somehow across a couple of world-spreaded sites and Berkeley should remain "capo di tutti capi" of the project - otherwise S@H may become a victim of it's own success.
____________

1 · 2 · 3 · 4 . . . 6 · Next

Message boards : Technical News : Maxed (Dec 16 2010)

Copyright © 2014 University of California