Busy Bytes (Jul 06 2009)

Message boards : Technical News : Busy Bytes (Jul 06 2009)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Sacaripasa
Avatar

Send message
Joined: 29 Dec 05
Posts: 13
Credit: 2,050,629
RAC: 1
United States
Message 915168 - Posted: 7 Jul 2009, 2:31:08 UTC - in response to Message 915119.  

There is no "prize" at the end of a fictitious line here, so if there is down time, people should understand that, take a deep breath and wait for it to clear. Being #1 cruncher gets you what, bragging rights on a message board!? Matt is in the most awful position here by having to obey orders, even if he disagrees, or has better ideas that are falling on def ears. It would be great for another 80k, but with today's financial rout, that will be just as hard as have a problem free seti@home day. BTW, I'll be in the top 50 of my class tomorrow, first milestone!
ID: 915168 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30591
Credit: 53,134,872
RAC: 32
United States
Message 915169 - Posted: 7 Jul 2009, 2:31:26 UTC - in response to Message 915126.  

wrong answer the problem onley started within the last three weeks.... before not to bad ... something changed but what??????????????????

CUDA
No Seriously!
A lot of people have gone out a gotten a card to crunch more. It takes just one straw to break the camels back. Someone bought that card and put it up. Too much of a good thing ...

If everyone took 5% off their SETI percentage and gave it to their backup project(s) it would clear the logjam for now. The long term answer is $$$ to get gigabit up the hill.

I've got an alternative suggestion for those that refuse to run a backup. Turn your computer off for an hour or two a day and send the $$ you don't spend on electricity in as a donation.

ID: 915169 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 915171 - Posted: 7 Jul 2009, 2:36:01 UTC - in response to Message 915148.  

Re: Maxed out Bandwidth.

This seems to cause the most anger amongst crunchers because the uploads are blocked, and when that happens some hosts run out of work and can't download any more WU's. Users then increase cache size to try to hold enough work to tide them over, which just makes the problem worse.

Why not restrict bandwidth of the (just the) download servers to 80-85Mb/s?

This would obviously lengthen the time of max DL bandwidth, but leave enough bandwidth for the uploads to get through. This in turn would:-
1). Reduce storage requirements for 'In progress' work
2). Allow crunchers to get new work because they can upload completed work. (reduce frustrations)
3). Reduce need for very large cache.

Theoretically everybody would get some (enough) work, and large caches would slowly fill over time, faster as demand reduced.

Is there a flaw in my logic?


I made more or less the same suggestion in another thread. However, I thought that if they could prioritize the uploads, without a static restriction, that would help solve the problem. Nobody seemed to understand my suggestion, or it is a bad one.

I don't think reducing cache size matters too much, but reducing storage requirements by being sure the uploads succeed does make sense, even at the inefficient use of bandwidth.

A while ago, seti was running on 1/2 the band width and pegging it. Something happened and the bandwidth doubled, and overnight it was pegged again. Since the number of users didn't change so quickly, and this was before the baraCUDA, I think procedure in the back-office must have changed to choke the bandwidth chicken.

From a different perspective, like any other 'free' resource, bandwidth will be used until it is used up. The fact that the bandwidth use is pegged so frequently is no surprise. However, the boinc system (including mysql) just doesn't seem to be very solid or at least doesn't seem optimized. I'd rather see some change in procedures or evolution of boinc, before investing in yet more hardware and the like. It's a matter of trust, I suppose.
ID: 915171 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 915207 - Posted: 7 Jul 2009, 4:03:47 UTC - in response to Message 915126.  

wrong answer the problem onley started within the last three weeks.... before not to bad ... something changed but what??????????????????

If you think it's never happened before, then you've not noticed, or you've forgotten.

The problem starts when there is some sort of interruption -- a server fails, or some bottleneck slows things down, and it doesn't get caught right away.

Over the next day or so, a backlog builds up, and then the BOINC clients start trying to "push through" all at once.

It's particularly bad when there is a batch of "short" work like we've had lately, because that also increases the load.

The glitch that over-distributed Astropulse adds to that, since AP work units are long.

Faster processors (and CUDA) increase load as well, but that didn't happen in the last three weeks.

In other words, several things have pushed the load just a little beyond "normal" and it doesn't take much.

But it has all happened before, and at least so far it doesn't seem any worse than earlier glitches.
ID: 915207 · Report as offensive
Profile Jon Golding
Avatar

Send message
Joined: 20 Apr 00
Posts: 105
Credit: 841,861
RAC: 0
United Kingdom
Message 915249 - Posted: 7 Jul 2009, 8:29:38 UTC

Are there any "spare" servers, down the hill or elsewhere on/off campus, belonging to the Space Sciences Lab that could be used as a download mirror for data distribution to clients, to relieve pressure on the normal servers?
Nothing complicated - this would be only for downloads of a large reservoir of pre-split WUs at busy times. All processed data would be returned by clients to the normal servers up the hill. It could be one of the student's jobs to swap in a new drive of pre-split data every few days.
ID: 915249 · Report as offensive
djmotiska

Send message
Joined: 26 Jul 01
Posts: 20
Credit: 29,378,647
RAC: 105
Finland
Message 915253 - Posted: 7 Jul 2009, 8:51:18 UTC - in response to Message 915249.  

I was just about to suggest using mirrors. Looks like Einstein@Home use them.

The mirror server(s) could be anywhere, there's 900mbit of unused capacity in that gigabit line to upload splitted wu's to mirror server(s). Of course the wu's can be splitted at the mirror. The current Seti servers could do the other jobs and maybe some of the download traffic.
ID: 915253 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 915256 - Posted: 7 Jul 2009, 9:24:11 UTC
Last modified: 7 Jul 2009, 9:26:39 UTC


Because of the traffic..

A new cable (down the hill) or what ever is needed..

~ 80,000.- US$ ?


It's not possible to make it wireless?


I don't speak about WLAN, I guess the distance would be too far. (WLAN max. 100 MBit/s ?)

How long is the distance?

But in Germany some cities and villages use wireless DSL for to save the costs for to open the streets and bury the cables.
One transmitter on a high building - maybe church, town hall or what ever and in the houses the receiver.
I guess it's like radio.

I don't know how the bandwidth would be, maybe more transmitter and receiver needed to reach 1 GBit/s?
Or maybe 'only' double the current to 200 MBit/s?

I don't know if this would be cheaper and more possible as to bury a big cable in the bottom.

Also, don't laugh.. ;-) ..what about SAT DSL? It would be looking very crazy if on the campus would be one/some (depend of the bandwidth) satellite dish which don't 'look' to the sky..
..but if it would help..

ID: 915256 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 915257 - Posted: 7 Jul 2009, 9:31:09 UTC
Last modified: 7 Jul 2009, 9:31:34 UTC


OTOH.. BTW.. maybe..

There are 'WLAN amplifiers' available, which are for long distances?

Every 100 or 200 m an amplifier and well.

Maybe, if every WLAN connection 100 MBit/s.. two side by side for double traffic connection.. ..or use both.. the current cable and one/two WLAN.

ID: 915257 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 915259 - Posted: 7 Jul 2009, 9:37:13 UTC - in response to Message 915256.  

It's not possible to make it wireless?

It is.
Grant
Darwin NT
ID: 915259 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 915270 - Posted: 7 Jul 2009, 10:47:49 UTC - in response to Message 915259.  
Last modified: 7 Jul 2009, 10:48:45 UTC

It's not possible to make it wireless?

It is.


And.. why they don't do it? ;-)

I guess it would be a lot cheaper than with cable..

Maybe here are people around which have knowledge about wireless equipment (WLAN, SAT, radio, or what ever) and knowledge how to do it?

ID: 915270 · Report as offensive
Profile S@NL - Eesger - www.knoop.nl
Avatar

Send message
Joined: 7 Oct 01
Posts: 385
Credit: 50,200,038
RAC: 0
Netherlands
Message 915296 - Posted: 7 Jul 2009, 12:19:51 UTC

An other idea.. (maybe)

I have built the "heart" of the statistics of our team. I realize my "problems" are of a much smaller magnitude compared to Berkeleys, but I do think there are similarities.

One MySql table, the user history, has been trouble for a while now.. probably due to it's size of almost 8 Gb.. (like: data corruption => complete restore from backup, sql queries getting stuck => detection of inactivity, restarting history generation with detection as to "were was I")..

To fix this I first tried an other engine (innoDB instead of MyISAM), no joy (table got much larger and processing got much slower).

Next we'll try to upgrade to a newer MySql engine.. but I fear that will not (completely) fix the problem.

Now I am thinking of splitting the data in several identical tables containing:
(arg, [ code ] doesn't preserve multiple spaces)
___>=__ ___<___ _id's_ active %
______0 _500000 152071  45589  30
_500000 2500000 132865  25980  20
2500000 8250000 170345  20092  12
8250000 8500000 154294  14652  _9
8500000 8750000 155417  12821  _8
8750000 9000000 164536  20945  13
9000000 9999999 _59176  24888  42

and for the 'read queries' use a MRG_MyISAM structure.. that way I could write to the needed table based upon an id 'greater then .. AND less then ..' and keep the reading simple with the merge table.
This might help in:
- easier backup and restore (less massive backup files, partial restores if needed)
- less fragmentation due to smaller tables => more stability
- faster due to smaller index tables => more stability

The potential unhandy part is that I discovered that altered data in the underlying table isn't pushed to the merge-table. In an other table structure I used this to fix that:
ALTER TABLE `merge_table` UNION=(`table_1`,`table_2`....);

I haven't done any testing what this query does on a huge table... (it worked like a charm on a couple of 50k rows tables though :D)

Is there merit to this idea and could "table splitting" be an idea for the Berkeley tables as well?
The SETI@Home Gauntlet 2012 april 16 - 30| info / chat | STATS
ID: 915296 · Report as offensive
BMgoau

Send message
Joined: 8 Jan 07
Posts: 29
Credit: 1,562,200
RAC: 0
Australia
Message 915298 - Posted: 7 Jul 2009, 12:28:10 UTC

I think a picture really makes it clear what the effect of the recent problems have been on the grid. SETI's growth has effectively stopped.

ID: 915298 · Report as offensive
Wolverine
Avatar

Send message
Joined: 9 Jan 00
Posts: 35
Credit: 7,361,717
RAC: 0
Canada
Message 915305 - Posted: 7 Jul 2009, 13:06:09 UTC

How about sending more of the AP 5.05 and less Seti@home 6.03. More time out in the field processing with less contacts to the sever for more work. Days worth work at a time instead of hours.

Sure the initial change might be a little rough, but I think it will smooth out once you get enough "larger" workunits out in the field.

Once production of more larger work units is up to par, then re-examine the workunit compression idea. See if it would now be worth it due to the amount of AP units has been increased.

Is it possible to produce workunits for an offsite mirror while the system is down for weekely Tues. Maint.??? Stock up while things are cleaned up? This may reduce the startup stress on the servers when it all comes back online. Let the offsite mirror take the load and give the main server time to build up some workunits.


Just some things I wanted to throw out there for people to chew on.

- W
ID: 915305 · Report as offensive
Profile RandyC
Avatar

Send message
Joined: 20 Oct 99
Posts: 714
Credit: 1,704,345
RAC: 0
United States
Message 915312 - Posted: 7 Jul 2009, 13:29:28 UTC - in response to Message 915305.  

How about sending more of the AP 5.05 and less Seti@home 6.03. More time out in the field processing with less contacts to the sever for more work. Days worth work at a time instead of hours.

Sure the initial change might be a little rough, but I think it will smooth out once you get enough "larger" workunits out in the field.


This is what many people (including myself) would like. The problem is that the AP and MB WUs are split from the same input datasets. To make a long story short, the AP splitters chew through the input files FASTER than MB WUs are being processed. Two or three weeks ago we had the situation where the AP splitters had processed every single one of about 100 input files while the MB splitters were busy working on the first 10 - 20 input files! The AP splitters were basically shut down for a couple of weeks to let MB catch up on the backlog.
ID: 915312 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 915319 - Posted: 7 Jul 2009, 13:53:44 UTC


Now I remember.. also.. TV channel have trucks if they make a live report from special places.
IIRC.. this small satellite dish transmitter are with (a kind of) microwaves.
Of course - video and audio need more bandwidth as only radio.. but maybe this technic is well for internet traffic.


Examples of this trucks:
http://de.wikipedia.org/wiki/%C3%9Cbertragungswagen

http://de.wikipedia.org/wiki/Satellite_News_Gathering
In this report they talk about 'only' 6 - 10 or 16 - 24 MBit/s..
I don't know how old this report is (or this infos in it), but maybe current the technic is faster.. ;-)

ID: 915319 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 915323 - Posted: 7 Jul 2009, 14:02:14 UTC - in response to Message 915003.  
Last modified: 7 Jul 2009, 14:03:09 UTC

Matt,
If this is really a science/research project, then there are only two solutions to the bandwidth problem.

1. Get more funding
2. Deploy the NTPCkr

I HIGHLY favor these two over introducing some non-productive or artificial latency. From these two goals, other ideas become possible.

I would really like to install a server-side instance of BOINC on my machine at home, but I need to make time and upgrade drive space. It would really help me understand how this all works, and it sounds like something cool to get into.
ID: 915323 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 915332 - Posted: 7 Jul 2009, 14:25:54 UTC


How about to outsourcing the Berkeley traffic?

If SETI@home have members with big internet connections and big HDDs, maybe the SETI@home scheduler say at every work request where other member can download new WUs or upload the results.
If not for free, maybe for some money for the electriciy bill.

And this outsourced server at SETI@home members make at different times report/UL/DL to Berkeley.

Or maybe the scheduler at SETI@home tell every time an other outsourced server that this servers will also not have full load.

IIRC.. the software Skype use also user hardware/internet connection of many users at home for better voice quality.


..this is the last idea for today.. I don't want to SPAM the forum.. ;-D

ID: 915332 · Report as offensive
Profile Geek@Play
Volunteer tester
Avatar

Send message
Joined: 31 Jul 01
Posts: 2467
Credit: 86,146,931
RAC: 0
United States
Message 915345 - Posted: 7 Jul 2009, 15:05:28 UTC
Last modified: 7 Jul 2009, 15:06:08 UTC

As a scientific project the scientists in charge must keep extremely tight control of the data in order to preserve it's validity. It's not likely that they would move data to servers outside of their direct control because of this.

Even the data we are given is checked by a wingman before it's loaded into the data base. Constant checks and controls have to be maintained to maintain the validity of the science.
Boinc....Boinc....Boinc....Boinc....
ID: 915345 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 915360 - Posted: 7 Jul 2009, 15:56:28 UTC - in response to Message 915003.  

...
We could also increase the resolution of chirp rates that we process, thus lengthening the time it takes to process a workunit.
...


This is 'usefull' for the science?
What does this mean?
The data will be calculated more enhanced/fine/accurate?


BTW.
This isn't an idea.. it's a question.. ;-)

ID: 915360 · Report as offensive
Profile cgregb
Avatar

Send message
Joined: 2 Apr 03
Posts: 515
Credit: 7,776,883
RAC: 362
United States
Message 915369 - Posted: 7 Jul 2009, 16:13:14 UTC - in response to Message 915003.  

Having read these forums for a while I appreciate your pain.

I am not a techie, but a money and marketing guy.

A few thoughts:

Define the probem, not just the symptoms (scientific)

Define the best solution (scientific)

Define the resources needed (scientific, poitical, monetary)

Create a liasion team to assist in the non - scientific issues.

Delegate the tasks to responsible people (of which there are thousands trying to help you)

Implement - Evaluate - Adjust - do it again.

I am sure you realize you are on the bleeding edge of technology You will eventually write the book (if you are not writing now) on distributed processing. I would think that several commercial organizations would be more than happy to review your work for a mere donation of money or equipment. You have ten+ years of experience going where no one has gone before (pardon the pun)

You and the guys at Berkley are making tremendous progress in computer science as well as Astronomy.

Like any other not for profit (not non-profit) organization, you have grown to the pont where you need additional infrastructure in terms on people skills to allow you to focus on your goals.

We are fully behind you! Take the step. Dare to be great, don't settle for good enough.

Go get em, tiger!

Greg
Greg in Phoenix

ID: 915369 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Technical News : Busy Bytes (Jul 06 2009)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.