Message boards :
Technical News :
Moving on... (Apr 08 2013)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
So! We made the big move to the colocation facility without too much pain and anguish. In fact, thanks to some precise planning and preparation we were pretty much back on line a day earlier than expected. Were there any problems during the move? Nothing too crazy. Some expected confusion about the network/DNS configuration. A lot of expected struggle due to the frustrating non-standards regarding rack rails. And one unexpected nuisance where the power strips mounted in the back of the rack were blocking the external sata ports on the jbod which holds georgem/paddym's disks. However if we moved the strip, it would block other ports on other servers. It was a bit of a puzzle, eventually solved. It feels great knowing our servers are on real backup power for the first time ever, and on a functional kvm, and behind a more rigid firewall that we control ourselves. As well, we no longer have that 100Mbit hardware limit in our way, so we can use the full gigabit of Hurricane Electric bandwidth. Jeff and I predicted based on previous demand that we'd see, once things settled down, a bandwidth usage average of 150Mbits/second (as long as both multibeam and astropulse workunits were available). And in fact this is what we're seeing, though we are still tuning some throttle mechanisms to make sure we don't go much higher than that. Why not go higher? At least three reasons for now. First, we don't really have the data or the ability to split workunits faster than that. Second, we eventually hope to move off Hurricane and get on the campus network (and wantonly grabbing all the bits we can for no clear scientific reason wouldn't be setting a good example that we are in control of our needs/traffic). Third, and perhaps most importantly, it seems that our result storage server can't handle much higher a load. Yes, that seems to be our big bottleneck at this point - the ability of that server to write results to disk much faster than current demand. We expected as much. We'll look into improving the disk i/o on that system soon. And we'll see how we fare after tomorrow's outage... What's next? We still have a couple more servers to bring down, perhaps next week, like the BOINC/CASPER web servers, and Eric's GALFA machines. None of these will have any impact on SETI@home. Meanwhile there's lots of minor annoyances. Remember that a lot of our server issues stemmed from a crazy web of cross dependencies (mostly NFS). Well in advance we started to untangle that web to get these servers on different subnets, but you can imagine we missed some pieces, and the resulting fallout of a decade's worth of scripts scattered around in a decade's worth of random locations expecting a mount to exist and not getting it. Nothing remotely tragic, and we may very well be beyond all that at this point. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
SciManStev Send message Joined: 20 Jun 99 Posts: 6657 Credit: 121,090,076 RAC: 0 |
Awesome Matt! This move has been a huge breath of fresh air for everyone! Crunching SETI just got a lot easier, and a lot more maintainable on your end! Well done! Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
shizaru Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0 |
Matt, you sound... happy!:D Congrats!! And like Steve said, awesome! |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Good news to all, now DL/UL are fast and without error, we just need now a small increase in the GPU WU limit to be totaly happy. |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30936 Credit: 53,134,872 RAC: 32 |
(and wantonly grabbing all the bits we can for no clear scientific reason Speaking of that, how are we doing on raw data collection vs. the speed at which we can crunch the data? I know the data page hasn't been updated in a coon's age as to what is collected. |
rebest Send message Joined: 16 Apr 00 Posts: 1296 Credit: 45,357,093 RAC: 0 |
Thanks, Matt. I hope that this means that you, Eric and the guys will have time to actually be creative rather than fighting fires. Join the PACK! |
Thomas Send message Joined: 9 Dec 11 Posts: 1499 Credit: 1,345,576 RAC: 0 |
Good Game Matt ! :) Congrats to all the roster of SETI@home ! Everyone has felt the success of the migration and the whole world is really happy. A new era for the project is open ! And what a relief for the team in terms of logistics... Thanks for the heads-up and thanks again for this big move. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Matt... Thank you for both your news post and the dedication to the project you have shown during the transition. I should wish one bit of explanation though, regarding this statement..... "Why not go higher? At least three reasons for now. First, we don't really have the data or the ability to split workunits faster than that." It would appear that the ability of the splitters to keep up with demand and the improved distribution of work with the increased bandwidth has been doing rather well since coming back up. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? This is an important question for some of us devoted souls. "Time is simply the mechanism that keeps everything from happening all at once." |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? Right... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
ML1 Send message Joined: 25 Nov 01 Posts: 21019 Credit: 7,508,002 RAC: 20 |
... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. Roll-on Real Time Processing :-) (Just a few GPUs needed?...) Next bit of research is how to sift through the huge database of results in parallel? ;-) Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30936 Credit: 53,134,872 RAC: 32 |
Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? Thanks for the straight answer. So just how much does it cost to put a receiver at another telescope (ballpark) so we can use more of our big fat pipe? |
Cheopis Send message Joined: 17 Sep 00 Posts: 156 Credit: 18,451,329 RAC: 0 |
Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? Any hints on where the team might be planning on going next? More sites for the same depth of analysis, or deeper analysis of the data that we already have a pipeline for? Or more background work shifted to the remote computers? RFID / NTPCKR/ splitting? If the answer is that you guys aren't sure yet because you are still watching how things develop, that's fine too! |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? Or how much would it cost to increase the bandwidth recorded at Arecibo? Dan Wertheimer noted that as a desirable change in a request for donations 2 or 3 years ago, and doubling the bandwidth doubles the amount of data per observation time. Joe |
Wolverine Send message Joined: 9 Jan 00 Posts: 35 Credit: 7,361,717 RAC: 0 |
Nice work! All that and I didn't feel a thing. Time for another $$ drive to fix that bottleneck? What is needed to eliminate it? Hardware? Cost?? |
HAL9000 Send message Joined: 25 Mar 13 Posts: 12 Credit: 259,905 RAC: 0 |
|
rob smith Send message Joined: 7 Mar 03 Posts: 22460 Credit: 416,307,556 RAC: 380 |
Nice work! All that and I didn't feel a thing. GPUUG has a funding drive for replacement servers running take a look at this thread: http://setiathome.berkeley.edu/forum_thread.php?id=70511 Then make a donation. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Cheopis Send message Joined: 17 Sep 00 Posts: 156 Credit: 18,451,329 RAC: 0 |
Nice work! All that and I didn't feel a thing. Well, the two servers that are being collected for now are an upload and a download server. As it turns out, it seems that simply putting the servers into a better environment for upload and download processing seems to have drastically improved data upload/download handling. Sure, beefier upload/download servers could do more, but do we really need new servers for upload and download, or should we redirect funds towards a different goal? Maybe we really do need a new upload and download server, but after seeing what the old machines are doing now, perhaps the specifications for the new machines can be dropped significantly, while still leaving room for upgrades at a later time? At the very least I hope a reconsideration of upload and download server needs is in the works before any new hardware for those roles is purchased for use in the new server facility. Upload and download aren't ALL that the two new servers are slated to do, but it might be that a completely different server architecture might be appropriate for best performance on the other work, and leave the current upload and download servers doing what they have now proven they can do quite nicely :) |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? Hey, weren't we supposed to start seeing some data from somewhere else, Green Bank I think? Whatever happened with that? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Filipe Send message Joined: 12 Aug 00 Posts: 218 Credit: 21,281,677 RAC: 20 |
Well, the two servers that are being collected for now are an upload and a download server. As it turns out, it seems that simply putting the servers into a better environment for upload and download processing seems to have drastically improved data upload/download handling. Sure, beefier upload/download servers could do more, but do we really need new servers for upload and download, or should we redirect funds towards a different goal? Maybe we really do need a new upload and download server, but after seeing what the old machines are doing now, perhaps the specifications for the new machines can be dropped significantly, while still leaving room for upgrades at a later time? A "NTPCKR" dedicated server, is the way to go. Theres is no point in having all the result seating unchecked on the database. |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30936 Credit: 53,134,872 RAC: 32 |
A "NTPCKR" dedicated server, is the way to go. They aren't http://setiathome.berkeley.edu/ntpckr.php It just isn't ready for prime time ... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.