Moving on... (Apr 08 2013)

Author	Message
Matt Lebofsky Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0	Message 1354823 - Posted: 8 Apr 2013, 22:10:38 UTC So! We made the big move to the colocation facility without too much pain and anguish. In fact, thanks to some precise planning and preparation we were pretty much back on line a day earlier than expected. Were there any problems during the move? Nothing too crazy. Some expected confusion about the network/DNS configuration. A lot of expected struggle due to the frustrating non-standards regarding rack rails. And one unexpected nuisance where the power strips mounted in the back of the rack were blocking the external sata ports on the jbod which holds georgem/paddym's disks. However if we moved the strip, it would block other ports on other servers. It was a bit of a puzzle, eventually solved. It feels great knowing our servers are on real backup power for the first time ever, and on a functional kvm, and behind a more rigid firewall that we control ourselves. As well, we no longer have that 100Mbit hardware limit in our way, so we can use the full gigabit of Hurricane Electric bandwidth. Jeff and I predicted based on previous demand that we'd see, once things settled down, a bandwidth usage average of 150Mbits/second (as long as both multibeam and astropulse workunits were available). And in fact this is what we're seeing, though we are still tuning some throttle mechanisms to make sure we don't go much higher than that. Why not go higher? At least three reasons for now. First, we don't really have the data or the ability to split workunits faster than that. Second, we eventually hope to move off Hurricane and get on the campus network (and wantonly grabbing all the bits we can for no clear scientific reason wouldn't be setting a good example that we are in control of our needs/traffic). Third, and perhaps most importantly, it seems that our result storage server can't handle much higher a load. Yes, that seems to be our big bottleneck at this point - the ability of that server to write results to disk much faster than current demand. We expected as much. We'll look into improving the disk i/o on that system soon. And we'll see how we fare after tomorrow's outage... What's next? We still have a couple more servers to bring down, perhaps next week, like the BOINC/CASPER web servers, and Eric's GALFA machines. None of these will have any impact on SETI@home. Meanwhile there's lots of minor annoyances. Remember that a lot of our server issues stemmed from a crazy web of cross dependencies (mostly NFS). Well in advance we started to untangle that web to get these servers on different subnets, but you can imagine we missed some pieces, and the resulting fallout of a decade's worth of scripts scattered around in a decade's worth of random locations expecting a mount to exist and not getting it. Nothing remotely tragic, and we may very well be beyond all that at this point. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude ID: 1354823 ·

SciManStev Volunteer tester Send message Joined: 20 Jun 99 Posts: 6656 Credit: 121,090,076 RAC: 0	Message 1354826 - Posted: 8 Apr 2013, 22:27:51 UTC Last modified: 8 Apr 2013, 22:28:22 UTC Awesome Matt! This move has been a huge breath of fresh air for everyone! Crunching SETI just got a lot easier, and a lot more maintainable on your end! Well done! Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website ID: 1354826 ·

shizaru Volunteer tester Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0	Message 1354827 - Posted: 8 Apr 2013, 22:43:36 UTC Matt, you sound... happy!:D Congrats!! And like Steve said, awesome! ID: 1354827 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1354830 - Posted: 8 Apr 2013, 22:51:49 UTC Good news to all, now DL/UL are fast and without error, we just need now a small increase in the GPU WU limit to be totaly happy. ID: 1354830 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30812 Credit: 53,134,872 RAC: 32	Message 1354832 - Posted: 8 Apr 2013, 23:00:43 UTC - in response to Message 1354823. (and wantonly grabbing all the bits we can for no clear scientific reason Speaking of that, how are we doing on raw data collection vs. the speed at which we can crunch the data? I know the data page hasn't been updated in a coon's age as to what is collected. ID: 1354832 ·

rebest Volunteer tester Send message Joined: 16 Apr 00 Posts: 1296 Credit: 45,357,093 RAC: 0	Message 1354834 - Posted: 8 Apr 2013, 23:01:33 UTC Thanks, Matt. I hope that this means that you, Eric and the guys will have time to actually be creative rather than fighting fires. Join the PACK! ID: 1354834 ·

Thomas Volunteer tester Send message Joined: 9 Dec 11 Posts: 1499 Credit: 1,345,576 RAC: 0	Message 1354925 - Posted: 9 Apr 2013, 6:38:10 UTC Good Game Matt ! :) Congrats to all the roster of SETI@home ! Everyone has felt the success of the migration and the whole world is really happy. A new era for the project is open ! And what a relief for the team in terms of logistics... Thanks for the heads-up and thanks again for this big move. ID: 1354925 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004	Message 1354931 - Posted: 9 Apr 2013, 7:15:28 UTC Matt... Thank you for both your news post and the dedication to the project you have shown during the transition. I should wish one bit of explanation though, regarding this statement..... "Why not go higher? At least three reasons for now. First, we don't really have the data or the ability to split workunits faster than that." It would appear that the ability of the splitters to keep up with demand and the improved distribution of work with the increased bandwidth has been doing rather well since coming back up. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? This is an important question for some of us devoted souls. "Time is simply the mechanism that keeps everything from happening all at once." ID: 1354931 ·

Matt Lebofsky Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0	Message 1355085 - Posted: 9 Apr 2013, 21:07:45 UTC - in response to Message 1354931. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? Right... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude ID: 1355085 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 20635 Credit: 7,508,002 RAC: 20	Message 1355097 - Posted: 9 Apr 2013, 21:44:20 UTC - in response to Message 1355085. ... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. Roll-on Real Time Processing :-) (Just a few GPUs needed?...) Next bit of research is how to sift through the huge database of results in parallel? ;-) Happy fast crunchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 1355097 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30812 Credit: 53,134,872 RAC: 32	Message 1355100 - Posted: 9 Apr 2013, 21:57:10 UTC - in response to Message 1355085. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? Right... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. - Matt Thanks for the straight answer. So just how much does it cost to put a receiver at another telescope (ballpark) so we can use more of our big fat pipe? ID: 1355100 ·

Cheopis Send message Joined: 17 Sep 00 Posts: 156 Credit: 18,451,329 RAC: 0	Message 1355112 - Posted: 9 Apr 2013, 23:00:46 UTC - in response to Message 1355085. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? Right... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. - Matt Any hints on where the team might be planning on going next? More sites for the same depth of analysis, or deeper analysis of the data that we already have a pipeline for? Or more background work shifted to the remote computers? RFID / NTPCKR/ splitting? If the answer is that you guys aren't sure yet because you are still watching how things develop, that's fine too! ID: 1355112 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1355115 - Posted: 9 Apr 2013, 23:09:17 UTC - in response to Message 1355100. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? Right... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. - Matt Thanks for the straight answer. So just how much does it cost to put a receiver at another telescope (ballpark) so we can use more of our big fat pipe? Or how much would it cost to increase the bandwidth recorded at Arecibo? Dan Wertheimer noted that as a desirable change in a request for donations 2 or 3 years ago, and doubling the bandwidth doubles the amount of data per observation time. Joe ID: 1355115 ·

Wolverine Send message Joined: 9 Jan 00 Posts: 35 Credit: 7,361,717 RAC: 0	Message 1355127 - Posted: 9 Apr 2013, 23:48:39 UTC Nice work! All that and I didn't feel a thing. Time for another $$ drive to fix that bottleneck? What is needed to eliminate it? Hardware? Cost?? ID: 1355127 ·

HAL9000 Volunteer tester Send message Joined: 25 Mar 13 Posts: 12 Credit: 259,905 RAC: 0	Message 1355147 - Posted: 10 Apr 2013, 1:05:51 UTC w00t, great to hear all is well ;) 10.7 billion ops/sec ID: 1355147 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22327 Credit: 416,307,556 RAC: 380	Message 1355204 - Posted: 10 Apr 2013, 5:55:39 UTC - in response to Message 1355127. Nice work! All that and I didn't feel a thing. Time for another $$ drive to fix that bottleneck? What is needed to eliminate it? Hardware? Cost?? GPUUG has a funding drive for replacement servers running take a look at this thread: http://setiathome.berkeley.edu/forum_thread.php?id=70511 Then make a donation. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1355204 ·

Cheopis Send message Joined: 17 Sep 00 Posts: 156 Credit: 18,451,329 RAC: 0	Message 1355296 - Posted: 10 Apr 2013, 13:13:48 UTC - in response to Message 1355204. Last modified: 10 Apr 2013, 13:24:20 UTC Nice work! All that and I didn't feel a thing. Time for another $$ drive to fix that bottleneck? What is needed to eliminate it? Hardware? Cost?? GPUUG has a funding drive for replacement servers running take a look at this thread: http://setiathome.berkeley.edu/forum_thread.php?id=70511 Then make a donation. Well, the two servers that are being collected for now are an upload and a download server. As it turns out, it seems that simply putting the servers into a better environment for upload and download processing seems to have drastically improved data upload/download handling. Sure, beefier upload/download servers could do more, but do we really need new servers for upload and download, or should we redirect funds towards a different goal? Maybe we really do need a new upload and download server, but after seeing what the old machines are doing now, perhaps the specifications for the new machines can be dropped significantly, while still leaving room for upgrades at a later time? At the very least I hope a reconsideration of upload and download server needs is in the works before any new hardware for those roles is purchased for use in the new server facility. Upload and download aren't ALL that the two new servers are slated to do, but it might be that a completely different server architecture might be appropriate for best performance on the other work, and leave the current upload and download servers doing what they have now proven they can do quite nicely :) ID: 1355296 ·

David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1355299 - Posted: 10 Apr 2013, 13:43:03 UTC - in response to Message 1355085. Are you saying that you are not able to acquire enough data from Arecibo to continue to distribute work at the improved rates? And if so, would more drives for data shuttle service help? Or is it a limit on the rate that you are able to record data or the time allowed to do so? Right... our observation time is the bottleneck in this case. When we are able to use the telescope we can record at the rates we desire, and we are easily able to shuttle all the data back to UCB on the drives we have. - Matt Hey, weren't we supposed to start seeing some data from somewhere else, Green Bank I think? Whatever happened with that? David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1355299 ·

Filipe Send message Joined: 12 Aug 00 Posts: 218 Credit: 21,281,677 RAC: 20	Message 1355323 - Posted: 10 Apr 2013, 15:26:10 UTC Well, the two servers that are being collected for now are an upload and a download server. As it turns out, it seems that simply putting the servers into a better environment for upload and download processing seems to have drastically improved data upload/download handling. Sure, beefier upload/download servers could do more, but do we really need new servers for upload and download, or should we redirect funds towards a different goal? Maybe we really do need a new upload and download server, but after seeing what the old machines are doing now, perhaps the specifications for the new machines can be dropped significantly, while still leaving room for upgrades at a later time? At the very least I hope a reconsideration of upload and download server needs is in the works before any new hardware for those roles is purchased for use in the new server facility. Upload and download aren't ALL that the two new servers are slated to do, but it might be that a completely different server architecture might be appropriate for best performance on the other work, and leave the current upload and download servers doing what they have now proven they can do quite nicely :) A "NTPCKR" dedicated server, is the way to go. Theres is no point in having all the result seating unchecked on the database. ID: 1355323 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30812 Credit: 53,134,872 RAC: 32	Message 1355389 - Posted: 10 Apr 2013, 17:34:32 UTC - in response to Message 1355323. A "NTPCKR" dedicated server, is the way to go. Theres is no point in having all the result seating unchecked on the database They aren't http://setiathome.berkeley.edu/ntpckr.php It just isn't ready for prime time ... ID: 1355389 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.