Message boards :
Technical News :
Monolith (Jun 14 2011)
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Jeff Mercer Send message Joined: 14 Aug 08 Posts: 90 Credit: 162,139 RAC: 0 |
OK... just noticed things wern't uploading. Wasn't sure if it was me or a problem at the lab. Just plain forgot about the weekly outage. No problem... just concerned ! :) |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
Not just you(mine aren't going either).....just checked, and the upload server has been labeled as "Disabled". SETI@Home is in the midst of the weekly backup/outage schedule, so I wouldn't be surprised if nothing goes back until sometime Friday. No, the weekly outage ended Tuesday afternoon, Berkeley time. See Matt's message, #1 in this thread. We are having problems with some of the servers. See the Cricket Graphs. Bookmark that link, and check it when you have trouble communicating with the Seti@Home servers. When the Green (downloads) is above 90 MB and/or the Blue (uploads & Scheduler Requests) is about 40 Mb, the data link is maxed out and you will have trouble connecting. This is the normal case for at least the first full day after an outage. When either the Green or Blue is near the bottom of the graph (as the blue line is now), we are having server problems. Hope that is useful information. Around here lately, patience is not just a virtue, it is a requirement. Donald Infernal Optimist / Submariner, retired |
Jeff Mercer Send message Joined: 14 Aug 08 Posts: 90 Credit: 162,139 RAC: 0 |
Thanks for the information. I read that the servers were throwing fits, but thought that they had most of the problem figured out. I'm going to have to spend a little more time reading the news and updates. As to patience, it's not a problem with me !! ;) At my age, you just CAN'T hurry anymore ! HA HA HA ! Anyway, thanks for the information !! |
Acrklor Send message Joined: 22 Oct 01 Posts: 14 Credit: 639,144 RAC: 0 |
Anybody have any theories about what is causing the ridiculously consistent heavy load? Free to speculate? That may not be a good idea to say to me :P The obvious theories: - seems to me there are more and more Lunatics-enabled 'Anonymous platform' crunchers which would also lead to less crunch time and more load (however, I have no idea if they reached a percentage where this would have an impact) - a configuration change (from a few weeks ago) coming back to haunt you (never happend to me of course :P)...even more cache can have negativ impact when nothings hit - timeouts/events/semaphores... meaning the software part is often waiting on something (which wouldn't cause CPU/IO load in particular) because of a completely other issue The creative but unlikely theories: - outage of something cooling related, but to small to trigger an alarm (for example a single non-critical fan) -> higher temperature -> system tries to countermeasure by reducing CPU cycles - installed BBU (Battery Backup Unit) went awry causing the RAID controller to disable Write Cache - malfunction on network equipment causing a lot of lost packets -> higher latency The fabricated theories: - BIOS went nuts and changed the CPU/RAM multiplier on power cycle (<- no kidding, it happens...also I havn't seen it with server boards) - I assume TCP/IP checksum offload is enabled. However, I read a few times that there are configurations out there (with a lot small packets) where disabling improves throughput (which would suggest the CPU is faster than the network interface itself for this particular setup). I know, I know, it's unlikely at best, but who knows and since we're speculating... ;) I've got my fair share of experience with server systems, but of course still no clue how it looks backstage at seti@home/boinc, which means: just speculating like crazy. ^^ "Judging people you don't know for things you don't understand is just really stupid." - Ellen Page |
S@NL - XP_Freak Send message Joined: 10 Jul 99 Posts: 99 Credit: 6,248,265 RAC: 0 |
@Acrklor: You can add the very high numer of short time wu's to the obvious theories. Goodbye Seti Classic |
halfempty Send message Joined: 2 Jun 99 Posts: 97 Credit: 35,236,901 RAC: 114 |
If there is a way, might it be productive to limit the number of concurrent connections? Since the pipe can only handle so much bandwidth, limiting the number of clients that can connect at any one time to something the pipe can reasonably handle should improve efficiency. If the excess clients get immediately rejected and go into a timeout delay, it should reduce the thrashing on the servers and eliminate a bunch of errors/lost packets/re-sends. Don't know if it would make a noticeable difference, just a thought. |
Invisible Man Send message Joined: 24 Jun 01 Posts: 22 Credit: 1,129,336 RAC: 0 |
I apologize but expect things to get worse as the music career will temporary consume me. You may see rather significant periods of silence from me for the next... I dunno... 6 to 12 months? I'm sure the others will chime in as needed if I'm not around. Let’s face it folks, the writing is on the wall. It appears to me that Matt wants out, to further his music career. His last sentence is very telling; who exactly will “chime in� Remember the vast number of Threads to Technical News only shows 12 pages, from back in Feb 7th 2007. How many years before that? Most have been written by Matt. Even if his music career is of a temporary nature, we will surely miss him & his up to date writings. What ever you do Matt, we will all wish you the Very Best for the Future. “A man must do, what a man has to do†P.S. I really hope I am wrong! |
Invisible Man Send message Joined: 24 Jun 01 Posts: 22 Credit: 1,129,336 RAC: 0 |
Sorry. The first two lines of my previous message should have been in quotes. |
David J. Moritz Send message Joined: 15 Aug 99 Posts: 21 Credit: 2,542,037 RAC: 0 |
This is supposed to be the technical news forum, unfortunately the SETI staff never posts any news. It seems it is up to the users to update the status of the system. Once again the upload server is not functioning (cricket Graph) and the server status on the site shows it as UP. Further the server status page shows that the work units recieved has not updated for 56 hours. A little bit of information posted on the site would allow volunteers to understand the status of the servers and the system. The staff needs to remember that people are supporting SETI. David Moritz |
John Clark Send message Joined: 29 Sep 99 Posts: 16515 Credit: 4,418,829 RAC: 0 |
Wrong David J Moritz Who posted the first post of this thread, and the admin/project scientists post as and when relevant or the need arises. It's good to be back amongst friends and colleagues |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
The staff needs to remember that people are supporting SETI. I'll give Eric a call right away. I'm sure he'll be delighted to hear that there are in fact intelligent lifeforms installing BOINC and crunching data and not the robotic pirate super squirrels he previously theorized. ... and the next time I see Matt, Jeff or Eric outside the lab actually living their lives, I'll be sure to lure them back into the lab and lock the doors. A trail of candy and a few pieces conveniently placed under a cardboard box held up at one end by a stick and triggered by a string are usually enough to catch all three of those Project Admins. ... and I'll be sure to train all three of them to give obvious status updates every 10 minutes, including detailed bathroom breaks. They will be appropriately rewarded for good behavior and properly shocked with a jolt of electricity for not following orders. I might even put up a web cam and start a pay-per-view service. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Thing(s) for the staff to look at when they get in Monday: on the "Server Status" page: "Current Result creation rate", "Result Turnaround Time", "Results Received in the last hour" and "Transitioner backlog" all show as "AS OF" 53 hours ago (currently...), suggesting that the same software bug is affecting all those stats (and that whatever broke, broke on Friday around 4 AM Berkeley time)... . Hello, from Albany, CA!... |
AllenIN Send message Joined: 5 Dec 00 Posts: 292 Credit: 58,297,005 RAC: 311 |
Great sarcasm Oz, but it does seem to me and I'm sure many other supporters, that this project has more problems with their systems that most other projects. Why do you think that is, especially since they have had probably the most time as a project? Allen |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
Perhaps the sheer number of users and CPU's and GPU's. All of which are far above any other project. Boinc....Boinc....Boinc....Boinc.... |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
I might even put up a web cam and start a pay-per-view service. Oh boy, a webcam, where do I subscribe??? PROUD MEMBER OF Team Starfire World BOINC |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
Would that include a new fibre cable to the SSL? If so where do I sign up? Boinc....Boinc....Boinc....Boinc.... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
And that "most time as a project" combined with severely limited funds ensures that much of the hardware is just barely able to handle the load. A commercial data center trying to do what this project is doing would dedicate much more than three racks of equipment in a repurposed closet. Joe |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
Great sarcasm Oz, but it does seem to me and I'm sure many other supporters, that this project has more problems with their systems that most other projects. My other response was too wordy, so I have edited to this response. SETI@Home has over 500,000+ user accounts the last time I checked. Most these users have more than one system, and a good portion have a farm running. On an intranet, to avoid so much network activity, you would divide the users up into subnets and you'd have powerful enough servers to handle the load. But since this is the internet running off of a WAN connection, go look at some of the largest internet sites that service 1,000,000+ connections like SETI@Home does. Look at Amazon.com and the amount of transactions they process. Look at Youtube.com and the compressed video they serve to their viewers. Go look at their business models on how they earn money to afford the large datacenters they have and the staff required to keep things running smoothly and without a glitch. Then go look at what SETI@Home is running. Look at SETI@Home's business model and how they earn their income to afford their "datacenter" servicing over 500,000 users, and the whole 5 part-time staff members employed to keep us happy. Then you tell me why SETI@Home is running as "poorly" as some users seem to think it is and how they can improve it, and you can explain to me why longevity has anything at all to do with reality in an ever-changing world. |
AllenIN Send message Joined: 5 Dec 00 Posts: 292 Credit: 58,297,005 RAC: 311 |
Thanks for taking the time to put some of this in the right perspective for me. I never quite thought of it as you put it, many users with many, many machines certainly does make for a lot of connections. I don't know that it would be quite as massive as Amazon, but it certainly massive on any scale. However, since you had a very good answer for me this time, might I ask why there is so much more trouble at this point in time and not so much say 7 years ago....before Boinc? I really don't remember having as much downtime back then, but I could have just forgotten the good old days. Thanks, Allen |
Geek@Play Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 |
In the days before Boinc................... Work units could take more than 24 hours to complete. Not like today with the GPU's crunching that can turn around a work unit in less than 5 minutes with more in depth analizing than before. It's simply better computers and more GPU's on the project but bandwidth has remained the same (or little improvement) at Berkeley. Boinc....Boinc....Boinc....Boinc.... |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.