Message boards :
Technical News :
Comedy (Jun 17 2009)
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
(and not take their spare cycles away again) I should have singled this out in my other post. If I was running a BOINC project, my target user would not be someone who builds machines just for SETI, or just for DC. It would not be someone who hangs out in the forums. It'd be the guy who has a computer that is already paid for (or maybe a bunch of computers on a LAN), who will install BOINC and then relax. ... because if they are spare cycles (and not "manufactured" spare cycles) and they go to waste, then that's exactly what would have happened anyway and they won't actually care. Those of us who post here all the time, and talk about the "experience" are vastly different from the average user -- we're self-selected for our dedication, and we have exceedingly high expectations. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
(and not take their spare cycles away again) On the other hand, we have new members like ST (welcome!), who joined on 15 June and has just made his first ever forum post in Number Crunching (908809): "So what is the problem with upload? ...". Do we really want volunteers who attach a computer to the project and never do anything else? Never check to see if it's working properly? Aren't interested in the project's results? Never join the fun in the Cafe, or learn something about computing in Number Crunching? Never make any friends? Never make a donation? If that's the sort of volunteer you want to recruit, and expect to retain, BOINC could be a lot simpler...... |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
The volunteer that attaches and never posts or complains is certainly easier, and a typical computer that isn't working properly is going to get to 1 wu/day and stay there, so it won't be a big impact. They don't complain because they've spent thousands of dollars building a farm only to have SETI "waste" their time and effort. They don't refuse to send cash because they're already donating electricity. At the same time, that wouldn't be my choice. Those "stealth" crunchers aren't very interesting, aren't fun to talk to, etc. For all the good things about those who participate on the forums and go to great lengths to be involved and competitive, most of us also take this way too seriously. We expect 99.999% uptime, and we start threads like "Panic Mode On (17) Server problems" when in fact 90% uptime is probably much more than is needed. We complain when the instantaneous performance isn't what we'd expect from other sites, when BOINC takes care of things pretty well when the servers aren't working to five-nines. |
Carsten Send message Joined: 19 Jul 00 Posts: 5 Credit: 2,045,200 RAC: 0 |
Hello to all. As you might see this is my first post in this forum. As I understood the first post by Matt there is currently a massive load problem on the Database (correct me if i'm wrong). So for me as a Systemadministrator it is very simple: If your server gets under heavy load then reduce the load. As I see there are only two options: Option 1: Keep the systems up and let the users fail sometimes (more often than succeed) Option 2: Shut off communication with the users and let them fail completely but the server can proceed till it is in good health May be Option 1 takes days or so. But I'm sure Option 2 takes less time. In my expierience it always went well with Option 2. Mail Servers recoverd from massive mailbombing attacks within a few hours while they were staggering for days before. Maybe this would be a possible solution for the current "problem" OK there ist still Option 3. Add more power to the Server. This might not be so simple as it sounds. Perhaps Boinc Developers should think about a "stop comunication with project till" switch. This would rapidly reduce load :-) |
Johnney Guinness Send message Joined: 11 Sep 06 Posts: 3093 Credit: 2,652,287 RAC: 0 |
2) I hadn't realised quite how successful you'd been in recruiting active users recently: Richard Haselgrove, That graph is staggering, i knew that there was a rise in recent Numbers, but its bigger than i thought. Recently total active boinc combined users fell to a low of 285,000 users, that was about two months ago. Today total active boinc combined users is 327,000 users according to boincstats here. Its a rise of over 40,000 new users. Its a staggering increase, much of it is due to the media coverage of the SETI 10th anniversary. But also WCG has been drawing much larger numbers of new users recently. Some days, i see WCG getting 500 or 600 new users with credit. Its good for all boinc projects! John. The boinc combined image; |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
Perhaps Boinc Developers should think about a "stop comunication with project till" switch. This would rapidly reduce load :-) There is already a mechanism for this: next_rpc_delay. Unfortunately, it's not that simple. If the next_rpc_delay is increased, the feeder buffer must also be increased so more work can be issued during each scheduler request. The following calculations were taken from the wiki article on low-latency BOINC computing. Current values gathered from BOINCStats and a couple of tasks. # of active hosts: 287,268 total computing power: 646,514,300 MFLOPS avg task size: 15,846,450 MFLOPS (angle 0.431439) * * I hope other users here can improve the accuracy of this value. Let's calculate the average computing power per host: 6465143000/287268 = 2250.56 MFLOPS Let's calculate the average time to complete one task (for the population). 15846450/2250560000 = 0.00704 sec Interpretation: On average, 142 tasks/sec are completed by SETI@Home computers. That means, on average, the SETI@Home servers must issue 142 tasks/sec of new work to its users. Depending on users' cache sizes and the frequency of scheduler requests (unknown to me), the feeder may not be able to meet this demand in the majority of requests. If the feeder connects every 2 seconds to fill its buffer and the request itself takes at least 0.5 seconds, then the feeder buffer size needs to be at least 300, just to deal with typical load. Prognosis: Until the current 100 task limit is raised and the splitters can operate at those speeds, SETI@Home will continue to have work shortages, bottlenecks, etc. Please note that this was done for MULTIBEAM workunits exclusively. Because Astropulse workunits take considerably longer to process, the scheduler demand rates will be significantly lower. Also, does anyone know what the average number of CPUs per host SETI@Home has? I cannot find this statistic anywhere. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Download pipe 100 MBits/second, practical limit somewhat less. Size of S@H Enhanced WU 3753xx bytes, 3 MBits plus a little. Maximum delivery of WUs ~ 32 per second. Joe |
C Send message Joined: 3 Apr 99 Posts: 240 Credit: 7,716,977 RAC: 0 |
... Too many zeros in the calculation - the TCP number has two zeros at the end, not three. Updating your numbers... 646514300/287268 = 225.056 MFLOPS Let's calculate the average time to complete one task (for the population). 15846450/225056000 = 0.0704 sec Interpretation: On average, 14.2 tasks/sec are completed by SETI@Home computers. That means, on average, the SETI@Home servers must issue 14.2 tasks/sec of new work to its users. ...that's still quite a lot of tasks... C Join Team MacNN |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
C, Good catch. I thought it seemed too high. LOL Still, 14.2 tasks/sec is just what it's doing now.... if you increase this number, then the total FLOPS for the project will also increase until the workunit demand is met. But by knowing the current work throughput, we can estimate the load on the database, calculate speedup, etc once improvements are made. |
Carsten Send message Joined: 19 Jul 00 Posts: 5 Credit: 2,045,200 RAC: 0 |
Hello C Just Checked your calculation. I think your Calculator needs new Batteries :-) Mine says 646514300 MFLOPS /287268 ActiveHosts = 2250.56 MFLOPS per Host So DJStarFox had one Zero too much but the Result was Correct. Checking my estimated CPU Speed with BoincView says 1.232 GFLOPS for an "old" Pentium M 1.6 GHz So the rest of the Calculation was absolutley correct. |
Neil Blaikie Send message Joined: 17 May 99 Posts: 143 Credit: 6,652,341 RAC: 0 |
Downloads seem to have just kicked into gear (at least for me anyway), managed to get some new work and am awaiting only 3 out of 10 tasks to upload now. Been a nice refreshing day for my computer which crunches 24/7 usually, while it has had a day off enjoying "computer vacation", same can't be said for me, plus to top it all off it is raining heavily here in Montreal! |
Peter Farrow Send message Joined: 15 Jun 99 Posts: 6 Credit: 5,974,136 RAC: 0 |
Hello Everyone, as you can see despite having joined Seti in 1999, this is only my second post. What prompted me to write was the discussion here regarding users. Let me first state that i am at best a computer amateur, I use them but don't really understand the intricacies of the operating systems, mounters, splitters etc, so I can add nothing to the technical discussion. However, I am a slightly competitive kind of chap, and regularly review the manager and Boinc Stats to see how I am doing, where I am on my registration class is actually more important than I thought, and I have a special pure malt Scotch waiting to be opened when I reach a million credits, hoepfully in the not too distant future. I can understand those that might load and forget Boinc, especially as its newness wears off, but believe me, there are an awful lot of us out here who know little of computers, but who do take a great interest in the project. The thought of donating something that would otherwise have no use is an act of selflessness on the part of individuals, but, and here is the real crunch, the contribution to the science and developing a greater understanding of the project is a vital aspect of taking part. I suppose at the end of the day it really does not matter whether individuals just donate computer time, or take a greater interest in the project, what matters is that science and our understanding of it is increased and enhanced. Oh and by the way, whilst being a bit of a technophobe in the electronics and software development field, as a mechanical engineer, I take great delight in reading the contributors suggestions and identifying the logic used to troubleshoot the problem. Keep up the good work and best regards to all. Peter |
Berserker Send message Joined: 2 Jun 99 Posts: 105 Credit: 5,440,087 RAC: 0 |
What query are you talking about that must traverse all rows? That's a bad query in any context. Actually, I misspoke, apologies. You'd want to be doing those queries using the indexes only if at all possible. The big problem SETI@home usually has is the results (or workunits, or whatever it's called) table, and common queries might include 'how many unsent workunits to I have'. In fact, that's the exact query that is going bad right now. Of course, there's a trade-off here too. The more indexes you have, the longer it takes to do record updates, and the more memory the indexes take. BOINC in general has had index issues in the past, but that was quite some time ago. The problem I was addressing relates to minimizing disk I/O. The table is obviously too big to fit completely into memory anyway (remember to include indexes). Again, I misspoke - you want enough RAM to keep the indexes in memory with some headroom for data. SETI@home generally works when that happens, but goes bad rather quickly when it doesn't. You'll note that most of SETI@home servers are far from RAM challenged, and there's a good reason for that. So, reducing I/O should also reduce the amount of steady-state memory usage for that table. Having contiguous records in an index also helps it when searching, because the search algorithm goes from log(n)+(total_records / deleted_records) to log(n). By improving the index seek time, queries will average shorter. By optimizing the most common case, we're making the most impact on performance. All of this is indeed true, but my comment about the volume of records still stands. If the goal is to keep the indexes in memory (and I believe it is in this case), then you do need to keep the amount of data under more strict control. Random access I/O to memory is cheap. Random access I/O to disks is most certainly not. There are several other things they could do, including lowering the value of query_cache_min_res_unit to be the data size of one row from the result table (default is 4K). But without actually being there to work on this and see the performance metrics (e.g., Qcache_lowmem_prunes, etc.), I can only make thoughtful, constructive suggestions and research them before saying anything. It's up to Matt, Jeff, and the rest of them to find the time for discussion, decision, and implementation. These sound like reasonable ideas. I might even look into using them myself with my own multi-GB databases. Performance vs storage space has always been a hallmark trade-off in computer science applications, but here I am simply presenting an alternative to status quo, that to the best of my professional experience, will accomplish what I said it would. There's an important balance between data volume and data fragmentation here. It's always necessary to keep both in mind (hence the weekly DB compression outage) but if the goal is to keep the indexes in memory, volume is the driver. If it's to achieve better disk I/O, fragmentation is the driver. Also, does anyone know what the average number of CPUs per host SETI@Home has? I cannot find this statistic anywhere. Let me ponder that a while. I have the data required to calculate that - I just need to figure out how to do the calculation. Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Hello C I was trained (many years ago) always to check the detailed figures with an order-of-magnitude mental check, to ensure a decimal point hadn't slipped. The Server Status Page has a 'Results received in last hour' figure. For MB, that has averaged from 40,000 to 60,000 per hour for the last month, or 11 to 16 per second. So C was closer than DJS, by an alternative calculation. |
Neil Blaikie Send message Joined: 17 May 99 Posts: 143 Credit: 6,652,341 RAC: 0 |
Another excellent comment. Good to see new people posting to the forums, remember also that you can make good friends with people on here as well. I actually have a few friends on here that share my interest in online flight simulator gaming. Always nice to sometimes fly with them and have a good 'ol chat! While this is a science project, yes you can learn a lot about computers on here, I knew virtually nothing about UNIX/Linux systems until speaking with a few people on here. They have been a great help and while I have not taken the plunge fully yet, am learning more and more each day from those willing to add a bit extra time to help me out. Generally as well my knowledge of the inner workings of the project has increased since I first joined in '99, I used to be one of the "set and forget gang" but have a fair working knowledge of what actually goes on now. Those willing to help out and go that extra mile are what make A) posting here good, I enjoy reading the forums and seeing what suggestions people come up with, B) The fact that those who have spent their time in developing the optimized apps to speed crunching time up are also willing to follow-up with help on getting their apps or other people's optimized apps working correctly. They don't just make, publish and forget them. Grocery store awaits, car=dead=long walk to store in pouring rain :-( |
Carsten Send message Joined: 19 Jul 00 Posts: 5 Credit: 2,045,200 RAC: 0 |
Yes you are right Richard. But this is what some people call the difference between theory an practice. In theory thing don't work and anyone knows why In practice things do work and noboby knows why Let's join them Things don't work and nobody knows why! ;-) |
Berserker Send message Joined: 2 Jun 99 Posts: 105 Credit: 5,440,087 RAC: 0 |
Also, does anyone know what the average number of CPUs per host SETI@Home has? I cannot find this statistic anywhere. After much playing about with XML, bash, and sed - As of this morning: Users - 981,487 Hosts - 2,347,596 CPUs - 3,729,045 Average CPUs/Host - 1.5885 Average Hosts/User - 2.3919 Average CPUs/User - 3.7994 Note - CPUs is BOINC's view of CPUs. It does not account for Dual/Quad Core or Hyperthreading. Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking. |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
So, reducing I/O should also reduce the amount of steady-state memory usage for that table. Having contiguous records in an index also helps it when searching, because the search algorithm goes from log(n)+(total_records / deleted_records) to log(n). By improving the index seek time, queries will average shorter. By optimizing the most common case, we're making the most impact on performance. I thought the goal was maximizing database throughput: transactions per second. But you might be right. Are most transactions (on the results table) cached, or are most random requiring disk I/O? Without knowing the statistics or trying both ways, we cannot say for sure which approach would yield better throughput. I wish I had some real numbers either way, but that's the truth as I see it. Let's hope Matt, Jeff, and co make the right decisions. Also, does anyone know what the average number of CPUs per host SETI@Home has? I cannot find this statistic anywhere. If you can calculate it, I can use that number to find the unmet demand for workunits by comparing it to project's users' average RAC (need that number too) and the current FLOPS rate of the project. Knowing how much demand/capacity is out in the field for SETI workunits will give the project idea of when they will be ahead of the curve on server throughput. |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
Also, does anyone know what the average number of CPUs per host SETI@Home has? I cannot find this statistic anywhere. Any chance you could filter to your query to include only active hosts? RAC >= 0.1 That's what my earlier numbers were based on, active hosts. |
Berserker Send message Joined: 2 Jun 99 Posts: 105 Credit: 5,440,087 RAC: 0 |
That's somewhat harder and will require a change in methodology. If I get time over the weekend I'll make a change to the code that updates my stats site and let it do the calculation. It has it's own concept of an active host based on when that host was last granted credit. Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.