Message boards :
Number crunching :
report the results immediately!
Message board moderation
Author | Message |
---|---|
SETI User Send message Joined: 29 Jun 02 Posts: 369 Credit: 0 RAC: 0 ![]() |
Hello, is the Truxoft Boinc V5.3.12.txXX the only client that can report the results immediately? Or have the new Boinc V5.4.9 this feature too? Greetings! |
![]() ![]() Send message Joined: 1 Dec 03 Posts: 33 Credit: 8,919 RAC: 0 ![]() |
A little off topic, sorry, but is there a Truxoft that works with enhanced? SETI@HOME - Crunching 24/7 SETI Classic to SETI Enhanced. I was there. "I'm the Christian the Devil warned you about!" |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
A little off topic, sorry, but is there a Truxoft that works with enhanced?The existing Truxoft Boinc V5.3.12.tx36 works just fine with enhanced, but I recommend you turn off the calibration for SETI (not used on enhanced: DCF changes mess up the calibation for standard). But it's still worth it if you do optimised Einstein. ![]() |
Jack Gulley Send message Joined: 4 Mar 03 Posts: 423 Credit: 526,566 RAC: 0 ![]() |
A little off topic, sorry, but is there a Truxoft that works with enhanced? His current client version works just fine with Enhanced. No changes required. The calibration of course only attempts to adjust what it claimed, but has no effect on what is actually claimed or granted. So you might as well not enable the Calibration feature for Seti. But all of the other options such as Report_Results_Immediately still work just like they did. Enhanced really only effects the science application, not the Client manager. No word from Trux yet when he will put out a version of the new client with his remaining features that are still useful. Suspect he will wait until after a few more problems are fixed in the recommend new client. And NO, he does not have a version of the Enhanced science application. No need for it as Crunch3r has his optimized version out. |
![]() ![]() Send message Joined: 1 Dec 03 Posts: 33 Credit: 8,919 RAC: 0 ![]() |
OK thanks, now answer his question. lol Sorry, didn't mean to post-jack. SETI@HOME - Crunching 24/7 SETI Classic to SETI Enhanced. I was there. "I'm the Christian the Devil warned you about!" |
![]() ![]() Send message Joined: 4 Jul 99 Posts: 1575 Credit: 4,152,111 RAC: 1 ![]() |
Hello, This is a bug not a feature, and is the sole reason I do not endorse the trux clients. BOINC WIKI ![]() ![]() BOINCing since 2002/12/8 |
Zap de Ridder Send message Joined: 9 Jan 00 Posts: 227 Credit: 1,468,844 RAC: 1 ![]() |
Hello, It is a feature and can set off by removing this line: <return_results_immediately/> in the truxoft_prefs.xml |
![]() Send message Joined: 21 Apr 00 Posts: 1459 Credit: 58,485 RAC: 0 ![]() |
it is a bug because no client is ment to have that kind of behaviour, there are reasons the projects don't want clients reporting results immediately, yes i know it's nice for users, but that's not the only factor hereIt is a feature and can set off by removing this line:Hello,This is a bug not a feature, and is the sole reason I do not endorse the trux clients. people complain about how slow things are, how seti is always having problems and outages, well this is one of the things causing it, so stop reporting immediately also a feature like that should be off by default, but if a user has to "remove" a string from a config file, i'm guessing it's on by default, again this is bad Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Engines |
WendyR ![]() Send message Joined: 1 Aug 05 Posts: 44 Credit: 1,962,140 RAC: 0 ![]() |
I like the idea of returning results as soon as they are complete. Jack Gulley posted a well thought-out reason for wanting this feature in this thread. Basically, he uses his returned SETI results to monitor that his home network/computers are functioning correctly. He can do this from any internet connection, and does not require any unusual software or hardware to accomplish it. While it is not Berkley's responsibility to provide a monitoring system into people's home networks, it is a nice side benefit they provide for their volunteers. It gives me some comfort knowing that my results are "safely home" to Berkley quickly. That means lower probability of me or my machines screwing something up and losing a completed result. In addition, returning my results faster gives me a greater probability of being part of the quorum, and/or my unit being chosen as the cannonical result (Sometimes it seems that algorithm does seem to include "if the machine is Wendy's don't pick it though... :) ) My "trivial but still annoying" reason -- my "active" results don't fit into the real estate on my screen with those "extra" results there. The argument against this "feature" has always been server load. I want to question that. Really, how much of a load is it? And just where is that load? According to the Wiki the process happens in two phases, and the "troublesome" part is the upload of the data file, which has always happened as soon as possible. The feature we are talking about is the contact with the scheduler. My experience is that the problems are in contacting the upload/download server, rather than the scheduler. In addition, the latest round of problems has been in full disks on the upload/download server. Wouldn't letting the scheduler know that those results are there, ready for assimilation and deletion, help this situation? How many results get trashed when they are in that "Ready to report" status? Are those files still sitting out on the upload/download server? How long? They seem to have lots of capacity on the assimilator and deleter pieces of the process, and need some help on the upload/download part of the process. Wouldn't allowing "return results immediately" help this? Just a few thoughts.... |
![]() ![]() Send message Joined: 7 May 05 Posts: 217 Credit: 10,386,105 RAC: 12 ![]() ![]() |
Isn't it better to maintain a constant stream of a small amount of data rather than everyone surging at times with dozens of updates? I would rather send one file 10 to 12 times a day than to wait and report 10 days worth of data. Just because you don't like the "bug", it doesn't mean it is one. Fear will keep the local systems in line. Fear of this battle station. - Grand Moff Tarkin |
Astro ![]() Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
The result uploads go directly to the hard drive. They are 6-23K in size. The "reporting" isn't really much more than a few bytes, but it forces communication between the different servers inside their network. Each connection takes time and the operations required after receiving the report take time. It's this which is slowing down the system. To report 1 wu takes about as much as reporting a dozen. So by waiting and reporting multiple results at the same time saves on these open connections between servers. |
Astro ![]() Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
for example, if they send a million results a day, they get back roughly the same number. 1,000,000 divided by 86400 (seconds in a day)= 11.57 operations per second. If the users report 2 at a time, it only needs to do 6 per second, if 3 then 4 per second, if 4 then 3 per second. you get the picture. |
![]() Send message Joined: 21 Apr 00 Posts: 1459 Credit: 58,485 RAC: 0 ![]() |
I like the idea of returning results as soon as they are complete.i do as well, and have been tempted sometimes, however, i'd rather respect the wishes of the project admins, as they know what's best for their project Jack Gulley posted a well thought-out reason for wanting this feature in this thread.i see his point, but there are much easier and better ways of monitoring, he could set up a dynamic DNS account, so he can always access his site remotely, and run something like Nagios on a web server, that'll give him a lot more detail, he could also set up secure remote access, so he can log into machines remotely, and use them over the internet, to reboot them or whatever It gives me some comfort knowing that my results are "safely home" to Berkley quickly. That means lower probability of me or my machines screwing something up and losing a completed result. In addition, returning my results faster gives me a greater probability of being part of the quorum, and/or my unit being chosen as the cannonical result (Sometimes it seems that algorithm does seem to include "if the machine is Wendy's don't pick it though... :) )i see your point, and i've had trouble with crashes and glitches causing me to loose workunits in the past, so i see the tempation, however, it's not very good if, by doing that, we're overloading the servers more than we need to, seti is big enough, with enough performance issues as it is, without us adding new ones My "trivial but still annoying" reason -- my "active" results don't fit into the real estate on my screen with those "extra" results there.a more appropriate solution is to reduce your cache, increase your screen resolution, or use something like BoincView which can filter results based on status (so you only see active and paused if you want), there's a setup guide in the wiki The argument against this "feature" has always been server load. I want to question that. Really, how much of a load is it? And just where is that load?quite a bit, in the context of reporting the load is on the DB server, there are some more detailed values below, with examples and explanations... According to the Wiki the process happens in two phases, and the "troublesome" part is the upload of the data file, which has always happened as soon as possible. The feature we are talking about is the contact with the scheduler.overall, yes, the data server takes quite a hammering however, as you say reporting is nothing to do with uploading, so uploading is irrelevent it actually makes no difference how/when you upload results, you've still got to send the same ammount of data, the problem is with the sheer amount of it, and the processing power needed to handle all those uploads, especially if lots are happening at the same time (like after an outage), but there's not much that can be done to improve the efficiecy of that, the only thing is buying more bandwidth (which would be unused normally) and buy a higher capacity server, both of which cost a far bit of money, which SETI doesn't have anyway, with regard to reporting, as most of you know when you report results, it's part of an "update" to that project (in this case SETI) when you do any kind of "update" the request is sent to the scheduler, which in turn accesses the DB (database) and here enlies/inlies (sp?) the problem, the load put on the DB by lots of updating/reporting (the DB is the slowest part of the whole system, and the reason for the weekly outage) i know what you're thinking: how is reporting any different to uploading? well, that's the subtle difference, there's a lot more overhead involved with reporting (or any kind of "update" to the scheduler) what happens is that when you do a "blank" update, one that doesn't report any results, and doesn't request new work, it takes about 4 DB queries to complete that request (to get your user data and a few other things) now, for each result reported, it takes about 3 queries (my memory is foggy with these number, so please correct me if i'm wrong, but i think i'm close at least) so to report 1 result, it takes 7 queries, 4 for user data, and 3 for the result for each additional result it takes an additional 3 queries in addition to the 7 so for 5 results, it would be the 4 for the user data, and 3*5=15, so that's 19 in total however, if we were to report each of those results individually, the outcome would be quite different for each report, there'd be the 4 for user data, and the 3 for the result, which is 7 so 7 * the 5 results is 35, for the same 5 results that could be reported all together and only use 19 35:19 that's quite a difference in my mind My experience is that the problems are in contacting the upload/download server, rather than the scheduler. In addition, the latest round of problems has been in full disks on the upload/download server. Wouldn't letting the scheduler know that those results are there, ready for assimilation and deletion, help this situation? How many results get trashed when they are in that "Ready to report" status? Are those files still sitting out on the upload/download server? How long? They seem to have lots of capacity on the assimilator and deleter pieces of the process, and need some help on the upload/download part of the process. Wouldn't allowing "return results immediately" help this?many valid points, however we have to weigh up the cost:benifit ratio and decide which is the best choice based on cost:benifit there are already measures in place to stop the disks getting full, and to manage really old uploads that arne't in the DB anymore (so that disks don't get clogged up) but in short, the additional load placed on the DB isn't worth the "gain", and i argue that the gain is non-existant because "lost" results are just resent, and other things aren't of much benifit, the load on the DB is the single biggest problem for this project Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Engines |
![]() Send message Joined: 21 Apr 00 Posts: 1459 Credit: 58,485 RAC: 0 ![]() |
Isn't it better to maintain a constant stream of a small amount of data rather than everyone surging at times with dozens of updates? I would rather send one file 10 to 12 times a day than to wait and report 10 days worth of data.for data uploads there's no difference, and it's actually better to have a more constant stream, so the data server is better able to handle things, rather than in floods that come and go, but for reporting it's quite a different matter, also reporting doens't need much bandwidth at all, so you can quite easilly report 20 results without a problem reporting lots of results in the same update is kinder to the DB Just because you don't like the "bug", it doesn't mean it is one.opinions on the matter are irrelevent, the fact remains that it's a bug, because it's not supposed to behave like that, so it's a bug by definition Want to search the BOINC Wiki, BOINCstats, or various BOINC forums from within firefox? Try the BOINC related Firefox Search Engines |
![]() Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 ![]() |
In addition to Lee's excellent explanation, also consider this: When uploading all you do is write your result to a directory on the harddrive. That doesn't take much overhead. Yet for reporting each result, you use the server's CPU and memory. Not so much a problem if only you, or Jack were doing it. But then extrapolate this when thousands of users are doing this, multiple of them at the same time. |
![]() Send message Joined: 25 Nov 01 Posts: 21675 Credit: 7,508,002 RAC: 20 ![]() ![]() |
In addition to Lee's excellent explanation, also consider this: (My bold added to the quote.) The bottleneck for the entire system is the Berkeley database server. Sending back clusters of results all at once reduces the number of database server transactions required and so gives a useful server-side performance boost. Multiply that (small) boost by a few hundred thousand and you save yourself the need for a very expensive machine in your server closet. Jack, you got a spare super-server that you can donate? There's much better ways to monitor the health of the machines in your farm. Regards, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
![]() Send message Joined: 23 Oct 00 Posts: 33 Credit: 16,828,887 RAC: 0 ![]() |
Actually it IS off by default (at least it was when I used Trux' client before 5.4.9 came out). Only Calibration was on by default. You had to change the config file to actually turn return_results_immediately ON. Trux' client was much more useful than just for calibration and returning results settings. It allowed automatic CPU Affinity (great for SMP and Dual Cores), Affinity by Project, and assigning Project Priorities specific to that computer. So i wouldn't disparage Truxoft for what is an option and default off "feature"/bug. ![]() |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 ![]() |
The result uploads go directly to the hard drive. They are 6-23K in size. The "reporting" isn't really much more than a few bytes, but it forces communication between the different servers inside their network. Each connection takes time and the operations required after receiving the report take time. It's this which is slowing down the system. To report 1 wu takes about as much as reporting a dozen. So by waiting and reporting multiple results at the same time saves on these open connections between servers. Reporting means invoking a program on the web server (which may spawn a thread and load an executable), connecting to the database server, finding the appropriate row in the table and updating it. If you are reporting more than one work unit, all but the last two steps can be shared across the update. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 ![]() |
Jack could get a hosting account at some cheezy ISP like 1dollarhosting.com and then put a little script on each machine that uploads a file periodically (every 10 minutes?). It could be a pretty small file. ... and at least a half-dozen ways to get those all combined into one HTML file. |
WendyR ![]() Send message Joined: 1 Aug 05 Posts: 44 Credit: 1,962,140 RAC: 0 ![]() |
Lee, Thank you for your thoughtful, polite and useful response to my questions. Please understand that I am not trying to be obnoxious here. I, too, am a software developer, and fairly good at what might best be called "process optimization", and might put a little different "spin" on things. i do as well, and have been tempted sometimes, however, i'd rather respect the wishes of the project admins, as they know what's best for their projectYes, it is their project, and they get to make the final decision. I just hope that rational input from the user community is considered. i see his point, but there are much easier and better ways of monitoring.I agree, but it is nice to see some "side benefits" of this too. a more appropriate solution is to reduce your cache, increase your screen resolution, or use something like BoincView which can filter results based on status (so you only see active and paused if you want), there's a setup guide in the wikiAlready at .1 days, using BoincView, and monitor set at maximum resolution. (These 42 yr old eyes are starting having issues with those little letters on the screen too) ...[useful explaination of report results process and database operations deleted for brevity]... many valid points, however we have to weigh up the cost:benifit ratio and decide which is the best choice based on cost:benifit Until the bold statement, certainly all valid points that really can't be argued with. I too, would have to go with the decision to optimize the cost/benefit ratio. But, when you optimize a process, you have to look at the whole picture. Sure, you can spend lots of time optimizing a small piece of the process, but even a factor of 10 speedup on 1% of your total is only a total speedup of .1%. You are better off looking at a more modest improvement on a larger portion of the whole. You also want to look for "low hanging fruit" -- things that you can do quickly, easily, and with as little disruption to the existing process as possible. So, just where are the log jams and problems in the current process? Have they "moved" or changed in character with recent hardware and software upgrades? Are "we" operating under old assumptions? Looking back over posts from the last few months, there were several outages related to specific hardware failures, disks, memory upgrades, UPS, power failures -- talk about luck! I seldom see "large" buildups in one area on the status page anymore, and when I do, there is usually some process down, and it clears in a few hours. Clearly, the database server is able to handle "day to day" tasks without falling behind. It also appears to have additional capacity to be able to catch up fairly quickly after an outage. I found this little tidbit here. It is Tony (mmciastro) quoting Matt Right now we are generating and sending out 250,000 results a day without taxing our database server. Recent problems -- looking back, there was a whole series of 403 and validation error reports starting around March 27, 2006, finally ending about a week later with "cranking up" the time to deletion. There are 403 and validation errors happening right now (May 15, 2006). My (failable/sp?) memory seems to recall a couple of other spats with this type of problem in the last few months. I seem to recall a couple of other times sometime last year where a full disk resulted in both database sluggishness, malformed workunits being created, and upload/download issues. Each time, the solution seemed to be to clear up those directories on the upload/download server. Right now, this seems to be the most common reason for an unplanned outage. So, the obvious questions are what can be done to help this? What simple things are available? Faster clearing of completed results [report_results_immediatly] would seem to be one way of improving the situation. It seems like it should be simple, since it was already in the software at one point in time. Now, the whole "BOINC covers more than one project" argument should still be made, and, yes, my specific examples apply only to the current situation for the SETI project, and perhaps other projects are database bound. That does need to enter into the cost benefit analysis too. Am I making any sense? It sounds like I am blathering now. I will shut up for a while.... |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.