Message boards :
Number crunching :
New Credit Adjustment?
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 16 · Next
| Author | Message |
|---|---|
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
"To keep the RAC constant" is not a very good reason to buy a new computer. If you read what Eric wrote carefully, he's saying that what they're doing is taking a sampling of machines, and they're comparing the benchmark * time vs. FLOPS, and adjust so that on average the credit will be the same. Back when we used the old benchmark * time method, credits did increase as machines got faster. With the new system, as the median moves up, so will credit, just as before. I'm done now.......... People will quit for many reasons. The statement that an old machine, by virtue of being old, will get a lower RAC is a false statement. If someone is looking for a reason to get mad and quit, it is as good a reason as any. |
OzzFan ![]() Send message Joined: 9 Apr 02 Posts: 15692 Credit: 84,761,841 RAC: 28
|
All I am saying is that there used to be updates on promising signals seti classicand what we get now is not much.I don't recall to many complaints about credit even though there was only one granted credit per work unit. I know for one, I cared less about the credits as finding a signal back then was far more important to me. OK, that I can understand your interest. But I also know that they are so close to getting the Nitpicker running properly. This is still no reason to blame obsession with credits on the projects though. |
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
All I am saying is that there used to be updates on promising signals seti classicand what we get now is not much.I don't recall to many complaints about credit even though there was only one granted credit per work unit. I know for one, I cared less about the credits as finding a signal back then was far more important to me. There are a couple of reasons: I think we (by which I mean Eric and Dan and the other scientists) have learned that the need a better filter than the one they used during Classic -- that there are lots of things that look like candidates, but aren't really. SETI@Home is underfunded, so developments like the NTPCKR just take longer. It also occurs to me that if the NTPCKR was available on day 1, that it wouldn't have done anything -- the "p" is persistency, and you need to be able to compare multiple signals across time. I'm not being critical, people make choices and they are entitled to do so, but there are people with thousands of dollars worth of dedicated computers and huge monthly power bills who have not sent a little cash to the project. It is unfortunate that we're not just the source of clock cycles, but also the primary funding source -- but creating that table of candidates for Classic cost money, and creating the NTPCKR costs money, and it just seems that if that is what we want (some measure of science from the database) then some money can be invested in crunchers, and some in the scientists. Maybe SETI@Home should sell credits? |
|
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0
|
All I am saying is that there used to be updates on promising signals seti classicand what we get now is not much.I don't recall to many complaints about credit even though there was only one granted credit per work unit. I know for one, I cared less about the credits as finding a signal back then was far more important to me. As an aside to a hugely entertaining (yawn) re-statement of solidly entrenched, diametrically opposed viewpoints, isn't that what Mark did with his auction? F.
|
|
Ingleside Send message Joined: 4 Feb 03 Posts: 1546 Credit: 15,832,022 RAC: 13
|
To answer the question of equal credit why not the old system of 1999 1 credit per WU worked okay then. This worked fairly well in the start, since all wu's took roughly the same amount of time to crunch regardless of Angle-range. But, with the release of v3.03, the most popular windows-cmd-client somehow was significantly slower on VLAR, and VHAR was faster than "normal" AR. This lead to many users deleting all VLAR, and some even deleting "normal" AR, for just crunching High Angle-range. If my recollection isn't too fuzzy, VHAR gave roughly 25% advantage over "normal", and 50% advantage over VLAR. For users running win9x, VLAR was even worse... Not sure, but even one of the most popular tools for "classic", SetiQueue, in AFAIK atleast one version could easily let users delete wu's based on angle-range... Now, this was quickly removed again, but still the later SetiQueue included settings to wait longer than normal if grabbed a VLAR, to decrease the odds the next also is VLAR, and in case of VHAR, it was an option to immediately grab more work, to increase the odds of getting more VHAR. Also, if continued getting VHAR, it was possible to "overfill" the queue with... not sure if it was 25% or something... Another problem was that v3.03 was significantly slower than earlier versions, so many users was angry with the "1 wu = 1 credit"-system, and some left because of this. Interestingly, the same users had no problems with the "1 wu = 1 credit"-system then v3.00 was released, since v3.00 was significantly faster than earlier clients... Oh, and until v3.00 was released, one very popular beta-client was v2.70, even it was reported it wasn't returning enough info, many still continued to use it, since it was faster... So basically, history from SETI-"classic" taught Berkeley, that a "1 wu = 1 credit"-system doesn't work in practice, atleast not if they wants to get all wu's crunched and not only the VHAR-wu's, if not all wu's takes roughly the same amount of time. With SETI_Enhanced atleast 10x difference in crunch-times, it definitely wouldn't work, since even the 25%-50%-advantage in "classic" was enough for some users to abort everything except VHAR... With Astropulse, the difference is much higher than 10x... "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
What Mark did definitely benefits SETI financially, but Mark still has to crunch to get the credits. It's a secondary market (and very inventive). ... but what if SETI@Home simply sold credits? Send 'em $20, get 20,000 cobblestones. Want 1,000,000 cobblestones? Send 'em a kilobuck. SETI needs money, and SETI needs clock cycles, why not reward both? |
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
OK. Let me see if I understand the concepts here... In the beginning, there was 1 credit for 1 WU. Crafty users soon found that certain WUs took less time to process than others, so 1:1 was deemed "bad". At the dawn of a new age (BOINC), the measure was changed to a benchmark and credits were awarded by computing benchmark * time. Crafty users soon found that benchmarks differed between various OSes, notably Linux and Windows, as well as noticed that one could inflate benchmark scores to give higher credit claims, so benchmark * time was deemed "bad". Industrious developers with SETI and/or BOINC decided that instead of benchmark * time, a new method based on the number of floating point operations performed would be used instead. This method seemed to provide stability across multiple OSes and all but eliminated the issues with benchmark * time. SETI project administrators refused to disallow BOINC version 3.x, 4.x, and 5.x clients that did not handle flop counting properly (or at all), which caused issues with still having some clients that requested benchmark * time (or zero, in the case of the 3.x clients), but by in large, things were "good". Various other projects remained with the known flawed benchmark * time methodology, while yet other projects started granting fixed credit amounts that could be larger, equal to, or less than claimed credits based on benchmark * time. A decree was issued that projects should calibrate their credit to within 20% (over or under) of SETI, or else a request would be made of various statistics sites to manipulate the statistical data to force calibrate the stats so as to obfuscate which project was granting a higher amount of credits to the casual user. This effect should've also obfuscated the data about projects granting less as well, but the stated objective was in the context of projects that were granting "too much", not "too little". The prognostication was that if projects started competing for participants by offering ever higher amounts of credits, this would cause a "credit war". Despite these dire prognostications, this "credit war" never seemed to materialize. Now the claim is that apparently the projects that had a payout of less than SETI "wished to start a credit war" (paraphrasing) by increasing credits apparently far beyond the 20%-rule that came out in the decree (inferred as such by the wording of Eric's post). In response to this, SETI responds with credit deflation and an inferred switch back to the known-flawed benchmark * time method. Questions: 1) Is benchmark * time now or soon to become the "official" BOINC-wide credit methodology? 2) Given the flaws of the benchmark, how does this improve upon the idea of cross-project parity, given that benchmarks differ between Operating Systems and considering application performance differences within a single project? Example: An AMD system running Linux at Einstein@Home is faster than the equivalent Windows-based host, not based on the difference in the benchmark, but in the science application's performance being better when compiled with GCC in Linux than with Visual Studio in Windows.
|
|
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0
|
We may have interpreted various postings differently, but I have not inferred from anything that I have read that the basis for credit here on S@H will revert from FLOPS to benchmark*time. Could you point me to the relevant statement, please? F.
|
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
There are only two possible translations: 1) All other projects will code new applications to perform flop-counting. 2) Flop-counting levels here will be reduced to match benchmark * time * whatever this new-fangled multiplier will be, essentially making benchmark * time (or benchmark * multiplier???) as the "standard". I'm willing to listen to #3, if someone would like to splain it a bit better, however ANY change here MUST now include addressing BOINC 3.x, 4.x, and 5.x clients that underclaim. Additionally, someone needs to speak to the differences between identical processors running the same project, except with a different OS. The example for Einstein is very real. It was reduced greatly, but the difference still exists. As an AMD/Windows user, I currently have a disadvantage there vs. what I would if my system ran Linux. This does not translate to "equality" in my book. I have a concern as to how this "New Deal" would treat two systems that were in such a situation.
|
Mumps [MM] Send message Joined: 11 Feb 08 Posts: 4454 Credit: 100,893,853 RAC: 30
|
Unless I mis-interpreted it, doesn't the following quote from Eric state just that? Q. Does this multiplier fix the credit for that machine to a certain credit per day (say 100)? I've read many a post here indicating that a given hosts benchmarks (floating point and integer combined) are only a minor reflection of the reality of how much time it will take to crunch a WU. So I'm a bit concerned that this will cause AMD based machines (as a potentially wildly incorrect example) to have a more significant change in their claimed credit than a Core-2 Intel machine. [EDIT]And as another concern, a lot of the increase from the Optimized apps have been achieved by being more efficient with the CPU. The Benchmarks (Floating+Integer) don't change with an Optimized App. A host just gets lots more done with the same number of them. So how does an optimized app get more credit when it won't be changing anything applicable to the credit claiming/granting formula?[/EDIT] |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3
|
1) All other projects will code new applications to perform flop-counting. Apparently not possible for those projects that use the wrapper, as it is either too difficult or impossible to make the wrapper "see" flop counting. |
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
Thanks. I missed that point... Also, I want to again mention that there are differences in science application performance inside various projects. The example that I am aware of is at Einstein, where the Linux application still performs better than the Windows application on similar (very similar) hardware. Additionally, benchmarks have been altered in the past. Additionally benchmarks have been different between OSes.
|
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
1) All other projects will code new applications to perform flop-counting. Why is it that certain individuals cannot leave well enough alone until they actually have a workable plan?
|
Blurf Send message Joined: 2 Sep 06 Posts: 8964 Credit: 12,678,685 RAC: 0
|
Brian--Eric's post indicates this IS the "workable plan". |
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
Pete, Perhaps I should restate it as "logically reviewed plan that ensures that the goals are truly achievable and thus is a real 'workable plan' for the masses"? Verily this vichyssoise of verbiage veers most verbose... |
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
OK. Let me see if I understand the concepts here... Brian, You're in trouble by the second sentence. In the beginning, all work units took the same amount of time, so just counting work units was fine. When work units vary from a few minutes to a few hours, one credit per work unit is not so good. That is an important distinction. If we could average benchmark * time over thousands of work units, and thousands of computers, it is a reasonable measure of the average credit per WU. If counting flops returned 65,000 credits, and benchmark * time would have given 60,000 credits, then the multiplier is too low. The problem comes up when you try to grant credit based on benchmark * time on a single work unit and a single CPU. Using the average of 10,000 work units, selected at random, to "normalize" the credit multiplier is a long way from abandoning flop counts. -- Ned |
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
OK. Let me see if I understand the concepts here... I do believe I said that, and I quote, "Crafty users soon found that certain WUs took less time to process than others, so 1:1 was deemed "bad"." However, for maximum verbosity and crystal clear clarity: SETI Classic workunits in the very early days of SETI@Home ran at approximately the same time per each work unit. During the span of time in which SETI was still utilizing the "SETI Classic" software and credit schema, variable runtime workunits were introduced. Soon after this paradigm shift occurred, various users found that certain WUs took less time to process than others, aborted those that took longer, and so 1:1 was deemed "bad".
A benchmark on my Pentium 4 2.40GHz system running Windows XP should be very similar to the benchmark on my Pentium 4 2.40GHz system running Ubuntu, if it were a dual-boot system. As I understand things, this is not true utilizing the benchmarks as of today. I do not mean that things must be exactly equal, but they should be approximately within 1-3%. My impression is that the delta is larger than that. If someone would like to point me to a dual-boot Windows/Linux system that shows otherwise, feel free to do so and I'll reevaluate my thoughts on this aspect. Using the average of 10,000 work units, selected at random, to "normalize" the credit multiplier is a long way from abandoning flop counts. The difference in benchmarks between OSes is one factor that corrupts that data set. Sure, you could (and I'm sure will) argue that the sampling size is sufficiently large to reduce the effects, but there are also individuals who inflated benchmarks in the BOINC client, since it is open source. It doesn't have to be one of the known versions that have inflated benchmarks either. It could be someone who silently compiles their own unique version, doesn't distribute it, and thus claims high on a consistent basis. Flop counting is also not without a possibility of tampering. As I understand it, the science application reports it as well as the BOINC client. I'll admit that I do not know enough about how the process works so this may not be within the realm of possibility, but if the value of the flop counter is accessable by looking at a result file, one could hex edit the result file before uploading or one could programmatically change the value by 0.1-0.25%. Don't laugh. According to some, vast hordes of people are just that crafty to where they'd do it... To properly settle things "once and for all", all potential avenues of manipulation of the benchmark results or the flop counter need to be removed. As such, some sort of hash would need to be created and the value then encrypted with AES-level encryption with only the individual project having the private key. Each project would have their own unique public/private key pair. Each project would rotate pairs independently from the other at random intervals unique not to all projects, but only within a specific project. If the project had a science application that was open source, then the process that handles the hash and encryption would need to be a closed source external dll. This should take care of all but the most "hard core" hacker, as anything sitting on a user's box is obviously at risk for disassembly. As such, ONLY the public key would reside at the end-user (us, the crunchers). Now, can I do any of that? Nope, at least not in a hurry. I do know that if we're all bent out of shape so much about "fair" or "equal" credits, then the whole process should be treated just like credit card data is supposed to be handled under the Payment Card Industry Security Standard Council's "Payment Card Industry Data Security Standard" (PCI DSS). Go Google it...
|
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
OK. Let me see if I understand the concepts here... Brian, I said "work units have wildly different times" which you turned into "people cheated, so we needed a more complex credit system." That isn't the same. You said "my two machines under different OSes get different benchmarks" and I said "we aren't comparing individual machines, just how two methods of scoring compare." Eric said the S@H developers chose the wrong multiplier. You turned that into "tampering" and the cryptographic means to prevent it. I understand PCI DSS, it is part of my job. It does not apply. The proposal is trying to solve two interrelated problems:
|
|
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0
|
Let me offer a different interpretation: The current fleet contains a mix of processors, ranging from PowerPCs and early Pentiums through Core2 Quads, Nehalems, etc. If you summed up all of the integer benchmarks and divided by the number of machines, you'd have the integer benchmark for the median machine. Do the same with the floating point benchmarks to get the median floating point benchmark. Multiply by the constant, and you'll have the number of credits per second that the middle-of-the-road machine should get. Now, add up all of the granted credits, and divide by the number of actual seconds over some interval (it could be an hour, or a day, doesn't matter) and that will tell you how many credits per CPU-second are actually granted to the same middle-of-the-road machine. The ratio between the two numbers, for the median machine, show how much the current multiplier is off, and in which direction. If slow machines retire, and newer, faster machines are added, the values of both numbers will move up. Either way, the definition of "middle of the road" represents a mix of Intel and AMD (and Motorola, and Sparc, and whatever else is out there). If the "fleet" doubles in speed, the number of credits per second for the "middle of the road" increases. |
|
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0
|
Without discussing the fears that have been presented, mainly by BOINC (housed at UCB) and its' staff, as reasons for cross-project parity, one of which is cheating, then you are glossing over one of the reasons that has brought us to this exact moment in time. It has been an ongoing theme, with various volume levels, since people started dumping the longer-running tasks. It morphed into the epic debate about version 5.12 and the resulting team-only optimized version, which was followed by Simon and others in picking up the mantle. It is the underlying message behind equalization of credit, partially veiled by the noble idea that people should select projects for the science. The overwhelming statistics already state that the vast majority of the user base already does select the project(s) they participate in on their interest level in the project. What else is the reason why the ultra-high granting projects simply did not attract hordes of users? Rosetta has approaching 209,000 users. RieselSieve has around 8,500. Stated differently Rosetta has 24.59 times the number of users as RieselSieve. What I'm driving at is that there is a large expenditure of negative energy on this subject when it, quite frankly, is not the big monster that it is made out to be from the side of BOINC or the projects. Sure, it would be nice if...but the reality is such that there is no proof that users are congregating in large numbers based on the amount of credit awarded or not awarded. As this is the case, then why is there always some big "need" to rush headlong into "fixing" things when one could take one's time and do it right? Already I've been informed that even with this plan, there are some projects which can't implement it due to technical limitations as well as their own funding / resource issues making it extremely cost prohibitive to do so. What happens to those projects? Do they weigh the average down? Do they increase the average? Will action be taken against them by BOINC administration for non-compliance?
As indicated, I'm looking at a larger view of history than you appear to be willing to. I also am, quite frankly, sick and tired of this constant upheaval and am offering ideas that help secure the calculations so as to lock it down, which was not done originally and appears to not be taken into consideration now, since we are back to using benchmarks which are known to have been both flawed and tampered with in the past.
This is where you have not paid attention good enough. Those are the technical reasons. The political reasons are probably about the same weight in this. I'm not saying that the technical reasons are bogus, but I am saying this was an awful convenient time to "earmark" the technial requirements with an additional agenda...
|
©2026 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.