Message boards :
Number crunching :
Average Credit Decreasing?
Message board moderation
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 32 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
Ergo: CN = Flopcounter May I take slight issue with that? In my view, CN pretends to be a flopcounter: it presents its figures - yes, in cobblestones - as if they were flops which had been counted. But they're not, and they haven't been. One of the most er, dodgy, things that BOINC (centrally) does is to reconvert [*] arbitrary CN numbers back into what it claims are scientifically counted flops. Thank you for reminding me of that. Arguably, the best BOINC credit system so far was the second - Eric's attempt at true flopcounting, which worked pretty well here, from what I remember. [*] on the BOINC home page currently: "24-hour average: 11.562 PetaFLOPS" |
shizaru Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0 |
(EDIT: Richard, hadn't seen you're post when I wrote whatever's below but since we're kinda on the same page now I don't think I'll add anything.) Unfortunately that tone is my baseline. It gets under people's skin I know. 3 parts INTJ and 1 part cultural differences. It's you, and Richard, and Raistmer (and all the other "above & and beyond the call of duty" people) whose life we need to make easier. Ironically that was the whole point of my previous post. It was my point back when I got the BoZ speech too. -If we multiply the estimates by rainbows (or call the new credits Andersens if it makes anyone feel better) then everybody can sit back, relax and enjoy themselves. Because once we do, we can define what those credits are. And it'll look something like this (just a lot better written): "We've tried to make a system that's FLOPS-based but unfortunately we can't come up with one (for now) that isn't vulnerable to cheating. CREDITS will henceforth be called Unicorns (and RAC will become Recent Average Unicorns) and be a daily token given by Boinc that CAN NOT and SHOULD NOT be compared with past or future tokens. And it should most certainly NOT be used as a measure of your PC's health or abilities. We'd love to give you a speedometer but are unable to create one ATM that doesn't compromise the science. Hopefully one day we will... But until then it's Unicorns." You can't say any of that though if you're calling your CREDITS "Cobblestones". How much time and energy would it have saved you guys (the gurus) if this was done when v7 rolled out? ----- PS Think convergence on +/- 10 % elapsed estimates on a given host, adapting to major changes within 10 tasks, would be enough of a goal ? Is that question directed at me? I'm guessing it's for Richard but not quite sure... |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
The kitties like to call their credits....kibblestones. "Time is simply the mechanism that keeps everything from happening all at once." |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
yes actually both correct the way I see it. The terms 'flopcounter', 'estimate' and 'controller' are for the purposes of scheduling the tasks, and mildy stretching semantics more or less interchangeable. Doesn't matter whether you estimate tasks in seconds of work, #fpOperations, Cobblestones, or bananas. Ãt's the function of the estimate that's important, and the quality of its predictions with respect to requirements. Unfortunately there is a disconnect between user expectations/needs, and the CreditNew spec which has not specified any metrics to determine whether it's working properly or not. Something related ML1 touched on as well was the idea of 'fudge factors', and a reasonable aversion to them. The thing is, if it meets non-existant specifications/requirements then it isn't a fudge-factor, it's a control point. If it's a logically justifiable procedure to solving problem, it's an algorithm. So 'real' first step IMO, is specifying the requirements/needs (based on our experience that the current situation isn't good enough). Debating semantics is what the Committee is for :-D "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Think convergence on +/- 10 % elapsed estimates on a given host, adapting to major changes within 10 tasks, would be enough of a goal ? Anyone :) as per my last post, the requirements spec for CN appears to be missing, so we get to make one :) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3797 Credit: 1,114,826,392 RAC: 3,319 |
|
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
The kitties like to call their credits....kibblestones. LOL....yes.... And whilst the flavor of the current kibbles seems to be acceptable to the kitties, the quantity has been falling, so they have to get the computers to crunch harder to keep the mean level of kibbles in their bowl at a proper level. "Time is simply the mechanism that keeps everything from happening all at once." |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
It's just statement regarding current state of things. Yes, it's smth like that currently. But... if it will be acknowledged... then those unicorns will be useless (as current credits are :) ). Credits used for : 1) (and most important) bragging/competition. 2) performance measurement tool. You propose to forget about 2, discard "FLOPS" and leave arbitrary "unicorns" instead. But all those problems that make current credit not suitable as 2) make it not suitable for fair 1) too. Hence, FLOPS or unicorns - people will not be able to make fair competition so cries remain... Regarding FLOPs-counting as V2 - good (really good!) in single project/single homogenious app. Here we have 2 apps, one of them (MultiBeam) highly non-homogenious one (in such degree that we have separate VLAR rules on server). It's quite obvious that different part of work for such app, being assigned some flops, will have absolutely different performance (measured let say in tasks done for particular AR) on different hardware and different ARs. So, we still have those "VHAR/VLAR storms" that will bias RAC, even in case credit granted in V2 (by pre-assigned FLOPS). Biased RAC w/o any change in both hardware _and_ software part of host setup. Just because of current data stream (totally out of operator's control). So, no fair competition, cherry picking in case of too much competitive person and so on... all spectrum of negative issues remain. |
shizaru Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0 |
Wow, OK... I'm shocked. By some miracle we're all on the same page all of a sudden. Awesome. I thought you guys were gonna pick up from where we left off (almost 3yrs ago) and that I was gonna get another lecture on how credit is NOT a flopcounter. So now what? Well now... From where I'm standing it looks like you are trying to fix something that isn't fixable. Now don't get me wrong... Jason if you think you can crack the FLOPS problem and reel it into some semblance of reality then you're a miracle worker. Plain and simple. But for now, and for your own sanity, and Richard's, and Raistmer's... And all the other gurus. Oh and Eric's of course :) ... ... It might be a good idea to call credits anything but Cobblestones. Because as long as you do, you're all going to go crazy. OK so here's the deal with the CreditNew equation: -After the CN equation is done chasing it's tail it spits out a number. -CreditNew calls this number "F" (after calling it a bunch of other things first) Now let's all call "F" something like "CN" for the sake of ALL our sanity. And again for all our sanity let's agree for a moment that CN is a random number generator (which is one of the kindest things people have called it over the years). Unfortunately (and out of the blue) we do this with it: CN*cobblestone_scale = CREDIT But if CN is arbitrary then no matter what you multiply it with, the result will also be arbitrary. Also, there's no real need to multiply it by anything (let alone Cobblestones). We can easily leave that part out. It won't break the equation or anything because it's not really PART of the equation. It could just as easily be: CN = Credit And "Credit" could be anything. Anything EXCEPT Cobblestones ;) |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 |
expect credit to go south again soon ... |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
True. It *appears* and is árbitrary (actually chaotic), because of the faults. expanding the CN part: peak_flops*elapsedseconds*pfc_scale*cobblestone_scale - 1st term peak_flops: (Sticking point for later) - 2nd term elapsedseconds, measured, is the important one, because the whole purpose is to automatically adjust the second term to predict how long your next tasks will be. - 3rd term, pfc_scale, and its inverse 1/pfc_scale used for prediction, are misnamed perhaps unintentionally, will come back to it (prediction is important for scheduling) - cobblestone_scale. Only a unitlesss conversion from one scale to another (measuring stick), could easily be banana_scale. so the only term here that is truly arbitrary/chaotic, is the scaling term, that is supposed to compensate for peak_flops being all over the map, and natural variation in elapsedseconds It could just as easily be: And "Credit" could be anything. Anything EXCEPT Cobblestones ;) Exactly, you can award Virtual-Tacos if you want, however we still need to be able to estimate how long tasks will take (obviously broken in various situations) Back to 'Ãs it fixable?'. I contend yes. Here is why: You can look at your task history and predict how long future tasks will take --> Humans are good at that, it's called 'éstimate localisation' pfc_scale, because normalisation to a reference application is a step, should have been named 'relative_efficiency'. With only 1 application, the credit claim boils down to claim*cobblestone_scale We saw that on Albert@home tests, precisely as you say, the same amount as flopcounting (There are natural instabilities in elapsed_seconds, OK, but not that much, noise can be smoothed with damping if you know what you're doing, humans also do that with their knees while riding a skateboard.) Let's put the damping/stability aside as very fixable. So accepting (with a stretch) that projects using creditNew understand their application well enough to put a reasonable #operations on their tasks to start with (There are formal techniques and it's not actually mutable), Where's the problem ? The main (scaling) problem is with how that peak_flops is yet another misnamed number, misused. For CPU it's Boinc_Whetstone (well disguised) For GPU it's marketing GFlops Now comes the tricky bit. Normalisation to the reference application, when there is only one application, shows 'çorrect' credit and 'reasonable' estimates (ignoring that noise/damping issue), as our experiments on Albert indicated. on Windows/Linux-Intel/Mac-Intel Boinc_Whetstone is flat out WRONG, too low when you have SSE through AVX engaged. Now we have different hosts with different features being normalised to an increasing number of results claiming a fraction of Actual The estimates come too short, are unstable depending on what hosts are returning work, bounds periodically get exceeded (indicating breakages in teh system), and arguably least important you get a fraction of the credit intended. The fix ? Well the server side knows your CPU features, the project hopefully knows for the reference application(s) what instruction sets they enabled. We then know a rule of thumb to determine how much boinc Whetstone underclaims for a given instruction set ( Sisoft_SIMD_Whetstone/Boinc_Whetsone ) At least for 'reasonable' enough correction (as Humans do, and all we need for prediction/estimation purposes) you should find on your hosts, Sisoft SIMD singleThreaded Whetstone is approximately 2.5 to 5x the Boinc Whetstone figure. Funny part is, on the larger scale (many results/hosts), the 2x slop in that will even cancel out by the fact that some hosts would be on the high side, while others on the low. The only reasons it won't self-correct without compensation, are that first FPU-only applications and hosts don't represent enough numbers of results (if any anymore) compared to the massive #of underclaiming SSE-AVX results. Address that discrepancy ('Óptimising' Boinc Whetstone won't cut it), and add some damping to stabilise, and it will dial in as intended (i.e. estimates). Homework: A) Run Sisoft highest SIMD single threaded benchmark, and divide that by Boinc Whetstone on your host, Multiply your RAC by that, that's cobblestone_scale award +/-noise. should be ~2.5-5x or so (Mine's 3.3x) B) work out all the factors that affect reported elapsed_time in your results, and how quickly responding estimates need to adapt for practical purposes/change, while not bouncing around all over the shop giving meaningless noise. (not quite as subjective as it sounds) C) if you have a smartphone, open up Googlemaps and marvel (lol) at how relativistic estimate localisation through sensor fusion and statistically based controllers is done right (with GPS and GLONASS). "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
expect credit to go south again soon ... Lol, gave me an idea for a principle: Credit is inversely proportional to the wonkiness of Boinc's estimates. (which get more wonky every time an old host dies or new more efficient one comes online) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
True. It *appears* and is árbitrary (actually chaotic), because of the faults. Hold that thought. "Chaotic" is a good scientific (mathematical) word. Just don't call it random in David's hearing (he's touchy about that). |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
True. It *appears* and is árbitrary (actually chaotic), because of the faults. Agreed. Chaotic is not the same as Random, It is NOT random. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14674 Credit: 200,643,578 RAC: 874 |
True. It *appears* and is árbitrary (actually chaotic), because of the faults. Om. Chant after me: "Chaos is not random". |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
True. It *appears* and is árbitrary (actually chaotic), because of the faults. A wise puppet, (Agro Vation, my Avatar), once said: "Beauty is only skin deep, but ugly goes right through to the bone." "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
William Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0 |
True. It *appears* and is árbitrary (actually chaotic), because of the faults. 'Three is Chaos' No it's not random, but the code amounts to fairly unpredictable credit awards (though as for any chaotic system, there are patterns [*]) and to the layman unpredictable is pretty much the same as random. We say chaotic, when we know you could calculate it. 'random' is a word we use when there don't seem to be any rules. Very few things are truly random. Blame physics for that ;) [*] and if you were in possession of every single variable used in CN you could indeed calculate how much credit you'll be getting. CN IS a calculation after all. A person who won't read has no advantage over one who can't read. (Mark Twain) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Well that's just the difference between transferring the (awkward) theory, from formulae to practice. Properly engineered solutions are a marriage of form and function, the synthetic (theory) and the aesthetic (simplicity, proportion and beauty), crafting an optimal synthetic-aesthetic, i.e. 'State of the Art', recognised since the renaissance. [Edit:] leading to the aerospace design rule of thumb --> if it looks wrong, it probably is wrong. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
shizaru Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0 |
BTW can anybody put some real numbers into the Creditnew equation? I'd like to play around these days but loathe to work with just letters and words. Feels like 10x the effort (at least for the way my own brain is wired). Anyway, can anybody create a template? Might even be a good idea to have links on the CN Wiki page directing to real world examples of a Seti task and other Boinc projects tasks... dunno. William? Ageless? Jason? Anybody? Thanx in advance. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Got a spreadsheet somewhere on Googledocs for analysing Albert data [Edit: may have been seti-beta CPU, will see], Will dig it out on the weekend. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.