Average Credit Decreasing?

Message boards : Number crunching : Average Credit Decreasing?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 32 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1786425 - Posted: 10 May 2016, 14:08:00 UTC - in response to Message 1786410.  

Ergo: CN = Flopcounter

May I take slight issue with that? In my view, CN pretends to be a flopcounter: it presents its figures - yes, in cobblestones - as if they were flops which had been counted. But they're not, and they haven't been.

One of the most er, dodgy, things that BOINC (centrally) does is to reconvert [*] arbitrary CN numbers back into what it claims are scientifically counted flops. Thank you for reminding me of that.

Arguably, the best BOINC credit system so far was the second - Eric's attempt at true flopcounting, which worked pretty well here, from what I remember.

[*] on the BOINC home page currently: "24-hour average: 11.562 PetaFLOPS"
ID: 1786425 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1786427 - Posted: 10 May 2016, 14:20:57 UTC - in response to Message 1786415.  
Last modified: 10 May 2016, 14:25:03 UTC

(EDIT: Richard, hadn't seen you're post when I wrote whatever's below but since we're kinda on the same page now I don't think I'll add anything.)

Unfortunately that tone is my baseline. It gets under people's skin I know. 3 parts INTJ and 1 part cultural differences.

It's you, and Richard, and Raistmer (and all the other "above & and beyond the call of duty" people) whose life we need to make easier. Ironically that was the whole point of my previous post. It was my point back when I got the BoZ speech too.

-If we multiply the estimates by rainbows (or call the new credits Andersens if it makes anyone feel better) then everybody can sit back, relax and enjoy themselves.

Because once we do, we can define what those credits are. And it'll look something like this (just a lot better written):

"We've tried to make a system that's FLOPS-based but unfortunately we can't come up with one (for now) that isn't vulnerable to cheating. CREDITS will henceforth be called Unicorns (and RAC will become Recent Average Unicorns) and be a daily token given by Boinc that CAN NOT and SHOULD NOT be compared with past or future tokens. And it should most certainly NOT be used as a measure of your PC's health or abilities. We'd love to give you a speedometer but are unable to create one ATM that doesn't compromise the science. Hopefully one day we will... But until then it's Unicorns."

You can't say any of that though if you're calling your CREDITS "Cobblestones".

How much time and energy would it have saved you guys (the gurus) if this was done when v7 rolled out?

-----

PS
Think convergence on +/- 10 % elapsed estimates on a given host, adapting to major changes within 10 tasks, would be enough of a goal ?


Is that question directed at me? I'm guessing it's for Richard but not quite sure...
ID: 1786427 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1786428 - Posted: 10 May 2016, 14:22:53 UTC

The kitties like to call their credits....kibblestones.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1786428 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786433 - Posted: 10 May 2016, 14:31:12 UTC - in response to Message 1786425.  

yes actually both correct the way I see it. The terms 'flopcounter', 'estimate' and 'controller' are for the purposes of scheduling the tasks, and mildy stretching semantics more or less interchangeable.

Doesn't matter whether you estimate tasks in seconds of work, #fpOperations, Cobblestones, or bananas. Ít's the function of the estimate that's important, and the quality of its predictions with respect to requirements. Unfortunately there is a disconnect between user expectations/needs, and the CreditNew spec which has not specified any metrics to determine whether it's working properly or not.

Something related ML1 touched on as well was the idea of 'fudge factors', and a reasonable aversion to them. The thing is, if it meets non-existant specifications/requirements then it isn't a fudge-factor, it's a control point. If it's a logically justifiable procedure to solving problem, it's an algorithm.

So 'real' first step IMO, is specifying the requirements/needs (based on our experience that the current situation isn't good enough). Debating semantics is what the Committee is for :-D
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786433 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786436 - Posted: 10 May 2016, 14:33:16 UTC - in response to Message 1786427.  
Last modified: 10 May 2016, 14:33:44 UTC

Think convergence on +/- 10 % elapsed estimates on a given host, adapting to major changes within 10 tasks, would be enough of a goal ?


Is that question directed at me? I'm guessing it's for Richard but not quite sure...


Anyone :) as per my last post, the requirements spec for CN appears to be missing, so we get to make one :)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786436 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 1786437 - Posted: 10 May 2016, 14:34:51 UTC - in response to Message 1786428.  

The kitties like to call their credits....kibblestones.


CreditMew. :^D
ID: 1786437 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1786438 - Posted: 10 May 2016, 14:37:12 UTC - in response to Message 1786437.  

The kitties like to call their credits....kibblestones.


CreditMew. :^D

LOL....yes....
And whilst the flavor of the current kibbles seems to be acceptable to the kitties, the quantity has been falling, so they have to get the computers to crunch harder to keep the mean level of kibbles in their bowl at a proper level.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1786438 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1786443 - Posted: 10 May 2016, 14:58:48 UTC - in response to Message 1786427.  


"We've tried to make a system that's FLOPS-based but unfortunately we can't come up with one (for now) that isn't vulnerable to cheating. CREDITS will henceforth be called Unicorns (and RAC will become Recent Average Unicorns) and be a daily token given by Boinc that CAN NOT and SHOULD NOT be compared with past or future tokens. And it should most certainly NOT be used as a measure of your PC's health or abilities. We'd love to give you a speedometer but are unable to create one ATM that doesn't compromise the science. Hopefully one day we will... But until then it's Unicorns."

It's just statement regarding current state of things. Yes, it's smth like that currently. But... if it will be acknowledged... then those unicorns will be useless (as current credits are :) ).
Credits used for :
1) (and most important) bragging/competition.
2) performance measurement tool.

You propose to forget about 2, discard "FLOPS" and leave arbitrary "unicorns" instead.
But all those problems that make current credit not suitable as 2) make it not suitable for fair 1) too. Hence, FLOPS or unicorns - people will not be able to make fair competition so cries remain...

Regarding FLOPs-counting as V2 - good (really good!) in single project/single homogenious app.
Here we have 2 apps, one of them (MultiBeam) highly non-homogenious one (in such degree that we have separate VLAR rules on server).
It's quite obvious that different part of work for such app, being assigned some flops, will have absolutely different performance (measured let say in tasks done for particular AR) on different hardware and different ARs. So, we still have those "VHAR/VLAR storms" that will bias RAC, even in case credit granted in V2 (by pre-assigned FLOPS). Biased RAC w/o any change in both hardware _and_ software part of host setup. Just because of current data stream (totally out of operator's control). So, no fair competition, cherry picking in case of too much competitive person and so on... all spectrum of negative issues remain.
ID: 1786443 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1786508 - Posted: 10 May 2016, 23:59:41 UTC

Wow, OK... I'm shocked. By some miracle we're all on the same page all of a sudden. Awesome.

I thought you guys were gonna pick up from where we left off (almost 3yrs ago) and that I was gonna get another lecture on how credit is NOT a flopcounter.

So now what? Well now...

From where I'm standing it looks like you are trying to fix something that isn't fixable. Now don't get me wrong... Jason if you think you can crack the FLOPS problem and reel it into some semblance of reality then you're a miracle worker. Plain and simple.

But for now, and for your own sanity, and Richard's, and Raistmer's... And all the other gurus. Oh and Eric's of course :) ...

... It might be a good idea to call credits anything but Cobblestones. Because as long as you do, you're all going to go crazy.

OK so here's the deal with the CreditNew equation:

-After the CN equation is done chasing it's tail it spits out a number.
-CreditNew calls this number "F" (after calling it a bunch of other things first)

Now let's all call "F" something like "CN" for the sake of ALL our sanity. And again for all our sanity let's agree for a moment that CN is a random number generator (which is one of the kindest things people have called it over the years). Unfortunately (and out of the blue) we do this with it:

CN*cobblestone_scale = CREDIT

But if CN is arbitrary then no matter what you multiply it with, the result will also be arbitrary.

Also, there's no real need to multiply it by anything (let alone Cobblestones). We can easily leave that part out. It won't break the equation or anything because it's not really PART of the equation.

It could just as easily be:
CN = Credit

And "Credit" could be anything. Anything EXCEPT Cobblestones ;)
ID: 1786508 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1786588 - Posted: 11 May 2016, 5:33:15 UTC - in response to Message 1786508.  

expect credit to go south again soon ...
ID: 1786588 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786614 - Posted: 11 May 2016, 7:03:51 UTC - in response to Message 1786508.  


CN*cobblestone_scale = CREDIT

But if CN is arbitrary then no matter what you multiply it with, the result will also be arbitrary.
...


True. It *appears* and is árbitrary (actually chaotic), because of the faults.
expanding the CN part:
peak_flops*elapsedseconds*pfc_scale*cobblestone_scale

- 1st term peak_flops: (Sticking point for later)
- 2nd term elapsedseconds, measured, is the important one, because the whole purpose is to automatically adjust the second term to predict how long your next tasks will be.
- 3rd term, pfc_scale, and its inverse 1/pfc_scale used for prediction, are misnamed perhaps unintentionally, will come back to it (prediction is important for scheduling)
- cobblestone_scale. Only a unitlesss conversion from one scale to another (measuring stick), could easily be banana_scale.

so the only term here that is truly arbitrary/chaotic, is the scaling term, that is supposed to compensate for peak_flops being all over the map, and natural variation in elapsedseconds

It could just as easily be:
CN = Credit


And "Credit" could be anything. Anything EXCEPT Cobblestones ;)


Exactly, you can award Virtual-Tacos if you want, however we still need to be able to estimate how long tasks will take (obviously broken in various situations)

Back to 'ís it fixable?'. I contend yes. Here is why:
You can look at your task history and predict how long future tasks will take --> Humans are good at that, it's called 'éstimate localisation'

pfc_scale, because normalisation to a reference application is a step, should have been named 'relative_efficiency'. With only 1 application, the credit claim boils down to

claim*cobblestone_scale

We saw that on Albert@home tests, precisely as you say, the same amount as flopcounting (There are natural instabilities in elapsed_seconds, OK, but not that much, noise can be smoothed with damping if you know what you're doing, humans also do that with their knees while riding a skateboard.) Let's put the damping/stability aside as very fixable.

So accepting (with a stretch) that projects using creditNew understand their application well enough to put a reasonable #operations on their tasks to start with (There are formal techniques and it's not actually mutable), Where's the problem ?

The main (scaling) problem is with how that peak_flops is yet another misnamed number, misused.
For CPU it's Boinc_Whetstone (well disguised)
For GPU it's marketing GFlops

Now comes the tricky bit. Normalisation to the reference application, when there is only one application, shows 'çorrect' credit and 'reasonable' estimates (ignoring that noise/damping issue), as our experiments on Albert indicated.

on Windows/Linux-Intel/Mac-Intel Boinc_Whetstone is flat out WRONG, too low when you have SSE through AVX engaged. Now we have different hosts with different features being normalised to an increasing number of results claiming a fraction of Actual

The estimates come too short, are unstable depending on what hosts are returning work, bounds periodically get exceeded (indicating breakages in teh system), and arguably least important you get a fraction of the credit intended.

The fix ? Well the server side knows your CPU features, the project hopefully knows for the reference application(s) what instruction sets they enabled.

We then know a rule of thumb to determine how much boinc Whetstone underclaims for a given instruction set ( Sisoft_SIMD_Whetstone/Boinc_Whetsone )

At least for 'reasonable' enough correction (as Humans do, and all we need for prediction/estimation purposes) you should find on your hosts, Sisoft SIMD singleThreaded Whetstone is approximately 2.5 to 5x the Boinc Whetstone figure.

Funny part is, on the larger scale (many results/hosts), the 2x slop in that will even cancel out by the fact that some hosts would be on the high side, while others on the low.

The only reasons it won't self-correct without compensation, are that first FPU-only applications and hosts don't represent enough numbers of results (if any anymore) compared to the massive #of underclaiming SSE-AVX results.

Address that discrepancy ('Óptimising' Boinc Whetstone won't cut it), and add some damping to stabilise, and it will dial in as intended (i.e. estimates).

Homework:
A) Run Sisoft highest SIMD single threaded benchmark, and divide that by Boinc Whetstone on your host, Multiply your RAC by that, that's cobblestone_scale award +/-noise. should be ~2.5-5x or so (Mine's 3.3x)

B) work out all the factors that affect reported elapsed_time in your results, and how quickly responding estimates need to adapt for practical purposes/change, while not bouncing around all over the shop giving meaningless noise. (not quite as subjective as it sounds)

C) if you have a smartphone, open up Googlemaps and marvel (lol) at how relativistic estimate localisation through sensor fusion and statistically based controllers is done right (with GPS and GLONASS).
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786614 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786618 - Posted: 11 May 2016, 7:32:18 UTC - in response to Message 1786588.  

expect credit to go south again soon ...


Lol, gave me an idea for a principle:

Credit is inversely proportional to the wonkiness of Boinc's estimates.
(which get more wonky every time an old host dies or new more efficient one comes online)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786618 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1786627 - Posted: 11 May 2016, 8:42:48 UTC - in response to Message 1786614.  

True. It *appears* and is árbitrary (actually chaotic), because of the faults.

Hold that thought. "Chaotic" is a good scientific (mathematical) word. Just don't call it random in David's hearing (he's touchy about that).
ID: 1786627 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786628 - Posted: 11 May 2016, 8:44:24 UTC - in response to Message 1786627.  

True. It *appears* and is árbitrary (actually chaotic), because of the faults.

Hold that thought. "Chaotic" is a good scientific (mathematical) word. Just don't call it random in David's hearing (he's touchy about that).


Agreed. Chaotic is not the same as Random, It is NOT random.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786628 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1786636 - Posted: 11 May 2016, 9:06:53 UTC - in response to Message 1786628.  

True. It *appears* and is árbitrary (actually chaotic), because of the faults.

Hold that thought. "Chaotic" is a good scientific (mathematical) word. Just don't call it random in David's hearing (he's touchy about that).

Agreed. Chaotic is not the same as Random, It is NOT random.

Om. Chant after me: "Chaos is not random".
ID: 1786636 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786646 - Posted: 11 May 2016, 9:41:55 UTC - in response to Message 1786636.  
Last modified: 11 May 2016, 9:42:49 UTC

True. It *appears* and is árbitrary (actually chaotic), because of the faults.

Hold that thought. "Chaotic" is a good scientific (mathematical) word. Just don't call it random in David's hearing (he's touchy about that).

Agreed. Chaotic is not the same as Random, It is NOT random.

Om. Chant after me: "Chaos is not random".


A wise puppet, (Agro Vation, my Avatar), once said: "Beauty is only skin deep, but ugly goes right through to the bone."
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786646 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1786647 - Posted: 11 May 2016, 9:45:37 UTC - in response to Message 1786627.  

True. It *appears* and is árbitrary (actually chaotic), because of the faults.

Hold that thought. "Chaotic" is a good scientific (mathematical) word. Just don't call it random in David's hearing (he's touchy about that).

'Three is Chaos'

No it's not random, but the code amounts to fairly unpredictable credit awards (though as for any chaotic system, there are patterns [*]) and to the layman unpredictable is pretty much the same as random. We say chaotic, when we know you could calculate it.
'random' is a word we use when there don't seem to be any rules. Very few things are truly random. Blame physics for that ;)



[*] and if you were in possession of every single variable used in CN you could indeed calculate how much credit you'll be getting. CN IS a calculation after all.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1786647 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786649 - Posted: 11 May 2016, 9:53:25 UTC - in response to Message 1786647.  
Last modified: 11 May 2016, 10:04:46 UTC

Well that's just the difference between transferring the (awkward) theory, from formulae to practice.

Properly engineered solutions are a marriage of form and function, the synthetic (theory) and the aesthetic (simplicity, proportion and beauty), crafting an optimal synthetic-aesthetic, i.e. 'State of the Art', recognised since the renaissance.

[Edit:] leading to the aerospace design rule of thumb --> if it looks wrong, it probably is wrong.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786649 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1786949 - Posted: 12 May 2016, 8:52:45 UTC

BTW can anybody put some real numbers into the Creditnew equation? I'd like to play around these days but loathe to work with just letters and words. Feels like 10x the effort (at least for the way my own brain is wired).

Anyway, can anybody create a template? Might even be a good idea to have links on the CN Wiki page directing to real world examples of a Seti task and other Boinc projects tasks... dunno.

William? Ageless? Jason? Anybody?

Thanx in advance.
ID: 1786949 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1786951 - Posted: 12 May 2016, 8:55:41 UTC - in response to Message 1786949.  
Last modified: 12 May 2016, 8:57:09 UTC

Got a spreadsheet somewhere on Googledocs for analysing Albert data [Edit: may have been seti-beta CPU, will see], Will dig it out on the weekend.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1786951 · Report as offensive
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 32 · Next

Message boards : Number crunching : Average Credit Decreasing?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.