Posts by Bart Barenbrug

21) Message boards : Number crunching : Are there any sites providing optimized clients? -- PART II (Message 352643)
Posted 30 Jun 2006 by Bart Barenbrug
Post:

Sorry I wasn't clear, the -nographics argument is only available when testing the setiathome app standalone (without BOINC):
path\\setiathome_5.15_windows_intelx86.exe -nographics[/pre]

To me that sounds like a usefull boinc feature: to be able to spawn the app with the -nographics option (or any other, maybe project-specific, options). Would such a feature give everybody who uses it (I personally never use graphics) a 5-10% speedup?
22) Message boards : Number crunching : Pending Credit calculator/monitor (Message 155090)
Posted 22 Aug 2005 by Bart Barenbrug
Post:
But what does the Ø gc/wu reflect?
My guess is that when the script parses the result pages it not only looks at the results that are still pending, but also look at the credit for the results where credit has been granted already, and that the average value of those is what you're looking at. I haven't looked at the source of the script, but that would be my guess...
23) Message boards : Number crunching : Pending Credit calculator/monitor (Message 155082)
Posted 22 Aug 2005 by Bart Barenbrug
Post:
Oops. Double post. Sorry...
24) Message boards : Number crunching : So, IS seti actually recovering from the outages or what? (Message 152660)
Posted 17 Aug 2005 by Bart Barenbrug
Post:
Yes, there will be 3 versions overall. PHP Standalone script (with PHP), PHP web page script, VB EXE. The URL is http://cjsoftuk.dyndns.org/boinc/

FYI: I just tried the VB exe, which seems to work fine, and yields results similar to Andy's script (though it reports somewhat more pending credit (3.8K vs 3.2K); I don't know which one's right). But the php web script gives me the following error:
Notice: Undefined index: PCMUser in /www/org/dyndns/cjsoftuk/domain_public/boinc/PCM.php on line 243

(both when I try for speed and accuracy.) My host ID is 1114129 if you want to give it a shot yourself.
25) Message boards : Number crunching : Damn You People Using Boinc 4.13 (Message 139655)
Posted 20 Jul 2005 by Bart Barenbrug
Post:
I agree with Ananas. here's another example of a WU with three succesful results (so it could've been validated ok), but the WU got trashed since too many download errors came first (only two out of the 6 are by pre-4.19 core clients by the way, so not sending work to pre-4.19 clients wouldn't even have helped in this case).

I don't see why a result could not transition from "Ready to send" to "In progress" (to use the status page terminology) only *after* it's been succesfully downloaded by a client. If an error occurs, fine, just keep it as "ready to send" and hand it out to the next client who comes asking for more work. No need to keep track of the failed attempt in the database: the client will come again for more (just like a failed upload of a processed result will simply be retried later), so there is no strain (even less than in the current way of working) on the database. Am I missing something?
26) Message boards : Number crunching : UPLOADING II --- I is TOOOOOO big (Message 138513)
Posted 18 Jul 2005 by Bart Barenbrug
Post:
So is the Philmor server being reconfigured to serve as either upload or download server (so Kryten's load is divided)? I assume some gateway or router can be configured to split the traffic appropriately? By the looks of it, the Kosh server could be another candidate for re-deployment if really need be (and validation can keep up on the remaining two validation processes). I can imagine that moving tasks from one machine to another is far from trivial in a running environment, with many scripts and programs running in concert. So to the people at seti: stay cool, despite the sometimes heated discussions here.
27) Message boards : Number crunching : Uploading (Message 138223)
Posted 18 Jul 2005 by Bart Barenbrug
Post:
Maybe let clients that work on the same work unit get in contact with each other (in a peer to peer manner) and handle validation (redundantly) amongst themselves, so only validated results would have to be reported back to the servers.

With that code on the clients, and with the program being open source, that means that there is the possibility of some one changing the validation code ...


Of course. Which is why the validation would need to be done redundantly (maybe on all clients that computed the result, or even on different clients). So server-side there would still be some sort of validation required, but only of the validation process, not of the results. Might not even make a big change in validation effort (cpu-cycle wise), but if clients could forego reporting results that they know are bad, that could act as a separate filter, relieving some server-side pressure (connection-wise). But maybe it won't help much. Just a not-worked-out idea in the line of "could client cpus be helpfull if servers are cpu-bound").
28) Message boards : Number crunching : Uploading (Message 138216)
Posted 18 Jul 2005 by Bart Barenbrug
Post:
The basic idea used in this is collision detection and back off like what is used in Ethernet. The trouble with this is that saturation starts to take place when the network starts to get above 60% of capacity... (others say it is efficient into the 90's but I think they are dreaming).

One of the things that makes it worse now is that every client with a reasonable queue is trying to connect every few minutes or so, because there's always one of the WUs that reaches its "let's try again" time. Maybe that should be made global for all results of a client, so the servers would get less load. That small change is client-side only, so it could be implemented quite easily. To better profit from it, a further change could be that once a client does set up a connection, it would use the connection to its full extent by uploading all its finished results (with proper time-outs of course, and a smart client might upload in earlier-deadline first order). So not to break the connection once one upload is done, but keep it if more is to be communicated (this means the client would be in charge of breaking the connection, so the server would need to have a good time-out set to it). Such a change would require server-side adaptations, I guess, so would be harder to implement.

I also seem to remember a time that a clients would collect a few WUs before first trying to upload them (which is fine if the deadline is not near). That could also relieve pressure on the servers, if combined with "once a connection is there, use it until nothing more needs to be communicated" strategy.


Worse, to implement it would require a change in both the clinets and the servers and this would have to be done across all projects at the same time ... :(

I don't see how to make this work ... :(


A suggestion that's even worse in that respect: it seems that server-side hardware shortage is common. Where-as client-side, there is hardware aplenty (especially cpu cycles, which at the moment seem to be the problem if a server is completely cpu-bound). Now servers do completely different things than clients, and need different resources, and can be considered reliable whereas clients can not (in a security sense), but would it be possible to distribute the work a little more? Maybe let clients that work on the same work unit get in contact with each other (in a peer to peer manner) and handle validation (redundantly) amongst themselves, so only validated results would have to be reported back to the servers. I know this is long-term, might not combine well with dial-up clients, and needs a lot more work to get the details right, so take just take it as an example of asking one-self: what else could we decentralise and let clients help us out with. The answer is probably "not much more than is already done", since I'm sure the devs have asked the same question. But it can't hurt to revisit questions like these now and again, so consider this post a friendly reminder. ;)
29) Message boards : Number crunching : Granted credit Proposal (Message 130554)
Posted 30 Jun 2005 by Bart Barenbrug
Post:
Also imho it's fighting a symptom, not a cause. The problem is in benchmarking. I installed an optimised boinc along with an optimised cruncher app, but apparently the benchmark code does not allow for as much optimisation as the cruncher code, so the benchmarks do not reflect how fast my cruncher can actually crunch. So I'm now claiming less credit per WU than I used to be, though I'm still processing them completely. So I'm actually claiming too little, and assigning the median at least compensates for that.

Using an optimised app (=better software) is like using better hardware. Someone who buys a faster PC surely gets more credits, so why wouldn't someone who uses better software get more (not per WU of course, but because more WUs are being processed)? It up to everybody if they want to upgrade (hardware or software).

The assumption that the lower scores for those with optimised apps are unfair to users of unoptimised apps is not true imho. When more people start using the optimised apps, there will be cases where the median of a WU is low (because several optimised apps worked on it). And then everybody suffers by getting too little credit.

The benchmarks should reflect the amount of work a cruncher can do, and with an optimised cruncher this is currently not the case (even with an optimed boinc).
30) Message boards : Number crunching : Optimized BOINC 4.45 clients for Windows (Message 128089)
Posted 26 Jun 2005 by Bart Barenbrug
Post:
Actually, I see every reason to use the optimized application, but don't know why I'd use the optimized core client. as far as I know it doesn't do the science part.

That's my thinking as well. I run the optimized science app, but see no need for an optimized CC. All it does is improve benchmark scores.


It brings the benchmark scores back up to speed with an optimised client, thereby reflecting that you crunch faster with the optimised client. A faster cruncher should get more credit because it gets more work done (just like if you get faster/more hardware). Claimed credit is something like like "benchmark score" times "time taken to process a WU". So "work done per time unit" times "number of time units per WU" equals "work done for WU". If I use an optimised client with a non-optimised boinc, my "time taken to process a WU" will go down a lot, but my "benchmark score" will not go up. So I claim less credit per WU. Not so much a problem for me since I process more WUs, so it compensates (though not increases my rac like it should since I do more science). But others who worked on the same WU might be penalized because of it, because the granted credit is the median of three scores, one of which is now significantly lower.

A typical non-optimised WU would yield some 21-25 (or so, for me anyway) claimed credits. With the optimised client (but unoptimised boinc), it's more like 12-16 (again: for me). So if the validator gets one result from an unoptimised client, and two from optimised clients, the granted credit will be in the 12-16 range, which will not be very nice for the person who used the unoptimised client and did spend the time to earn 20-something credits.

All in all: use an optimised client along with an optimised cruncher to let the benchmark scores better reflect how fast work is actually done. Indeed it doesn't change the science done, but it does make the credit system more accurate, and that's important to some.
31) Message boards : Number crunching : Boinc current cost ! (Message 100415)
Posted 17 Apr 2005 by Bart Barenbrug
Post:
> One way to cut down power usage if running seti (or most other BOINC enabled
> projects) 24/7 is to run it in a ramdrive, have all your harddrives spindown
> when they are not used and have enough RAM to avoid swapping.

Even without a RAM drive: My PC has 4 harddisks in it, so only one needs to be spinning for BOINC (and my satelite receiver actually also uses it for storage, so I want to keep it running so that timer recordings don't fail to start because the hdd wasn't running). Would there be a way (in windows xp) to set the power management for the other three drives to spin down, but keep the one running? So far I've only been able to either have them all run, or have them all spin down...
32) Message boards : Number crunching : Make Boinc CC 4.19+ mandatory (Message 81690)
Posted 21 Feb 2005 by Bart Barenbrug
Post:
I wonder how much of that is due to the amount of 4.09 and 4.13 clients out there versus the amount of 4.19 clients. Many casual users (myself included) wouldn't keep tabs on what the latest version of boinc would be, so they wouldn't upgrade unless their current boinc told them so.

Nevertheless, I'm hoping a newer version will be an improvement, so I've just upgraded from 4.13 to 4.19 (my old client finally managed to upload all my processed results, and the upgrade nicely kept my already downloaded queue). So your post had some effect, Colt. ;)
33) Message boards : Number crunching : more power - less credit (Message 38002)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
> But, if you strip out all the nonsense, and just track, say, based on basic
> CPU speed/class the total numbers come down quite a bit. With this, you can
> even average out the differences between AMD and Intel chips.

Sounds great! Go for it! ;)
34) Message boards : Number crunching : How I Would Fix the Credit System (Message 37992)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
> With no Windows experience: there is just graphic client?

There's a client that can do graphics if you ask for it. Which I don't. But indeed, I don't know if the time spent rendering the graphics is excluded from the time reported to boinc, which would mean that indeed you would get (or at least claim) credit for that "work" if you leave the graphics running. I guess that's one more reason for the validator to take the median claimed credit as the awarded one.

> I like credits: they tell me a that my boxes work OK. Boinc is a kind of snmp
> ;-)

Good point. [OT]Similar to why I donate blood: it's a free regular medical check-up... ;)[/OT]
35) Message boards : Number crunching : more power - less credit (Message 37988)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
> I remember we did the tests, but I don't recall the outcome. I guess because
> I have a hard time getting worked up about this ...

True, there are many more important things. I guess I was having too much time yesterday (finally a free Sunday, and what do I do: work some formulas... *grin*).

> The problem I still see is that there is no project independent way to derive
> the loading factor. And that is the point.

Indeed, since the new parameter is supposed to characterise a project (in terms of how much its computations are like the Drhystone benchmark computations versus the Whetstone benchmark computations), that may indeed be not so easy. I don't know exactly how the computation time is currently predicted, but would it be possible to predict using the Drhystone number alone, predict using the Whetstone number alone, and then process a workunit and see which interpolation weight would be needed to combine the two predicted values into the actual time taken? This would have to be averaged over a larger number of work units of course (the set being sort of a representative set of the different work units a project might have; and it's up to each project to determine such a set with more or less accuracy), and also over a representative set of machine types, but it would only have to be done once for every project (maybe updated now and again as new machines come out). Still a reasonable amount of work, which luckily can be automated (even farmed out, if you want... ;) ).

I see your point about this being difficult.

> We did discuss a number of mechanisms that might in turn allow the derivation
> of the values. One of the proposals (I think it was mine, but whatever)
> looked at the stability of the benchmarks (which I am not impressed with by
> the way) and would collect these statistics in an attepmpt to obtain a better
> figure of merit. Basically, it would take the reported results from all
> machines of specific classes and roll-up and average the benchmarking scores.

Ah, yes: I think that is sort of what I tried to describe above, but you nicely rolled it up in the numbers already available.

> With that in place, then you could also collect estimated vs actuals ... and
> from THAT derive your loading factor. The nice thing about this is
> that there is not a major change in mid-horse ... It also allows you to
> eliminate the instability of the benchmarks and their incorrect measurement of
> the actual work. From there you now have a benchmark score, an averaged
> benchmark score, an estimate based on benchmark score, actual(s) based on WU
> processed, Actuals averaged, and finally, estimates based on actuals ...
>
> Whatever ... :)

*grin* I see your pain in seti having to order more and more storage to accomodate the database that will store all of that... ;)

> We can toy with the ideas all we like, but it is not likely that there will be
> a change unless you can convince one or more projects that this, or any other
> scheme, is a better and more useful scheme to derive estimates and credit
> ....

Ah, some evangalizing to do. I don't think I'm ready for that (my free Sunday is over).

(last try, can't resist:) but you call this a different scheme, and I simply say that it's only a small variation on the existing scheme. To introduce it, just introduce the weight between the benchmarks as a project attribute, adjust the summation of the benchmarks accordingly, and set the weight to 1/2 by default. That gives the exact same situation as now, but any project on its own can decide if they want to fine-tune the weight if they see that that improves the consistency and predictability of their claimed credit scores.
36) Message boards : Number crunching : more power - less credit (Message 37951)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
> This mostly tells me that there are problems with the benchmarking of
> HT processors

In that case, how does the comparison fair with HT turned off (just setting the settings to work in a single unit at a time only, and benchmark accordingly)?

> The point of using Whetstone & Dhrystone are that they are reasonable
> benchmarks for the class of work we would normally do with science
> applications on BOINC.

The only point I'm trying to make is that although they are measured and stored in the database seperately, they are not used as such. In the current way of working, the benchmark could actually do both benchmarks and only return the sum of the two stones (MhoistStones, so to speak ;) ) which can then be used in fabs(host.p_fpops)/1e9 + fabs(host.p_iops)/1e9 (which then simplifies to host.p_fipops/1e9 assuming the fabs are already taken care of in the summation upon benchmarking).

So I saw room for improvement by providing an additional parameter, which hopefully could be per project, which gives a rough indication of how much that project's work is more like the Drhystones benchmark or more like the Whetstones benchmark, and can be used to weigh the two benchmark scores. Since Benher concluded that Seti is almost all floating point, setting that extra parameter for seti is simple: time should be weighed according to the floating point benchmark and not according to the integer benchmark. But work for other projects might be more like the integer benchmark, in which the parameter could be set differently for that project.

So this is a parameter that is opposite of seti-specific, but allows differentiation between the different projects. And using the simple version where the parameter q is used as weight for linear interpolation (q*fabs(host.p_fpops)/1e9 + (1-q)*fabs(host.p_iops)/1e9) does not complicate things much more.

But it still depends on how well Drhystone and Whetstone benchmarks can represent the actual work, and I agree with AthlonRob and others that that is questionable at best. But if we're measuring both anyway, and see a correlation between differences in those numbers and differences in claimed credit, we might actually use those differences to obtain more equally claimed credit, as that is what most people seem to want.
37) Message boards : Number crunching : Large Stores of Workunits = Longer Time to Grant Credits (Solution: Suggestion for credit system) (Message 37821)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
> If people were able to see how long they may have to wait on John Doe to process
> his work unit maybe they wont go so crazy over everything.

Well, as a first estimate, I can go to my result page, and click on any work-unit. That will give me an overview of who is working on that work unit also. Clicking on the entries in the first column, will show the report deadlines for each of the results. So I can already have a good idea how long it might take that work unit to complete.

I think it will be hard to make the estimate more precise than just using the report deadline: not everybody has their computer processing at a constant rate (laptops, other work being done on the computer etc.), so it would be hard for even the client to come up with a better prediction in the general case.
38) Message boards : Number crunching : How I Would Fix the Credit System (Message 37820)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
If people want their WU counts back. If I go to my computer summary in my own account data, there's a number there counting my "results" (right after the download rate. This is a pretty good indication of the number of WUs, right? Actually it's more like how many you've downloaded, so invalid WUs and WUs that got lost due to a reset or so also get counted, but if I read remarks like "So what if occasionally somebody gets credit for an invalid result?" that may not be so much of a problem. Just as long as someone doesn't constantly reset his project because that would get him/her another cache full of instant "results credit". So basically we already have a parallel system in place.
39) Message boards : Number crunching : more power - less credit (Message 37819)
Posted 18 Oct 2004 by Bart Barenbrug
Post:
Benher informed:

> Seti is virtually ALL floating point.
> There is some integer for address offset calculations into buffers (indexing),
> and a few other places, but virtually all FP.
> In fact the estimated WU completion time calculation only uses the FP_benchmark score.

That clears up a few things. I thought had spotted a pattern with the "low fp benchmark compared to i benchmark results in lower claimed credit". My formula might still be applied to compensate for differences between projects, but if the usage is so extreme, it may be simplified to the (more aproximate) w=qf+(1-q)i where q would be (almost) 1 for seti. i.e. for seti only use the fp benchmark for the claimed credit also (like the for the estimated WU completion time), but keep the integer benchmark around for other projects that may use integer more.

I can see the trend that faster computers will take longer for that bit reversal: processing speed has increased more than memory speed, so cache misses become more and more expensive. If the benchmarks don't take this into account as much, there could indeed be a tendency for faster computers to have a higher penalty.

Bill mentioned:

> As I understand it since the linux isnt optimized it claims less credit,

If the seti client for linux were less optimized, it would take longer for a work unit, which would actually increase the amount of claimed credit. So it would actually have to be the benchmark code that is less optimized (less so than the seti client) to result in less credit being claimed.
40) Message boards : Number crunching : more power - less credit (Message 37659)
Posted 17 Oct 2004 by Bart Barenbrug
Post:
Not to make the other post even longer: my new formula is also far from perfect. It only splits up the work done in integer versus floating point computations. First: there are different integer operations which run at different speeds, so the mix of which integer operations are used and in which quantaties is also important (you can refine the benchmark to take that into account, but it will be increasingly difficult to determine accurately what the mix for a typical work unit of a project will be). And the same goes for floating point. Second, there are other types of calculation as well. "Control" calculations for example ("if", "while", "switch" etc.) and they take time as well, so where do those get lumped in. Third, this does not take cache behavior, memory speeds, and all such things into account which also influence the computation times. Fourth, it may also be the case that a machine might have a floating point unit sitting next to an integer unit, meaning that if an integer and floating point operation would be close enough together in the program, they may actually be executed in parallel, rather than sequencially (which is what adding the floating point time to the integer time in my computations corresponds to; anybody tried to determine a proper benchmark for a vliw machne yet?? ;) ).

But since we do have two benchmark numbers, and they happen to be for floating point versus integer, at least we have to measure for a work unit how much of its work is floating point, and how much integer, otherwise having two different benchmark numbers isn't really meaningful (just adding them doesn't even correspond to 50% floating point operations and 50% integer: there's an inverse relation there as I tried to show). And if you have that ratio for a work unit of a given project, I think you arrive at the formula I wrote down in my earlier post, which at least uses the two different benchmark numbers in a more meaningful way. It's still gives an approximation only of the amount of work done, but a better one, I think.

It not so much corrects for the difference between fast and slow computers (just multiplying seconds spent with operations/sec will do that trick), but for the different relative speeds with which computers perform their integer versus their floating point work. If we can find a correlation between that ratio for a given computer and how its claimed credit compares to the claimed credit of others, this new way of computing claimed credit might help.

For example this WU has had two similar computers working on it (each claiming 38 or so credits), whereas the other computer not only claimed a different amount of credit, but also has quite a difference in processing speed of integer versus floating point (roughly 2.8 faster integer than floating point for this other computer, versus roughly factor 1.5 for the other two). Maybe it's this difference that causes the difference in claimed credit (but maybe it has another cause altogether: more WUs should be analysed).

Also [url=http://setiweb.ssl.berkeley.edu/sah/workunit.php?wuid=2978186]this<a> work unit seems to indicate something similar. Could it be true that computers that have slow floating point benchmark results relative to their integer benchmark results tend to claim lower credit?. Which would mean that seti's work units mostly perform integer operations, and not so much floating point (so seti would have a "p" value lower than 0.5 in my formula) and that therefore the claimed credit gets dragged down by a slow floating point benchmark, even though low floating point performance doesn't matter much for seti processing. If this is a trend that can be really established on the basis of analysis of many more work units, and maybe with a different relation (but still with a clear correlation) for another project which relies more on floating point computation, going for the formula I mentioned might be worth it (once more pressing issues have been solved of course).

Paul: you can have a look in your results at the claimed credit for your Mac (which has an integer benchmark score of about three times its floating point score) versus that of your pentiums (which score less than twice as high for integer as for floating point). If I'm right, your Mac will consistently claim lower credit per unit than your P4s (though again: that might also be related to other differences between the machines). Do you see that trend?


Previous 20 · Next 20


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.