Posts by Scott Brown

1) Message boards : Number crunching : Response to concerns regarding the new credit system. (Message 317696)
Posted 26 May 2006 by Scott Brown
Post:
Thanks for the nice response Jim-R.

A more severe penalty for aborted workunits might be effective, but it would likely alienate some new users who have trouble getting started, etc. I also failed to mention that the effectiveness of the current quota system (whether with fpops or flat-rate credit) at reducing the level/effect of "sweet" workunit selection is directly related to the proportion of those units vs. long units. If they are roughly equal, then the effectiveness would be minimized since every second unit would not be aborted (thus quota would hold at about 50). On the other hand, the more lopsided the balance then the less impact on the overall system (if mostly longer units, then quotas would fall toward zero if they were aborted; if mostly shorter units, then aborting the few long units would have little overall effect). In either of the latter cases, a more severe penalty might not be necessary.

I think it is also worth throwing one more monkey-wrench into the mix. Not all projects are going to use fpops. CPDN is already a flat-rate system, and Einstein has stated that it will also be doing so for the s5 run (probably in about 1 month or so). What will the other projects do? Can they chose to remain on benchmarks?

2) Message boards : Number crunching : Response to concerns regarding the new credit system. (Message 317553)
Posted 26 May 2006 by Scott Brown
Post:

Perhaps we just validate each optimised app for every project with a gold standard PC, and assign credit score based on time it took to complete...
There you go wanting to go back to a similar system as we had before but a lot more complicated. Who is going to validate each app and assign credit for each optimized app? Remember with the enhanced application having widely varying crunch times acording to the angle range of the work unit, your system would require a special app for *each* ar. Or we would be seeing people abortingg the longer running wu's in favor of the shorter running wu's where you get the same amount of credits in a shorter amount of time. With our new system, the only way to get more credits in a shorter amount of time is use a faster computer!!!


Actually, given the changes that have been implemented in BOINC over the last several months, this argument is no longer true. A primary reason for moving to BOINC's credit system was to avoid the "cheating" going on in SETI classic as well as to reduce the "sweet" workunit selection. The current design of BOINC renders these issues irrelevant because:

1. While angle range produces largely varied workunit times, assuming no "sweet" unit selection, this evens out across times (that is, the random stochastic process would yield balanced errors around any gold standard selected).

2. BOINC now has a maximum daily workunit quota that is reduced by failures. Thus, aborting longer workunits for shorter "sweet" ones would very quickly reduce that maximum quota value to near zero (reducing the effect of "sweet" unit selection to minimal).

3. Most of the other forms of "cheating" from the classic flat-rate credit system are dealt with via the validation process.

4. Inflated scores through optimization are eliminated in a flat-rate system.
3) Message boards : Number crunching : Response to concerns regarding the new credit system. (Message 315125)
Posted 24 May 2006 by Scott Brown
Post:

time should no longer make a diffrence, thats true.
and thats why on the same machine the self calculated credits/hour should be everytime almost the same and not diffrent.


Well, that is a good theory, but in practice I have noticed it is not the case. On my Athlon 4200 X2, I have two back-to-back workunits completed that were awarded almost identical credits but have vastly different completion times. The first unit completed in just over 17,000 secs (about 63.7 credits, AR=0.41) and the second completed in more than 22,000 secs (about 64.5 credits, AR=.042). Both units were completed overnight with nothing but BOINC running on the machine. Given the very similar AR's, I don't understand the extra hour and twenty minutes on the second unit for a whole .8 credits extra.

4) Message boards : Number crunching : report the results immediately! (Message 313313)
Posted 22 May 2006 by Scott Brown
Post:

While I do not use the option of reporting results immediately, there are some cases when this would seem to be a legitimate option...For example see the following short thread:

http://setiathome.berkeley.edu/forum_thread.php?id=31060

5) Message boards : Number crunching : Response to concerns regarding the new credit system. (Message 310607)
Posted 19 May 2006 by Scott Brown
Post:
maybe Berkeley should just consider resetting the stats like they did when we switched from classic to BOINC.

Oh yes, I look forward to seeing everyone come post at the Q&P forums for anytime between 3 times the 4 and 55 days (deadlines on results times the quorum needed, thus the "about time" for credit to be granted) asking "why can't I post on the main forums?" ... well, you need credit to be able to post here. Without it, no posting in NC, the Cafe or Science.

So you want to cripple everyone's ability to post here just to reset the chances of overtaking team A or team B?


What??? Are you kidding? The forum posting requirements are largely irrelevant to this discussion (and could easily be fixed by using pre-enhanced or enhanced credit for posting eligibility after a reset). The point is that, EVEN IGNORING THE OPTIMIZED CLIENT ISSUES, the credit system was so FUBAR from the beginning that a complete reset is necessary to get anything remotely close to a "level playing field".

6) Message boards : Number crunching : Response to concerns regarding the new credit system. (Message 310400)
Posted 19 May 2006 by Scott Brown
Post:

...maybe Berkeley should just consider resetting the stats like they did when we switched from classic to BOINC. You would still see your pre-enhanced stats along with post-enhanced, for instance. This is obviously not the best solution IMO...



I think that this would likely have been the best solution. Keep in mind that optimization is not the only flaw that existed under the old system. The benchmarking was screwed up for different operating systems, too (especially Linux). Thus, the existing cobbelstones from SETI are largely meaningless when compared the new system. Enhanced is supposed to establish a truly "level palying field", but by incorporating the flawed inequailities of the previous credit system via inclusion of current scores, this will never be the case.

7) Message boards : Number crunching : Unofficial BOINC Wiki closing 2006-03-31 (Message 259516)
Posted 9 Mar 2006 by Scott Brown
Post:
David Anderson realized though that he could fund SETI on the back of BOINC. To get funding for BOINC it had to appeal to a much wider audience than SETI alone. With many DC projects involved there had to be compromises, BOINC couldn't be slanted towards SETI only. With compromise, it's inevitable that it won't be ideal.
But it simpy boils down to no SETI or a compromized SETI. Nothing new really, SETI@Home was a compromize after the US Senate killed NASAs' SETI budget.


Not sure what planet you have been on, but it is fairly certain that David Anderson is much more interested in BOINC than in SETI specifically. Besides the fact that he is a computer scientist rather than an astronomer, he has more-or-less stated this publicly (e.g., in the CPDN video just before their BOINC startup).

[/quote]

Let's see now, to quote from Dr Andersons' web entry:
I work as a Research Scientist at the U.C. Berkeley Space Sciences Laboratory, where I run the BOINC and SETI@home projects.

Note, works in the SSL, not the Computer Science Department.

SETI@Home began years before BOINC was conceived, and Dr Anderson ran that project. SETI funding ended several years ago, and I think it certain that his interest in SETI did not simply simply end then.

Unlike you, I cannot say that he is more or less interested in which aspects of BOINC and SETI. I can say is that he was well known to original SETI crunchers as the SETI@Home project director, and is now the directs both SETI and BOINC.
It doesn't take a mental giant ti fill in the gaps.
[/quote]

Okay...let's try this once more...I said he IS a computer scientist, not that he works in the computer science department. I am well aware of his role with the original SETI@home. And it takes even less of a mental giant to understand that (and I am paraphraisng him here from the CPDN video) when he says that the SETI project is less likely to succeed than other projects, less useful than other potential BOINC projects, and that he is most interested in distributed computing (and BOINC specifically) for the scientific community that he probably was not really concerned about how "he could fund SETI on the back of BOINC."
8) Message boards : Number crunching : Unofficial BOINC Wiki closing 2006-03-31 (Message 258159)
Posted 6 Mar 2006 by Scott Brown
Post:
My apologies to Paul for taking this a bit OT, but a response is required here...


David Anderson realized though that he could fund SETI on the back of BOINC. To get funding for BOINC it had to appeal to a much wider audience than SETI alone. With many DC projects involved there had to be compromises, BOINC couldn't be slanted towards SETI only. With compromise, it's inevitable that it won't be ideal.
But it simpy boils down to no SETI or a compromized SETI. Nothing new really, SETI@Home was a compromize after the US Senate killed NASAs' SETI budget.


Not sure what planet you have been on, but it is fairly certain that David Anderson is much more interested in BOINC than in SETI specifically. Besides the fact that he is a computer scientist rather than an astronomer, he has more-or-less stated this publicly (e.g., in the CPDN video just before their BOINC startup).


Secondly, the dissent and dissatisfaction that I see (and once felt too) seems to originate when participants believe that we are something more than just that; when we believe that we are more than just volunteers; when we believe they we are stake-holders in BOINC and/or SETI - We're not (with a few exceptions), we can donate our CPU cycles, we can donate hardware and even money, but we don't own BOINC, or even a share of it. If just participating in the Science and the fun of credits and rankings isn't enough for us, we are not going to be satisfied.

Some seem to think that they are 'owed' something by the projects they join; that there is some onus on the project to guarantee some service level; that they will get work from a project; that their ideas must be listened to. None of this is true, we are volunteers, no more. We can chose the projects we join, when we run run work. There just isn't any more to it.


Okay...time to say this yet again...it is the basic nature of volunteering that feelings of "stake-holdership" or being "owed something" arise. This is not unique to SETI, BOINC, or DC projects in general. Nothing you or anyone else can say will change this fact about voulnteerism! (There is plenty of literature on this topic).

More importantly, as your own language indicates, we are RESEARCH PARTICIPANTS (note that this is not the same as research subjects, which itself has unique implications). As such, we are guaranteed numerous rights under granting agency guidlines (here that is under NSF), many of which, somewhat ironically, are very similar to those "ideals" that Paul Buck proposed.


9) Message boards : Number crunching : Decisions- Boinc Projects (Message 255476)
Posted 1 Mar 2006 by Scott Brown
Post:

Well, I will not guess about Paul's motivations. From my few discussions with him (not rarely in disagreement), I can only believe that he has always had the best of intentions.

However, I do find it very interesting that (even though I know he crunched for them and posted on their forum) there is no trace of Paul D. Buck at Predictor@home. His ID is no longer there and the stats sites do not report him there anymore either. I can only wonder why his account would have been removed???

10) Message boards : Number crunching : Unofficial BOINC Wiki closing 2006-03-31 (Message 254556)
Posted 27 Feb 2006 by Scott Brown
Post:

Paul, I am very sorry to see you leave. As everyone has noted below, your work with all the BOINC projects has been a tremendous resource to us all. In many ways, you have been the heart & soul of BOINC (much moreso than even any of the BOINC developers!). I will greatly miss your comments in the forum and the logic that you always bring to discussions. I wish you the best of luck with whatever your next endeavor will be as well as with your health.

Best Regards,

Scott
11) Message boards : Number crunching : Privacy Issues (Message 228857)
Posted 10 Jan 2006 by Scott Brown
Post:

I just thought I'd add a bit to this discussion that has been ignored up to this point. Most of the BOINC projects are scientific research projects funded through external grants (e.g., NSF or NIH for projects in the US, etc.). As such, it is almost always the case that additional privacy protections to research participants apply. In other words, while the discussion regarding general notions of rights to privacy is interesting, it may be somewhat irrelevant because privacy guarantees of grant-supported research projects almost always EXCEED standard privacy concerns. The actions (as described by River~~) taken by an official of a project would appear to almost certainly be violations of privacy under that higher standard (though I would like to see the original postings to see for my self...could someone post a link to them?).

12) Message boards : Number crunching : number of classic & BOINC SETI users (Message 184775)
Posted 1 Nov 2005 by Scott Brown
Post:

It may be, it was a choice I had to make. What would be a better horizon? 14 days? A month.

I have no problem changing this if it would be more accurate.


@Willy
Well, I think it makes little sense to set the threshold for 'active' lower than the deadlines for project workunits. Since these vary, I wonder if 'active' shouldn't be defined on a project-by-project basis?

@all
And as a scientist and statistician, I'd like to say that statisitics are never incorrect (proofs abound on this point), they simply vary in the level of error due to poor measurement, human factors, incorrectly specified models, etc. That said, my favorite statement on statistics I heard in a debate more than a decade ago..."Statistics are like a lamp post to a drunk, more for support than illumination."
13) Message boards : Number crunching : Granted credit less than ALL claimed? (Message 183829)
Posted 30 Oct 2005 by Scott Brown
Post:

Okay, I am at a loss for this one. Three units returned...all claim 30+...first one returned is still pending...other two returned (one is mine) get credit in the teens, but each gets different credit?

http://setiathome.berkeley.edu/results.php?hostid=1002952


http://setiathome.berkeley.edu/results.php?hostid=1002952

You're not looking on a wu, but on a host.



Doh!

Well, it is late here and I am tired...time to call it a day.
14) Message boards : Number crunching : Granted credit less than ALL claimed? (Message 183822)
Posted 30 Oct 2005 by Scott Brown
Post:

Okay, I am at a loss for this one. Three units returned...all claim 30+...first one returned is still pending...other two returned (one is mine) get credit in the teens, but each gets different credit?

http://setiathome.berkeley.edu/results.php?hostid=1002952

something seems amiss here...anyone wanna explain this one? Thanks.

Scott
15) Message boards : Number crunching : BOINC 5.2.2 possible download issues? (Message 180334)
Posted 20 Oct 2005 by Scott Brown
Post:

Hi Rom

Got a "ready to report" on Rosetta didnt update automatically. My connection is always on


Same here with LHC and SZTAKI...
16) Message boards : Number crunching : Improved Benchmarking System Using Calibration Concepts (Message 179741)
Posted 18 Oct 2005 by Scott Brown
Post:
If you look at Improved Benchmarking System Using Calibration Concepts I have a new proposal.

Read and comment here ...


Overall, I like the system...though a few quirks would still need to be worked out.

If System "B" reports outside of the bounds of 2,400 to 3,000 seconds, it is not within the expected 10%.
Error bounds is one of the "unknowns" as far as a "correct" value, I would hope less than 5% would be common.


For example, here you are clearly assuming a normal (or at least symetric) distribution of errors. This is a very strong assumption that is almost certainly false. Specifically, all work unit times are bounded on the lower end by zero (ignoring the negative time errors noted at some projects) but are unbounded at the upper end. Given the widely varying workunit times, this can result in vastly different error distributions. For instance, a typical SETI unit (meaning not errored-out short unit) will have enough processing time that the bound of zero may have little effect (i.e., since almost no machine will approach the zero bound time, the distribution of times--and thus errors--will approximate a normal density). However, on short units (errored SETI units, short LHC units, normally short unit projects such as PPAH or PrimeGrid, etc.) the effect of the zero bound is greater making the normality assumption tenuous. That is, on longer time units, 5% error in each direction is likely to be about the same time amount, but on shorter units the lower bound for 5% error will be closer to the distribution mean time than the upper bound. Furthermore, given the inherent bias of faster machines producing the shortest times (and thereby processing the most workunits overall), a significant system-wide effect could result.

How to address this is not immediately apparent. Some possibilities might include some Bayesian techniques for similar bounding problems in MCMC modeling, but these usually require that we know or have strong assumptions about the non-normal distribution (e.g., is it Poission, Negative Binomial, etc.). Perhaps the simplest fix would be a minimum length requirement for all benchmark units, but this would require 1) some language changes regarding the 'benchmarking through normal operations' in your original text and 2) the possibility that projects such as PPAH or PrimeGrid with shorter units use a benchmark from another project (and we can probably throw CPDN into the latter as well given its 0% error--a cross-project benchmark would be necessary).

17) Message boards : Number crunching : SuperComputer (Message 179622)
Posted 18 Oct 2005 by Scott Brown
Post:
I would also point out that mere population growth is something to be factored into any discussions of global warming. While it is true that the period of the industrial revolution was rampant with burning of wood and coal, to say that we do not do these things today is rather naive. In many (if not most) developing nations, the emissions levels far exceed those of the now industrialized west. More importantly, there are simply alot more people doing these things than there were over 100 years ago. The potential impact of fossil fuel emissions is considerably more currently simply due to the fact that there are more people using them (even adjusting for the greater efficiency of today's use--which is quite variable from nation to nation).

A further point that is frequently ignored is the increased CO2 emissions from the increased population itself. While a few million extra people might have a very small effect, the 6 billion plus size of the world human population puts out considerably more CO2 than even 100 years ago (world pop. didn't reach 1 billion until 1800, but grew from 5 to 6 billion in 12 years [1987-1999]). Added to this are the exponentially growing numbers of domesticated animals (both pets and livestock) that also emit CO2 and the clearing of land (eliminating CO2 using trees) to house and feed these growing human and animal populations.

That said, I am with Paul here...the 'jury' is largely out with this debate. What is not in question is that there is substantial global climate change (we don't question events such as the ice age). The real question is what effect are humans having on that change and what is simply the result of natural events (e.g., vulcanism, etc.). I run CPDN because I think that it is a start at answering this question, and because it may be the best option we have to be apolitical (or at least mostly apolitical) given its very public nature and reliance on a global community of 'volunteers'.
18) Message boards : Number crunching : SuperComputer (Message 178164)
Posted 14 Oct 2005 by Scott Brown
Post:
No, because if you were doing the same work with a supercomputer you would be doing the same redundency, or some other mechanism to error check. ...

Sorry, I'll disagree on that one.

s@h imposes such high redundancy because we are all "untrusted" clients. Berkeley have no idea what our equipment is or what it does! Or even if we are OCed to a random heat death or just outright cheaters.

In contrast, a supercomputer is well controlled and already tested/proven. There's likely ECC as part of the hardware to guarantee the results. There can be much less or even zero WU redundancy.


s@h-classic and Boinc-s@h do remarkably well for such an uncontrolled environment!

Regards,
Martin


No...I think Paul is right here. If you are comparing raw performance, redundancy is irrelevant. More specifically, redundancy is project-specific. CPDN, for example, has virtually no redundancy built in.
19) Message boards : Number crunching : BOINC Popularity vs Classic SETI@Home (Message 176627)
Posted 11 Oct 2005 by Scott Brown
Post:
How many of the PC's crunching BOINC projects would have to be left "on" 24/7 if they were not being used for distributed computing projects? Virtually none. That is why its appropriate to question the amount of resources expended on these projects. I would guess that the majority of computers being used for BOINC projects average only a couple hours a day of "regular" computer use - some more than others, but there are many machines out there that do nothing but DC. The only PC's that really have to be on 24/7 are servers and gateways...how many of you can say your computers would have to be "on" all the time if not for DC?


I have one 24/7 machine that would be 24/7 regardless of DC projects (a university office machine that frequently does analyses for me overnight). Two others are on dial-up and are never on 24/7, though overnight runs for the slow process of updates are not unusual.

Thank you though for your guess that the majority of machines are only on a couple of hours per day in regular use. I suppose we should write the the thousands of businesses, universities, etc. that have people employed 8 hours per day that they really only need to provide their workers with computers 1/4th of the time...think of the overhead they would save!

I would guess that there are many more servers and gateways running BOINC than you suggest. Also, there are many other instances of 'have to be on 24/7' machines (e.g., university computer labs, etc.).
20) Message boards : Number crunching : BOINC Popularity vs Classic SETI@Home (Message 176390)
Posted 10 Oct 2005 by Scott Brown
Post:
When Classic stopped accepting new accounts they had about 5.5 million users. What do you suppose was the total amount of energy used to crunch the 2 billion Classic work units that were done prior to last September? If the most beneficial results of that expense of non-renewable energy was a few "interesting" locations and results, I say it was a waste of valuable resources. I believe in the distributed computing concept, and have supported it myself with significant financial and time investment. But I have to question whether the cost vs benefit of the projects developed to date really justifies their continuation. I'm not trying to be confrontational, in fact I am really rather sad about reaching this conclusion regarding Seti@Home. I understand that finding verifiable signals was always a long shot.......both because the project was looking at very narrow regions of space and because we may have had our "radio" tuned in to the wrong "frequency" (I realize this is oversimplification but you know what I mean). With respect to S@H in particular, does this "long shot" really justify the expenditure of the very limited resources our earth posesses? And are we really meant to discover these extraterrestrial signals at all?


As it has said for years...SETI@home is about using resources that are already wasted by utilizing idle computer cycles from computers that would otherwise be on. A more appropriate question would be does the long shot justify the expenditure of additional resources (farm building, 24/7 dedicated SETI machines, etc.)?


Next 20


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.