留言板 :
Number crunching :
WU Deadlines
留言板合理
| 作者 | 消息 |
|---|---|
Jord 发送消息 已加入:9 Jun 99 贴子:15175 积分:4,362,181 近期平均积分:3
|
Even with short connection times hosts receive more units than can be crunched in that time period, Hosts are doing more than one project, with selectable variables on per-centage allocation and selectable switching times etc. So... Take out multiple units per host. That's no longer necessary. Take out crunching for multiple projects. That's no longer necessary. See the end result: Seti Classic with a new cruncher. It's all he may want. :) |
W-K 666 ![]() 发送消息 已加入:18 May 99 贴子:13932 积分:40,757,560 近期平均积分:67
|
@PhonAcq I don't believe you have thought your suggestion thru before posting it. Even with short connection times hosts receive more units than can be crunched in that time period, Hosts are doing more than one project, with selectable variables on per-centage allocation and selectable switching times etc. As I said before I've had units granted credit quicker than a lot of hosts can crunch one unit. The data needed to be held about each host would far outweigh and advantages gained. If you are sent more than one unit there is no garauntee that you receive them in the same order etc etc. There are area's even in first world countries where broadband is impossible. A little of the work I do sometimes requires me to go to places where there is no public commumications what so ever. The company I work for is contracted to Trinity House the UK organisation responsible for lighthouses etc. It is just as feasable looking back at the units I received between the 7th and the 17th of July for me to suggest that they send out 5 copies of each unit as I had at least two where 2 hosts failed to meat the deadline, and so they are still in the database. Andy |
Ace Casino 发送消息 已加入:5 Feb 03 贴子:285 积分:29,750,804 近期平均积分:15
|
PhonAcq, In one breath you say why not extend the deadline. In the next breath you say the 4th person reporting should not get full or possibly any credit. If seti sends out 4 and regardless of the time it takes, everyone should get credit. 1. If only the 1st 3 to report got credit (or recognition for there work) there would be alot of people leaving feeling there contribution was a waist of time. 2. It would mean to join this project you would need a new computer every 6-12 months. (thats how fast a new computer becomes obsolete) So the fastest would get in first and tough luck for the people with slower computers. 3. There are alot of people like myself that keep an old computer running for the fun and science of it. We will never be the top cruncher nor do we care to be. I don't optimize. I just let the computer run and crunch. TAke a deep breath and keep on crunch'n [img][/img] |
Paul D. Buck 发送消息 已加入:19 Jul 00 贴子:3898 积分:1,158,042 近期平均积分:0
|
@ML1 If I ws that annoyed with something I would be doing something else with my time. @PhonAcq One of the other items in the works is that the scheduler should be trying to send work to computers that have roughly identical turnaround times. But, again, this has no relationship to the quorum size or deadline length. Once again the point is to try to get the work unit processed and returned in the shortest amount of time so that the database has the fewest possible records stored. The need for the hand-shaking is to be sure that the client and server both keep accurate track of the work to be processed by the client and that the client in fact does know about the work it is supposed to be doing. The primary cause of the errors has to do with the communication exchange being interrupted and the one side "knowing" something that the other side never heard. Loss on the server to client side leads to "ghost" work on the server. Loss on the client to server leads to work reported that is never recorded. Again, this is addressing a different set of problems and has nothing to do with deadllines or quorum size. |
ML1 发送消息 已加入:25 Nov 01 贴子:10633 积分:7,508,002 近期平均积分:20
|
There are those that are so hostile at times I am amazed that they participate in this project at all. Is it all hostility? Or just overzealous passion and frustration? (And sometimes a little ignorance and impatience...) Cheers, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
|
PhonAcq 发送消息 已加入:14 Apr 01 贴子:1656 积分:30,658,217 近期平均积分:1
|
Randy, you are 100% correct and is the same for all of us. Getting credit makes the project fun. I think it would be more fun if that credit was also impactful. That is, if you are also one of the first three reporting valid, self-consistent results. So I think boinc should send work units to clients that have approximately the same successful turn around times, so that the clients would return about the same time and the need for computing a wasteful fourth wu is minimized. One way to do this is to keep a score for each client, which is the time the client will respond with a valid result with a 95% (two standard deviations) probability. Perhaps with such an approach the need for subdquent handshaking (see below) would go away. Cheers. May this Farce be with You |
Ace Casino 发送消息 已加入:5 Feb 03 贴子:285 积分:29,750,804 近期平均积分:15
|
I think I might be a good example for those of you wondering if you will get credits when others have already been granted credits. In this WU the other 3 have already been granted credit a week ago: http://setiweb.ssl.berkeley.edu/workunit.php?wuid=21585872 I'm still crunching should report today and "will" get credit. Happens all the time to me. In this WU you will see that the other 3 reported on the 24th of July and where granted credit. I came along today 9 days later and was granted credit. http://setiweb.ssl.berkeley.edu/workunit.php?wuid=21488447 As a couple of people said above; as long as you report before the deadline you will get credit. Even if the others where already granted credit. At least this is my experience. [img][/img] |
Paul D. Buck 发送消息 已加入:19 Jul 00 贴子:3898 积分:1,158,042 近期平均积分:0
|
There is no magic here. The quorum size of three makes sense. It also makes sense in the context of this project to attempt to turn the work around as quickly as possible. In that light, because there is a need for three, and there are a significant percentage of work units that lose one result, the decision was made to issue a fourth right away. One of the things that is a given in this DC areana is that there is enough capacity to make this a viable and practical choice. You are correct that there are good ideas, some posted here, some only in the mailing lists, some in the bug base. All of them are known to the project types and they choose those that they find practical and useful. The only "weakness" is that we have to speculate as to the reasons for the choices. But, it is not our project. It is there's, and it is up to them to make the choices. One of those choices is to limit communication to the participant base. I don't like it, but can certainly understand why. There are those that are so hostile at times I am amazed that they participate in this project at all. You do mix up signals with work units, but that is not terribly important. The choice of 3 results is one of those deceptively simple choices. But, truth be told, 3 is good enough for a project that has little probability of accomplishing anything and there is a small need to say, yes, the analysis is valid. The changes in the hand-shaking between the clients and the servers has no connection to the choice of quorum size, but only to address an issue that in the early design stages was thought to be insignificant. Experience has shown that there are enough errors in the process that a more complicated mechanism is called for. Like all systems, BOINC was developed with some ideas on what would work and how much complication needed to be added to the system. It was felt that the the process of data interchange between the server an client would be good enough with out additional complexity and so that is how it was built. Now we know that those assumptions are not good enough. So, complexity is being added to address the issues. For those that don't like the development team this is added "proof" that they don't know what they are doing. Yet, this is a logic choice, make the system, field it, find out what does not work, fix that and move on. When I taught comptuter science classes I always pointed out that when developing code you never "optimize" the code as you are writing it. You write it to be as clean, clear, logical, and structured as possible. only after the system is operational do you look for the places it is slow. THEY WILL NEVER BE WHERE YOU EXPECT THEM. Anyway ... food for thought ... |
|
PhonAcq 发送消息 已加入:14 Apr 01 贴子:1656 积分:30,658,217 近期平均积分:1
|
I've noticed a couple of replies in John's vein, encouraging me to editorialize (but not personalize): This project might be more productive if we spent more time asking how one might do it better and less time accepting things as they are. Bad ideas should be exposed as such, but rebuttal by referencing a 'higher' authority is rarely credible in science. Conversely, with a new idea someone may actually be able to contribute more than hot transistors to the project. (Of course, the higher authority needs to be part of the discussion, or at least be informed of developments somehow.) So from this prespective, why is it that the project administrators have deemed it useful to take 4 signals after there are 3 consistent ones. If true, then why not always take 4, and not 3. There may be a perfectly good reason, but redundancy would seem to be a weak one (given 3 have already been accepted). Can someone point to the mathematics??? The other point is that the algorithm should be designed to be self adapting and robust, which minimizes the need for ad hoc, artificial time limits. I hope the handshake idea is really going to happen, and is not a rumor, because it will help with an efficient algorithm. May this Farce be with You |
|
John McLeod VII 发送消息 已加入:15 Jul 99 贴子:24806 积分:790,712 近期平均积分:0
|
THe reason for the 4th issue was to handle those occasions where one of the three does not return the result. In this case you now have a work unit and two results in the database for 4 weeks. Even with the 4 th result pending, and it is late, should return within the two week original deadline. If it does not, the work is purged because we did form a Quorum of Results. If BOINC hands you work, and you get the work done correctly and on time, you should get credit for it, as the project administrators have deemed it useful to the project when it was sent out. If it is late, and it becomes part of the quorum anyway, then you still get credit because it happened to be useful even though it was late. If it is late, and the quorum is already formed, then you are out of luck as you did not meet your end of the agreement (getting the work back on time) and the work was not needed at that point. So, if the project decides to send 4 copies of a WU then all 4 get credit if they get returned on time. BOINC WIKI |
|
PhonAcq 发送消息 已加入:14 Apr 01 贴子:1656 积分:30,658,217 近期平均积分:1
|
THe reason for the 4th issue was to handle those occasions where one of the three does not return the result. In this case you now have a work unit and two results in the database for 4 weeks. Even with the 4 th result pending, and it is late, should return within the two week original deadline. If it does not, the work is purged because we did form a Quorum of Results. Good to know boinc's algorithms are going to be improved. And I guess development of the hand shake is an indication my position is more or less accepted. In general, a well engineered system will only use crowbars for the unexpected, not as a matter of course. So the two week deadline will remain ad hoc in my mind, and I will hope the improved boinc will be more self adaptive. May this Farce be with You |
W-K 666 ![]() 发送消息 已加入:18 May 99 贴子:13932 积分:40,757,560 近期平均积分:67
|
P.S. VISTA (M$ name for new windoze ver.) is an acronym for Of course you can quote it. It actually came to me from my son, registered as Nutter, he has never claimed to be sane and to prove it he is BSc in Computer Science. Andy |
ML1 发送消息 已加入:25 Nov 01 贴子:10633 积分:7,508,002 近期平均积分:20
|
P.S. VISTA (M$ name for new windoze ver.) is an acronym for ROTFL! Nice one. Mind if I quote you on that? :) (And as far as I know, M$ is the only system that supports those things!) Cheers, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
RDC 发送消息 已加入:17 May 99 贴子:544 积分:1,215,728 近期平均积分:0
|
People who can return results quickly but choose long queues in order to increase their earned cobblestones will rethink that strategy. People, when faced with getting no credit for their work due to slow computing, will help the economy by buying broadband and getting a new computer, perhaps a MacTel! Hope you'll buy me a faster PC and the broadband connection to go with it since it's your idea. To truly explore, one must keep an open mind... |
Paul D. Buck 发送消息 已加入:19 Jul 00 贴子:3898 积分:1,158,042 近期平均积分:0
|
THe reason for the 4th issue was to handle those occasions where one of the three does not return the result. In this case you now have a work unit and two results in the database for 4 weeks. Even with the 4 th result pending, and it is late, should return within the two week original deadline. If it does not, the work is purged because we did form a Quorum of Results. So, normal course of events the work is held in the database for only two weeks and no re-issue is required. The hand shaking you talk to in the end of your post is a part of the BOINC Daemon and the scheduler code that is being worked on right now. It is being tested on Einstein@Home ... |
|
PhonAcq 发送消息 已加入:14 Apr 01 贴子:1656 积分:30,658,217 近期平均积分:1
|
I'm not sure anyone is rebutting my position objectively. To repeat, if the Seti computer system already has decided a result, which seems to occur when there are three self-consistent returned results, the next result is over-redundant and is not needed, by definition. The science result has been determined by this point. Thus the late arrivers are not contributing. They don't deserve full credit. An a priori wu deadline/time limit is not needed. Accepting this, then the database requirements may actually reduce, because the system need not keep the late comers around. Just assimulate the initial results and clear the space. When there are system malfunctions like those of a couple of weeks ago, people will not loose work units via time outs. People who can return results quickly but choose long queues in order to increase their earned cobblestones will rethink that strategy. People, when faced with getting no credit for their work due to slow computing, will help the economy by buying broadband and getting a new computer, perhaps a MacTel! What would be good would be for command central to develope a system derived abort signal, which would be issued to a client to abort any work unit that is already stale. Again, the compute power would not be wasted on overly redundant computing. See also: Is redundant SETI computing absurd? May this Farce be with You |
Pappa 发送消息 已加入:9 Jan 00 贴子:2562 积分:12,301,681 近期平均积分:0
|
John I remember a post similar and it was stated that "multiple machine are in "progress" then the first trhee that it high/low are tosed and then "assigned" machines are "averaged"... This is a Very Good Idea! So obviously there is a change that no one has posted... AND looking at some results, you are stating the truth... This really is SAD! They Lied Again! With respect to the last outage, then users have to waste more "Donated" time... Yes it reall is hard to be "Postive" when people in charge have dictated that "Users" do not matter... We can get More! Look it up. If it has three results that are already validated, abort it. Otherwise, keep crunching, you may be the third result, and prevent the need for someone else to do the crunching. If there are three results and they are not validating, then keep crunching as you may make a quorum with two of the others and get the science complete and not require another computer to do the crunching. Please consider a Donation to the Seti Project. |
|
SURVEYOR 发送消息 已加入:19 Oct 02 贴子:375 积分:608,422 近期平均积分:0
|
I just received 4 wu's with a 2 month deadline for the test project [Beta]. Maybe they are testing the long deadline? Fred BOINC Alpha, BOINC Beta, LHC Alpha, Einstein Alpha
|
W-K 666 ![]() 发送消息 已加入:18 May 99 贴子:13932 积分:40,757,560 近期平均积分:67
|
Yes, I forgot that the db keeps growing. But the deadline is not important if the process described below is slightly modified, to wit, as soon as there are three (or four or whatever) consistent results, close the wu. The decision on the wu has been made, de facto. Don't invoke a time limit. Then when there are system issues that delay the return of results no one gets harmed. I think that it would upset a lot of people, I had a unit yesterday morning that I received at about 03:00 UTC and by the time I looked at about 08:30 BST in the UK it had already been granted credit. There has to be a lot of hosts that cannot crunch a unit in 4 and a half hours, my second machine which is offline, takes 6h:30m on a good day. Andy P.S. VISTA (M$ name for new windoze ver.) is an acronym for
|
Steve Cressman 发送消息 已加入:6 Jun 02 贴子:583 积分:65,644 近期平均积分:0
|
Yes Of course, alot of people are getting credit for that fourth returned result, me included. But I think the credit is not due to us because the decision on the unit was already complete. So the 4th result is not adding to the 'science' except for adding a wee bit more confidence in the result. Your thinking here needs to be revised a little bit. Please go to the Wiki and read about credit and related links. You will then learn there are many reason it is done the way it is :) 98SE XP2500+ @ 2.1 GHz Boinc v5.8.8 And God said"Let there be light."But then the program crashed because he was trying to access the 'light' property of a NULL universe pointer. |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.