Message boards :
Number crunching :
About Deadlines or Database reduction proposals
Message board moderation
Author | Message |
---|---|
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Let's continue our discussions about the topic here. After some "intense" discussion totally of topic on the wrong thread my suggestion is to squeeze the deadline of the MB up to 30 days to the SETI powers. Who could we done that? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14677 Credit: 200,643,578 RAC: 874 |
An alternative proposal in the same off-topic discussion was to halve the current deadlines. This would make the longest task deadline a little over 26 days, while keeping shorties - short (a little over 10 days). Both proposals would affect MultiBeam tasks only - there was no proposal to change the deadline for AstroPulse tasks. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
That's is an ever better proposal and i imagine is easier to do since uses the same code just adjust a simple adjust of the upper limit used. |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
I see no reason to use different deadlines for different tasks. AstroPulse and all variants of MultiBeam should have the same deadline as they all end up in the same queue anyway. The only exception is if some task type is used for a different purpose and there is some specific need for the results to be received quickly after the original observation. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
An alternative proposal in the same off-topic discussion was to halve the current deadlines. This would make the longest task deadline a little over 26 days, while keeping shorties - short (a little over 10 days). . . I rather like the idea of 28 days, an even 4 weeks. That should not tread on very many toes (if any in reality) but should weed out most of the delinquent hosts keeping WUs in limbo. . . I 'spoke' to Mr Kevvy about moving the off topic messages from the panic thread to here, just to clean up the mess there is nothing else. He is willing to take care of it if we/someone tags them. Stephen ? ? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
. I 'spoke' to Mr Kevvy about moving the off topic messages from the panic thread to here, just to clean up the mess there is nothing else. He is willing to take care of it if we/someone tags them. Uggh, that's 3 pages worth of posts. I certainly would not like to have 225 messages about post moves and deletions show up in my PM basket. Let the lie where they are. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
From the other thread. Just take a look at the graph before making ANY assumption about "having no effect", "Because of long deadlines" - these two are totally and utterly WRONG.I did have a look at your graph, and i stand by my statement. What your graph shows is what is presently occurring, which is a result of the present deadlines. You are graphing how long it takes for a WU to be validated- and those extended times you talk about are the result of a long deadline WU being sent to multiple systems- not just one - but multiple systems where it times out (on more than one occasion) before it is finally Validated. The slowest system i have been able to find so far (after a very brief search) takes 2 days to process a MB WU. So how could setting the maximum WU deadlines to 28 days impact the ability of that system to process work for Seti? The truth is, and some do not accept this, is that SETI@Home has a POLICY of supporting a very wide range of computer performance, and human activity such as holidays and forgetting to stop a host, infrequent processing and so on. Twenty days would mean about 40% of the task sent out would have to be resent, and, as these are probably on hosts that only do a very small number of tasks per year that means alienating a very large proportion of the user base, which according to many reports is shrinking - do you want to decimate that base over night?How does my suggestion affect any host in the manner you suggest? As i pointed out above- there is no system presently crunching that can't return the longest possible running MB WU within 28 days, even if it only spends an hour every other day processing work, so there would be no impact on any of the slowest of systems systems ability to do work for Seti if the MB deadlines were set to a maximum of 28 days. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
That gets my vote.. I 'spoke' to Mr Kevvy about moving the off topic messages from the panic thread to here, just to clean up the mess there is nothing else. He is willing to take care of it if we/someone tags them. There's this thread here for the topic under discussion, there is a new Panic mode thread for those discussions. Let the old one be. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
I see no reason to use different deadlines for different tasks. AstroPulse and all variants of MultiBeam should have the same deadline as they all end up in the same queue anyway. The only exception is if some task type is used for a different purpose and there is some specific need for the results to be received quickly after the original observation.I agree. 28 days is plenty of time to process work (even for the slowest of slow systems). It also gives people time to get things fixed up/back online if they have issues, without them missing out on Credit for work they have done but couldn't return by the due date because it was an extremely short deadline. Grant Darwin NT |
rob smith Send message Joined: 7 Mar 03 Posts: 22495 Credit: 416,307,556 RAC: 380 |
Thinking about different task types having different deadlines: AstroPulse and MultiBeam arrive in the queue for distribution from different sources, and have a unique identifier in their file name - the "ap" as the first two characters for AstroPulse and either "BLC" or a date for MultiBeam. Thus it is a quick and easy check on the file name see what deadline to assign (file names are assigned by the splitters). When it comes to MultiBeam it is possible to differentiate the from the file names VHARs from the rest, this is down to a historical event - in the very early days of nVidia GPUs they took for ever to complete, (and were possibly producing incorrect results at the same time). As for the rest, there is no classification done by the splitters - its either VHAR or not; which might help a little. It would be possible to change the splitter code such that there were more angle ranges identified by filename. Then it would become a case of deciding the deadline weighting between the types, not forgetting that data from different sources may require different weightings. This goes anywhere near the problem of noise bombs - these are chunks of data that have a very high proportion of noise, so fail early. Sadly it s down to us, who are doing the signal detection to find them. The splitters would have to process each and every task to detect and filter out noise bombs, that is a few seconds extra on the splitting of every task (remember that the splitters are sitting on a server using very similar chips to our own, but are doing much more than just splitting). The weightings I talked about a few lines back could then be used to assign the deadline to each task - this should be a simple change to the logic with a few more cases in the existing statement (not forgetting of course currently it may just be a simple "if first characters = "ap" then deadline = x otherwise deadline = y". Certainly do-able, but there are a number of steps to go through before it could be implemented, the process would go something along these lines:: One - characterise the Angular Range for each data source into a set of ranges (already done for Arecibo, but I don't think it has been done for GBT); Two - establish which range is to the baseline for the deadline weightings; Three - establish what the baseline deadline should be; Four - establish the actual baseline weightings for each of the ranges; Five - calculate the deadlines based on their weightings and the base deadline; Six - aggregate groups of ranges that have very similar deadlines; Seven - modify the splitter code to accommodate the new unique range identifier. Do this at Beta first; Eight - test the new code to make sure it is not upsetting anything else (BOINC has lots of side allies that could cause trouble). Do this at Beta first; Nine - modify the distribution logic to cope with the multiple choices for assigning deadlines instead of just two. Do this at Beta first; Ten - If it is working correctly then migrate to Main. (Eleven - have a large glass of suitable relaxing potion.) Now obviously some of these steps can be done in parallel, I've listed them out in some sort of sensible order. And in doing the calculations of new baselines and their aggregation it might be found that there is no need to actually go beyond that step, and just reset the baselines of all MultiBeam to something nearer the 30 day level than the current 55 (or whatever it is) days. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
rob smith Send message Joined: 7 Mar 03 Posts: 22495 Credit: 416,307,556 RAC: 380 |
Uggh, that's 3 pages worth of posts. I certainly would not like to have 225 messages about post moves and deletions show up in my PM basket. @Keith: I'm with you on that. From the moderators view there is another thought - there may be some posts buried in the middle that should stay in the server issues thread and moving things out and back is not a pleasant thing to do. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14677 Credit: 200,643,578 RAC: 874 |
Rob - the deadlines are set automatically by direct reference to the Angle Range, using variations on Joe's formula. That code has been in place for literally decades, although tweaked along the way. File names are not on the roadmap for that pathway. The .vlar extension was a kludge for the next stage in the process - the distribution of tasks by the scheduler. It was a quick way of identifying which tasks ran painfully slowly on NVidia GPUs. The name designation doesn't actually match the technical definition of VLAR - AR <= telescope beam width, but it was useful in its day. We're not using it any more. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
From the other thread- Yes, reduced deadlines would reduce the amount of work out in the field. But so would turning off GPU-spoofingThat was my feeling too, till Tbar posted a comparison between 2 systems with almost identical hardware & applications. One spoofed, the other not. Keep in mind- the faster you return work, the higher your Pendings. The longer it takes you to return work the greater the chance your wingman has already returned theirs & so it will go (pretty much) straight to Validated. The end result is the load on the database is pretty much the same. The Task list All numbers- which is the load on the database (In Progress + Pendings + Inconclusives + Valids etc)- for both systems was within a few hundred of each other. The spoofed system had a much higher In Progress number, with a much lower Validation Pending number. The un-spoofed system had a much lower In Progress number, with much higher Validation Pending numbers. But overall, spoofed v unspoofed, for a given system & application the All numbers were pretty much the same, it's just a difference in status (In progess v Validation Pending). Basically the better a system performs, the greater the load on the database. But so would the same amount of work being done (WUs per hour), if it were being done by many, many more slower systems- with the added load of keeping track of all those extra systems, and all those extra Scheduler requests, of course. Grant Darwin NT |
W-K 666 Send message Joined: 18 May 99 Posts: 19361 Credit: 40,757,560 RAC: 67 |
The slowest system i have been able to find so far (after a very brief search) takes 2 days to process a MB WU. So how could setting the maximum WU deadlines to 28 days impact the ability of that system to process work for Seti? Assuming you mean, the task took 48 hrs or to be crunched. Then under the original assumption be Dr. D. A of volunteers donating 1 hr/day that would specify a 48 day deadline. Are we asking for the 1 hr/day assumption to be modified to 2 hr/day? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
Are we asking for the 1 hr/day assumption to be modified to 2 hr/day?You can forward that proposal if you like. But since screen savers are no longer a thing, i don't see how it's relevant these days. Grant Darwin NT |
W-K 666 Send message Joined: 18 May 99 Posts: 19361 Credit: 40,757,560 RAC: 67 |
Are we asking for the 1 hr/day assumption to be modified to 2 hr/day?You can forward that proposal if you like. I'm not making that proposal, but the suggestion of reducing the longest deadlines to 28 days is. Did the screensaver make that much difference? Can't say I was aware of that, I thought the added load was about 5%. The list of participating CPU's is at https://setiathome.berkeley.edu/cpu_list.php |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
Did the screensaver make that much difference?It did make a difference, but the point i was was trying to make is that many of the original ideas/assumptions are no longer relevant. People no longer use screen savers, there is no need. So an hour of screen saver time a day really isn't relevant any more. Not to mention it can now be run on everything from phones, tablets, laptops, desktops, servers etc when it was originally just for people's desktop or laptop computer. I'm surprised someone hasn't got it running on their Smart TV yet. Watch TV, and look for aliens all at the same time. Grant Darwin NT |
W-K 666 Send message Joined: 18 May 99 Posts: 19361 Credit: 40,757,560 RAC: 67 |
Why are the initial assumptions no longer relevant? Is it wrong to assume that there are hosts out there that are only switched on for a few hours/day and because their performance is low are set to suspend when keyboard or mouse is detected. I know for a fact that my youngest's desktop is not switched on at least 3 days per week and on Saturdays is usually only on for a few hours in the afternoon. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13841 Credit: 208,696,464 RAC: 304 |
Why are the initial assumptions no longer relevant?I mentioned that in the previous post. Is it wrong to assume that there are hosts out there that are only switched on for a few hours/day and because their performance is low are set to suspend when keyboard or mouse is detected.So you'r saying we should extend deadlines even further to allow for systems that are very slow & might only run for an hour a week or so? Personally i'd rather we continue to cater for the vast majority- which are average systems which would only crunch work for a few hours most days- than cater to extreme outliers. If a system can't return a single WU within 28 days then i really don't see a need to cater for it. Catering for the widest possible range of hardware and use cases does not mean catering for all possible hardware & use cases. There will always have to be a cutoff point. Grant Darwin NT |
rob smith Send message Joined: 7 Mar 03 Posts: 22495 Credit: 416,307,556 RAC: 380 |
Thanks Richard. I knew that VLAR as a tag was redundant (I'm blaming low caffeine levels for typing VHAR in my post Richard refers to) So there is already a correction of deadline for "anticipated run time" in place, all that needs to happen is the reference value be adjusted to give a lower deadline. But when thinking about how low to take the deadline remember that V-K-666's host is probably in the top 1000, so not that far down the list and we are seeing people who only run their computers part-time on SETI. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.