About Deadlines or Database reduction proposals

Message boards : Number crunching : About Deadlines or Database reduction proposals
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 16 · Next

AuthorMessage
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2033842 - Posted: 24 Feb 2020, 22:48:32 UTC
Last modified: 24 Feb 2020, 23:24:11 UTC

Let's continue our discussions about the topic here.

After some "intense" discussion totally of topic on the wrong thread my suggestion is to squeeze the deadline of the MB up to 30 days to the SETI powers. Who could we done that?
ID: 2033842 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2033846 - Posted: 24 Feb 2020, 23:05:35 UTC

An alternative proposal in the same off-topic discussion was to halve the current deadlines. This would make the longest task deadline a little over 26 days, while keeping shorties - short (a little over 10 days).

Both proposals would affect MultiBeam tasks only - there was no proposal to change the deadline for AstroPulse tasks.
ID: 2033846 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2033850 - Posted: 24 Feb 2020, 23:23:26 UTC - in response to Message 2033846.  

That's is an ever better proposal and i imagine is easier to do since uses the same code just adjust a simple adjust of the upper limit used.
ID: 2033850 · Report as offensive     Reply Quote
Ville Saari
Avatar

Send message
Joined: 30 Nov 00
Posts: 1158
Credit: 49,177,052
RAC: 82,530
Finland
Message 2033856 - Posted: 25 Feb 2020, 0:18:27 UTC

I see no reason to use different deadlines for different tasks. AstroPulse and all variants of MultiBeam should have the same deadline as they all end up in the same queue anyway. The only exception is if some task type is used for a different purpose and there is some specific need for the results to be received quickly after the original observation.
ID: 2033856 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2033857 - Posted: 25 Feb 2020, 0:21:45 UTC - in response to Message 2033846.  
Last modified: 25 Feb 2020, 0:24:44 UTC

An alternative proposal in the same off-topic discussion was to halve the current deadlines. This would make the longest task deadline a little over 26 days, while keeping shorties - short (a little over 10 days).

Both proposals would affect MultiBeam tasks only - there was no proposal to change the deadline for AstroPulse tasks.


. . I rather like the idea of 28 days, an even 4 weeks. That should not tread on very many toes (if any in reality) but should weed out most of the delinquent hosts keeping WUs in limbo.

. . I 'spoke' to Mr Kevvy about moving the off topic messages from the panic thread to here, just to clean up the mess there is nothing else. He is willing to take care of it if we/someone tags them.

Stephen

? ?
ID: 2033857 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2033861 - Posted: 25 Feb 2020, 0:40:25 UTC - in response to Message 2033857.  

. I 'spoke' to Mr Kevvy about moving the off topic messages from the panic thread to here, just to clean up the mess there is nothing else. He is willing to take care of it if we/someone tags them.

Uggh, that's 3 pages worth of posts. I certainly would not like to have 225 messages about post moves and deletions show up in my PM basket.

Let the lie where they are.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2033861 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033889 - Posted: 25 Feb 2020, 8:09:02 UTC

From the other thread.
Just take a look at the graph before making ANY assumption about "having no effect", "Because of long deadlines" - these two are totally and utterly WRONG.
I did have a look at your graph, and i stand by my statement. What your graph shows is what is presently occurring, which is a result of the present deadlines.

You are graphing how long it takes for a WU to be validated- and those extended times you talk about are the result of a long deadline WU being sent to multiple systems- not just one - but multiple systems where it times out (on more than one occasion) before it is finally Validated.
The slowest system i have been able to find so far (after a very brief search) takes 2 days to process a MB WU. So how could setting the maximum WU deadlines to 28 days impact the ability of that system to process work for Seti?


The truth is, and some do not accept this, is that SETI@Home has a POLICY of supporting a very wide range of computer performance, and human activity such as holidays and forgetting to stop a host, infrequent processing and so on. Twenty days would mean about 40% of the task sent out would have to be resent, and, as these are probably on hosts that only do a very small number of tasks per year that means alienating a very large proportion of the user base, which according to many reports is shrinking - do you want to decimate that base over night?
How does my suggestion affect any host in the manner you suggest?
As i pointed out above- there is no system presently crunching that can't return the longest possible running MB WU within 28 days, even if it only spends an hour every other day processing work, so there would be no impact on any of the slowest of systems systems ability to do work for Seti if the MB deadlines were set to a maximum of 28 days.
Grant
Darwin NT
ID: 2033889 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033890 - Posted: 25 Feb 2020, 8:10:51 UTC - in response to Message 2033861.  

. I 'spoke' to Mr Kevvy about moving the off topic messages from the panic thread to here, just to clean up the mess there is nothing else. He is willing to take care of it if we/someone tags them.

Uggh, that's 3 pages worth of posts. I certainly would not like to have 225 messages about post moves and deletions show up in my PM basket.

Let the lie where they are.
That gets my vote.
There's this thread here for the topic under discussion, there is a new Panic mode thread for those discussions.
Let the old one be.
Grant
Darwin NT
ID: 2033890 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033891 - Posted: 25 Feb 2020, 8:13:21 UTC - in response to Message 2033856.  
Last modified: 25 Feb 2020, 8:13:53 UTC

I see no reason to use different deadlines for different tasks. AstroPulse and all variants of MultiBeam should have the same deadline as they all end up in the same queue anyway. The only exception is if some task type is used for a different purpose and there is some specific need for the results to be received quickly after the original observation.
I agree.
28 days is plenty of time to process work (even for the slowest of slow systems). It also gives people time to get things fixed up/back online if they have issues, without them missing out on Credit for work they have done but couldn't return by the due date because it was an extremely short deadline.
Grant
Darwin NT
ID: 2033891 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2033892 - Posted: 25 Feb 2020, 8:16:45 UTC

Thinking about different task types having different deadlines:

AstroPulse and MultiBeam arrive in the queue for distribution from different sources, and have a unique identifier in their file name - the "ap" as the first two characters for AstroPulse and either "BLC" or a date for MultiBeam. Thus it is a quick and easy check on the file name see what deadline to assign (file names are assigned by the splitters).
When it comes to MultiBeam it is possible to differentiate the from the file names VHARs from the rest, this is down to a historical event - in the very early days of nVidia GPUs they took for ever to complete, (and were possibly producing incorrect results at the same time). As for the rest, there is no classification done by the splitters - its either VHAR or not; which might help a little.
It would be possible to change the splitter code such that there were more angle ranges identified by filename. Then it would become a case of deciding the deadline weighting between the types, not forgetting that data from different sources may require different weightings.
This goes anywhere near the problem of noise bombs - these are chunks of data that have a very high proportion of noise, so fail early. Sadly it s down to us, who are doing the signal detection to find them. The splitters would have to process each and every task to detect and filter out noise bombs, that is a few seconds extra on the splitting of every task (remember that the splitters are sitting on a server using very similar chips to our own, but are doing much more than just splitting).

The weightings I talked about a few lines back could then be used to assign the deadline to each task - this should be a simple change to the logic with a few more cases in the existing statement (not forgetting of course currently it may just be a simple "if first characters = "ap" then deadline = x otherwise deadline = y".

Certainly do-able, but there are a number of steps to go through before it could be implemented, the process would go something along these lines::
One - characterise the Angular Range for each data source into a set of ranges (already done for Arecibo, but I don't think it has been done for GBT);
Two - establish which range is to the baseline for the deadline weightings;
Three - establish what the baseline deadline should be;
Four - establish the actual baseline weightings for each of the ranges;
Five - calculate the deadlines based on their weightings and the base deadline;
Six - aggregate groups of ranges that have very similar deadlines;
Seven - modify the splitter code to accommodate the new unique range identifier. Do this at Beta first;
Eight - test the new code to make sure it is not upsetting anything else (BOINC has lots of side allies that could cause trouble). Do this at Beta first;
Nine - modify the distribution logic to cope with the multiple choices for assigning deadlines instead of just two. Do this at Beta first;
Ten - If it is working correctly then migrate to Main.
(Eleven - have a large glass of suitable relaxing potion.)

Now obviously some of these steps can be done in parallel, I've listed them out in some sort of sensible order. And in doing the calculations of new baselines and their aggregation it might be found that there is no need to actually go beyond that step, and just reset the baselines of all MultiBeam to something nearer the 30 day level than the current 55 (or whatever it is) days.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2033892 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2033893 - Posted: 25 Feb 2020, 8:20:06 UTC

Uggh, that's 3 pages worth of posts. I certainly would not like to have 225 messages about post moves and deletions show up in my PM basket.

Let the lie where they are.

@Keith:
I'm with you on that.
From the moderators view there is another thought - there may be some posts buried in the middle that should stay in the server issues thread and moving things out and back is not a pleasant thing to do.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2033893 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2033894 - Posted: 25 Feb 2020, 8:39:49 UTC - in response to Message 2033892.  

Rob - the deadlines are set automatically by direct reference to the Angle Range, using variations on Joe's formula. That code has been in place for literally decades, although tweaked along the way. File names are not on the roadmap for that pathway.

The .vlar extension was a kludge for the next stage in the process - the distribution of tasks by the scheduler. It was a quick way of identifying which tasks ran painfully slowly on NVidia GPUs. The name designation doesn't actually match the technical definition of VLAR - AR <= telescope beam width, but it was useful in its day. We're not using it any more.
ID: 2033894 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033897 - Posted: 25 Feb 2020, 8:41:34 UTC

From the other thread-
Yes, reduced deadlines would reduce the amount of work out in the field. But so would turning off GPU-spoofing
That was my feeling too, till Tbar posted a comparison between 2 systems with almost identical hardware & applications. One spoofed, the other not.

Keep in mind- the faster you return work, the higher your Pendings. The longer it takes you to return work the greater the chance your wingman has already returned theirs & so it will go (pretty much) straight to Validated.

The end result is the load on the database is pretty much the same. The Task list All numbers- which is the load on the database (In Progress + Pendings + Inconclusives + Valids etc)- for both systems was within a few hundred of each other. The spoofed system had a much higher In Progress number, with a much lower Validation Pending number. The un-spoofed system had a much lower In Progress number, with much higher Validation Pending numbers. But overall, spoofed v unspoofed, for a given system & application the All numbers were pretty much the same, it's just a difference in status (In progess v Validation Pending).

Basically the better a system performs, the greater the load on the database. But so would the same amount of work being done (WUs per hour), if it were being done by many, many more slower systems- with the added load of keeping track of all those extra systems, and all those extra Scheduler requests, of course.
Grant
Darwin NT
ID: 2033897 · Report as offensive     Reply Quote
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2033899 - Posted: 25 Feb 2020, 9:01:13 UTC - in response to Message 2033889.  

The slowest system i have been able to find so far (after a very brief search) takes 2 days to process a MB WU. So how could setting the maximum WU deadlines to 28 days impact the ability of that system to process work for Seti?

Assuming you mean, the task took 48 hrs or to be crunched. Then under the original assumption be Dr. D. A of volunteers donating 1 hr/day that would specify a 48 day deadline.

Are we asking for the 1 hr/day assumption to be modified to 2 hr/day?
ID: 2033899 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033900 - Posted: 25 Feb 2020, 9:11:16 UTC - in response to Message 2033899.  

Are we asking for the 1 hr/day assumption to be modified to 2 hr/day?
You can forward that proposal if you like.
But since screen savers are no longer a thing, i don't see how it's relevant these days.
Grant
Darwin NT
ID: 2033900 · Report as offensive     Reply Quote
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2033901 - Posted: 25 Feb 2020, 9:20:07 UTC - in response to Message 2033900.  
Last modified: 25 Feb 2020, 9:26:01 UTC

Are we asking for the 1 hr/day assumption to be modified to 2 hr/day?
You can forward that proposal if you like.
But since screen savers are no longer a thing, i don't see how it's relevant these days.

I'm not making that proposal, but the suggestion of reducing the longest deadlines to 28 days is.

Did the screensaver make that much difference?
Can't say I was aware of that, I thought the added load was about 5%.

The list of participating CPU's is at https://setiathome.berkeley.edu/cpu_list.php
ID: 2033901 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033902 - Posted: 25 Feb 2020, 9:30:14 UTC - in response to Message 2033901.  

Did the screensaver make that much difference?
Can't say I was aware of that, I thought the added load was about 5%.
It did make a difference, but the point i was was trying to make is that many of the original ideas/assumptions are no longer relevant. People no longer use screen savers, there is no need. So an hour of screen saver time a day really isn't relevant any more. Not to mention it can now be run on everything from phones, tablets, laptops, desktops, servers etc when it was originally just for people's desktop or laptop computer.
I'm surprised someone hasn't got it running on their Smart TV yet. Watch TV, and look for aliens all at the same time.
Grant
Darwin NT
ID: 2033902 · Report as offensive     Reply Quote
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2033903 - Posted: 25 Feb 2020, 9:41:43 UTC - in response to Message 2033902.  

Why are the initial assumptions no longer relevant?
Is it wrong to assume that there are hosts out there that are only switched on for a few hours/day and because their performance is low are set to suspend when keyboard or mouse is detected.

I know for a fact that my youngest's desktop is not switched on at least 3 days per week and on Saturdays is usually only on for a few hours in the afternoon.
ID: 2033903 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 2033904 - Posted: 25 Feb 2020, 9:54:50 UTC - in response to Message 2033903.  
Last modified: 25 Feb 2020, 9:55:37 UTC

Why are the initial assumptions no longer relevant?
I mentioned that in the previous post.


Is it wrong to assume that there are hosts out there that are only switched on for a few hours/day and because their performance is low are set to suspend when keyboard or mouse is detected.
I know for a fact that my youngest's desktop is not switched on at least 3 days per week and on Saturdays is usually only on for a few hours in the afternoon.
So you'r saying we should extend deadlines even further to allow for systems that are very slow & might only run for an hour a week or so?

Personally i'd rather we continue to cater for the vast majority- which are average systems which would only crunch work for a few hours most days- than cater to extreme outliers.
If a system can't return a single WU within 28 days then i really don't see a need to cater for it.
Catering for the widest possible range of hardware and use cases does not mean catering for all possible hardware & use cases. There will always have to be a cutoff point.
Grant
Darwin NT
ID: 2033904 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2033906 - Posted: 25 Feb 2020, 10:10:47 UTC - in response to Message 2033894.  

Thanks Richard. I knew that VLAR as a tag was redundant (I'm blaming low caffeine levels for typing VHAR in my post Richard refers to)
So there is already a correction of deadline for "anticipated run time" in place, all that needs to happen is the reference value be adjusted to give a lower deadline.


But when thinking about how low to take the deadline remember that V-K-666's host is probably in the top 1000, so not that far down the list and we are seeing people who only run their computers part-time on SETI.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2033906 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 16 · Next

Message boards : Number crunching : About Deadlines or Database reduction proposals


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.