Message boards :
Number crunching :
The spoofed client - Whatever about
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
So following the logic of reducing the out-in-the-field number surely shortening the deadlines would improve that.Not as much as you might think. I can't remember who did it, but they spent a considerable amount of time & effort & looked at how long it took for their WUs to be Validaated. I can't remember the actual numbers, but I think it was something like well over 90% of WUs are returned and Validated within 48 hrs or so, the next 5% or so within a week. For all of the WUs people see in their Pending Validation list from 1, 2 or even more months a go the vast majority of work sent out is actually returned within 48hrs (or less). The only way it would have a significant effect would be to make the deadlines something like 96 hours (4 days). But who hasn't had an issue with their ISP? Power failure? System crash? Moved house? Gone on a holiday? While a 1 week deadline probably would reduce the number of Results-out-in-the-field significantly, there really needs to be some extra time to allow for problems that crop up to be sorted out, and still allow to be able to rerun the work and not have it just time out. I reckon a 1 month deadline would be a good compromise, but it's effect on Results-out-in-the-field would be minimal. Grant Darwin NT |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30653 Credit: 53,134,872 RAC: 32 |
Host 1; State: All (29150) · In progress (500) · Validation pending (17846) · Validation inconclusive (352) · Valid (10452) · Invalid (0) · Error (0) Thought it might be interesting to compare to a host at the extreme other end of the world, very slow SLOW State: All (16) · In progress (2) · Validation pending (9) · Validation inconclusive (1) · Valid (4) · Invalid (0) · Error (0) You would think that wingmates would on average have already returned so what is up with the validation pending?! Remember for each w/u there are at least two files on the server, one for each wingmate. As you can't validate one of your machines against another of your machines your work can get paired with a super slow machine (there are lots more of them) and have files sitting server side for a long long time. There will be a sweet spot. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Thought it might be interesting to compare to a host at the extreme other end of the world, very slowWell, from that link you can explore every wingmate for the pending tasks. The first five have an average turnround time ranging from 0.9 days to 8.33 days - remember that other people's computers may have a low resource share for SETI. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
There will be a sweet spot. You never will satisfy all the hosts sauces but you could help a lot by making something like GPUGrid, they set a maximum time to return a WU, ok there is a couple of days bit not need to be extreme like that. But set a deadline around 1 month could fit most of the slow hosts and could help to squeeze the DB a little. That could be clearly explained on the project rules. I believe even a very slower host could return a WU crunched in 1 month. I was looking at my WU buffer and i have some (more than 100) WU with a deadline up to Oct 10... about 3 months from now! Not need to explain the consequences of keep this to the DB size. Now imagine if a couple of them goes to those "death hosts" another 3 months of wait? |
rob smith Send message Joined: 7 Mar 03 Posts: 22205 Credit: 416,307,556 RAC: 380 |
It would be helpful to know which group of tasks has a twelve week deadline instead of the more normal six weeks. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
It would be helpful to know which group of tasks has a twelve week deadline instead of the more normal six weeks.I don't have any in my home cache at the moment. But looking at deadlines revisited, they will be tasks with AR 0.22549 or just above - rare, but not unknown. |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30653 Credit: 53,134,872 RAC: 32 |
Thought it might be interesting to compare to a host at the extreme other end of the world, very slowWell, from that link you can explore every wingmate for the pending tasks. The first five have an average turnround time ranging from 0.9 days to 8.33 days - remember that other people's computers may have a low resource share for SETI. Very true, and the last two have now timed out and are now unsent. If this little host is "average" and 20% of work units are timing out ... look what that does to the database. Since the link it has returned the 2 that were in progress, one validated immediately. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
It would be helpful to know which group of tasks has a twelve week deadline instead of the more normal six weeks. Application Local: setiathome_v8 8.01 (cuda90) Name blc41_2bit_guppi_58642_03113_PSR_B1133+16_0015.19208.0.22.45.235.vlar State Ready to report Received Wed 17 Jul 2019 07:53:23 PM EST Report deadline Thu 10 Oct 2019 09:36:14 PM EST Resources 1 CPU + 1 NVIDIA GPU Estimated computation size 6,806 GFLOPs CPU time 00:00:15 Elapsed time 00:00:56 Executable setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 Application Local: setiathome_v8 8.01 (cuda90) Name blc41_2bit_guppi_58642_02075_3C295_0008.32290.0.22.45.30.vlar State Ready to report Received Wed 17 Jul 2019 06:14:45 PM EST Report deadline Fri 27 Sep 2019 09:31:39 AM EST Resources 1 CPU + 1 NVIDIA GPU Estimated computation size 247,577 GFLOPs CPU time 00:00:15 Elapsed time 00:00:52 Executable setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 and a lot others from the same set, 295 WU on one host only. But i see the same on the others hosts with others set's. Is that what you ask for? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Is that what you ask for?Well, the Task ID 7871006337 is easier for others to look at, but it leads us in the right direction - thanks. WU true angle range is : 0.026279 corresponds to Joe's graph. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Those Three Top end machines are about the same as before. I looked at the lower producing machines and was able to make a general assessment. Basically, the In Progress number shouldn't exceed the Host's Daily output, or Valid number. If the Host caches more tasks than it can complete in a Day the Database impact rises. This means a Host with a Valid number of 2000 shouldn't have a cache of 6400, or as in the old days, a Host completing 100 tasks a day shouldn't have a cache of 10000. It basically agrees with the suggestion I made a while back, the Maximum allowed cache should be set to One Day. This means a Host completing 50 tasks a day should have a cache of 50, and so forth. This will also lower the Pending number a bit as Pending tasks generally would only stay Pending a Day. If the Wingman doesn't run his machine, at least there will only be One day of tasks on his machine instead of 10 days. Of course lowering the Deadlines would also help, but, there seems to be reluctance to lower deadlines. |
rob smith Send message Joined: 7 Mar 03 Posts: 22205 Credit: 416,307,556 RAC: 380 |
With a max one day cache you automatically alienate those who grab a few days work at the weekend, process it off-line during the week then return and refill at the weekend, that is folks who use their cache properly not those who just like to have a huge cache that they do not need apart from polishing their egos by having thousands of tasks sitting around that they will never process. But, as with the current 100+100 cache people would eventually find out ways to work around it. One solution to part of the "problem" is of course to seriously restrict the flow of tasks to hosts that fail to return large numbers by their deadline to one one non-error task returned, one task sent out (I can think of one user who would have their cache reduced from >>30k to a single figure if this was implemented, and I dare say there are other hosts doing much the same) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
tazzduke Send message Joined: 15 Sep 07 Posts: 190 Credit: 28,269,068 RAC: 5 |
With a max one day cache you automatically alienate those who grab a few days work at the weekend, process it off-line during the week then return and refill at the weekend, that is folks who use their cache properly not those who just like to have a huge cache that they do not need apart from polishing their egos by having thousands of tasks sitting around that they will never process. But, as with the current 100+100 cache people would eventually find out ways to work around it. +1 |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
With a max one day cache you automatically alienate those who grab a few days work at the weekend, process it off-line during the week then return and refill at the weekend, that is folks who use their cache properly not those who just like to have a huge cache that they do not need apart from polishing their egos by having thousands of tasks sitting around that they will never process. But, as with the current 100+100 cache people would eventually find out ways to work around it.This sounds as though it is two Very small groups of users. As far as somehow fixing the 'bad Host' problem, good luck with that. SETI has been trying to 'fix' that problem forever and I doubt it will be fixed anytime soon. As for the other 99% of SETI users, tests indicate the most efficient cache setting is to set the cache just below the Host's Daily output. That setting will result in the best In Progress and Pending validation ratio along with the Smallest Impact on the Database, for those who can actually download that many tasks. For the most productive Hosts who will still be stuck with just a couple hours worth of cache, you can take solace in the knowledge that at least those few people that can't seem to find the Internet for up to a week at a time will still be able to run those dozen or so tasks a day. |
rob smith Send message Joined: 7 Mar 03 Posts: 22205 Credit: 416,307,556 RAC: 380 |
From the evidence in front of me it would appear that the former group (the weekly hitters) represent somewhere five and ten percent of the users that are contributing to slow validations. The second group (the never return in time) who have huge caches it only takes a very small number of them to get a very large number of trapped tasks - 10 such hosts, each with 30,000 is 300,000, just now there are ~5,400,000 tasks out in the field - that's about 5.5% of the tasks out in the field trapped on only ten machines, which is far from an insignificant fraction, and well worth releasing back so they can be processed in an acceptable period of time. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
pututu Send message Joined: 21 Jul 16 Posts: 12 Credit: 10,108,801 RAC: 6 |
This PC is not mine but I give a thumb up for his/her effort to find ways to download more tasks per rig. Running multiple BOINC instances on one PC. Appears there are 4 instances running on one 1080 card most likely due to memory size limitation hence the run time for 1080 appears about 4 times slower than mine. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782554 https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782555 https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782556 https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782557 There are many guides on how one can setup BOINC multiple instances. The upcoming Wow! event has certainly created some ingenuity in overcoming the 100 tasks limitation per GPU/CPU. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
There are many guides on how one can setup BOINC multiple instances. The upcoming Wow! event has certainly created some ingenuity in overcoming the 100 tasks limitation per GPU/CPU. See this: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8773146 This is what a real spoofed client host could do. Observe there are no slow in the crunching time. Now imagine what could do a fleet of 10 or more of this babes. This show why spoofed hosts must be managed with extreme care to avoyd swam the servers. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Right, because all 90,000 active users are running Linux, and have the skills and means to do this. Pull the other one, Keith, it's got bells on... . . Hi Keith . . Your comment about 'you have to be a member of GPU Users group' to get the information did come across as a bit elitist at the least and rather political at worst, but I would have thought that Jimbocous has been around long enough to know that most of the serious contributors take their efforts and responsibilities seriously. My fear is not as much about the abuse of ridiculously oversized caches (personally my philosophy has always been your cache does not need to be bigger than 1 days work) but rather a plague of screwed up clients misbehaving and causing havoc. . . Sadly not all of us have programming skills/experience to bring to the party so this remains beyond our personal abilities. And you of course right in pointing out that a great proliferation of humungous caches would cause possibly terminal database bloat. . . Just my 2 cents worth ... Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
These restrictions were put into place to stop huge amounts of data being returned after an outage as it caused database problems. That is it, no other reason. . . Hi Bernie, this is a very good point but I think not the only reason. Database bloat is another and the database/s have bumped their heads on the ceiling on more than one occasion. But while that may have been improved with the updates that have been implemented it is still worth remembering. Stephen :-{ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
. . Sadly not all of us have programming skills/experience to bring to the party so this remains beyond our personal abilities. And you of course right in pointing out that a great proliferation of humungous caches would cause possibly terminal database bloat. Hi Stephen, thanks for the comments. Would you believed I flunked out of the only programming class I took in college 40 year ago. I don't have a clue how to program. My brain just does not understand the structure. I started off knowing nothing other than I could read the instructions provided by Seti. I never needed to know how to program or compile until I discovered a bug in the client and reported it to the BOINC developers github site. Richard Haselgrove told me I had confirmed a bug he had reported two years earlier that was never acted upon. With my new observation confirming Richard's, that got the ball started rolling on the problem finally. But when it came time to test the code, Richard told me I was on my own to compile the client since we have no Linux developers that know how. There are automatic systems in place to automatically generate applications when new code is submitted for Windows, but none for Linux. It has to be done by manual input. So I had to read the instructions on the BOINC website. And finally through a lot of trial and error I succeeded. I can follow along and get the gist of what the code is doing since it is all in high level C that reads like normal text. So in any particular small part of code I can actually understand what it is doing. DA or the other developers put in little helpful comments for various sections that for the most part explain what the code is supposed to do. Putting the whole of the code together to get the bigger picture is harder for me. Sleuthing out the parts that control how many tasks are allowed on board a host at any time is easy to find and change. Tbar hinted that was all he did to generate the 7.4.44 client. Change one value and voila you get 10,000 tasks allowed instead of 1000. Add a zero is all I did. It took another team member to piece the spoofed gpu count together with my assistance since his programming knowledge was only 30 years old and not 40 years like me. So it was a team effort that created the spoofed client through a lot of iteration and testing and trial and error. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I remember reading somewhere the main reason for the limit was to stop having so many outstanding unused task assignments in the Database. The problem was, there was a number of people who would download 10000 tasks and then never run BOINC again, causing those 10000 tasks to remain in the database for months before timing out. Imagine how many tasks would be in the database if Everyone was allowed to hold 10000 tasks. In reality, very few people need that many tasks, and it really doesn't matter if those people have 500 In Progress tasks or 3000 in progress tasks. What matters to the database is what's in the All column and that column will be quite high on a Host that Needs 10000 tasks a day. Take for example this Host, do you really think it would matter if the In Progress column said 1400 rather than 3200? . . I believe the built in limit on reported completed WUs is 256. But it can be quite simply reduced as Richard has suggested many times, and should not allow the heavy hitters to have any more impact than more moderate rigs in the short term, but of course with many times more completed WUs to report the process will be repeated over a much longer period and while not making the recovery initially much worse they will extend the impact over a longer time. But as you say, these Mega crunchers are only a relatively small minority. I do believe the major issue and reason for the download limits was database management. As for the offenders of the highlighted 'crime', they still abound, just take a look at the older tasks in your pending and inconclusive lists, you will find lots of them. Hosts sitting on hundreds or thousands of unprocessed WUs holding them in limbo until they time out. :( Stephen :( |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.