The spoofed client - Whatever about

Author	Message
Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 2004610 - Posted: 28 Jul 2019, 7:43:23 UTC - in response to Message 2004609. Last modified: 28 Jul 2019, 7:44:22 UTC So following the logic of reducing the out-in-the-field number surely shortening the deadlines would improve that. Not as much as you might think. I can't remember who did it, but they spent a considerable amount of time & effort & looked at how long it took for their WUs to be Validaated. I can't remember the actual numbers, but I think it was something like well over 90% of WUs are returned and Validated within 48 hrs or so, the next 5% or so within a week. For all of the WUs people see in their Pending Validation list from 1, 2 or even more months a go the vast majority of work sent out is actually returned within 48hrs (or less). The only way it would have a significant effect would be to make the deadlines something like 96 hours (4 days). But who hasn't had an issue with their ISP? Power failure? System crash? Moved house? Gone on a holiday? While a 1 week deadline probably would reduce the number of Results-out-in-the-field significantly, there really needs to be some extra time to allow for problems that crop up to be sorted out, and still allow to be able to rerun the work and not have it just time out. I reckon a 1 month deadline would be a good compromise, but it's effect on Results-out-in-the-field would be minimal. Grant Darwin NT ID: 2004610 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30653 Credit: 53,134,872 RAC: 32	Message 2004628 - Posted: 28 Jul 2019, 14:01:13 UTC - in response to Message 2004555. Host 1; State: All (29150) Â· In progress (500) Â· Validation pending (17846) Â· Validation inconclusive (352) Â· Valid (10452) Â· Invalid (0) Â· Error (0) Host 2; State: All (28428) Â· In progress (3200) Â· Validation pending (14583) Â· Validation inconclusive (425) Â· Valid (10220) Â· Invalid (0) Â· Error (0) Host 3; State: All (26577) Â· In progress (5172) Â· Validation pending (10703) Â· Validation inconclusive (374) Â· Valid (10229) Â· Invalid (2) Â· Error (97) Another look at those Three Top Hosts; Host 1; State: All (28282) Â· In progress (500) Â· Validation pending (17600) Â· Validation inconclusive (337) Â· Valid (9845) Host 2; State: All (27806) Â· In progress (3200) Â· Validation pending (14143) Â· Validation inconclusive (448) Â· Valid (10015) Host 3; State: All (27821) Â· In progress (6500) Â· Validation pending (10860) Â· Validation inconclusive (378) Â· Valid (9987) This is more like what I would expect, the 'All' numbers are very close together. Any perceived 'Help' to the database by lowering the cache setting is offset by an increase in Pending tasks. This is for Hosts completing 10000 tasks a day, is it the same for those completing 5000 per day? 1000 a day? Since Validated tasks are removed after 24 hours I assume we need to consider this a daily cycle. Hmmm, I suppose I need to order another 1070 to break out of this deadlock... Thought it might be interesting to compare to a host at the extreme other end of the world, very slow SLOW State: All (16) Â· In progress (2) Â· Validation pending (9) Â· Validation inconclusive (1) Â· Valid (4) Â· Invalid (0) Â· Error (0) You would think that wingmates would on average have already returned so what is up with the validation pending?! Remember for each w/u there are at least two files on the server, one for each wingmate. As you can't validate one of your machines against another of your machines your work can get paired with a super slow machine (there are lots more of them) and have files sitting server side for a long long time. There will be a sweet spot. ID: 2004628 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 2004631 - Posted: 28 Jul 2019, 14:24:01 UTC - in response to Message 2004628. Thought it might be interesting to compare to a host at the extreme other end of the world, very slow SLOW State: All (16) Â· In progress (2) Â· Validation pending (9) Â· Validation inconclusive (1) Â· Valid (4) Â· Invalid (0) Â· Error (0) You would think that wingmates would on average have already returned so what is up with the validation pending?! Well, from that link you can explore every wingmate for the pending tasks. The first five have an average turnround time ranging from 0.9 days to 8.33 days - remember that other people's computers may have a low resource share for SETI. ID: 2004631 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 2004634 - Posted: 28 Jul 2019, 14:55:55 UTC - in response to Message 2004628. Last modified: 28 Jul 2019, 14:57:58 UTC There will be a sweet spot. You never will satisfy all the hosts sauces but you could help a lot by making something like GPUGrid, they set a maximum time to return a WU, ok there is a couple of days bit not need to be extreme like that. But set a deadline around 1 month could fit most of the slow hosts and could help to squeeze the DB a little. That could be clearly explained on the project rules. I believe even a very slower host could return a WU crunched in 1 month. I was looking at my WU buffer and i have some (more than 100) WU with a deadline up to Oct 10... about 3 months from now! Not need to explain the consequences of keep this to the DB size. Now imagine if a couple of them goes to those "death hosts" another 3 months of wait? ID: 2004634 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22205 Credit: 416,307,556 RAC: 380	Message 2004642 - Posted: 28 Jul 2019, 16:38:41 UTC It would be helpful to know which group of tasks has a twelve week deadline instead of the more normal six weeks. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2004642 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 2004646 - Posted: 28 Jul 2019, 16:47:30 UTC - in response to Message 2004642. It would be helpful to know which group of tasks has a twelve week deadline instead of the more normal six weeks. I don't have any in my home cache at the moment. But looking at deadlines revisited, they will be tasks with AR 0.22549 or just above - rare, but not unknown. ID: 2004646 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30653 Credit: 53,134,872 RAC: 32	Message 2004647 - Posted: 28 Jul 2019, 17:05:40 UTC - in response to Message 2004631. Thought it might be interesting to compare to a host at the extreme other end of the world, very slow SLOW State: All (16) Â· In progress (2) Â· Validation pending (9) Â· Validation inconclusive (1) Â· Valid (4) Â· Invalid (0) Â· Error (0) You would think that wingmates would on average have already returned so what is up with the validation pending?! Well, from that link you can explore every wingmate for the pending tasks. The first five have an average turnround time ranging from 0.9 days to 8.33 days - remember that other people's computers may have a low resource share for SETI. Very true, and the last two have now timed out and are now unsent. If this little host is "average" and 20% of work units are timing out ... look what that does to the database. Since the link it has returned the 2 that were in progress, one validated immediately. ID: 2004647 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 2004648 - Posted: 28 Jul 2019, 17:12:28 UTC - in response to Message 2004642. Last modified: 28 Jul 2019, 17:25:27 UTC It would be helpful to know which group of tasks has a twelve week deadline instead of the more normal six weeks. Application Local: setiathome_v8 8.01 (cuda90) Name blc41_2bit_guppi_58642_03113_PSR_B1133+16_0015.19208.0.22.45.235.vlar State Ready to report Received Wed 17 Jul 2019 07:53:23 PM EST Report deadline Thu 10 Oct 2019 09:36:14 PM EST Resources 1 CPU + 1 NVIDIA GPU Estimated computation size 6,806 GFLOPs CPU time 00:00:15 Elapsed time 00:00:56 Executable setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 Application Local: setiathome_v8 8.01 (cuda90) Name blc41_2bit_guppi_58642_02075_3C295_0008.32290.0.22.45.30.vlar State Ready to report Received Wed 17 Jul 2019 06:14:45 PM EST Report deadline Fri 27 Sep 2019 09:31:39 AM EST Resources 1 CPU + 1 NVIDIA GPU Estimated computation size 247,577 GFLOPs CPU time 00:00:15 Elapsed time 00:00:52 Executable setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101 and a lot others from the same set, 295 WU on one host only. But i see the same on the others hosts with others set's. Is that what you ask for? ID: 2004648 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 2004650 - Posted: 28 Jul 2019, 17:28:55 UTC - in response to Message 2004648. Is that what you ask for? Well, the Task ID 7871006337 is easier for others to look at, but it leads us in the right direction - thanks. WU true angle range is : 0.026279 corresponds to Joe's graph. ID: 2004650 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2004711 - Posted: 29 Jul 2019, 4:22:47 UTC Those Three Top end machines are about the same as before. I looked at the lower producing machines and was able to make a general assessment. Basically, the In Progress number shouldn't exceed the Host's Daily output, or Valid number. If the Host caches more tasks than it can complete in a Day the Database impact rises. This means a Host with a Valid number of 2000 shouldn't have a cache of 6400, or as in the old days, a Host completing 100 tasks a day shouldn't have a cache of 10000. It basically agrees with the suggestion I made a while back, the Maximum allowed cache should be set to One Day. This means a Host completing 50 tasks a day should have a cache of 50, and so forth. This will also lower the Pending number a bit as Pending tasks generally would only stay Pending a Day. If the Wingman doesn't run his machine, at least there will only be One day of tasks on his machine instead of 10 days. Of course lowering the Deadlines would also help, but, there seems to be reluctance to lower deadlines. ID: 2004711 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22205 Credit: 416,307,556 RAC: 380	Message 2004723 - Posted: 29 Jul 2019, 7:52:22 UTC With a max one day cache you automatically alienate those who grab a few days work at the weekend, process it off-line during the week then return and refill at the weekend, that is folks who use their cache properly not those who just like to have a huge cache that they do not need apart from polishing their egos by having thousands of tasks sitting around that they will never process. But, as with the current 100+100 cache people would eventually find out ways to work around it. One solution to part of the "problem" is of course to seriously restrict the flow of tasks to hosts that fail to return large numbers by their deadline to one one non-error task returned, one task sent out (I can think of one user who would have their cache reduced from >>30k to a single figure if this was implemented, and I dare say there are other hosts doing much the same) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2004723 ·

tazzduke Volunteer tester Send message Joined: 15 Sep 07 Posts: 190 Credit: 28,269,068 RAC: 5	Message 2004729 - Posted: 29 Jul 2019, 10:02:58 UTC - in response to Message 2004723. With a max one day cache you automatically alienate those who grab a few days work at the weekend, process it off-line during the week then return and refill at the weekend, that is folks who use their cache properly not those who just like to have a huge cache that they do not need apart from polishing their egos by having thousands of tasks sitting around that they will never process. But, as with the current 100+100 cache people would eventually find out ways to work around it. One solution to part of the "problem" is of course to seriously restrict the flow of tasks to hosts that fail to return large numbers by their deadline to one one non-error task returned, one task sent out (I can think of one user who would have their cache reduced from >>30k to a single figure if this was implemented, and I dare say there are other hosts doing much the same) +1 ID: 2004729 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 2004768 - Posted: 29 Jul 2019, 15:18:23 UTC - in response to Message 2004723. With a max one day cache you automatically alienate those who grab a few days work at the weekend, process it off-line during the week then return and refill at the weekend, that is folks who use their cache properly not those who just like to have a huge cache that they do not need apart from polishing their egos by having thousands of tasks sitting around that they will never process. But, as with the current 100+100 cache people would eventually find out ways to work around it. One solution to part of the "problem" is of course to seriously restrict the flow of tasks to hosts that fail to return large numbers by their deadline to one one non-error task returned, one task sent out (I can think of one user who would have their cache reduced from >>30k to a single figure if this was implemented, and I dare say there are other hosts doing much the same) This sounds as though it is two Very small groups of users. As far as somehow fixing the 'bad Host' problem, good luck with that. SETI has been trying to 'fix' that problem forever and I doubt it will be fixed anytime soon. As for the other 99% of SETI users, tests indicate the most efficient cache setting is to set the cache just below the Host's Daily output. That setting will result in the best In Progress and Pending validation ratio along with the Smallest Impact on the Database, for those who can actually download that many tasks. For the most productive Hosts who will still be stuck with just a couple hours worth of cache, you can take solace in the knowledge that at least those few people that can't seem to find the Internet for up to a week at a time will still be able to run those dozen or so tasks a day. ID: 2004768 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22205 Credit: 416,307,556 RAC: 380	Message 2004774 - Posted: 29 Jul 2019, 15:57:28 UTC From the evidence in front of me it would appear that the former group (the weekly hitters) represent somewhere five and ten percent of the users that are contributing to slow validations. The second group (the never return in time) who have huge caches it only takes a very small number of them to get a very large number of trapped tasks - 10 such hosts, each with 30,000 is 300,000, just now there are ~5,400,000 tasks out in the field - that's about 5.5% of the tasks out in the field trapped on only ten machines, which is far from an insignificant fraction, and well worth releasing back so they can be processed in an acceptable period of time. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 2004774 ·

pututu Send message Joined: 21 Jul 16 Posts: 12 Credit: 10,108,801 RAC: 6	Message 2005605 - Posted: 4 Aug 2019, 1:35:56 UTC Last modified: 4 Aug 2019, 1:36:53 UTC This PC is not mine but I give a thumb up for his/her effort to find ways to download more tasks per rig. Running multiple BOINC instances on one PC. Appears there are 4 instances running on one 1080 card most likely due to memory size limitation hence the run time for 1080 appears about 4 times slower than mine. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782554 https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782555 https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782556 https://setiathome.berkeley.edu/show_host_detail.php?hostid=8782557 There are many guides on how one can setup BOINC multiple instances. The upcoming Wow! event has certainly created some ingenuity in overcoming the 100 tasks limitation per GPU/CPU. ID: 2005605 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 2005606 - Posted: 4 Aug 2019, 1:42:10 UTC - in response to Message 2005605. Last modified: 4 Aug 2019, 2:04:26 UTC There are many guides on how one can setup BOINC multiple instances. The upcoming Wow! event has certainly created some ingenuity in overcoming the 100 tasks limitation per GPU/CPU. See this: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8773146 This is what a real spoofed client host could do. Observe there are no slow in the crunching time. Now imagine what could do a fleet of 10 or more of this babes. This show why spoofed hosts must be managed with extreme care to avoyd swam the servers. ID: 2005606 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2005613 - Posted: 4 Aug 2019, 2:32:50 UTC - in response to Message 2004321. Right, because all 90,000 active users are running Linux, and have the skills and means to do this. Pull the other one, Keith, it's got bells on... "With great power . . . comes great responsibility" ...comes something, that's for sure ... Was it your intent to sound that arrogant? I guess my attempt at levity fell on deaf ears. Yes that was a popular movie reference. I interpreted your comment as an affront to the GPUUG. We take our responsibilities seriously. If you look at the Top 100 participants list of those running Linux, you will notice that not everyone is a member of the GPUUG. So without the assistance of the GPUUG, independent users have been able to figure out how to spoof the client. It is not rocket science. Have you even bothered to download the master and look at the code in the client directory? The instructions for compiling the BOINC code is published right on the BOINC pages. All it takes is some effort on the part of the participant to read the code, figure out what needs to be changed and follow the instructions to make your own client that spoofs gpus. Heck you don't even have to be running Linux to run a spoofed client. Make the Windows client and run the SoG app. All I was trying to get across is that we shouldn't willy-nilly make it as easy as downloading the AIO for the masses and have everyone spoof multiple gpus. I'm sure the database and the servers will crumble under the load. You can always use a rescheduler if you are running out of work on Tuesday's. Heck the need for a spoofed client is less and less now that the outages are reasonable again at 4-5 hours. The only reason the spoofed client came into existence was having to deal with the 12-14 hour or longer outages we were experiencing for a great long while. . . Hi Keith . . Your comment about 'you have to be a member of GPU Users group' to get the information did come across as a bit elitist at the least and rather political at worst, but I would have thought that Jimbocous has been around long enough to know that most of the serious contributors take their efforts and responsibilities seriously. My fear is not as much about the abuse of ridiculously oversized caches (personally my philosophy has always been your cache does not need to be bigger than 1 days work) but rather a plague of screwed up clients misbehaving and causing havoc. . . Sadly not all of us have programming skills/experience to bring to the party so this remains beyond our personal abilities. And you of course right in pointing out that a great proliferation of humungous caches would cause possibly terminal database bloat. . . Just my 2 cents worth ... Stephen :) ID: 2005613 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2005614 - Posted: 4 Aug 2019, 2:51:17 UTC - in response to Message 2004405. These restrictions were put into place to stop huge amounts of data being returned after an outage as it caused database problems. That is it, no other reason. When I raised this point before I was assured that people who spoofed were aware of this fact and controlled their task return so as not to flood the servers. Is this still correct? . . Hi Bernie, this is a very good point but I think not the only reason. Database bloat is another and the database/s have bumped their heads on the ceiling on more than one occasion. But while that may have been improved with the updates that have been implemented it is still worth remembering. Stephen :-{ ID: 2005614 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 2005616 - Posted: 4 Aug 2019, 3:08:25 UTC - in response to Message 2005613. . . Sadly not all of us have programming skills/experience to bring to the party so this remains beyond our personal abilities. And you of course right in pointing out that a great proliferation of humungous caches would cause possibly terminal database bloat. . . Just my 2 cents worth ... Hi Stephen, thanks for the comments. Would you believed I flunked out of the only programming class I took in college 40 year ago. I don't have a clue how to program. My brain just does not understand the structure. I started off knowing nothing other than I could read the instructions provided by Seti. I never needed to know how to program or compile until I discovered a bug in the client and reported it to the BOINC developers github site. Richard Haselgrove told me I had confirmed a bug he had reported two years earlier that was never acted upon. With my new observation confirming Richard's, that got the ball started rolling on the problem finally. But when it came time to test the code, Richard told me I was on my own to compile the client since we have no Linux developers that know how. There are automatic systems in place to automatically generate applications when new code is submitted for Windows, but none for Linux. It has to be done by manual input. So I had to read the instructions on the BOINC website. And finally through a lot of trial and error I succeeded. I can follow along and get the gist of what the code is doing since it is all in high level C that reads like normal text. So in any particular small part of code I can actually understand what it is doing. DA or the other developers put in little helpful comments for various sections that for the most part explain what the code is supposed to do. Putting the whole of the code together to get the bigger picture is harder for me. Sleuthing out the parts that control how many tasks are allowed on board a host at any time is easy to find and change. Tbar hinted that was all he did to generate the 7.4.44 client. Change one value and voila you get 10,000 tasks allowed instead of 1000. Add a zero is all I did. It took another team member to piece the spoofed gpu count together with my assistance since his programming knowledge was only 30 years old and not 40 years like me. So it was a team effort that created the spoofed client through a lot of iteration and testing and trial and error. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 2005616 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 2005617 - Posted: 4 Aug 2019, 3:12:12 UTC - in response to Message 2004425. I remember reading somewhere the main reason for the limit was to stop having so many outstanding unused task assignments in the Database. The problem was, there was a number of people who would download 10000 tasks and then never run BOINC again, causing those 10000 tasks to remain in the database for months before timing out. Imagine how many tasks would be in the database if Everyone was allowed to hold 10000 tasks. In reality, very few people need that many tasks, and it really doesn't matter if those people have 500 In Progress tasks or 3000 in progress tasks. What matters to the database is what's in the All column and that column will be quite high on a Host that Needs 10000 tasks a day. Take for example this Host, do you really think it would matter if the In Progress column said 1400 rather than 3200? State: All (27891) Â· In progress (3200) Â· Validation pending (14703) Â· Validation inconclusive (430) Â· Valid (9558) Â· Invalid (0) Â· Error (0) Application: All (27891) Â· AstroPulse v7 (0) Â· SETI@home v8 (27891) Again, what matters is the ALL column, and the difference between 27891 and 26091 is pretty much insignificant. However, it's the difference between the Host working and Not working during certain periods of the week. Last I heard SETI needed all the Hosts it could get working. I really don't see a problem with uploads, it doesn't matter how many tasks you have completed I've Never seen more than a couple of hundred upload at any one time and the next couple of hundred isn't any different than any other Host uploading. The only concern would be downloads if the RTS was low, but, in my observation the slower Hosts always get more tasks and refill faster than the fastest Hosts, at least on My machines. Just remember, out of the 100000 or so active Hosts, there are relatively few with the need for higher cache levels, and I really don't think SETI will mind if the owners do what they can to keep them working. . . I believe the built in limit on reported completed WUs is 256. But it can be quite simply reduced as Richard has suggested many times, and should not allow the heavy hitters to have any more impact than more moderate rigs in the short term, but of course with many times more completed WUs to report the process will be repeated over a much longer period and while not making the recovery initially much worse they will extend the impact over a longer time. But as you say, these Mega crunchers are only a relatively small minority. I do believe the major issue and reason for the download limits was database management. As for the offenders of the highlighted 'crime', they still abound, just take a look at the older tasks in your pending and inconclusive lists, you will find lots of them. Hosts sitting on hundreds or thousands of unprocessed WUs holding them in limbo until they time out. :( Stephen :( ID: 2005617 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.