The spoofed client - Whatever about

Message boards : Number crunching : The spoofed client - Whatever about
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5

AuthorMessage
Profile Keith Myers Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10303
Credit: 1,006,166,852
RAC: 1,387,815
United States
Message 2005752 - Posted: 4 Aug 2019, 20:28:23 UTC - in response to Message 2005749.  

I believe I've already proven the fix is to simply change the cache to One Day, or even Half a Day. This will provide the fastest Hosts enough WUs to last through an average Tuesday Outage. The current problem is the Fastest 20% of SETI Hosts are about 300% faster than the other 80%. This results in any attempt to run a smaller cache self defeating as it results in a Large increase in the Pending tasks for those 20% of Hosts. There isn't any impact on the Database's 'results in the field' by those 20% of Fast Hosts running a vastly increased cache as the Pending Tasks number drops as the Cache number increases.
The same Three Hosts running different caches;
State: All (31078) · In progress (500) · Validation pending (17067) · Validation inconclusive (390) · Valid (13121)
State: All (29443) · In progress (3600) · Validation pending (13368) · Validation inconclusive (456) · Valid (12019)
State: All (30827) · In progress (6500) · Validation pending (11465) · Validation inconclusive (456) · Valid (12315)

The Host with the Smallest Cache has the Largest Impact on the 'Results in the Field' with the current system.


Missing something here, changing size did nothing for me.
at both "seti preferences" website and at "boincstats" and at my local PC "boinc preferences" (all 3 just to be sure) did the follwing

Mimimum buffer days 0.1 => 2.0
Additional work buffer days 0.25 => 2.0
did an update and a sync repeatedly, no change in work unit count. Still just under 900 work units (9 GPUs)

verfied as shown below

root@tb85-nvidia:/var/lib/boinc-client# grep -i ">2.000" *.xml
global_prefs_override.xml:   <work_buf_min_days>2.000000</work_buf_min_days>
global_prefs_override.xml:   <work_buf_additional_days>2.000000</work_buf_additional_days>
sched_request_setiathome.berkeley.edu.xml:   <work_buf_min_days>2.000000</work_buf_min_days>
sched_request_setiathome.berkeley.edu.xml:   <work_buf_additional_days>2.000000</work_buf_additional_days>

root@tb85-nvidia:/var/lib/boinc-client# grep -i ">2.0" *.xml
global_prefs.xml:<work_buf_min_days>2.0</work_buf_min_days>
global_prefs.xml:<work_buf_additional_days>2.0</work_buf_additional_days>


So I should have picked up addition work units in 7.14.2 ?

With 9 gpus you are allotted 900 tasks. So your cache is full. There are no more to be retrieved no matter how many days of work you ask for.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2005752 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 17941
Credit: 408,844,278
RAC: 32,492
United Kingdom
Message 2005753 - Posted: 4 Aug 2019, 20:28:39 UTC

Reduce the additional work day to a much lower number - round about 0.01, thus the refreshes will be much smoother and much less prone to the humps and bumps that happen with large ( >1 ) additional days.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2005753 · Report as offensive
Profile Joseph Stateson
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 226
Credit: 48,110,063
RAC: 336,563
United States
Message 2005756 - Posted: 4 Aug 2019, 20:37:28 UTC - in response to Message 2005752.  


With 9 gpus you are allotted 900 tasks. So your cache is full. There are no more to be retrieved no matter how many days of work you ask for.


that is exactly what I thought. So what is the cache that is being discussed?
ID: 2005756 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 4926
Credit: 664,420,365
RAC: 1,414,499
United States
Message 2005758 - Posted: 4 Aug 2019, 20:43:32 UTC - in response to Message 2005756.  

SETI is the one who has imposed the cache limits, SETI is the one who would have to change the cache limits.
The Point is, compared to other suggestions being offered, SETI changing the cache limits would be a Very Easy endeavor to accomplish.
That's all.
ID: 2005758 · Report as offensive
Profile Keith Myers Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10303
Credit: 1,006,166,852
RAC: 1,387,815
United States
Message 2005759 - Posted: 4 Aug 2019, 20:44:07 UTC - in response to Message 2005756.  
Last modified: 4 Aug 2019, 20:44:54 UTC

Seti only allows 100 tasks per cpu and 100 tasks per gpu on board at any time for the host's cache. So setting for X days of work does nothing for you if you have a reasonably fast system. You are only going to get the maximum cache allotment for your hardware.

Setting additional days of work to 0.01 will cause the client to ask for work at every scheduler connection.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2005759 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1530
Credit: 193,618,221
RAC: 474,880
United States
Message 2005766 - Posted: 4 Aug 2019, 21:52:45 UTC

If the objective is to try and "smooth out the bumps" while keeping crunchers well-fed, it seems to me that the easiest way for the SETI project to handle that would be to eliminate or greatly increase hard limits on tasks in progress, and instead calculate tasks delivered based on average turnaround time, as is (I believe) already done up to the point where the limit is reached. Logically, that should be sufficient control, and self-correcting for problem clients. Am I missing something?
ID: 2005766 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 11865
Credit: 184,030,687
RAC: 234,010
Australia
Message 2005774 - Posted: 4 Aug 2019, 22:51:14 UTC - in response to Message 2005766.  

Am I missing something?
Yep, i'll try to summarise things-
Basically, the more work returned per hour, then the lower the Results-out-in-the-field number can be or the higher the number of Results-out-in-the-field then the lower the amount of work returned per hour can be before the servers start to choke under the load.
That is the Seti@home server issue.


As discussed earlier in the thread, the Results-out-in-the-field has been limited (by the imposition of server side limits) to stop the servers from falling over or grinding to a halt as the Seti@home servers are at their limits.
At the current levels of Results-out-in-the-field, when the Results-received-in-last-hour reaches a certain (variable) level, the Workunit-files-waiting-for-deletion start to backup. Once they reach a certain (variable) level, the splitter output falls away, sometimes to less than10 per second. If this goes on for long enough then the Ready-to-send buffer runs out, and people can't get work.

The number of Results-out-in-the-field is the total of all the crunchers All tasks number in their account Task list.
It was suggested that the larger people's caches, the larger the number of Results-out-in-the-field. Tbar hypothesized that this wasn't the case, and provided evidence that for a given amount of work processed per hour, the All value of tasks remains the same- the ratio between In progress (people's caches) & Validation pendings varies, but for a given hourly throughput, the All tasks number remains the same.
The smaller the In progress number, the larger the Validation pending number; the larger the In progress number, the smaller the Validation pending number- but the All tasks number remains the same.

In the past (I still can't remember his name, Jeff I think it was?) looked at the turnaround time for wok, and found that, regardless of how many long-term outstanding Pendings or Inconclusives you might see in your Task List, the huge majority of work is returned within 48 hours. Work Units that take longer than 48 hours to be validated are only a very small percentage, and the long term (2 months+) outstanding work is only a very small percentage of that very small percentage.

Because the high performance crunchers have the greatest impact on the Results-out-in-the-field (because their All number of tasks is so high), and because re-distributing the work out in the field to reduce the number of WUs that take more than 48 hours to be returned won't actually have any significant effect on the Results-out-in-the-field numbers (because they are only a very small percentage of the number of WUs that are returned) any benefit to alleviating the servers load issues would be virtually nil.


Rob has suggested that systems which have large numbers of Ghosts may contribute as much as 5.5% to the Results-out-in-the-field, so fixing the mechanism for limiting work to non-performing systems would help with that.
But 5.5% (think of it as 0.055) really isn't much in the overall scheme of things- fixing those systems would result in a better buffer before the servers start to have issues, but it wouldn't be enough to enable any meaningful increase in the Sever side limits IMHO.


There are plans (photos have been posted) to upgrade the Upload server (which also does file deletion, database purging, and (maybe) validating) which will hopefully alleviate the present server issues and allow the Server side limits to be increased.
Grant
Darwin NT
ID: 2005774 · Report as offensive
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 4776
Credit: 163,975,465
RAC: 240,611
Australia
Message 2005793 - Posted: 5 Aug 2019, 0:50:48 UTC - in response to Message 2005701.  

IMHO whats is unrealistic is to keep the size of the WU small (around 720K) with the current top crunchers who can crunch a WU in less than 30 sec.
Instead of that small WU, could be created a new "large" MB WU who could be sended only to the fastest hosts. Like GPUgrid does with it's small/large WUs. That will make the spoofed client or rescheduling unnecessary.
As Ian posted even the 6400WU cache is to small for the current top GPU crunchers. An over this number the lag is to high to allow to use larger buffers.
For now the 100 CPU WU limit is still valid, but with the arrival of the new Ryzens with a lot of cores even that 100 CPU WU Limit will be come unrealistic too.

. . Sadly I cannot see that working either. If they created a separate WU format for the heavy hitters then those tasks can only be validated against other heavy hitters, and I do not think there are high enough numbers of these to make this work, and it would rather defeat the purpose of the validation process itself. If they increased the size of ALL WUs then the slow machines, who we are rapidly losing even now, would cause their owners to lose interest even more quickly. If they are only doing a few tasks a day would they still bother if that number drops even only by half?

. . I still think daily limits that take into account the daily productivity of that individual host would be the most workable solution for the project. They already maintain the information about each host's productivity, so a mechanism that allows the schedulers to use this info to multiply the basic limits accordingly would require a smaller change in the system (so it seems to me anyway). That way the slow machines can still function under their current limits but for machines that produce large multiples of the work done by slower rigs can receive work in multiples of the basic limit. A machine producing 200 valid results per day has the current limits of 100 per device, but a machine producing 1000 valid units a day can receive 500 per device. No need for spoofing then and it remains under the control of the project/servers rather than various and sundry workarounds out in the field.

Stephen

? ?
ID: 2005793 · Report as offensive
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 4776
Credit: 163,975,465
RAC: 240,611
Australia
Message 2005796 - Posted: 5 Aug 2019, 1:01:32 UTC - in response to Message 2005759.  

Setting additional days of work to 0.01 will cause the client to ask for work at every scheduler connection.


. . Also when that value is set to 0.00 which is what I run at ...

. . My rigs ask for work at every request interval.

Stephen

. .
ID: 2005796 · Report as offensive
Profile Keith Myers Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10303
Credit: 1,006,166,852
RAC: 1,387,815
United States
Message 2005803 - Posted: 5 Aug 2019, 2:18:28 UTC - in response to Message 2005796.  

Setting additional days of work to 0.01 will cause the client to ask for work at every scheduler connection.


. . Also when that value is set to 0.00 which is what I run at ...

. . My rigs ask for work at every request interval.

Stephen

. .

I tried to set it lower than 0.01 and the web page would not accept any lower value. You save your changes but the value never updates to anything other than 0.01.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2005803 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1850
Credit: 2,258,721
RAC: 169
Australia
Message 2005805 - Posted: 5 Aug 2019, 2:42:18 UTC
Last modified: 5 Aug 2019, 3:10:44 UTC

I've tried it and saved it, now it shows just "0" how would this work ?
ID: 2005805 · Report as offensive
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 4776
Credit: 163,975,465
RAC: 240,611
Australia
Message 2005806 - Posted: 5 Aug 2019, 2:47:46 UTC - in response to Message 2005803.  

I tried to set it lower than 0.01 and the web page would not accept any lower value. You save your changes but the value never updates to anything other than 0.01.


. . OK, I often feel I live in the twilight zone, it seems things behave entirely differently for me than for other people ...

. . I have rechecked ALL my machines and every one has accepted a value of zero, and they are running 3 different versions of Boinc client & BoincManager. One Linux machine is running 7.2.42, the others are running 7.14.2 and the Windows machine is running 7.6.33. How bizarre ...

Stephen

? ?
ID: 2005806 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 13293
Credit: 165,399,025
RAC: 212,025
United Kingdom
Message 2005829 - Posted: 5 Aug 2019, 9:45:16 UTC

Are the spoofing mods published anywhere under GPL? I might try my hand at Linux compilation...
ID: 2005829 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 8244
Credit: 517,959,434
RAC: 396,003
Panama
Message 2005847 - Posted: 5 Aug 2019, 12:11:13 UTC - in response to Message 2005829.  

Are the spoofing mods published anywhere under GPL? I might try my hand at Linux compilation...

AFAIK No.
ID: 2005847 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5

Message boards : Number crunching : The spoofed client - Whatever about


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.