Limits


log in

Advanced search

Message boards : Number crunching : Limits

Previous · 1 · 2
Author Message
Josef W. Segur
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4196
Credit: 1,028,796
RAC: 262
United States
Message 1316239 - Posted: 16 Dec 2012, 22:29:00 UTC - in response to Message 1316094.

So is this 100 WU limit per a computer or per an account?

Per computer, plus another 100 if it has at least one usable GPU.


So currently the most I'd be able to get is 100 WUs for the CPU and 100 WUs for GPUs regardless of the number of GPUs in the computer?

Yes, but note that it's a limit on tasks in progress. If you report 5 completions, you can get 5 new tasks to replace them. Most of the highly productive hosts are in that mode. It's not fair that the limits are really only affecting the top 8000 or so hosts, of course.
Joe


Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,688,798
RAC: 2,235
United States
Message 1316258 - Posted: 16 Dec 2012, 23:25:03 UTC - in response to Message 1316239.
Last modified: 16 Dec 2012, 23:30:09 UTC

So is this 100 WU limit per a computer or per an account?

Per computer, plus another 100 if it has at least one usable GPU.


So currently the most I'd be able to get is 100 WUs for the CPU and 100 WUs for GPUs regardless of the number of GPUs in the computer?

Yes, but note that it's a limit on tasks in progress. If you report 5 completions, you can get 5 new tasks to replace them. Most of the highly productive hosts are in that mode. It's not fair that the limits are really only affecting the top 8000 or so hosts, of course.
Joe



Unless you consider the top 8000 hosts are the problem. Well probably only the top 1000. At least that's what Eric's explanation sounds like, large queues, fast crunchers, especially during a shorty event.

My modest cruncher is currently in the 4500-5000 range of hosts, depending if you look at RAC or credits per month. I use to have a 6 day cache, the current limits are now about 3 1/2 days so it's not someone like me who's affected during downtime Tuesdays. Looking at the fastest, by RAC, host out there I see it's fetching 12-14 GPU units every five minutes. That puts it's GPU unit processing rate at over 4000 per day. A 1000 unit limit would still mean less than 3 hours of cache for that system.

Now the devs can't affect the number of waiting for validation results in everyone's host's results table other than driving the cache size down. They can't affect the valid results because that's mainly governed by the speed of the cruncher. They can affect the number pending as that is set by cache size. So they are taking the only measures they can to reduce everyone's host table size.

Now when I look at my pending list, eliminating those who look to have joined for a week never to return, I'm still seeing a number of wingmen who still have thousands of units queued up but seemingly forgotten about. But they are examples of people who let their crunchers queue a metric ton of units but when there's a problem at their end, they end up clogging up their wingmen's waiting for validation queue. The smaller the queue, the less people they screw.

Give it a little longer to shake out these forever waiting for validation units to flush out and then look at bumping the numbers up, slowly and waiting a week or two to see how it's shaping up and then upping it again, etc. because when it goes bad, it goes bad for everyone, not just the top crunchers.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5683
Credit: 56,098,734
RAC: 49,919
Australia
Message 1316346 - Posted: 17 Dec 2012, 7:35:12 UTC - in response to Message 1316258.


Even though they may carry a huge cache, at least they return the work within the period thier cache is set for. Whereas there would be 1,000s of systems out there tht take almost to the deadline to return their work (if they do return it).
I'd expect they'd have a significant effect on the size of the database too.
____________
Grant
Darwin NT.

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,688,798
RAC: 2,235
United States
Message 1316392 - Posted: 17 Dec 2012, 12:45:55 UTC - in response to Message 1316346.
Last modified: 17 Dec 2012, 12:49:42 UTC


Even though they may carry a huge cache, at least they return the work within the period thier cache is set for. Whereas there would be 1,000s of systems out there tht take almost to the deadline to return their work (if they do return it).
I'd expect they'd have a significant effect on the size of the database too.

I wish that was true. My two oldest pending currently have a wingman (2nd wingman), whose host has a RAC in the 40,000-50,000 range but 1600 units dated between Oct 27th and Nov 4th that's been "forgotten". So those two units will be pending through a 3rd wingman starting the day after Mayan doomsday. And when I bother to look at my units old waiting for my wingman that are over 30 days old, I frequently see exactly the same thing, a host with 1000s of units "forgotten". Or have hundreds of errors. Or hundreds of false overflows so I'm stuck in inconclusive-ville.

These are cluttering everyone's list of units assigned to each host and are not affected by the performance of the host or its desired queue size (or queue size as currently dictated). Currently 1/3rd of my units waiting for a wingman are over 30 days old. The current limits at least limit the damage caused by hosts having "technical difficulties", clogging the pipes to the point the scheduler hiccups.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 3841
Credit: 106,384,456
RAC: 90,300
United States
Message 1316427 - Posted: 17 Dec 2012, 14:43:55 UTC - in response to Message 1316239.

So is this 100 WU limit per a computer or per an account?

Per computer, plus another 100 if it has at least one usable GPU.


So currently the most I'd be able to get is 100 WUs for the CPU and 100 WUs for GPUs regardless of the number of GPUs in the computer?

Yes, but note that it's a limit on tasks in progress. If you report 5 completions, you can get 5 new tasks to replace them. Most of the highly productive hosts are in that mode. It's not fair that the limits are really only affecting the top 8000 or so hosts, of course.
Joe



I had an idea a while back how a weighted task value, based on the AR, could help prevent overloading a machine when it had nothing but VLAR tasks to work with in its queue. That would probably add even more complexity to the client. So probably not the best idea.
However a weighted value tracked on the server end might be a easier solution to making the workunits larger. To prevent very fast machines from acquiring a large number of tasks when VHAR's are being generated a predetermined value could be applied to a task which would be applied towards that task limit of the machine. I was thinking something along the lines of a 1/x, but it might want to be centered on the "normal" AR. So .42/x might work better, but I don't know if a task with an AR of .03 should count as 14.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8483
Credit: 23,007,382
RAC: 14,898
United Kingdom
Message 1320211 - Posted: 26 Dec 2012, 22:39:31 UTC
Last modified: 26 Dec 2012, 22:39:45 UTC

If the limits are still on, how come this happens.

http://setiathome.berkeley.edu/results.php?hostid=5450808&offset=580&show_names=0&state=1&appid=

msattler
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38153
Credit: 555,852,846
RAC: 596,776
United States
Message 1320217 - Posted: 26 Dec 2012, 22:52:19 UTC - in response to Message 1320211.

If the limits are still on, how come this happens.

http://setiathome.berkeley.edu/results.php?hostid=5450808&offset=580&show_names=0&state=1&appid=

Dunno, but I wish I could get it to happen here.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Claggy
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4039
Credit: 32,691,290
RAC: 723
United Kingdom
Message 1320220 - Posted: 26 Dec 2012, 22:54:53 UTC - in response to Message 1320211.

If the limits are still on, how come this happens.

http://setiathome.berkeley.edu/results.php?hostid=5450808&offset=580&show_names=0&state=1&appid=


BOINC version 6.6.28

Claggy

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8483
Credit: 23,007,382
RAC: 14,898
United Kingdom
Message 1320221 - Posted: 26 Dec 2012, 22:55:38 UTC - in response to Message 1320220.

If the limits are still on, how come this happens.

http://setiathome.berkeley.edu/results.php?hostid=5450808&offset=580&show_names=0&state=1&appid=


BOINC version 6.6.28

Claggy

Thanks

zoom314
Avatar
Send message
Joined: 30 Nov 03
Posts: 45749
Credit: 36,379,806
RAC: 8,223
Message 1320223 - Posted: 26 Dec 2012, 22:56:46 UTC - in response to Message 1320217.

If the limits are still on, how come this happens.

http://setiathome.berkeley.edu/results.php?hostid=5450808&offset=580&show_names=0&state=1&appid=

Dunno, but I wish I could get it to happen here.

You have a lotta company in that regard Mark.
____________

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,688,798
RAC: 2,235
United States
Message 1320703 - Posted: 28 Dec 2012, 8:21:29 UTC - in response to Message 1320223.
Last modified: 28 Dec 2012, 8:41:59 UTC

Well one interesting side effect of the 100 unit limit per device on MB is that it makes it relatively simple to calculate your process rate, assuming your rig is set to run 24/7.

Simply subtract the Time Reported from the Time Sent on valid tasks for a particular device (cpu or gpu), you may want to average several to help compensate for the 5 minute minimum delay between reports, and divide by 100.

For my little rig it ends up being around 1 CPU unit done about every 56 minutes and 1 GPU unit (done by the integrated GPU) about every 45 minutes. So about 57-58 units a day. SIV is giving me, based on average number of fetched units per day over the last 6 months of 61. So pretty spot on.

You can also use this to get a guestimate on how many CPU and GPU threads are running on other people's hosts. Of course this assumes they are running 24/7 and just Seti and primarily MB vs AP but you can get a ballpark figure.

Looking at the most productive host out there, owned by Morton Ross. Using the average time between report and sent of the first five, 100+ credit GPU units, I calculate a GPU unit completed every 32.3 seconds. Task details show 8 GPUs in the system. The average run time of those first five is about 781 seconds which yields a guesstimate of 24-25 GPU threads running across those 8 GPUs.

On the CPU side is a little tougher to compute since valid CPU units are less regular in the valid list but after finding five I calculate one CPU unit done every 517 seconds with the average run time of 6946 seconds gives me a guesstimate of 13-14 threads running CPU units. I'll say that the remaining available threads are handling the CPU overhead of the all the GPU threads.

It's just a curious bit of math, still dependent on a lot of assumptions. Still one GPU unit every 32.3 seconds is a bit of a jaw dropper. It's no wonder that one host handles about one out of every 650 units.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Previous · 1 · 2

Message boards : Number crunching : Limits

Copyright © 2014 University of California