Did the work unit limit change?

Message boards : Number crunching : Did the work unit limit change?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1517249 - Posted: 16 May 2014, 23:20:26 UTC - in response to Message 1517233.  
Last modified: 16 May 2014, 23:26:29 UTC

A system with NVIDIA & iGPU the 100 tasks may only be assigned to NVIDIA leaving the iGPU idle. Now that system can download 200 GPU tasks, but they may all be assigned to NVIDIA leaving the iGPU still idle.

It could go the other way and the system could assigned 200 tasks to the iGPU instead of a much faster 780 Ti. Maybe stating it that way help make more sense as to how it isn't working?

Yep.
All the posts I've seen so far have pointed out how each GPU now has 100WU as opposed to sharing only 100WU in total. Other than some inactive GPUs being used when working out the limit (which means the system has more than 100WU per active GPU) i hadn't seen any instances where inactive or much slower ones were getting work, and faster or active GPUs weren't getting their 100WUs.


EDIT-
Thinking about it I think the ideal way to do what they are trying to implement would be to take the project limit and split it across the different vendor GPU's. Probably based on their processing rate. So a System with a faster NVIDIA card and an iGPU might have a limit of like 70 for the NVIDIA & 30 for the iGPU.

If they were to do that then they will have to raise the GPU limits significantly. The problem with the limits is the GPUs running out of work even with very short outages. Splitting the allocation between GPUs will make it even worse.
I'm a fan of the KISS (Keep It Simple Stupid) principle- splitting the allowance across multiple cards & multiple vendors makes for considerable complexity, as opposed to just allowing the limit to be changed from x GPU WUs per system to x WUs per GPU per system.
No need to worry about vendors or percentages. Just sort out the issue that is resulting in inactive GPUs being used in the allocation of work.
Grant
Darwin NT
ID: 1517249 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1517264 - Posted: 17 May 2014, 0:17:29 UTC - in response to Message 1517249.  

A system with NVIDIA & iGPU the 100 tasks may only be assigned to NVIDIA leaving the iGPU idle. Now that system can download 200 GPU tasks, but they may all be assigned to NVIDIA leaving the iGPU still idle.

It could go the other way and the system could assigned 200 tasks to the iGPU instead of a much faster 780 Ti. Maybe stating it that way help make more sense as to how it isn't working?

Yep.
All the posts I've seen so far have pointed out how each GPU now has 100WU as opposed to sharing only 100WU in total. Other than some inactive GPUs being used when working out the limit (which means the system has more than 100WU per active GPU) i hadn't seen any instances where inactive or much slower ones were getting work, and faster or active GPUs weren't getting their 100WUs.

My 2nd example with the slower iGPU getting all of the work may never happen currently. I just thought it was a good example of demonstrating how things could go wrong.

EDIT-
Thinking about it I think the ideal way to do what they are trying to implement would be to take the project limit and split it across the different vendor GPU's. Probably based on their processing rate. So a System with a faster NVIDIA card and an iGPU might have a limit of like 70 for the NVIDIA & 30 for the iGPU.

If they were to do that then they will have to raise the GPU limits significantly. The problem with the limits is the GPUs running out of work even with very short outages. Splitting the allocation between GPUs will make it even worse.
I'm a fan of the KISS (Keep It Simple Stupid) principle- splitting the allowance across multiple cards & multiple vendors makes for considerable complexity, as opposed to just allowing the limit to be changed from x GPU WUs per system to x WUs per GPU per system.
No need to worry about vendors or percentages. Just sort out the issue that is resulting in inactive GPUs being used in the allocation of work.

Fixing the db and scripts so it doesn't get grumpy & crash would probably be easier than implementing a split limit system.
Once they fix the problem of applying the limits to the type of GPU's the issue with allocating tasks for inactive GPU's will go away. If they can't make it work as expected hopefully they will remove this code so the BOINC servers actually follow the limits set by the project admins.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1517264 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1517279 - Posted: 17 May 2014, 1:24:27 UTC - in response to Message 1517264.  

Fixing the db and scripts so it doesn't get grumpy & crash would probably be easier than implementing a split limit system.

Not having the limits at all would be the best option. Having them per GPU is the next best.

I suspect the problems with the database are hardware more than software related, and if the requirements were made known then I'd expect a quick fundraiser would get what's needed.
But as there hasn't been any mention of what the cause or fix is, not much we can do to help.
Grant
Darwin NT
ID: 1517279 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1517293 - Posted: 17 May 2014, 2:17:29 UTC

The assumption that the BOINC changeset was the only cause for the limits changing is probably incorrect. As Claggy noted in his msg 1515079 near the beginning of this thread, he reported the issue on the boinc_dev mailing list. Eric Korpela's reply was:
I changed config_aux.xml limit to have the <per_proc/> option set.

I'm not entirely surprised it's broken. There are two incompatible places
to set these options.

Checking the haveland graphs indicates the change has had essentially no effect on turnaround time nor how many tasks are "in progress", so there's no perceptible downside as far as project load goes. I think the overall effect is good.
                                                                   Joe
ID: 1517293 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1517304 - Posted: 17 May 2014, 2:54:42 UTC - in response to Message 1517293.  

The assumption that the BOINC changeset was the only cause for the limits changing is probably incorrect. As Claggy noted in his msg 1515079 near the beginning of this thread, he reported the issue on the boinc_dev mailing list. Eric Korpela's reply was:
I changed config_aux.xml limit to have the <per_proc/> option set.

I'm not entirely surprised it's broken. There are two incompatible places
to set these options.

Checking the haveland graphs indicates the change has had essentially no effect on turnaround time nor how many tasks are "in progress", so there's no perceptible downside as far as project load goes. I think the overall effect is good.
                                                                   Joe

I have been watching that since the change. I suspect doing AP on GPU's has helped a bit with keeping the number of MB tasks out in the field down.
It also looks like more data is being moved over to be split. so there might be more AP for the weekend to keep the GPU's from going to town on MB's.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1517304 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1517305 - Posted: 17 May 2014, 3:03:21 UTC - in response to Message 1517293.  

Checking the haveland graphs indicates the change has had essentially no effect on turnaround time nor how many tasks are "in progress", so there's no perceptible downside as far as project load goes. I think the overall effect is good.
                                                                   Joe

In my case the turnaround time for GPU WUs has gone from around 8 hours to 16 hours. While a significant increase (it's doubled), in absolute terms it's sweet bugger all (the turn around time for my CPUs are 43hrs & 216hrs).
Grant
Darwin NT
ID: 1517305 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1517586 - Posted: 17 May 2014, 18:50:32 UTC - in response to Message 1517011.  
Last modified: 17 May 2014, 19:02:21 UTC

I could.
I will not.
I'll take the mixture.
(going childish)
Would You be asking if I had 240k.
--

Now I'm a bit down.

Once upon a time there was a Setiland. All people ware treated equally and there was a feeling of harmony around.

One day TheLimit was raised!

Most people noticed but nothing... But the evil was lurking around. An AP hoarder amidst us Seitzens. The universe kept on expanding as time passed by.

But one day - The AP stopped to appear. That was not evident immediately. Everyones cache was filled with a mixture of MB and AP - so nobody realised that until it was too late. No one got no AP anymore. Their caches were run out of AP.

Except for one - there was this evil hoarder. He had set a spell to the magic internet page and made sure that hed'd get no MB. This way he got all the AP in the wolrd!

Oh the groaning and moaning every morning, throughout the day and every evening. No greater evil could have hit us - people said!

Some went to the extent that they switched to another project - some waited their mouths shut.

Time passed by...

And there was this one morning when all remaining GPU coolers whirred so beautifully and all the worlds results loked a bit more shiny. Something must have happened said everyone - no one knew what.

But the a knight in an Arctic Silver IV shining body came and told every person still in the project that the CPU cooler of the hoarder had caught a fire and caused a meltdown of those horrible AP hoarding GPU cores and that the King of the Setiland had issued a permanent order: NO ONE (Tim) WITH A SETTING THAT SAYS AP ONLY AND NOT ALLOW ANY OTHER TYPE OF WORK SHALL HAVE A LIMIT OF 1 TASK A DAY INSTEAD OF 100 PER GPU AND CPU.

The people hoorrared for a total of three days and .....

--

Well - the children are now asleep. Mee too in a an hour or so.

I just looked at at the tasks of another top 10 cruncher and was kind of disappointed. When running AP only I could get 240 000 a day - but I do not want to. I have AP as a preference, but atleast I have this "send work from other projects when available" on.

How would you explain if You had N hunderd AP only? (N is a big number)


Am I cheating somehow?

I don’t think so.

Is there somewhere written that I will take MB wu’s or AP wu’s?

I don’t think so.

I can do whatever I want to my preferences, asking no one what to do because ARE MINE.
No one will tell me if I will download 800 AP’s or 800 MB WU’s
Yes I can take 800 Ap tasks and I will do it again because the server allow, and it is legal.
We all wanted 100 wu’s per Gpu as I remember, and now we are complaining?

By the way…
This machine is going to retire at about one month or two.
We have here at the office a new build with dual xeon, and 8 Gpus.I will install seti and run 1000 AP tasks.

And a question to petri33…
If you were at 2nd position, and I was at 3rd, will you have those questions?

Br
Tim

To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1517586 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1517806 - Posted: 18 May 2014, 10:29:59 UTC - in response to Message 1517586.  

I could.
I will not.
I'll take the mixture.
(going childish)
Would You be asking if I had 240k.
--

Br
Tim
[/quote]

Thanks Tim! (looked just at your tasklist -- 800 MB ;) )

Br
Petri
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1517806 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1520427 - Posted: 23 May 2014, 22:23:41 UTC - in response to Message 1517806.  

Thinking of the increased database load that larger caches would cause, I notice that I've still got 13 Enhanced WUs still sitting in my account. I'd expect clearing all those older WUs out would have to help the database considerably.
Grant
Darwin NT
ID: 1520427 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1520495 - Posted: 24 May 2014, 2:54:27 UTC - in response to Message 1520427.  

Thinking of the increased database load that larger caches would cause, I notice that I've still got 13 Enhanced WUs still sitting in my account. I'd expect clearing all those older WUs out would have to help the database considerably.

I still have 34 of them, all from last year in May and June.
ID: 1520495 · Report as offensive
Previous · 1 · 2 · 3 · 4

Message boards : Number crunching : Did the work unit limit change?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.