GPU in the Lunatics' apps

Message boards : Number crunching : GPU in the Lunatics' apps
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1164804 - Posted: 23 Oct 2011, 19:33:57 UTC - in response to Message 1164748.  
Last modified: 23 Oct 2011, 19:40:24 UTC

Fred J. Verster wrote:
---[SNIPPED]---

Those of you who refuse to "fiddle with" the value are condemning your hosts to sending a flops value usually far less than the GPU is actually producing. That's your choice, the system will probably supply enough GPU work to keep them busy unless there's an unexpected outage. My preference is to give a reasonable approximation of what an application will do so the system works as intended, I never expect BOINC to be able to rescind the old GIGO principle.


Joe, is right and I certainly will change this. Stabillity is also more important, then
excitement.
While this (ATI)-host, already is in constant short supply of work, which can be a result of the lower XFLOPS estimate, if I understand this correctly.
If I were to set set a FLOPS entry of 3.0 e+9, the server/scheduler, should respond
more realistic?
Or I'll change it to 1 MB per GPU, already switched to 1 AstroPulse task per GPU.
Cause too many time I see 1 WU, running on 0.5 GPU, cause there aren't enough.

For anonymous platform hosts with any significant amount of cached work, making an abrupt large adjustment of flops isn't a good idea. The APRs for GPU work are based on runtimes for each task, so are affected strongly by how many tasks you run simultaneously; changing from 1 to 2 tasks or vice versa would nearly double or halve the expected rate. But for any application where the APR has had time to stabilize, adjusting <flops> toward that APR in steps would make sense to me.

Your i7 2600 with 2 ATI 5870 system has a Whetstone benchmark around 3.27e09 and APRs around 21e09 for MB CPU, 68e09 for MB GPU, 42e09 for AP CPU, and 200e09 for AP GPU. For that host, I'd set the <flops> in steps over several days, something like:

          Day1     Day2     Day3     Day4
MB CPU    6e09     12e09    21e09    APR
MB GPU    8e09     20e09    50e09    APR
AP CPU    8e09     16e09    32e09    APR
AP GPU    9e09     24e09    64e09    170e09    APR

That's assuming you won't have much more than a one day cache, the idea is to have tasks which were scaled for a particular flops value be completed before you increase the flops too much.

{edit} As a final note, it might be wise to boost the rsc_fpops_bound values to avoid possible -177 execution time errors during the adjustment period, either by direct editing of client_state.xml or using Fred's Boinc Rescheduler.
                                                               Joe
ID: 1164804 · Report as offensive
Profile Fred J. Verster
Volunteer tester
Avatar

Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1164807 - Posted: 23 Oct 2011, 19:46:02 UTC - in response to Message 1164804.  
Last modified: 23 Oct 2011, 19:53:39 UTC

Fred J. Verster wrote:
---[SNIPPED]---

Those of you who refuse to "fiddle with" the value are condemning your hosts to sending a flops value usually far less than the GPU is actually producing. That's your choice, the system will probably supply enough GPU work to keep them busy unless there's an unexpected outage. My preference is to give a reasonable approximation of what an application will do so the system works as intended, I never expect BOINC to be able to rescind the old GIGO principle.


Joe, is right and I certainly will change this. Stabillity is also more important, then
excitement.
While this (ATI)-host, already is in constant short supply of work, which can be a result of the lower XFLOPS estimate, if I understand this correctly.
If I were to set set a FLOPS entry of 3.0 e+9, the server/scheduler, should respond
more realistic?
Or I'll change it to 1 MB per GPU, already switched to 1 AstroPulse task per GPU.
Cause too many time I see 1 WU, running on 0.5 GPU, cause there aren't enough.

For anonymous platform hosts with any significant amount of cached work, making an abrupt large adjustment of flops isn't a good idea. The APRs for GPU work are based on runtimes for each task, so are affected strongly by how many tasks you run simultaneously; changing from 1 to 2 tasks or vice versa would nearly double or halve the expected rate. But for any application where the APR has had time to stabilize, adjusting <flops> toward that APR in steps would make sense to me.

Your i7 2600 with 2 ATI 5870 system has a Whetstone benchmark around 3.27e09 and APRs around 21e09 for MB CPU, 68e09 for MB GPU, 42e09 for AP CPU, and 200e09 for AP GPU. For that host, I'd set the <flops> in steps over several days, something like:

          Day1     Day2     Day3     Day4
MB CPU    6e09     12e09    21e09    APR
MB GPU    8e09     20e09    50e09    APR
AP CPU    8e09     16e09    32e09    APR
AP GPU    9e09     24e09    64e09    170e09    APR

That's assuming you won't have much more than a one day cache, the idea is to have tasks which were scaled for a particular flops value be completed before you increase the flops too much.
                                                               Joe


Thanks very much, I'll save your info and try this.
(And it isn't necessary, setting a large cache, if workflow is about D'loading a few
MBs, whithout unnecessary delays, compute them and return them, IMHO :))
ID: 1164807 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1164815 - Posted: 23 Oct 2011, 20:24:41 UTC

<to_flop_or_not_to_flop>0</to_flop_or_not_to_flop>

Meeeowwwwwwr. That is the question.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1164815 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1164841 - Posted: 23 Oct 2011, 23:09:13 UTC - in response to Message 1164815.  

<to_flop_or_not_to_flop>0</to_flop_or_not_to_flop>

Meeeowwwwwwr. That is the question.



I am hoping I don't have to, with the limits in place this machine is managing ok, if I can get the right mix of AP's and MB's on CPU I will get enough GPU WU's to keep it happy.

When the limits come off the worst problem I can see is getting flooded with AP's (highly unlightly) or VLAR's, but with careful use of Reschedular I can hopefully avoid that.



Kevin


ID: 1164841 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1170749 - Posted: 13 Nov 2011, 15:55:53 UTC - in response to Message 1170747.  

OK, I finally gave in on my Q8200 with an ATI 4850 GPU. The estimated times for AP on the GPU was so extremely out of reality, it finishes 2 AP's in about 8 hours, but the estimations which never really came down even after hundreds of finished AP's on the GPU, was 28-35 hours per AP, it went down when one AP on the GPU was finished, but as soon as one AP on the CPU finished, it went back to more than 3 times the real value again.

So, I gave in, and have flopped all apps on that computer, and now it looks just fine for all new downloaded WU's of all types.

With all the talk of taking flops out because it "wasn't needed" I left it in there. As it doesn't hurt anything if it isn't needed, but if it is then bang it is already there. If it ain't fixed don't broke eet.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1170749 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1170786 - Posted: 13 Nov 2011, 17:45:19 UTC

Yes, just at the moment <flops> will be needed by anyone who wants to maintain consistency of estimated runtimes while running two or more SETI applications. It's not needed if you always run a single application, but if you run both CPU and GPU, or both AP and MB, or any such mixture, you'll get erratic runtime estimates without <flops>.

This is because of the bodged server fix to the error -177 problem. But as WinterKnight pointed out, It is now 3 months since that little contretemps, and neither BOINC nor SETI are showing much urgency about fixing it. So you may as well leave <flops> in there, as a workround for the workround....
ID: 1170786 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1170809 - Posted: 13 Nov 2011, 18:59:06 UTC - in response to Message 1170791.  

That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy.

But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-)
ID: 1170809 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65758
Credit: 55,293,173
RAC: 49
United States
Message 1170831 - Posted: 13 Nov 2011, 21:02:16 UTC - in response to Message 1170809.  
Last modified: 13 Nov 2011, 21:03:21 UTC

That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy.

But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-)

Sounds like the DCF is swinging, up and then down, PC says one way, Server says go the other way, that is nuts.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1170831 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1170834 - Posted: 13 Nov 2011, 21:23:53 UTC - in response to Message 1170831.  

That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy.

But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-)

Sounds like the DCF is swinging, up and then down, PC says one way, Server says go the other way, that is nuts.

Exactly. That is why, each week, we ask them to take the next step back towards smooth running.

Without success, so far.
ID: 1170834 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65758
Credit: 55,293,173
RAC: 49
United States
Message 1170840 - Posted: 13 Nov 2011, 21:34:30 UTC - in response to Message 1170834.  
Last modified: 13 Nov 2011, 21:35:54 UTC

That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy.

But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-)

Sounds like the DCF is swinging, up and then down, PC says one way, Server says go the other way, that is nuts.

Exactly. That is why, each week, we ask them to take the next step back towards smooth running.

Without success, so far.

So that's it, Thanks Richard, yer a real Lion Heart. :D

Keep asking, meanwhile I'll just flop around It.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1170840 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : GPU in the Lunatics' apps


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.