GPU in the Lunatics' apps

Author	Message
Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1164804 - Posted: 23 Oct 2011, 19:33:57 UTC - in response to Message 1164748. Last modified: 23 Oct 2011, 19:40:24 UTC Fred J. Verster wrote: ---[SNIPPED]--- Those of you who refuse to "fiddle with" the value are condemning your hosts to sending a flops value usually far less than the GPU is actually producing. That's your choice, the system will probably supply enough GPU work to keep them busy unless there's an unexpected outage. My preference is to give a reasonable approximation of what an application will do so the system works as intended, I never expect BOINC to be able to rescind the old GIGO principle. Joe, is right and I certainly will change this. Stabillity is also more important, then excitement. While this (ATI)-host, already is in constant short supply of work, which can be a result of the lower XFLOPS estimate, if I understand this correctly. If I were to set set a FLOPS entry of 3.0 e+9, the server/scheduler, should respond more realistic? Or I'll change it to 1 MB per GPU, already switched to 1 AstroPulse task per GPU. Cause too many time I see 1 WU, running on 0.5 GPU, cause there aren't enough. For anonymous platform hosts with any significant amount of cached work, making an abrupt large adjustment of flops isn't a good idea. The APRs for GPU work are based on runtimes for each task, so are affected strongly by how many tasks you run simultaneously; changing from 1 to 2 tasks or vice versa would nearly double or halve the expected rate. But for any application where the APR has had time to stabilize, adjusting <flops> toward that APR in steps would make sense to me. Your i7 2600 with 2 ATI 5870 system has a Whetstone benchmark around 3.27e09 and APRs around 21e09 for MB CPU, 68e09 for MB GPU, 42e09 for AP CPU, and 200e09 for AP GPU. For that host, I'd set the <flops> in steps over several days, something like: Day1 Day2 Day3 Day4 MB CPU 6e09 12e09 21e09 APR MB GPU 8e09 20e09 50e09 APR AP CPU 8e09 16e09 32e09 APR AP GPU 9e09 24e09 64e09 170e09 APR That's assuming you won't have much more than a one day cache, the idea is to have tasks which were scaled for a particular flops value be completed before you increase the flops too much. {edit} As a final note, it might be wise to boost the rsc_fpops_bound values to avoid possible -177 execution time errors during the adjustment period, either by direct editing of client_state.xml or using Fred's Boinc Rescheduler. Joe ID: 1164804 ·

Fred J. Verster Volunteer tester Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0	Message 1164807 - Posted: 23 Oct 2011, 19:46:02 UTC - in response to Message 1164804. Last modified: 23 Oct 2011, 19:53:39 UTC Fred J. Verster wrote: ---[SNIPPED]--- Those of you who refuse to "fiddle with" the value are condemning your hosts to sending a flops value usually far less than the GPU is actually producing. That's your choice, the system will probably supply enough GPU work to keep them busy unless there's an unexpected outage. My preference is to give a reasonable approximation of what an application will do so the system works as intended, I never expect BOINC to be able to rescind the old GIGO principle. Joe, is right and I certainly will change this. Stabillity is also more important, then excitement. While this (ATI)-host, already is in constant short supply of work, which can be a result of the lower XFLOPS estimate, if I understand this correctly. If I were to set set a FLOPS entry of 3.0 e+9, the server/scheduler, should respond more realistic? Or I'll change it to 1 MB per GPU, already switched to 1 AstroPulse task per GPU. Cause too many time I see 1 WU, running on 0.5 GPU, cause there aren't enough. For anonymous platform hosts with any significant amount of cached work, making an abrupt large adjustment of flops isn't a good idea. The APRs for GPU work are based on runtimes for each task, so are affected strongly by how many tasks you run simultaneously; changing from 1 to 2 tasks or vice versa would nearly double or halve the expected rate. But for any application where the APR has had time to stabilize, adjusting <flops> toward that APR in steps would make sense to me. Your i7 2600 with 2 ATI 5870 system has a Whetstone benchmark around 3.27e09 and APRs around 21e09 for MB CPU, 68e09 for MB GPU, 42e09 for AP CPU, and 200e09 for AP GPU. For that host, I'd set the <flops> in steps over several days, something like: Day1 Day2 Day3 Day4 MB CPU 6e09 12e09 21e09 APR MB GPU 8e09 20e09 50e09 APR AP CPU 8e09 16e09 32e09 APR AP GPU 9e09 24e09 64e09 170e09 APR That's assuming you won't have much more than a one day cache, the idea is to have tasks which were scaled for a particular flops value be completed before you increase the flops too much. Joe Thanks very much, I'll save your info and try this. (And it isn't necessary, setting a large cache, if workflow is about D'loading a few MBs, whithout unnecessary delays, compute them and return them, IMHO :)) ID: 1164807 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1164815 - Posted: 23 Oct 2011, 20:24:41 UTC <to_flop_or_not_to_flop>0</to_flop_or_not_to_flop> Meeeowwwwwwr. That is the question. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1164815 ·

Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572	Message 1164841 - Posted: 23 Oct 2011, 23:09:13 UTC - in response to Message 1164815. <to_flop_or_not_to_flop>0</to_flop_or_not_to_flop> Meeeowwwwwwr. That is the question. I am hoping I don't have to, with the limits in place this machine is managing ok, if I can get the right mix of AP's and MB's on CPU I will get enough GPU WU's to keep it happy. When the limits come off the worst problem I can see is getting flooded with AP's (highly unlightly) or VLAR's, but with careful use of Reschedular I can hopefully avoid that. Kevin ID: 1164841 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1170749 - Posted: 13 Nov 2011, 15:55:53 UTC - in response to Message 1170747. OK, I finally gave in on my Q8200 with an ATI 4850 GPU. The estimated times for AP on the GPU was so extremely out of reality, it finishes 2 AP's in about 8 hours, but the estimations which never really came down even after hundreds of finished AP's on the GPU, was 28-35 hours per AP, it went down when one AP on the GPU was finished, but as soon as one AP on the CPU finished, it went back to more than 3 times the real value again. So, I gave in, and have flopped all apps on that computer, and now it looks just fine for all new downloaded WU's of all types. With all the talk of taking flops out because it "wasn't needed" I left it in there. As it doesn't hurt anything if it isn't needed, but if it is then bang it is already there. If it ain't fixed don't broke eet. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1170749 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1170786 - Posted: 13 Nov 2011, 17:45:19 UTC Yes, just at the moment <flops> will be needed by anyone who wants to maintain consistency of estimated runtimes while running two or more SETI applications. It's not needed if you always run a single application, but if you run both CPU and GPU, or both AP and MB, or any such mixture, you'll get erratic runtime estimates without <flops>. This is because of the bodged server fix to the error -177 problem. But as WinterKnight pointed out, It is now 3 months since that little contretemps, and neither BOINC nor SETI are showing much urgency about fixing it. So you may as well leave <flops> in there, as a workround for the workround.... ID: 1170786 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1170809 - Posted: 13 Nov 2011, 18:59:06 UTC - in response to Message 1170791. That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy. But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-) ID: 1170809 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65758 Credit: 55,293,173 RAC: 49	Message 1170831 - Posted: 13 Nov 2011, 21:02:16 UTC - in response to Message 1170809. Last modified: 13 Nov 2011, 21:03:21 UTC That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy. But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-) Sounds like the DCF is swinging, up and then down, PC says one way, Server says go the other way, that is nuts. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1170831 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874	Message 1170834 - Posted: 13 Nov 2011, 21:23:53 UTC - in response to Message 1170831. That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy. But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-) Sounds like the DCF is swinging, up and then down, PC says one way, Server says go the other way, that is nuts. Exactly. That is why, each week, we ask them to take the next step back towards smooth running. Without success, so far. ID: 1170834 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65758 Credit: 55,293,173 RAC: 49	Message 1170840 - Posted: 13 Nov 2011, 21:34:30 UTC - in response to Message 1170834. Last modified: 13 Nov 2011, 21:35:54 UTC That makes sense. Because of the way the botched workround was applied, it made a fast machine look like a slow machine - or more specifically, the fast bits of a fast GPU look like a slow CPU. So your client is saying "I'm fast", and the server is saying (falsely), "no you're not, you're slow". That's when the estimates go crazy. But for a slow machine, both the client and the server agree that it's slow, and they get along a lot better ;-) Sounds like the DCF is swinging, up and then down, PC says one way, Server says go the other way, that is nuts. Exactly. That is why, each week, we ask them to take the next step back towards smooth running. Without success, so far. So that's it, Thanks Richard, yer a real Lion Heart. :D Keep asking, meanwhile I'll just flop around It. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1170840 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.