Message boards :
Number crunching :
OpenCL NV MultiBeam v8 SoG edition for Windows
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 21 · Next
Author | Message |
---|---|
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
SoG up and running with the <plan_class>opencl_nvidia_SoG</plan_class> Let's burn a GTX980 :-) |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14474 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Tut are you getting work for that OpenCL SOG? Plan Class names used under Anonymous Platform don't have to match the plan classes used for stock distributions - I've made up plan classes including my initials and the word 'test' before now, and they worked just fine. But they should include the keyword for the type of scheduling anticipated - OpenCL in this case (for BOINC versions >= 7.0.40). All mine did, so I can't speak for what happens if you leave it out. It'll be in a (debug) log if you fall foul of something, and need to look it up. This is the other way round, but error messages might look something like this: 11/15/2012 8:53:52 AM | | App version needs opencl but GPU doesn't support it |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Very high CPU usage for WU's other than High AR's. Almost a full core, for AR's other than VHAR's where the CPU usage is 8-10% only. Since the WU's I tested this with on BETA, was all above 2.something in AR, the low CPU usage was what surprised me most. However here on main, with mostly lower AR's the high CPU usage really shows. Thanks Dog, that we do not get VLAR's for CPU here, or even this GTX980 would come to a screeching halt :-) EDIT: But SoG is fast, scaringly fast. So I can live with high CPU usage, just dropping a CPU core from CPU crunching. Geeze.... |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Very high CPU usage for WU's other than High AR's. Almost a full core, for AR's other than VHAR's where the CPU usage is 8-10% only. ATi OpenCL build handles VLAR quite easely. Worth to try with OpenCL NV also. That's the disadvantage of beta - subset of ARs, subset of devices... Pulses and Triplets still processed by old way - and synhing uses lot of CPU as before (again, NV-specific). |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 33270 Credit: 79,922,639 RAC: 80 ![]() ![]() |
Very high CPU usage for WU's other than High AR's. Almost a full core, for AR's other than VHAR's where the CPU usage is 8-10% only. You can try _use_sleep or -use_sleep_ex 5 to reduce CPU usage. But i suggst to use this only running multiple instances. With each crime and every kindness we birth our future. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Very high CPU usage for WU's other than High AR's. Almost a full core, for AR's other than VHAR's where the CPU usage is 8-10% only. Well, running 3 at a time is indeed multiple instances. However, I'll wait and see if I can live with this, because by using -use_sleep, this app will not be any faster than CUDA50. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
because by using -use_sleep, this app will not be any faster than CUDA50. Would be interesting to check this BTW. Sleep() implemented mostly in PulseFind area. And VHAR has small amount of PulseFind so -use_sleep impact there would be quite small and CPU savings with midrange AR could be substantional. From other side, balancing overall host performance depends on GPU vs CPU work share. For fast GPUs most of host RAC should come from GPU part and CPU part could be neglectible. |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 33270 Credit: 79,922,639 RAC: 80 ![]() ![]() |
Very high CPU usage for WU's other than High AR's. Almost a full core, for AR's other than VHAR's where the CPU usage is 8-10% only. Thats why i suggested -use_sleep_ex 5. Shouldn`t be much slower running 3 instances but reduces CPU usage at least a little bit. Running benches atm. With each crime and every kindness we birth our future. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Thanks for the suggestions Mike, always appreciated. I'll keep it in mind, if I get really bothered about the CPU usage at some AR's. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Also would be interesting to check how it responds to -cpu_lock. OpenCL NV quite uncharted area and what we know on ATi side not always directly applicable here. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Also would be interesting to check how it responds to -cpu_lock. You want me to add -cpu_lock to the command line? |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Also would be interesting to check how it responds to -cpu_lock. Just as part of app parameter space exploration, later, when you establish some baseline impression how it behaves on different ARs. Baseline required to have smth to compare with. Then such things like -use_sleep and/or -cpu_lock and -sbs N variations can be tested. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Also would be interesting to check how it responds to -cpu_lock. OK, I'll keep it running as it is now. |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5516 Credit: 528,817,460 RAC: 242 ![]() ![]() |
Definitely using a lot more CPU than Beta, also seems to be taking longer to process. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Definitely using a lot more CPU than Beta, also seems to be taking longer to process. Yeah, but then the WU's we got on Beta, was consistently over 2 on the AR. Not one was a "normal" AR. Here, we see all kinds of AR's. Just now I'm crunching a bunch with an extreme AR of over 51, yes 51 :-) Those behave strange on the app, not any high CPU usage, but the progress is iffy to say the least (in the Boinc manager, the progress % doesn't even move until it suddenly jumps to 100% for these extreme WU's, but the progress indicator in BoincTasks work for these too), but they're done in 9-10 minutes, running 3 at a time. Too high to be fast, or something. Example of one of those "crazy" WU's: WU true angle range is : 51.249186 |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5516 Credit: 528,817,460 RAC: 242 ![]() ![]() |
Well, had a chance to look at some of these processed. They are now slower than Cuda here on main. Also seeing unusually high kernal usage. Within the last 20% of the analysis, kernal activity spikes, all CPUs go to 100%. I had been using a command line but removed it when it appears to be actually hampering the work, so now it's just running stock 3 at a time. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Well, had a chance to look at some of these processed. They are now slower than Cuda here on main. Also seeing unusually high kernal usage. Within the last 20% of the analysis, kernal activity spikes, all CPUs go to 100%. I had been using a command line but removed it when it appears to be actually hampering the work, so now it's just running stock 3 at a time. Well YMMV of course. I can not say that this app is slower than CUDA, on the contrary, for me it is much faster than CUDA50. But we'll see. I'll let it run as it does for now. I know how my production rate on CUDA looked like (per day), so I will pretty fast be able to compare with this app. |
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
Further comments: It's quite clear now, that the AR's of the WU's we use to test new apps on Beta, is not representative of the mix of different AR's we will meet here on main. There's a need for a better mix of AR's on Beta, that's for sure. The results we get on Beta for the new apps, is in no way the results we will get for those apps here on main. That much is totally clear by now. Anyhow, I'll continue with the SoG, until I can say whether or not it crunches V8 MB's faster or slower than CUDA50 on my GTX980 Strix, on the long run. So far, it's unclear, mostly because the SoG reacts not so good to "normal" AR's, and lower AR's, compared to how CUDA50 reacts to them. AR's between 2.xx and up to (so far unknown AR's), is where SoG really shines bright. And that ends the comment from the Swedish jury, for now. :-) |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Well, had a chance to look at some of these processed. They are now slower than Cuda here on main. Also seeing unusually high kernal usage. Within the last 20% of the analysis, kernal activity spikes, all CPUs go to 100%. I had been using a command line but removed it when it appears to be actually hampering the work, so now it's just running stock 3 at a time. Could you provide links to comparison pairs, please. |
![]() Send message Joined: 9 Apr 04 Posts: 8634 Credit: 2,930,782 RAC: 1 ![]() |
I have installed a Geforce GTX 750 on my Windows 10 PC, reinstalled the Lunatics package and is now crunching SETI@home tasks. In the stderr.txt I see that the nVidia driver is 353.54. Is this OK? I did nothing to install drivers, Windows 10 did all the work. Tullio |
©2022 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.