Message boards :
Number crunching :
Windows and Nvidia video cards
Message board moderation
Author | Message |
---|---|
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I am starting this thread to create a place for all the Setizens who are not running Linux and/or a zillion gpus (or more than 4 actually). Based on a quick skimming review of the Leaderboard, the majority of Seti crunchers are running Windows (XP thru 10) and some version of an Nvidia video card. This thread is focused on these users. If we need a Windows/AMD thread please let me know or start it yourself. Thank you. ================================================================================= I have two machines that are Windows-based and likely to stay that way. http://setiathome.berkeley.edu/show_host_detail.php?hostid=8281049 This one will never stop running Windows. Basically, the only way to get the lovely graphics in the screen saver is on this setup. The only hardware change I am contemplating is using part of my large supply of gtx 1060 3GB video cards to upgrade it again. Used gtx 1060 3GB cards on eBay are getting as low as $100USD which makes them hard to resist. This machine runs multiple BOINC projects. http://setiathome.berkeley.edu/show_host_detail.php?hostid=8671627 This machine is an AMD Ryzen 5 2400G where I am running both the internal gpu and a gtx 750Ti. Sometimes I think the internal gpu is roughly 50% as fast as a gtx 750 ti other times it seems to be more like 25%. I have tinkered a bit with the Vega 11 command line but haven't really been able to get it to "run" very fast. I am also now running it with slower (and cheaper) ram which as far as I know, slows the iGPU/Vega down. I am running about 75% of the available threads which should free up enough memory bus for the iGPU to be running full tilt. But I need to review my Bios settings and make sure the ram is running at full XMP profile speed and the iGPU is set to about 1500 GHz. HTH, Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Here is a very good command line for a Windows gtx 1060 3GB video card. -sbs 1024 -period_iterations_num 10 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 1060 3GB cards are basically single gpu task cards. I got the above command line from this thread: https://setiathome.berkeley.edu/forum_thread.php?id=81516&postid=1870592#1870592 Thank you Wiggo https://setiathome.berkeley.edu/show_user.php?userid=3450 I have gotten an average processing speed under windows of around 7 minutes. This is using the SOG gpu task. And the average has gone higher than that when the data mix has changed. HTH, Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I have been getting near 10,000 RAC on smallish systems with a Gtx 750ti. I managed around 7,000+ RAC with an obsolete 2 core system and a gtx 750Ti. I sent it to a Setizen friend. It now lives here: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671579 Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
10,000 for a 750Ti + 4 core CPU is pretty much standard using Lunatics and tuning. Your welcome. I just got done taking a look at my 2400G and I had no command lines for any of the applications. Which might explain why things were even slower than I thought they "oughta" be. So I just dropped the CL for a gtx 1060 3GB into the MB...Nvidia...sog.txt command file based on another Setizen's comment they were having good luck with it. And it promptly ran a 15 minute task. Who knows, "where the time goes" :) Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Tom I was just looking at your 2700 Win 10 host with the 1060 3GB card. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671092 and I saw a SoG command line I had never come across before. I see you have defined 6 kernels instead of the standard 2 and that seems to really speed up the crunching. I wonder why that isn't in more use? Simply lack of exposure or commonality? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Tom I was just looking at your 2700 Win 10 host with the 1060 3GB card. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671092I'm with Keith, never seen more than one or two kernels defined. Why did you choose six, and how do you determine how many kernels are in any particular card? Bruce |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Tom I was just looking at your 2700 Win 10 host with the 1060 3GB card. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671092I'm with Keith, never seen more than one or two kernels defined. It helps if you read Raistmer's dissertation on what the tuning commands do. http://lunatics.kwsn.info/index.php/topic,1808.msg61251/topicseen.html?PHPSESSID=6qginlckdn2a5jq0g5rc1kt550#new Pay attention to his first post in the thread about -period_iterations_num N and how that impacts the number of kernel arrays you set up. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
Tom I was just looking at your 2700 Win 10 host with the 1060 3GB card. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671092I'm with Keith, never seen more than one or two kernels defined. On my 1070's, this seems to have the best speed for them. -sbs 1024 -period_iterations_num 10 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Tom I was just looking at your 2700 Win 10 host with the 1060 3GB card. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671092 That is a great question. I dismounted that HD and installed another one to run LInux/Cuda91. I will need to re-mount that HD to see exactly what I did. As for why I am not sure. I usually follow the advice I have been given by people like you or Wiggo. Let me see now... Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Tom I was just looking at your 2700 Win 10 host with the 1060 3GB card. https://setiathome.berkeley.edu/show_host_detail.php?hostid=8671092 The app_config.xml I was using is this: <app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>1.0</gpu_usage> <cpu_usage>1.0</cpu_usage> </gpu_versions> </app> <app> <name>astropulse_v7</name> <gpu_versions> <gpu_usage>0.50</gpu_usage> <cpu_usage>2.0</cpu_usage> </gpu_versions> </app> </app_config> <project_max_concurrent>36</project_max_concurrent> The MB command line was this: -sbs 1024 -period_iterations_num 10 -spike_fft_thresh 4096 -tune 1 64 1 4 -tune 2 64 1 4 -tune 3 64 1 4 -tune 4 64 1 4 -tune 5 64 1 4 -tune 6 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 Now the question of why I put in all those "-tune 1...." is that I THINK the -tune 1, -tune 2 referenced different video cards. Card 1, Card 2, etc. Now the reason I set that up is because I was, once again, trying to get 3 or 4 gpus to run on "that box". I remain clueless, otherwise. Heck, I didn't even realize I was getting a 50% improvement. -edit- I went and looked at the SOG tasks for the gtx 1060 3G. And while there are some really low time numbers, there are also some 10 minute tasks. I remember running pretty reliably in the 7-8 minute range years ago. I have another box that I could drop a gtx 1060 3GB into. And play with the command line after I establish a baseline. ---edit--- The AP command line file appears to be empty. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Looking at your finished tasks, it really seems to help the standard AR Arecibo tasks. Not so much the BLC tasks or the Arecibo VLAR tasks, in fact may be hurting them. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Ok, I got a gtx 1060 3GB (Zotac Mini card) tucked into my oldest, continuously running Intel box. Its a Dell Optiplex 7010 Mini Tower than won't take a full length card. Sometime ago I upgraded to a 400 watt psu and when I tested it with a gtx 1060 3GB it didn't even come close to drawing too much power. Its here: https://setiathome.berkeley.edu/show_host_detail.php?hostid=8281049 So once I have a little more baseline for the 1060 (it has started off taking 12 minutes, so either the data is slower to crunch or something else is going on because it used to take 7-9 minutes. I will be perfectly happy to try out the "accidental" MB command line I created and see what happens. As a last resort I can re-install the other Windows 10 HD back in the Amd 2700 box and we can experiment with it. Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Looking at your finished tasks, it really seems to help the standard AR Arecibo tasks. Not so much the BLC tasks or the Arecibo VLAR tasks, in fact may be hurting them. It might be that some kind of intermediate number of those "thingies" will still speed up life and not slow down much on the others. Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
Here is a very good command line for a Windows gtx 1060 3GB video card. After taking a look at some documentation that is for the MB in the project directory I remember a couple of changes I used to use: -sbs 192 -period_iterations_num 10 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -tt 1500 The -tt controls how long the time slice is for the task. And it seems like "it" and the docs think -sbs 192 is a better choice for a x60 card so I will take that up for my gtx 1060 3GB too. It already looks like the "average" processing time is coming back towards 7-8 minutes, so maybe that was what I was missing. Tom A proud member of the OFA (Old Farts Association). |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
That is interesting. Once I added the -tt 1500 to the command line of the box I have that is running a Gtx 750Ti the wall clock time went from pretty much 20 minutes to as low as 7 minutes. If I am reading it right, the amount of cpu time hasn't changed. Just the wallclock time. So it may improve the overall average of the processing times for the gtx 750 Ti. Tom A proud member of the OFA (Old Farts Association). |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
That is interesting. Once I added the -tt 1500 to the command line of the box I have that is running a Gtx 750Ti the wall clock time went from pretty much 20 minutes to as low as 7 minutes. If I am reading it right, the amount of cpu time hasn't changed. Just the wallclock time. From Raistmer's explanation. Since summer PulseFind algorithm had been greatly improved in part of work splitting between separate kernel calls. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
So, if you want to change default behavior, you need to use both -period_iterations_num N and -tt n options. I wouldn't blanket statement apply that. Even for mid-range cards, tuning more aggressively than the default parameters yields improvements. It is just more likely that you will suffer lags if you push too hard. The more time you can spend in the kernel, the faster the task will crunch. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
In another thread a Setizen ran across an all to common issue. Sometimes Windows 10 we update your Nvidia driver. Often/Usually this means you lose something called the "OpenCL" driver which is necessary to run the SOG gpu task. A fix is to download a late model Nvidia driver from the Nvidia website and do a custom install with the "delete all previous versions" toggled on. Then re-boot and start the Boinc Manager. Tom A proud member of the OFA (Old Farts Association). |
W-K 666 Send message Joined: 18 May 99 Posts: 19099 Credit: 40,757,560 RAC: 67 |
In win 10 you can disable automatic driver updates. I used the instructions at https://www.windowscentral.com/how-disable-automatic-driver-updates-windows-10 |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
There have been a rash of posts about the SoG application not running because of no OpenCL component on the system. I wonder if it has to do with the apparent two kinds of Nvidia drivers that can be downloaded now. There is the Standard version and the new DCH version. And they are not compatible with each other. Window Update is installing the DCH version and you can't then install the Standard version over the top of the Windows Update version. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.