Message boards :
Number crunching :
Running SETI@home on an nVidia Fermi GPU
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0 ![]() |
Only for Fermi. If you try this on a non Fermi Card it would rise your calculation time about 4 times (..if you run 2 WU on one GPU). Edit: Found it: http://setiathome.berkeley.edu/forum_thread.php?id=60592&nowrap=true#1011252 Helli A loooong time ago: First Credits after SETI@home Restart |
hbomber Send message Joined: 2 May 01 Posts: 437 Credit: 50,852,854 RAC: 0 ![]() |
My observations show, that saved time, running three tasks on Fermi is mostly from time wasted for CPU phase of unit calculation, in the beginning of each unit. When three tasks are running, GPU never sits idle, as it does in those 12 seconds(on my system) while one unit is being prepared. There is some increase bcs full GPU utilization also, but as much as those 4-5% more utilization, from 95% with one unit, to 99% with three units. This still does not explain why two units are not gaining significant speed increase. Perhaps my tests were two short. If tho units start at same time, those CPU preparation is still wasted, GPU sitting Idle. Chances of overlapping preparation phase with more tasks is less likely. And this all is very welcome for VHAR units(taking around two minutes here), where CPU phase is relatively long, compared with whole crunch time for an unit. |
Terror Australis Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44 ![]() ![]() |
This app_info worked for me on a system using a Q6600, GTX470 and WinXP_32. The GTX 470 was later moved to a similar system using an E7200 processor with no problems. It's simple, no AP or anything fancy. You'll have to set your own <flops> values depending on your card and processor, mine were chosen empirically to give correct ETA's with a DCF of 1 in the client_state.xml file. The "ncpus" value of 0.05 works quite well as the 470 was only using around 2% of CPU time per thread. Note I'm still using the original AK_v8 app not AK_v8b, you'll have to change that bit if your using the later file. Best of luck with it. T.A. <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_v8_win_SSSE3x.exe</name> <executable/> </file_info> <file_info> <name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</name> <executable/> </file_info> <file_info> <name>cufft32_30_14.dll</name> <executable/> </file_info> <file_info> <name>cudart32_30_14.dll</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-1-1a_upx.dll</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_intelx86</platform> <flops>60519353881.00</flops> <file_ref> <file_name>AK_v8_win_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.05</avg_ncpus> <max_ncpus>0.05</max_ncpus> <flops>320000000000</flops> <plan_class>cuda</plan_class> <file_ref> <file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>0.33</count> </coproc> </app_version> </app_info> |
![]() ![]() Send message Joined: 4 Apr 01 Posts: 201 Credit: 47,158,217 RAC: 0 ![]() |
How do you configure the flops value if you are running more than 1 type of fermi card - I have a 470 and a 480 in the same system.. Last thing I read, I saw mention advising against using flops for the fermi cards - has this now changed and it is now the way to go?? Thx ![]() |
Fulvio Cavalli ![]() Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0 ![]() |
[/quote] First, shut down BOINC completely and use Task Manager to be sure boinc.exe is no longer running. Then you can use the Lunatics installer and select only the appropriate AK_v8b CPU S@H Enhanced application plus the Astropulse CPU application. After the install finishes, add the Fermi information to app_info.xml. The needed files should still be in place, just check that every filename in the app_info.xml has a matching file. Then you can restart BOINC. Joe[/quote] Hi guys, there is a lot of very nice info on this post, but it is so much info that I may have lost something, mostly from language lack of skills :p Im finally going for Fermi, with 2 GTX 460 ordered to replace my GTX 295 on one of my i7. Basically, after installing the later BOINC and Nvidia drivers, what I have to do is what Josef say on this nice tip? Ty all in advance! |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
I posted that for Rich-E, who had been running stock, including the 6.10 Fermi version. Your computers are hidden, but I guess from your RAC you are already running optimized and ought to look at other posts for advice. Joe |
Fulvio Cavalli ![]() Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0 ![]() |
Yes I am Josef. Im note sure what to do, but I guess I just need to add the Fermi information in the app_info.xml file and the proper files to my BOINC folder then? |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
... Because you're converting from non-Fermi hardware to Fermi hardware, you'll need to ensure that all CUDA work is run using the stock 6.10 application. Assuming there's 6.08 cuda or 6.09 cuda23 work present, you'll need to edit the related app_version sections to use setiathome_6.10_windows_intelx86__cuda_fermi.exe and its associated DLLs. You should end up with an app_info.xml which does not have the old cuda app you were using nor its DLLs. It may have all 3 cuda types, but all must invoke the same processing. Joe |
Fulvio Cavalli ![]() Send message Joined: 21 May 99 Posts: 1736 Credit: 259,180,282 RAC: 0 ![]() |
Thank you very much for your aid. I will let you know if all goes well when new hardware arrives. Old brave GTX 295 goes to my trusty Q9550 to make a nice 30k rig :D |
CHARLES JACKSON ![]() Send message Joined: 17 May 99 Posts: 49 Credit: 39,349,563 RAC: 0 ![]() |
Hi Does any one know when the new Lunatic Unified Installer is do out. hopely it will fix the gtx 460- 480 problems |
![]() ![]() Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 ![]() |
Hi Does any one know when the new Lunatic Unified Installer is do out. hopely it will fix the gtx 460- 480 problems Hi Charles. We have no schedule, but we're doing our best to make sure every aspect of this major technology leap is under control. It is of primary concern that the next installer releases handle some long standing issues so far unresolved in previous builds (stock & opt). There are some relatively minor issues to put to rest before release builds can be refined and fielded, and installers would shortly follow. How these minor issues pan out, will determine how long things take. Thanks for recognising the priority that these development tasks should take, and some of those things will be under consideration this weekend. Jason (Lunatics) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
CHARLES JACKSON ![]() Send message Joined: 17 May 99 Posts: 49 Credit: 39,349,563 RAC: 0 ![]() |
Hi Jason Thank you for your fast reply. Thank you for all your hard work at Lunatics the work is of great value to me. |
![]() Send message Joined: 16 Dec 01 Posts: 33 Credit: 15,457,430 RAC: 0 ![]() |
Hey guys, I just put together a seti box with 2x GTX460s...the performance is pretty poor however. I understand Fermi support isn't real solid at present, but my results seem worse than what I'm seeing browsing around. System Info Intel i7 920 2x GTX460 Windows 7-64bit BOINC 6.11.4 Lunatics UI 0.36 (MB only) My Fermi's are currently spending ~22-30 minutes per workunit...each. According to GPUz, they are running at speed with ~75% GPU load. I've tried running multiple WUs on the GPUs, and that yeilds really awefull results. GPU usage drops to ~40%, and complete about 0.25% in 5 minutes. Below is my app_info that I cobbled together from this thread. Any assistance would be great...or maybe this is to be expected (VLARs?) ?? Thank you <app_info> <app> <name>setiathome_enhanced</name> </app> <file_info> <name>AK_v8b_win_x64_SSSE3x.exe</name> <executable/> </file_info> <file_info> <name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</name> <executable/> </file_info> <file_info> <name>cufft32_30_14.dll</name> <executable/> </file_info> <file_info> <name>cudart32_30_14.dll</name> <executable/> </file_info> <file_info> <name>libfftw3f-3-1-1a_upx.dll</name> <executable/> </file_info> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_intelx86</platform> <file_ref> <file_name>AK_v8b_win_x64_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>603</version_num> <platform>windows_x86_64</platform> <file_ref> <file_name>AK_v8b_win_x64_SSSE3x.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>setiathome_enhanced</app_name> <version_num>610</version_num> <platform>windows_intelx86</platform> <avg_ncpus>0.200000</avg_ncpus> <max_ncpus>0.200000</max_ncpus> <plan_class>cuda_fermi</plan_class> <file_ref> <file_name>setiathome_6.10_windows_intelx86__cuda_fermi.exe</file_name> <main_program/> </file_ref> <file_ref> <file_name>cudart32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>cufft32_30_14.dll</file_name> </file_ref> <file_ref> <file_name>libfftw3f-3-1-1a_upx.dll</file_name> </file_ref> <coproc> <type>CUDA</type> <count>1</count> <----- Changed to 0.5 when attempted to run 2x WUs/gpu </coproc> </app_version> </app_info> |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14687 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Hey guys, Where are you reading those figures from? The only host on your account matching those specs is host 5471538, which seems to be spitting out the longer tasks in about 13 minutes each. That feels OK: my slightly more powerful GTX 470, in a Windows XP host (known to be faster because of driver differences), tends to run singletons in about ten minutes, though it can manage 3 in 24. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
Hey guys, Yes, you got a few reissues of old VLARs. Other than those, midrange ARs are showing 12 or 13 minutes, VHARs 3 or 4 minutes in your validated tasks. That's only about 6.5 times faster than your CPU for midrange, I'd expect better. As usual for fermi cards there's "clockRate = 810000" shown in the stderr reports. Are you sure the clocks are being boosted when doing tasks? A quick very approximate estimation for <flops> to put in your app_info.xml as the cards are now working would be 3.17e10 for the CPU app and 2.05e11 for the GPU app. Joe |
![]() Send message Joined: 16 Dec 01 Posts: 33 Credit: 15,457,430 RAC: 0 ![]() |
I have a batch that hasn't uploaded yet (network issues). Looks like they are finishing up in the 22-25min range for the most part. With some 12 mins as you noted |
![]() Send message Joined: 16 Dec 01 Posts: 33 Credit: 15,457,430 RAC: 0 ![]() |
Hi Josef, Yes I've been logging with GPU-z and the clock speeds are staying at 100%. However they usually however around 60-70% load...which I'm not sure if that is normal or not. I will try putting in the flops values for the card and see what that does. Appreciate the help |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
Hi Josef, You're welcome. The flops won't affect runtime, but should help BOINC estimate and do work fetch better. It's just an attempt to cooperate with the way BOINC is designed rather than fight it. The 60-70% load is similar to what others have reported for fermi cards running single instances. Maybe one of those VLARs was involved when you tried the multiple instance setup before, or maybe the 48 processors/shaders per multiprocessor of the 460 reacts differently from the 32 on GTX 470 and 480 cards. Joe |
![]() ![]() Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 ![]() |
Hey! You're a Squire!, when you get comfortable you might like to try the x32f build in Beta testing. Jason "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
![]() Send message Joined: 16 Dec 01 Posts: 33 Credit: 15,457,430 RAC: 0 ![]() |
Hey thanks for the heads up Jason, not sure how I missed that lol Wow...so far looks to be a sizable boost on my box...nice... :) On the bright side, did also manage to get multiple WUs running on the 460s :D Guessing I had some bad units before, because I cant figure out what I did for the life of me... thanks again guys Hey! You're a Squire!, when you get comfortable you might like to try the x32f build in Beta testing. |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.