Message boards :
Number crunching :
Linux CUDA 'Special' App finally available, featuring Low CPU use
Message board moderation
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · 23 · 24 . . . 83 · Next
Author | Message |
---|---|
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . I am not privvy to what happens within Nvidia Corp or the manufacturing companies but I feel confident that they have access to much lower level routines to effect fan and clock control than a user level interface such as xserver. But I found the documentation ironic and that amused me, if you are unable to appreciate the irony in that I am sorry. You have been very helpful and it was never my intention to offend you.Just curious. Since the Coolbits options are not included in the Public release of nVidia Settings, it would appear they are not intended for the general public. So, who do you think they where intended for? Now that the Vendors are releasing Software that uses the built-in nVidia tweaks, the choice nVidia has is to either make them available to those that want them, or see those people use someone else's software not under Nvidia's control. It's somewhat similar to an Automobile that can do well over 100mph. You will almost never be able to reach over 100mph as in most cases you will be penalized for trying. You can play around with Coolbits all you wish, but if you go to extremes and damage the hardware, there will be penalties. Nothing unusual there. . . Well I guess then it is safe to say, I am not most people, or perhaps I am? Stephen .. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
It actually wouldn't be that hard to script a fan control. SMI can output a steady stream of temp #s from it's options. One could average the last 10 readings and increase/decrease fan speed 2% (or proportional to the difference) to keep them at a desired temp (the same as any software company does). Sure it would work, as long as it is bullet proof coding. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
It actually wouldn't be that hard to script a fan control. SMI can output a steady stream of temp #s from it's options. One could average the last 10 readings and increase/decrease fan speed 2% (or proportional to the difference) to keep them at a desired temp (the same as any software company does). . . Then I guess that leaves an amateur like me out of the running Stephen :) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Hi guys, . . For what it is worth I seem to be running at about 8% to 9% inconclusives, but still zero invalids. I have corrected the PCIe config on the Pentium and the runtimes are now even across the two GPUs and regular. Stephen :) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Petri, Are you seeing a 40% increase in performance with the 1080 Ti going from 20 to 28 CU's? I'm curious, been window shopping :) |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Hi, nearly. Run time with 1080 is 140+ seconds and with ti it is 107 seconds. That is for vlar. See my results. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Hi, nearly. Once we get to 12 seconds then we're obsolete, since that is the task observation time, and it might be cheaper to crowdfund 1080tis to Berkeley for realtime analysis, for Arecibo/multibeam anyway. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Hi, nearly. The For shorties it balances the load differently: 2 for pulsefind, 2 for all the rest. CPU is used for chirping and it uses AVX2. EDIT: got the run time wrong. EDI2: makes mee feel like a fool. I meant 4.1. Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Petri, Are you seeing a 40% increase in performance with the 1080 Ti going from 20 to 28 CU's? Hi again, The performance scales quite well. The Wattage does not do nearly as good. See image .. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
The 12 second 1.4. seconds version is running on my test host. It uses 4 Ti's simultaneously for one task. Three for long pulse finds and one for all the rest (Gauss, Triplet, Autocorrelations and Spikes).That brings to mind a saying ... "The difference between men and boys, is the price of their toys!" |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
The 12 second 1.4. seconds version is running on my test host. It uses 4 Ti's simultaneously for one task. Three for long pulse finds and one for all the rest (Gauss, Triplet, Autocorrelations and Spikes).That brings to mind a saying ... Yeah! It IS fun to play even in this age and time -- regardles of the recent outages. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
The 12 second 1.4. seconds version is running on my test host. It uses 4 Ti's simultaneously for one task. Three for long pulse finds and one for all the rest (Gauss, Triplet, Autocorrelations and Spikes).That brings to mind a saying ... . . So when do we get our Lamborghinis ?? :) Stephen :) |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Hi, Here is an interesting one: http://setiathome.berkeley.edu/workunit.php?wuid=2488511762 The SoG has the same kind of error that my version has. I'd like to know if R. finds a cure for that - it might help me too. Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I ran across another one that is pretty simple. Zero signals and a Bad Best Pulse, https://setiathome.berkeley.edu/workunit.php?wuid=2488317742 Ran the task on my CPU and got; Best pulse: peak=4.564702, time=67.24, period=0.5079, d_freq=1420128564.62, score=0.8974, chirp=71.618, fft_len=64 http://boinc2.ssl.berkeley.edu/sah/download_fanout/3ba/16fe08aa.12502.25021.6.33.13 I've been running different builds since the outage, we'll see how they go. |
Wiggo Send message Joined: 24 Jan 00 Posts: 34748 Credit: 261,360,520 RAC: 489 |
Hi, That other rig is spitting out garbage on its 560 Ti which you came up against there on that w/u. ;-) Cheers. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Hi, used to run a 560ti, and early factory OC models were shipped with insufficient core voltage by default. They also tend to get pretty toasty. My feeling is we'll have to eventually embed some monitoring (e.g. NVML sensors during run) and possibly some lightweight spotchecks. Another potentially handy thing where you use padding, might be to use 0xDEADDEAD instead of zeroes, then throw in some extra threads with a conditional, such that the extras look for the hex value, and either set a flag or throw an exception on corruption detection. Not exactly rigorous, but low cost and better than nothing. [I plan something along those lines for the generic version, more oriented to the automated tuning, however that's further off.] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Hi, Yup, Writing and checking for 0xDEAD (or whateved bin code) in between buffers would reveal buffer under/overflows. I'll see if I have time to implement that some evening next week. Petri To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
A little more info on the False overflows. It seems all of these are occurring with the Low Angle range Arecibo tasks, mainly around 0.248126 & 0.148085. How many overflows depends upon how many of those angle range tasks you get. Seeing as how the previous versions had False overflows with the VLARs, it would appear this is a leftover from that problem. It still doesn't like the Low Angle ranges. I haven't found any at the higher angle ranges. https://setiathome.berkeley.edu/results.php?hostid=8215300&state=5 https://setiathome.berkeley.edu/results.php?hostid=7769537&state=5 https://setiathome.berkeley.edu/results.php?hostid=8136063&state=5 etc, etc... |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
under W7, bad WU with ATI ? https://setiathome.berkeley.edu/workunit.php?wuid=2486259533 |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
under W7, bad WU with ATI ?The task says; WARNING: This application needs newer GPU, at least ATI Radeon HD 5000 needed, exiting ! That is because SETI doesn't have an App that will work on the AMD Radeon HD 4850 in your machine. They tried one, but couldn't find a way to assign it to just the HD 4000 GPUs, so, they just removed it rather than send it to All the machines. You need to Uncheck Use ATI GPU in your Preferences, https://setiathome.berkeley.edu/prefs.php?subset=project Another way would be to Install the SSE41 CPU App on your machine. The Package only has the CPU App in the app_info.xml, so, it will only ask for CPU tasks. As a Bonus, the SSE41 App will be much faster on your machine than the Stock CPU App, SSE41_CPUr3344.zip |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.