Message boards :
Number crunching :
CUDA Not So Fast?
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
CUDA not so safe it seems, is it fast or not - it's the second question... The people that are raising the issues about CUDA by saying that it is generating inaccurate results are making one big assumption that may or may not be valid. And that assumption is that the current application running on the CPU does return valid results. It is well within the realm of possibilities that the fact that the CUDA application is finding "more" may simply mean that the application is finding signal candidates where the CPU based applications are simply not sensitive enough. Were this a real science based project, the project team would have created known data tasks where the number of spikes, doubles, triples, Gaussians, etc. would have been artificially generated and known. These "test" tasks would be created without and with noise and the applications tested against these known signals. Instead we test against tasks which have been captured in the past and within which we assume that we know what the signals are... and test against those ... The fact of the matter is that we have no idea which application is returning "correct" results because we have no idea what the results really should be ... gigo ... {edit}Unless I am mistaken and Beta Test has in the last month or so generated a known signal task{/edit} |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
CUDA not so safe it seems, is it fast or not - it's the second question... Are you joking? CUDA MB intended to do EXACTLY THE SAME calculation and produce EXACTLY THE SAME results as it does CPU app. Your assumption much further from reality than my. Just think a little that it inteded to be validated agains CPU app.... Moreover, CUDA MB and CPU MB have single codebase now. it's matter of command line switch to make CPU processing instead of CUDA one. Before making such assumption better to get familiar with subject :P It "found" more signals, "other" signals, "crash" signals and so on and so forth. And especially more new unknow "signals" appears after prev. task crash with driver restart. Great science in these ghosts, indeed... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
CUDA not so safe it seems, is it fast or not - it's the second question... The desired sensitivity is expressed by the thresholds specified in the WU header for each signal type. If CUDA processing can't interpret those the same as CPU processing on x86, x86_64, Itanium, PowerPC, Sparc, etc, then CUDA is wrong. It is true, though, that the project has decided to use single floats for most calculations rather than doubles or some even larger arbitrary precision. That means there will be cases where the CUDA app might find one or two extra/less signals simply due to use of a different FFT implementation, etc. The fairly consistent overflow on 31 triplets for WUs with angle range above about 2.5 degrees doesn't fall into that pattern of slight calculation difference. Were this a real science based project, the project team would have created known data tasks where the number of spikes, doubles, triples, Gaussians, etc. would have been artificially generated and known. These "test" tasks would be created without and with noise and the applications tested against these known signals. The project does have a way to generate artificial signals, I don't remember all the details from when I read about it many years ago, nor do I know if it can create all the signal types. In any case, the extent to which processing depends on angle range would require a very large set of test WUs. Oh, and bear in mind that all WUs have noise; sky noise, receiver noise, and digitization noise at least. In fact, the basic task the application does is to decide when a potential signal is statistically far enough from the theoretical mean noise to be worth saving for further examination (persistence being required to classify it as possibly extra-terrestrial). Joe |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
The assumption that the current application suite is correct is, as I stated, unfounded in that the testing is made against real world signals which does indeed have noise of various sorts. And the noise is indeed real noise. And thank you for confirming that what I asserted is indeed true. When testing a signal processing application the first tests always must be against a known signal suite. Then the know signal suite is biased with varying levels of noise to determine if the application is still able to detect the known signals correctly. By using only "real world" signals, we truly do not know what is in that task that we are processing. All we know is what previously assumed correct programs have detected. We then assume that if the new application detects what the old program detects that the new program is correct. An equivalent example is where a Lockheed engineer released a program to test part of the sonar suite in the S3A aircraft. Within a few months the fleet was out of A2 modules for this black box (actually painted beige) because the program always found the A2 faulty. When sites complained that they could not repair these black boxes Lockheed would send out the engineer with his A2 card and he would "prove" that the boxes were good if they only had a "good" A2 card. When his A2 failed for other reasons and was repaired, well, it turned out that the only way the program would run to completion was if it was run against his already faulty A2 card. I did not say that the GPU program was correct, or did not have other unrelated problems, only that the assertion that correctness is proven by it returning results that are identical to the current applications. All that that proves is that the program is returning the same results. Not that those results are correct. The last assertion that there is no point in testing with known test signals because of the large number of possibilities is a well established problem of the testing world and one for which I am very familiar. I also did not assert that to prove correctness that all possibilities and combinations must be tested to obtain confidence in correctness. Though, truth be told, to really prove correctness that is exactly what must occur. And Joe, the assumption that having 20 versions of a program that return the same values as output proves nothing beyond that they perform the same way. The fact that they return the same values only proves that they return the same values, not that the values returned are correct. Likewise, the problems of the CUDA application failing in one particular way does not validate the other application's operation in another aspect. Your assertion that this error over here proves that CUDA is flawed and so that the detection over there also must be wrong is simply not a supportable assertion based on the facts as we know them. Seriously, you guys are straining too hard at this. I simply stated that flaws in the CUDA application do not prove anything beyond the fact that the CUDA application has flaws. Flaws in the CUDA application as evidenced by differences in returned results does not prove anything beyond the CUDA application returns results that are different than the current application. We are assuming that the results returned by the current application suite are correct ... but, even though we may have confidence that that is true ... I simply stated that we have no proof that it is true. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Skepticism is good, and I agree we (participants) have no proof that the algorithms and implementation of the setiathome application are correct. I think that's almost certainly true for any distributed computing project, the participants must take whatever description the project staff has provided and decide whether to trust them. I made that decision about S@H over 9 years ago, and have seen no reason to change my mind. Close examination of the code since it has become open source has reinforced my trust. It is too late tonight to write an extended reply, and if I put it off I'd never get to it. This will have to do. Joe |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 |
Matt has recently posted that CUDA currently represents about 3% of validations. That should help damp down some of the more frantic speculations of the immense wave of new capacity suddenly brought in by this initiative. |
Jason Tobin Send message Joined: 28 Apr 07 Posts: 21 Credit: 1,168,873 RAC: 0 |
Hi; It seems I am not experiencing any faster computation speed from the latest Boinc Client 6.4.5 using the CUDA add on. My project is Seti@Home OS: Windows XP Media Center (Pro) 32 bit w/ Sp3 Video Card: NVidia GeForce 8800 GT Intel Core2 Quad CPU Q6600 @ 2.4 GHz 2.0 GB RAM driver version: 180.48 I down loaded the the new Boinc client. Downloaded the latest NVidia Driver (ver.180.48) Restarted Client stated that it did have CUDA Compatible components. The option for the GPU is enabled on my settings. But I am not seeing any benefit. I am not getting any error messages what so ever. Any thoughts?? AM I doing something wrong? Jason Tobin jasontobin48@hotmail.com ____________ Jason Tobin Alien Hunting Specialist Jason Tobin Alien Hunting Specialist |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... When a task is sent to a host, BOINC decides which application will do the work. You have a considerable number of WUs which were sent to your host while you were running BOINC 6.2.19 and S@H 6.03, those will be used to do those tasks. Tasks received since you made the conversion will be done with S@H 6.05. Temporarily you have the advantage of having both CPU and GPU crunching capability, with luck you'll get some tasks which need to be run soon on the CUDA GPU while still having some assigned to 6.03 and the CPU. After the 6.03 work runs out, you'll find that BOINC will only be activating one at a time using the GPU, your CPUs may be mostly idle unless you have Astropulse work or work from another project. You can also expect that 6.05 may cause various problems on your system from time to time, exit with errors fairly often, produce work which doesn't validate occasionally, etc. See the many threads here for more information. Joe |
Stacey Baird Send message Joined: 8 Nov 04 Posts: 41 Credit: 407,924 RAC: 0 |
I am using seti cuda 6.05 in WinXP S3, on a 3.2gh P4 simulated dual processor and using an NVIDIA FORCE 8400 with latest driver, two gig patriot memory and all that over clocked by 13 percent to about 3600 something. Two "regular" cpu projects are running and one CUDA. So, why do my CUDA projects say "Running (.06 CPU 1 CUDA)" .06 of what? Not a very clear status statement and no definition for it that I have found, so far. Does anyone know for sure what it means and why? The CUDA's don't seem to run all that fast either. Some actually seem to run in retrograde fashion, and with no explanation or accurate expression of a time budget in the TASKS section of the BOINC Mgr. Oh well, it's Christmas. Everyone needs some time off and a fresh start with a rested mind. best regards The welfare of the people is the highest law - Cicero If no one complains, the people must be satisfied. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
I am using seti cuda 6.05 in WinXP S3, on a 3.2gh P4 simulated dual processor and using an NVIDIA FORCE 8400 with latest driver, two gig patriot memory and all that over clocked by 13 percent to about 3600 something. Two "regular" cpu projects are running and one CUDA. Yes, I know ... Oh,you want me to tell you! :) It means that the main processing is happening on the Video Adapter. However, to move data to and fro and update BOINC Manager you are also "stealing" 0.06% of the time on the main CPU to do that "housekeeping" ... So, if all is correct, like on my one system I have 8 tasks running on the 8 virtual CPUs I have and one CUDA task that is running (total of 9 tasks) with 0.03% of time "stolen" from the CPUs as time wends on ... The percentage seems to vary and the slower the main system the higher the number will be ... this number also varies with BOINC version and the version of the Science Application running. CUDA tasks may or may not run faster depending on the capability of your video card. The lower end cards may not be all that much faster than the optimized application on the CPU ... *BUT*, potentially you are running one more task at a time and so the THROUGHPUT will be higher. So, for example, a week ago I built a new system and already I have 40K CS for "free" in that there has been little or no CPU time used to earn that extra credit. More importantly, I have done more science. Hope this helps. |
Bounce Send message Joined: 3 Apr 99 Posts: 66 Credit: 5,604,569 RAC: 0 |
>When a task is sent to a host, BOINC decides which application will do the work. >You have a considerable number of WUs which were sent to your host while you >were running BOINC 6.2.19 and S@H 6.03, those will be used to do those tasks. >Tasks received since you made the conversion will be done with S@H 6.05. Does BOINC want to flush the cache of pending WUs before downloading more work? Since doing all the requisite upgrades and verifying that BOINC sees the CUDA GPU, all the cached WUs continue to process but no new work is received (AP or standard). I can see a trend in more average WUs per day being processed on my stats graph but there's not been a new WU for 3 days (maybe more). I have an AMD dual core and an 8600GT. The BOINC task manager continues to only show 2 tasks running at a time (nothing indicating CUDA). 12/26/2008 6:47:23 AM||Starting BOINC client version 6.4.5 for windows_intelx86 12/26/2008 6:47:24 AM||log flags: task, file_xfer, sched_ops 12/26/2008 6:47:24 AM||Libraries: libcurl/7.19.0 OpenSSL/0.9.8i zlib/1.2.3 12/26/2008 6:47:24 AM||Running as a daemon 12/26/2008 6:47:24 AM||Data directory: C:\Documents and Settings\All Users\Application Data\BOINC 12/26/2008 6:47:24 AM||Running under account Administrator 12/26/2008 6:47:24 AM||Processor: 2 AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 6000+ [x86 Family 15 Model 67 Stepping 3] 12/26/2008 6:47:24 AM||Processor features: fpu tsc pae nx sse sse2 3dnow mmx 12/26/2008 6:47:24 AM||OS: Microsoft Windows XP: Professional x86 Editon, Service Pack 3, (05.01.2600.00) 12/26/2008 6:47:24 AM||Memory: 3.50 GB physical, 6.84 GB virtual 12/26/2008 6:47:24 AM||Disk: 232.88 GB total, 110.59 GB free 12/26/2008 6:47:24 AM||Local time is UTC -6 hours 12/26/2008 6:47:24 AM||Not using a proxy 12/26/2008 6:47:24 AM||CUDA devices found 12/26/2008 6:47:24 AM||Coprocessor: GeForce 8600 GT (1) 12/26/2008 6:47:25 AM|SETI@home|URL: http://setiathome.berkeley.edu/; Computer ID: 3710612; location: home; project prefs: home 12/26/2008 6:47:25 AM||General prefs: from SETI@home (last modified 25-Dec-2008 21:33:49) 12/26/2008 6:47:25 AM||Computer location: home 12/26/2008 6:47:25 AM||General prefs: using separate prefs for home 12/26/2008 6:47:25 AM||Reading preferences override file 12/26/2008 6:47:25 AM||Preferences limit memory usage when active to 1791.21MB 12/26/2008 6:47:25 AM||Preferences limit memory usage when idle to 3224.18MB 12/26/2008 6:47:25 AM||Preferences limit disk usage to 100.00GB |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
>When a task is sent to a host, BOINC decides which application will do the work. There were considerable changes made to the way BOINC estimates times for work, needed because the GPU processing doesn't use much CPU time. A side effect seems to be that the estimates can get really bad, so BOINC may simply think the work you already have is going to take much longer than it really will and is not fetching more work on that basis. From other posts I guess that adjusts itself after awhile, but maybe not until after all the 6.03 work is finished. Joe |
Bounce Send message Joined: 3 Apr 99 Posts: 66 Credit: 5,604,569 RAC: 0 |
thanks for the heads up. wait and see seems the way to be. |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
>When a task is sent to a host, BOINC decides which application will do the work. You are running it as a service (see your log, 4th line down, highlighted in red). It won't run cuda tasks when installed that way. You need to install BOINC as single user or multi-user, anything but protected mode. Apparently cuda doesn't work in protected mode. BOINC blog |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
... That's true on Windows Vista, which has "improved security", but a service install is OK on WinXP. Joe |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
... True that XP doesn't have a "protected" mode, but I still think you'll find it has issues communicating to a service. Maybe he could reinstall BOINC as single or multi-user and see how it goes. BOINC blog |
Bounce Send message Joined: 3 Apr 99 Posts: 66 Credit: 5,604,569 RAC: 0 |
the intent is for it to run even when the machine is logged out and idle. single and multiuser run only while [someone] is logged into the machine. no? interesting twist. as of a reboot about an hour ago, only 1 concurrent WU is running. this hasn't been the case for a long time (year(s) maybe). and still no updates as the queue winds down to nearly zero WU remaining. maybe a reinstall is needed anyway? |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
the intent is for it to run even when the machine is logged out and idle. single and multiuser run only while [someone] is logged into the machine. no? Yes you are correct about running it as a service. Possible things to try are: 1. Zero the long_term_debt figures in client_state.xml, but be careful! You'll need to stop it, change client_state and restart it after. 2. Changing the number of days cache, which sometimes triggers BOINC to go and get more work. Don't set it too high as it won't ask for work because it will assume that it can't complete work by the due date. At the moment my machines are on a 1 day cache because I turn them off during the day (too hot). BOINC blog |
Bounce Send message Joined: 3 Apr 99 Posts: 66 Credit: 5,604,569 RAC: 0 |
I disconnected and reconnected to the project. It flushed the WU cache and downloaded various non-WU dll's that looked like CUDA stuff. It's also getting more WUs after finishing them like it should. However it's now processing with the GPU and 1 of the 2 AMD Athlon processors it recognizes during restarts. I double-checked my BOINC and SETI preferences and all configs (default, home, and work) are set to 4 CPUs and 100% of available CPUs (2 different settings depending on the version of BOINC being run). GPU processing it "nifty" but at 0.3 processor equivalencies, I'd just as soon have my dual core working like it did than running with one core tied behind my back. I'm still game to keep trying to get all 3 processors working as long as anyone else is patient enough to keep suggesting possibilities. |
Jason Tobin Send message Joined: 28 Apr 07 Posts: 21 Credit: 1,168,873 RAC: 0 |
Ok my boinc client is now processing CUDA WU's but there is no graphics... and when there is graphics... there is a multi-colour haze. Anyone experiencing this? If so has anyone notified the proper authorities? With that as well who are the proper authorities to report these problems to? By the way thanks for the help earlier it did help my situation. Also I think people are already aware of problems like "time to completion" inaccurate messages etc.. Jason Tobin Alien Hunting Specialist |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.