CUDA victim #1

Author	Message
Instytut Dziennikarstwa Volunteer tester Send message Joined: 27 Mar 03 Posts: 19 Credit: 20,629,934 RAC: 0	Message 841556 - Posted: 18 Dec 2008, 19:20:19 UTC Last modified: 18 Dec 2008, 19:20:36 UTC http://setiathome.berkeley.edu/workunit.php?wuid=381019093 What the hell? ID: 841556 ·

Dorsilfin Volunteer tester Send message Joined: 28 Jul 08 Posts: 69 Credit: 4,484,890 RAC: 0	Message 841560 - Posted: 18 Dec 2008, 19:27:27 UTC im popping out plenty of those super small WU's in literally 20 seconds of CPU time, 17 seconds to feed the GPU and then 3 seconds of crunching.. = Done. I saw a few, and im like wow.. sucks for the computer that took that look to do it My City ID: 841560 ·

Instytut Dziennikarstwa Volunteer tester Send message Joined: 27 Mar 03 Posts: 19 Credit: 20,629,934 RAC: 0	Message 841563 - Posted: 18 Dec 2008, 19:28:52 UTC - in response to Message 841560. well, I don't mind the speed, I mind the completely different result for CUDA vs. non-CUDA crunch ID: 841563 ·

Dorsilfin Volunteer tester Send message Joined: 28 Jul 08 Posts: 69 Credit: 4,484,890 RAC: 0	Message 841564 - Posted: 18 Dec 2008, 19:29:18 UTC http://setiathome.berkeley.edu/workunit.php?wuid=381210455 http://setiathome.berkeley.edu/workunit.php?wuid=381210455 http://setiathome.berkeley.edu/workunit.php?wuid=381210434 My City ID: 841564 ·

Dorsilfin Volunteer tester Send message Joined: 28 Jul 08 Posts: 69 Credit: 4,484,890 RAC: 0	Message 841566 - Posted: 18 Dec 2008, 19:29:55 UTC - in response to Message 841563. well, I don't mind the speed, I mind the completely different result for CUDA vs. non-CUDA crunch I havnt gotten any in a while, maybe it was miss matching them with Cuda users.. Shrug My City ID: 841566 ·

Euan Holton Send message Joined: 4 Sep 99 Posts: 65 Credit: 17,441,343 RAC: 0	Message 841582 - Posted: 18 Dec 2008, 19:45:52 UTC - in response to Message 841563. well, I don't mind the speed, I mind the completely different result for CUDA vs. non-CUDA crunch I was lurking on the boards when the first SSE optimised applications came out, and there was considerable outcry then that they gave an 'unfair' advantage, but it died down once optimised versions became available on more platforms and more people made use of them. I can imagine that median CPU performance CUDA-enabled machine owners are actually quite glad of this, as it may come to pass that their box will outperform machines that have a high-end CPU and memory but either have ATI graphics or only basic graphics capabilities, eg servers and dedicated crunch boxes. ID: 841582 ·

popandbob Volunteer tester Send message Joined: 19 Mar 05 Posts: 551 Credit: 4,673,015 RAC: 0	Message 841585 - Posted: 18 Dec 2008, 19:49:21 UTC These are the result of a complete lack of testing! In the Beta test they were NOT validated against stock apps. There is still a problem with high angle ranges. This app really should not have been released so quick. Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957 Or Good Shop? http://www.goodshop.com/?charityid=888957 ID: 841585 ·

Byron S Goodgame Volunteer tester Send message Joined: 16 Jan 06 Posts: 1145 Credit: 3,936,993 RAC: 0	Message 841602 - Posted: 18 Dec 2008, 20:16:02 UTC Last modified: 18 Dec 2008, 20:25:51 UTC Excuse me if I'm mistaken, but most of us spent all of 4-5 days testing it in Beta, and many still had results to upload that never got in becuase the upload server went out. Maybe 1/3 of the tasks I did were validated against. The app couldn't possibly be much more than it was a few days ago when we started testing. Why does the public have it? ID: 841602 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 841612 - Posted: 18 Dec 2008, 20:29:49 UTC - in response to Message 841556. http://setiathome.berkeley.edu/workunit.php?wuid=381019093 What the hell? It's pretty obvious, actually. Your result, and the other result did not match. The work unit is being sent out to a third cruncher. You really need to be much more patient: credit has never been granted unless the work validates, and it has not validated yet. The key word is "yet." This has been true since SETI@Home moved to BOINC. ID: 841612 ·

Wayne Frazee Volunteer tester Send message Joined: 18 Jul 00 Posts: 26 Credit: 1,939,306 RAC: 0	Message 841616 - Posted: 18 Dec 2008, 20:39:16 UTC - in response to Message 841602. Excuse me if I'm mistaken, but most of us spent all of 4-5 days testing it in Beta, and many still had results to upload that never got in becuase the upload server went out. Maybe 1/3 of the tasks I did were validated against. The app couldn't possibly be much more than it was a few days ago when we started testing. Why does the public have it? Agreed. Many of the same issues coming up on the forums were in progress on the beta boards including a number of possible bugs getting the CUDA extensions to work. Additionally when you release something like this, there is a host of CUDA driver documentation and QandA that you like to have available to the community at the same time. May I suggest perhaps grabbing a couple long term active technical members from the community to actively contribute to a QandA wiki or similar that provides more comprehensive assistive documentation for the community for these kinds of releases? -W "Any sufficiently developed bug is indistinguishable from a feature." ID: 841616 ·

SATAN Send message Joined: 27 Aug 06 Posts: 835 Credit: 2,129,006 RAC: 0	Message 841620 - Posted: 18 Dec 2008, 20:42:18 UTC Whilst not being able to benefit from the new version yet. It does appear rushed and untested. If even 10% of the users have a Cuda enabled video card, the project will not be able to maintain enough work. I am more curious as to where the extra work seems to have gone. I thought when the project switched to multibeam there was going to be 14 times the amount of work to do. Now in 2 years has the crunching power available to this project increased by that fold? No. The project can't cope as it is, asking it to cope with this extra strain is just asking for trouble. ID: 841620 ·

Euan Holton Send message Joined: 4 Sep 99 Posts: 65 Credit: 17,441,343 RAC: 0	Message 841654 - Posted: 18 Dec 2008, 21:37:43 UTC - in response to Message 841620. Well, to be honest, I do wonder if the almost unseemly haste to get the CUDA software out is connected to nVidia wanting to get some positive PR out at a given moment and being willing to drop the project some much needed cash (and presumably some technical expertese) in return for such. What I'd really like to see in future versions of BOINC is an ability to farm out WUs to any spare compute resource a machine has, selecting the relevant application according to what's available, but whether the current science application support framework is flexible enough to allow that I do not know. ID: 841654 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 841662 - Posted: 18 Dec 2008, 21:54:19 UTC - in response to Message 841612. Last modified: 18 Dec 2008, 21:54:43 UTC http://setiathome.berkeley.edu/workunit.php?wuid=381019093 What the hell? It's pretty obvious, actually. Your result, and the other result did not match. The work unit is being sent out to a third cruncher. You really need to be much more patient: credit has never been granted unless the work validates, and it has not validated yet. The key word is "yet." This has been true since SETI@Home moved to BOINC. And two more interesting observations: 1) It's the old -9 overflow question again: Linux derivative of Lunatics AK_V8 ran to completion, finding one triplet: stock CUDA found an extra 29 spikes and bailed out early. Which one has the bug? Watch this space. 2) The latest SAH validator seems to have inherited the Astropulse validator bug - "Checked, but no consensus" has been replaced by "Valid", and "pending" has been replaced by "0.00". We progress - backwards. ID: 841662 ·

SATAN Send message Joined: 27 Aug 06 Posts: 835 Credit: 2,129,006 RAC: 0	Message 841682 - Posted: 18 Dec 2008, 22:12:05 UTC Last modified: 18 Dec 2008, 22:32:57 UTC Richard, totally agree with you. After the massive steps forward made by the optimizers and the switch to MB, things do appear to go backwards. ID: 841682 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 841685 - Posted: 18 Dec 2008, 22:28:13 UTC - in response to Message 841662. 1) It's the old -9 overflow question again: Linux derivative of Lunatics AK_V8 ran to completion, finding one triplet: stock CUDA found an extra 29 spikes and bailed out early. Which one has the bug? Watch this space. At beta I seen next situation on my PC: After few VLARS that crashed video driver on Vista screen was distorted (look beta forum for example - there is some picture ) and _every_ task after that finished in ~15 seconds with -9 error. I had to reboot OS to solve situation. ID: 841685 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 841801 - Posted: 19 Dec 2008, 4:11:21 UTC - in response to Message 841662. http://setiathome.berkeley.edu/workunit.php?wuid=381019093 What the hell? It's pretty obvious, actually. Your result, and the other result did not match. The work unit is being sent out to a third cruncher. You really need to be much more patient: credit has never been granted unless the work validates, and it has not validated yet. The key word is "yet." This has been true since SETI@Home moved to BOINC. And two more interesting observations: 1) It's the old -9 overflow question again: Linux derivative of Lunatics AK_V8 ran to completion, finding one triplet: stock CUDA found an extra 29 spikes and bailed out early. Which one has the bug? Watch this space. 2) The latest SAH validator seems to have inherited the Astropulse validator bug - "Checked, but no consensus" has been replaced by "Valid", and "pending" has been replaced by "0.00". We progress - backwards. If, as a rule of thumb one out of every ten bug fixes introduces a new bug, then it seems intuitively that the one way to not introduce new bugs is to stop fixing existing ones. I know this isn't a popular position, but given the size of the task to be done, and the size of the staff available to do it, it seems to me that, in general, things are going pretty well. Sure, it'd be better if some of these had been caught in Beta, and maybe they would have if the Beta had a longer cycle, but if we weren't complaining about the spurious -9 overflows and other issues, we'd be complaining about all the great stuff that's stuck in Beta. ID: 841801 ·

Paul D. Buck Volunteer tester Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0	Message 841802 - Posted: 19 Dec 2008, 4:13:07 UTC - in response to Message 841800. I am running the CUDA version and the last 20 workunits have finished in an average time of 3 minutes only. My system has two NVidia 260 OCs installed in SLI mode. Does CUDA take advantage of both GPUs? Also, since the NVidia 260 is one of the first GPUs to perform double precision floating point instructions, will this provide a significant edge. Does the SETI client use double precision or single precision floating point in its fourier transform module? No ... See: cudaAcc_initializeDevice: Found 1 CUDA device(s): Device 1 : GeForce GTX 260 cudaAcc_initializeDevice is determiming what CUDA device to use... user specified SETI to use CUDA device 1: GeForce GTX 260 SETI@home using CUDA accelerated device GeForce GTX 260 BOINC is only seeing one CUDA device and is using it ... if SLI mode links the two GPU cards ... that is why. You should be able to see if BOINC sees one or two cards on start up of BOINC where it tells you how many devices it found. ID: 841802 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 841803 - Posted: 19 Dec 2008, 4:17:51 UTC - in response to Message 841802. I am running the CUDA version and the last 20 workunits have finished in an average time of 3 minutes only. My system has two NVidia 260 OCs installed in SLI mode. Does CUDA take advantage of both GPUs? Also, since the NVidia 260 is one of the first GPUs to perform double precision floating point instructions, will this provide a significant edge. Does the SETI client use double precision or single precision floating point in its fourier transform module? No ... See: cudaAcc_initializeDevice: Found 1 CUDA device(s): Device 1 : GeForce GTX 260 cudaAcc_initializeDevice is determiming what CUDA device to use... user specified SETI to use CUDA device 1: GeForce GTX 260 SETI@home using CUDA accelerated device GeForce GTX 260 BOINC is only seeing one CUDA device and is using it ... if SLI mode links the two GPU cards ... that is why. You should be able to see if BOINC sees one or two cards on start up of BOINC where it tells you how many devices it found. I would have to imagine that if SLI was in place, CUDA would only see one GPU, because that's the way SLI appears to any games that go to use it, because of the hardware drivers for the cards and the configuration of them. It's like a RAID array. Multiple physical discs, one logical disc. With SLI, multiple physical cards, one logical card. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 841803 ·

Paul D. Buck Volunteer tester Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0	Message 841804 - Posted: 19 Dec 2008, 4:18:52 UTC Ned, Technically it is worse than that. ONe study I had in college days indicated that the "real" number of bugs actually stays constant. Each bug removed installs a new one that is more subtle and or less likely to cause problems. The one in ten number is actually for each 10 lines of code the AVERAGE programmer makes one error. This goes down with skill level to rise to higher LOC counts with very skilled programmers only making a mistake about one in 200 LOC. That is why higher level languages are "better" for writing code in that the error rate per feature decreases. 5th gen languages like PowerBuilder and the like greatly reduce the code required where we were making 100 window applications with about 15,000 lines of code and most of those were actually in the framework we were using (in which I had to override code to fix bugs in the FW)... ID: 841804 ·

1mp0Â£173 Volunteer tester Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0	Message 841809 - Posted: 19 Dec 2008, 4:36:07 UTC - in response to Message 841804. Technically it is worse than that. ONe study I had in college days indicated that the "real" number of bugs actually stays constant. Each bug removed installs a new one that is more subtle and or less likely to cause problems. The one in ten number is actually for each 10 lines of code the AVERAGE programmer makes one error. This goes down with skill level to rise to higher LOC counts with very skilled programmers only making a mistake about one in 200 LOC. It probably depends whose statistics you use, and how you measure. Besides, 87.4% of all statistics are made up. ID: 841809 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.