CUDA cards: SETI crunching speeds

Message boards : Number crunching : CUDA cards: SETI crunching speeds
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21746
Credit: 7,508,002
RAC: 20
United Kingdom
Message 861694 - Posted: 3 Feb 2009, 23:55:02 UTC - in response to Message 861661.  

I've scraped 67 results so far and they are all a very boring AR=2.7 and they each take about 16mins to run on my nVidia 8600GT. Meanwhile, the CPU (AMD AthlonXP X2) is also running two AP WUs.

Hopefully it will run through something more varied soon-ish...

Now up at 92 results and still all VHAR. Mainly "2.7" and just now a few "2.5". I'll hold off for a better spread before hassling RH for his graphing skills!

One thing I've noticed is that one core is pretty much holding at 75% to 95% for the CUDA app. Is the CPU working in parallel to the GPU or is there a lot of CPU preparatory work being done before handing over to the GPU to finish off?


Happy fast crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 861694 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 861699 - Posted: 4 Feb 2009, 0:05:32 UTC - in response to Message 861694.  

I've scraped 67 results so far and they are all a very boring AR=2.7 and they each take about 16mins to run on my nVidia 8600GT. Meanwhile, the CPU (AMD AthlonXP X2) is also running two AP WUs.

Hopefully it will run through something more varied soon-ish...

Now up at 92 results and still all VHAR. Mainly "2.7" and just now a few "2.5". I'll hold off for a better spread before hassling RH for his graphing skills!

One thing I've noticed is that one core is pretty much holding at 75% to 95% for the CUDA app. Is the CPU working in parallel to the GPU or is there a lot of CPU preparatory work being done before handing over to the GPU to finish off?


Happy fast crunchin',
Martin

On my box, it takes the CPU 18 secs of high activity to prep the data for the GPU and then it is almost idle for the rest of the time.

F.
ID: 861699 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 861820 - Posted: 4 Feb 2009, 9:06:53 UTC

Here's today's update. Well, last night's actually, but I went to bed before the message boards were back up.


(Direct link)

More data received from both BMaytum and Raistmer - thanks, guys. Anyone with a different graphics card want to join in the fun? It would be nice to see what a 260/280/295 can do.
ID: 861820 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 861824 - Posted: 4 Feb 2009, 9:15:27 UTC

One thing I've noticed is that one core is pretty much holding at 75% to 95% for the CUDA app.

On my box, it takes the CPU 18 secs of high activity to prep the data for the GPU and then it is almost idle for the rest of the time.

I think the difference is between the Windows and Linux implementations. Am I right in thinking that the GPUgrid applications are reported to be the other way round?

@ Martin - my graphing skills, such as they are, are borrowed from the Dark Side too - you may have to pick an arrow from your own quiver. Just make sure you choose one that can do logarithmic scales - AR looks very silly if plotted on a linear scale.
ID: 861824 · Report as offensive
Profile Westsail and *Pyxey*
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 338
Credit: 20,544,999
RAC: 0
United States
Message 861830 - Posted: 4 Feb 2009, 9:39:56 UTC - in response to Message 861820.  

Count me in!
GTX260 here,if someone could PM me what to do to get the data.
Thanks guys, keep krunchin'


"The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov
ID: 861830 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 861837 - Posted: 4 Feb 2009, 10:11:39 UTC - in response to Message 861830.  

Count me in!
GTX260 here,if someone could PM me what to do to get the data.
Thanks guys, keep krunchin'

Don't need to keep it private, it's all in public on these boards.

Make yourself an empty folder to work in.

Copy the script from the second message in this thread.

Save it in your work folder with a .BAT extension. Right-click on it, and choose 'Make shortcut'.

Right-click on the shortcut, and choose 'Properties'.

Look at the 'Target' box - the first editable one. Leave the existing text unchanged, but at the end, type a space, and then the location of your BOINC Data folder. If you don't know where your data folder is, refer to The Big BOINC 6 Answer Thread. Your Windows machines are XP, so it'll probably be something like

"C:\Documents and Settings\All Users\Application Data\BOINC"

Because that line has spaces in it, you'll need to include the quotation marks.

OK the properties, and double-click the shortcut. You should see three new files appear in the work folder: All_ar.txt, Start_times.txt, and End_times.txt: they should have data in them (not be empty). You'll be able to check that when All_ar.txt pops open in Wordpad - just save it and close Wordpad - that step tidies up some formatting. If the file is empty, go back and check the previous steps.

Move the shortcut somewhere convenient, and run it a couple of times a day. Once you've caught a decent amount of data, say in a couple of days' time, zip up the three text files, and send them to me at initial dot surname at btinternet dot com - that way, we'll build up the comparison on a single combined graph.


ID: 861837 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 861841 - Posted: 4 Feb 2009, 10:39:24 UTC
Last modified: 4 Feb 2009, 11:07:11 UTC

@Richard: emailed some results earlier, but forgot entirely that I throttled everything back (to 50% underclock, from 20% OC) over the really hot days here, so the times won't be a really useful comparison to anything (fairly scattered). Also forgot to to the open in wordpad/save trick to fix the line endings... doh... Oh well, will try harder for some better quality data next time.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 861841 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21746
Credit: 7,508,002
RAC: 20
United Kingdom
Message 861857 - Posted: 4 Feb 2009, 12:00:35 UTC - in response to Message 861824.  

One thing I've noticed is that one core is pretty much holding at 75% to 95% for the CUDA app.

On my box, it takes the CPU 18 secs of high activity to prep the data for the GPU and then it is almost idle for the rest of the time.

I think the difference is between the Windows and Linux implementations. Am I right in thinking that the GPUgrid applications are reported to be the other way round?

The CPU process isn't doing a "busy wait" polling the GPU is it?!

@ Martin - my graphing skills, such as they are, are borrowed from the Dark Side too - you may have to pick an arrow from your own quiver. Just make sure you choose one that can do logarithmic scales - AR looks very silly if plotted on a linear scale.

I'm very happy to use whatever tool is most appropriate, regardless of the OS.

I do take great issue with certain very dubious practices and outright pollution... But that is for another thread!

Meanwhile, my plots would be a straight vertical line in any case... Still churning AR=2.7 :-(

Happy crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 861857 · Report as offensive
Profile Westsail and *Pyxey*
Volunteer tester
Avatar

Send message
Joined: 26 Jul 99
Posts: 338
Credit: 20,544,999
RAC: 0
United States
Message 861867 - Posted: 4 Feb 2009, 12:18:01 UTC - in response to Message 861837.  

Cool thanks! Will shoot some data your way soon.
"The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov
ID: 861867 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 862452 - Posted: 5 Feb 2009, 22:30:12 UTC

New data from the regulars, and additional input from BeemerBiker:


(Direct link)

But Joseph (BB) - are you sure you're scraping the right stuff? For this thread, we're interested in elapsed (wall) time: from the stuff you sent me, I wonder if you're scraping the web pages for CPU time? Either that, or your 9800GTX+ is much faster than my 9800GTX+......
ID: 862452 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 862475 - Posted: 5 Feb 2009, 22:50:23 UTC
Last modified: 5 Feb 2009, 23:02:37 UTC

9800 GTX+
GPU Engine Specs:
Processor Cores 128
Graphics Clock (MHz) 738 MHz
Processor Clock (MHz) 1836 MHz
Texture Fill Rate (billion/sec) 47.2

Memory Specs:
Memory Clock (MHz) 1100 MHz
Standard Memory Config 512 MB
Memory Interface Width 256-bit
Memory Bandwidth (GB/sec) 70.4

GTX 280
GPU Engine Specs:
Processor Cores 240
Graphics Clock (MHz) 602 MHz
Processor Clock (MHz) 1296 MHz
Texture Fill Rate (billion/sec) 48.2

Memory Specs:
Memory Clock (MHz) 1107 MHz
Standard Memory Config 1 GB
Memory Interface Width 512-bit
Memory Bandwidth (GB/sec) 141.7


~ same crunching speed.. I'm confused.. ;-D


EDIT:
There are two 9800 GTX+ out there..
512 and 1024 MB..
Maybe because of this different speed?

EDIT #2:
OCed GPU?
ID: 862475 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 862490 - Posted: 5 Feb 2009, 23:03:54 UTC - in response to Message 862452.  
Last modified: 5 Feb 2009, 23:19:41 UTC

New data from the regulars, and additional input from BeemerBiker:


(Direct link)

But Joseph (BB) - are you sure you're scraping the right stuff? For this thread, we're interested in elapsed (wall) time: from the stuff you sent me, I wonder if you're scraping the web pages for CPU time? Either that, or your 9800GTX+ is much faster than my 9800GTX+......


Sorry - all I have access to is CPU time. I do not see how to get wall clock time from the resultid table. Also, I failed to parse for the phrase "Invalid" which can show up in the resultid table even though "Over", "Done" and "Success" shows up in the hostid task list such as any of these here Note that my data came from seti beta.


The seti gods have changed (just a few hours ago) the layout of the user account page and I will have to rewrite part of my access database program before I can use it again.

<UPDATE>

That elapsed time is only in that stdout.txt and .old file it would appear. The new file is not very big so I assume a lot of stuff I scraped from my account pages cannot be matched against the stuff in that stdout file. I will look at it later.
ID: 862490 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 862508 - Posted: 5 Feb 2009, 23:50:42 UTC - in response to Message 862490.  

If you look at the second post in this thread, you'll see that all the other participants are parsing local files only for Start_times and End_times: no reference to the website at all.

If you can integrate local records (start, end, times) with web data (validation, credit), that would be a major development in research tool technology. The biggest hurdle is matching tasks: the local records all relate to WU/task names, whereas the web gives you WU/task IDs. The only way I've found to match the two is to use an Excel function to pull the task name out of the tooltip text on the webpage.

But while you're thinking about that one, try the recipe at message 861837 in this thread, and see how you get on.

@ Sutaru,

As it happens, my 9800GTX+ is an overclocked 512MB model. But the significant difference is that I am plotting GPU processing times, whereas BB only supplied CPU times. That C vs. G makes all the difference.
ID: 862508 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 862670 - Posted: 6 Feb 2009, 11:21:12 UTC - in response to Message 862475.  

9800 GTX+
GPU Engine Specs:
Processor Cores 128
Graphics Clock (MHz) 738 MHz
Processor Clock (MHz) 1836 MHz
Texture Fill Rate (billion/sec) 47.2

Memory Specs:
Memory Clock (MHz) 1100 MHz
Standard Memory Config 512 MB
Memory Interface Width 256-bit
Memory Bandwidth (GB/sec) 70.4

GTX 280
GPU Engine Specs:
Processor Cores 240
Graphics Clock (MHz) 602 MHz
Processor Clock (MHz) 1296 MHz
Texture Fill Rate (billion/sec) 48.2

Memory Specs:
Memory Clock (MHz) 1107 MHz
Standard Memory Config 1 GB
Memory Interface Width 512-bit
Memory Bandwidth (GB/sec) 141.7


~ same crunching speed.. I'm confused.. ;-D
...

Something I read 3 or 4 weeks ago (don't remember where), suggested that CUDA work would be like doing textures. So as a very uncertain hypothesis, perhaps the "Texture Fill Rate" specification may be very significant.
                                                              Joe
ID: 862670 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 862730 - Posted: 6 Feb 2009, 14:46:04 UTC

Maybe in a future development it will be possible to "measure" somehow the real GPU time (thru the CPU app that feds it) and in this way a real estimate will be possible.
ID: 862730 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 862770 - Posted: 6 Feb 2009, 17:40:08 UTC - in response to Message 862508.  
Last modified: 6 Feb 2009, 17:46:39 UTC


If you can integrate local records (start, end, times) with web data (validation, credit), that would be a major development in research tool technology. The biggest hurdle is matching tasks: the local records all relate to WU/task names, whereas the web gives you WU/task IDs. The only way I've found to match the two is to use an Excel function to pull the task name out of the tooltip text on the webpage.


I now have a perl script that retrieves start, stop unix style timestamps and task names from the stdoutdae "txt" and "old" local seti project files. I also have a VS2008 C# program that walks the boinc project statistic pages for a given user and creates an access database (the one I sent you earlier).

I am looking at adding those timestamps into my database and will notify you the url when I get them in. It would be nice if that wallclock info was already in the resultsid statistics. GPUGRID provides some GPU time info but unfortunately no results.

I also have a perl script that automates web login to the projects I am running as I got tired of going to each project to manually run the "merge computers by name" script. I have not finished the merge part but I did manage to get the autologin part working for each of my 10 projects which was a major effort (for me) and my first ever perl script.
ID: 862770 · Report as offensive
Profile SoNic

Send message
Joined: 24 Dec 00
Posts: 140
Credit: 2,963,627
RAC: 0
Romania
Message 862772 - Posted: 6 Feb 2009, 17:43:34 UTC - in response to Message 862770.  
Last modified: 6 Feb 2009, 18:16:51 UTC

I just find out (thanks Raistmer) that the wall time of GPU it is provided in some results like this one (mine):

http://setiathome.berkeley.edu/result.php?resultid=1152171164

It has a wall-time of 3174.3 sec and a CPU time of 164.3594 sec.
It is just a matter of standardization I guess...
Now I can compare the speed of my GeForce 9500 GT: similar unit on my C2D E4300 @2.4Ghz takes approx 4300-4400 sec.

Actually, CUDA doesn't look so fast now:

http://setiathome.berkeley.edu/workunit.php?wuid=408776650

It is one of my GeForce 9500GT compared with a E8400 @3GHz with optimized app running. CPU time looks great for CUDA, but actual wall-time is longer...

If that wall-clock time info is right, CUDA it's just a way of adding another processor, but not like it is claimed (that works 5-10 times faster)!
Just a marketing hype from the guys at NVidia?
ID: 862772 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 862783 - Posted: 6 Feb 2009, 18:26:18 UTC - in response to Message 862772.  
Last modified: 6 Feb 2009, 18:30:57 UTC

I'm not so sure that wallclock time is acurate. I'm showing one I did on CPU and it is showing a wallclock time of 10515.8 but a CPU time of 9269.751. Another one showed 9785.6W/C and 9349.06 CPU.



This is the 10515.8 one http://setiathome.berkeley.edu/result.php?resultid=1150583601


PROUD MEMBER OF Team Starfire World BOINC
ID: 862783 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 862787 - Posted: 6 Feb 2009, 18:33:28 UTC

Another day, another update.


(Direct link)

I've taken BB's CPU timings off the chart while he works on his Perl script - it'll be great when that comes through, because all the other 2xx-series card owners I've invited have been too bashful to come to the party.

Two new contibutions today: an 8600GT from Brodo's host 4735693, and another 8600GT from Martin (ML1) - no host ID, I'm afraid.

Brodo - not yet a lot of overlap between AR data (from cache) and time data (from logs). If you could keep running the script, please, and send me the data again in a few days, the graph will fill out - at the moment, your 10 results have identical AR, and all the times are with a 3-second (0.1%) range.

Martin - as regular readers will know - is a Linux user (running SETI@home MB CUDA 608 Linux 64bit SM 1.0 - r06 by Crunch3r), so that's a first for this charting sequence. So it's particularly useful to have Brodo's little point plumb in the middle of Martin's trendline: on the limited evidence so far, the GPU speeds of the Linux and Windows apps are the same. Just need to get the Linux CPU usage down a bit, so the CPU can do something useful at the same time.
ID: 862787 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 862788 - Posted: 6 Feb 2009, 18:34:43 UTC - in response to Message 862772.  

I just find out (thanks Raistmer) that the wall time of GPU it is provided in some results like this one (mine):

http://setiathome.berkeley.edu/result.php?resultid=1152171164

It has a wall-time of 3174.3 sec and a CPU time of 164.3594 sec.
It is just a matter of standardization I guess...
Now I can compare the speed of my GeForce 9500 GT: similar unit on my C2D E4300 @2.4Ghz takes approx 4300-4400 sec.

Actually, CUDA doesn't look so fast now:

http://setiathome.berkeley.edu/workunit.php?wuid=408776650

It is one of my GeForce 9500GT compared with a E8400 @3GHz with optimized app running. CPU time looks great for CUDA, but actual wall-time is longer...

If that wall-clock time info is right, CUDA it's just a way of adding another processor, but not like it is claimed (that works 5-10 times faster)!
Just a marketing hype from the guys at NVidia?


Raistmer's and lunatic code is optimized for cpu (significently SSSE3) so if you are running one of those cpu's the CUDA stats will not be that much better.

The 5-10 applies for the stock or factory boinc project code but IANE.
ID: 862788 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : Number crunching : CUDA cards: SETI crunching speeds


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.