Message boards :
Number crunching :
CUDA cards: SETI crunching speeds
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Ah, OK. Well, please Crunch3r to build non "-poll" Linux version indeed. AFAIK there is no other CUDA app for Linux available... |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Returning to our original topic.... Overview: ![]() (Direct link) Detail: ![]() (Direct link) Three new contributions added: SoNic's 9500 GT (host 4777950), n9zl's GTX 280 (host 3375638), and Fred W's GTX 295 (host 3452433). SoNic and n9zl are still in the early stages of collecting data, so forgive me if I concentrate on Fred's GTX 295. He comments "I now have BOTH halves of my GTX295 running :))", suggesting it might have been running single-ended for part of this time. The overview chart is most interesting here, especially at VLAR where the times are tightly clustered. Fred has two quite distinct groupings, one at 3000 seconds, and another at 4000 seconds. Both are better than my overclocked 9800GTX+ at ~4270 seconds, but I know if I'd paid out for a GTX295, I'd hope to get the 3000 second model: a 7% improvement from 4270 to 4000 hardly seems worth the extra money (even if it is a dual core). Fred's data contains the start times for all his 459 tasks, so I'll delve into my database and see if there's an identifiable speed-change part way through his run. Edit - the time-banding doesn't seem to depend either on AR or time started - the VLARs start True_AR ................... Duration .. Start_time 0.011498800262108 ... 4040 ... 30/01/2009 22:55:58 0.011498800262108 ... 3073 ... 30/01/2009 23:05:31 0.011498800262108 ... 3017 ... 30/01/2009 23:09:30 0.011498800262108 ... 3047 ... 30/01/2009 23:18:34 0.011498800262108 ... 3065 ... 30/01/2009 23:24:31 0.011498800262108 ... 3091 ... 30/01/2009 23:56:44 0.011498800262108 ... 3052 ... 30/01/2009 23:59:47 0.011498800262108 ... 4009 ... 31/01/2009 00:03:18 0.011498800262108 ... 3032 ... 31/01/2009 00:09:21 0.011498800262108 ... 3019 ... 31/01/2009 00:15:36 0.011498800262108 ... 3003 ... 31/01/2009 00:48:15 0.011498800262108 ... 3037 ... 31/01/2009 00:50:39 0.011498800262108 ... 2984 ... 31/01/2009 00:59:53 and end True_AR ................... Duration .. Start_time 0.010283171229633 ... 3990 ... 06/02/2009 00:57:20 0.010812442064176 ... 3826 ... 06/02/2009 01:00:52 0.010812442064176 ... 3828 ... 06/02/2009 02:27:31 0.010812442064176 ... 3988 ... 06/02/2009 02:28:51 0.010812442064176 ... 3992 ... 06/02/2009 03:51:27 0.010812442064176 ... 3821 ... 06/02/2009 04:00:58 0.010812442064176 ... 4002 ... 06/02/2009 05:22:02 0.010812442064176 ... 3826 ... 06/02/2009 05:27:16 0.010812442064176 ... 4021 ... 06/02/2009 06:49:40 0.010812442064176 ... 3820 ... 06/02/2009 06:50:55 0.010812442064176 ... 3987 ... 06/02/2009 08:13:21 0.010812442064176 ... 3822 ... 06/02/2009 08:17:32 0.010812442064176 ... 3829 ... 06/02/2009 09:43:00 0.010812442064176 ... 3997 ... 06/02/2009 09:43:53 0.010812442064176 ... 3826 ... 06/02/2009 11:02:16 0.010812442064176 ... 4014 ... 06/02/2009 11:07:04 0.010812442064176 ... 3819 ... 06/02/2009 13:00:07 0.010812442064176 ... 3979 ... 06/02/2009 13:07:04 I wonder if it matters whether both 'cores' are doing VLAR at the same time? |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 309 Credit: 70,759,933 RAC: 3 ![]() |
Richard: I sent you data (an access mdb) from my 280 and 9800gtx+ All the data came from seti beta project, not the regular seti. |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
... Fred's GTX 295. He comments "I now have BOTH halves of my GTX295 running :))", suggesting it might have been running single-ended for part of this time. The overview chart is most interesting here, especially at VLAR where the times are tightly clustered. Fred has two quite distinct groupings, one at 3000 seconds, and another at 4000 seconds. Both are better than my overclocked 9800GTX+ at ~4270 seconds, but I know if I'd paid out for a GTX295, I'd hope to get the 3000 second model: a 7% improvement from 4270 to 4000 hardly seems worth the extra money (even if it is a dual core). Fred's data contains the start times for all his 459 tasks, so I'll delve into my database and see if there's an identifiable speed-change part way through his run. I can't be totally sure, but I believe that all the results have been collected since I managed to get the second core crunching. A couple of other possibilities spring to mind: 1. One core is actually driving my monitor and this is my main machine that is used for all the usual domestic stuff a web-connected computer is used for. The monitor on the second core is fired up only occasionally (e.g. Ozz & James on iPlayer and I haven't needed that since stats collection began) so that core is pretty well devoted to crunching ATM. 2. (More clutching at straws) I happened to see 2 VLAR CUDA starting up very close together earlier today - i.e. progress on one was < 2% when the second started. Up to about 8.5% the progress bars alternated, whilst one was moving the other was invariably stopped. Beyond 10% progress this did not seem to apply. Note that this is after the initial 18 seconds of no progress while the CPU is preparing and loading data into the GPU. Also this effect does not seem to apply to non-VLAR units. F. ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Richard: I sent you data (an access mdb) from my 280 and 9800gtx+ All the data came from seti beta project, not the regular seti. OK, I see it now - had my head buried in databases and spreadsheets for a while there. No problem with it being Beta data - Beta is running the identical v6.08 to Main while they struggle to find a solution to the VLAR issue. I'll do it next time - too late for another run this evening. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
@Richard Could you change color (or dot shape? maybe open dot, not filled) for 9500and 8800 curves better distinction ? |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
@Richard I'm trying to keep the dots as small as possible - with ~2,500 points from my own machines alone, it can get a bit crowded in there! But I'll see if I can find another colour, and maybe take out plots of some of the cards if there's nothing new to show. |
![]() Send message Joined: 24 Dec 00 Posts: 140 Credit: 2,963,627 RAC: 0 ![]() |
Good job on those graphs, Richard. If you can make that distinction that Raistmer was talking about would be even better :) My CUDA app is aborting the VLAR WUs... it's one of Raistmer mods. I will keep adding new zip files every day on that site, just tell me when to stop :) |
![]() Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 ![]() |
Well, this confirms my short run tests where I saw little difference between the runing of SaH tasks on my 9800 GT or the GTX 280 card ... I can suggest that there is actually a possible logical reason for this apparent paradox ... if the capabilities of the 9800GT are sufficient to the task. Adding capabilities like those of the more expensive cards are just so much waste as they are not used. Only the programmers can say for sure ... but what good is 3,000 computing elements if all that you need are 100? if all you need are 100, then 2900 are twiddling their thumbs ... GPU Grid experience says that there should be at least a factor of 2 to as much as 5 change in processing speed when going from the 9800 class to the GTX 260, 280, 295 (there seems to be only small differences in the processing speeds of these cards) ... |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 ![]() ![]() |
@Paul We could check if this powerful GPU starve or the reason in something else. Could you try next: 1) take AppTimes.exe from KWSN testbench 2) run CUDA app from this tool in separate directory (as usual standalone run): a) .\AppTimes .\your_cuda_app.exe b) .\AppTimes .\your_cuda_app.exe -poll And post resulting times (AppTimes will print elapsed and CPU times) here. Very interesting, will we get big elapsed time difference between this two ways on top GPU or not? |
![]() Send message Joined: 24 Dec 00 Posts: 140 Credit: 2,963,627 RAC: 0 ![]() |
GPU Grid experience says that there should be at least a factor of 2 to as much as 5 change in processing speed when going from the 9800 class to the GTX 260, 280, 295 (there seems to be only small differences in the processing speeds of these cards) ... It is interesting to find what happens here. I am curious to see where my 9500GT is compared to the other cards. |
![]() ![]() Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 ![]() |
You've got a PM Richard... I did a run of the tasks this morning on the Tesla. :) Thanks to Raistmer for his assistance with getting the script working properly for me. Can't wait to see the graph. I swear the thing is the same speed as my GTX+. "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov ![]() |
![]() ![]() Send message Joined: 24 May 99 Posts: 15 Credit: 1,651,989 RAC: 0 ![]() |
Hello richard, I've sent the 3 files from your script about my 8800 GTS (320 Mo). ...small contribution... Thanks. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Sorry for the delay, been trying to tidy up the process a bit. Overview: ![]() (Direct link) Detail: ![]() (Direct link) I'm now not showing every data series on every chart. Some are 'n/p' (not plotted): for those that are shown, the number of points is in the first column. This time, I'm concentrating on the new data from BeemerBiker and westsail. BB has got his script working for his 9800GTX (Beta host 37473) and GTX280 (Beta host 37273) - the same app is running at Beta, so nothing wrong in comparing the figures: in fact it's helping us with additional Angle Ranges to fill in the gaps. We also have the first few - very few - points from westsail's Tesla (host 4764467). He's described it as "...an early engineering sample built on a GX200 chip. It only has 2 core arrays so only 192 cores. Also the clock is only 750mhz", but I thought you'd all like to see it - definitely the fastest yet at VHAR, even though the CPU setup time probably narrows the gap between the cards on these short runs. If you can run another scrape on the C1060 tonight or tomorrow morning, Brandon (and I can remember how to get data out of your upload handler), I can get it plotted quickly - it takes a bit of setup time to add the other cards you mentioned in your PM. Edit - the 'n/p' is only showing on the overview chart - Excel is remembering something I told it to forget. Drat. Better next time. |
![]() ![]() Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 ![]() |
Sweet thanks! Sent you a new run with about half a days tasks. I'm most curious to see the VLAR performance but I think the teamwork app kills them. Should I try another app? Just been waiting for 6.09. "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov ![]() |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
Sweet thanks! Sent you a new run with about half a days tasks. I'm most curious to see the VLAR performance but I think the teamwork app kills them. Should I try another app? Just been waiting for 6.09. There are versions of the teamwork app both with and without VLAR kill so the option is there. F. ![]() |
![]() ![]() Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 ![]() |
Thanks *surfs to lunatics* Going to put the one that crunches VLARs on the Tesla host. Will dump Richard some more data in say 12 hours. It ought to be interesting. I just can't wait until we can do AP on Cuda. :) "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov ![]() |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 ![]() |
Thanks *surfs to lunatics* Going to put the one that crunches VLARs on the Tesla host. Will dump Richard some more data in say 12 hours. It ought to be interesting. I just can't wait until we can do AP on Cuda. :) Interesting indeed. The VLARs take about 66 mins on my 295. F. ![]() |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
Quick re-chart of the detail view to get the extra data points for the Tesla, and the counting working properly. ![]() (Direct link) No point showing the overview with VLARs while he's killing them all ;-) |
![]() ![]() Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 ![]() |
lol, VLAR_cuda_fix now running on C1060 ;) "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov ![]() |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.