Message boards :
Number crunching :
OpenCL NV MultiBeam v8 SoG edition for Windows
Message board moderation
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 21 · Next
Author | Message |
---|---|
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 ![]() ![]() |
Yeah, I've been running my Macs with a build of Tbars with pretty good success for a while so I have a good baseline for comparison. I'll see if he will compile one with the SoG switch. Thanks, Chris |
Joe Januzzi ![]() Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 ![]() ![]() |
The right Kernel (MultiBeam_Kernels_r3381.cl) seem to lower my CPU usage:) More FYI. -v 8 switch test with right kernel (MultiBeam_Kernels_r3381.cl). Hopefully some better data. I'll be going back to rev. 3366, because it's faster with less CPU usage for my system. With or without using the -v 8 switch. Someone with CPU core's to spare, might be better off with this newer revision (rev. 3381). If you need some more testing, trying different parameter's from the test results or a different revision to try, just let me know. Joe With -v 8 switch http://setiathome.berkeley.edu/workunit.php?wuid=2075019060 http://setiathome.berkeley.edu/workunit.php?wuid=2075019072 http://setiathome.berkeley.edu/workunit.php?wuid=2075018859 http://setiathome.berkeley.edu/workunit.php?wuid=2074685866 Without -v 8 switch http://setiathome.berkeley.edu/workunit.php?wuid=2073773400 http://setiathome.berkeley.edu/workunit.php?wuid=2074976160 http://setiathome.berkeley.edu/workunit.php?wuid=2074975852 http://setiathome.berkeley.edu/workunit.php?wuid=2074936776 ![]() Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5516 Credit: 528,817,460 RAC: 242 ![]() ![]() |
Next time you will see such coherence please record AR of involved tasks also. When I started this last night I was looking at the angles to see what affect the AR was since these were about 2 minutes slower. AR was 0.42-0.44 which tends to be the majority of what I normally see I've been reviewing those that processed over night and except for few high angle most are within the range of .42-.44 I've never got the hang of off line testing, so I stuck a exclusion line into a cc_config and prevented seti from running on 3 of the 4 GPU and currently only have my GPU 0 running. I restarted it with all the instances at the same time it's using anywhere from 22-25% CPU Utilization of 16 Hyperthread cores. (looks like it's spread the load across several cores 6-7) As they approach 60% complete they begin to increase CPU demand up to a final amount of 30% total of CPU Utilization. Once all complete and new task start the value drops to 22-25% total CPU Utilization I had a series of 20dec10 with AR 1.36 those used significantly less CPU (8-10% total CPU) than the other work units. Once those cleared I ran with 2 GPUs next. CPU Utilization of low 27% rising up to 46% total CPU as completion approaches. I'm going to remove the -v 8 as I can't find the AR with it in there. I've been forced to look at my wingman's stderr for the AR Think I'm going to switch back to r3366 for now. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
I can't see any indication the build is actually using the SoG feature, However, it's giving an Unknown Error the other builds didn't. So, I suppose the -DUSE_SIGNALS_ON_GPU option worked; ERROR: Available memory buffer of 128MB too small for PulseFind (168.7MB required), increase -sbs N value; exiting... It seems it's a little slower than the normal non-SoG build and uses a little more CPU. I haven't tried the nVidia SoG build yet... |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 ![]() ![]() |
Yeah "signals_on_gpu" shows up in the build features of the Windows app... Doesn't look like its get included for some reason. Chris |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
Well, something is causing it to require more SBS and the only change was adding signals_on_gpu to the configure line. It is a newer repository version, if that somehow matters. The nVidia version isn't any different; Build features: SETI8 Non-graphics OpenCL USE_OPENCL_NV OCL_CHIRP3 ASYNC_SPIKE FFTW SSSE3 64bit It also doesn't show much change from the non-SoG version... I wonder if it would be better with just -DSIGNALS_ON_GPU instead of; ...-DUSE_OPENCL -DUSE_OPENCL_HD5xxx -DUSE_SIGNALS_ON_GPU -DUSE_SSSE3 -DUSE_FFTWF -DSETI7 -DSETI8 -DOCL_CHIRP3 -DASYNC_SPIKE... |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
SIGNALS_ON_GPU just one of possible paths and other defines regulate another paths. To switch config lines not too hard actually. And don't build from head - it's under development currently. Use same rev as for published windows build. It's more stable though lack of recently added features. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
Hmmm, it seems -DSIGNALS_ON_GPU has awoken the beast. But now it says; #error: SIGNALS_ON_GPU path currently implemented only for ZERO_COPY path From my experience ZERO_COPY slows down a shorty by about a minute, and ASYNC_SPIKE is good for a few seconds faster times.... We'll see. Now I'm seeing the same Errors as the last build; sah_v7_opt/src/counters.h:251:9: error: use of undeclared identifier '__rdtsc' Is there an Easy way to Fix these Errors? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 ![]() ![]() |
I gave up on the counters and just slashed out the offending lines...as before. If I knew how I'd just disable the counters entirely, similar to the older builds that don't have the counters. The ATI App was just a little slower than the non-SoG build, the nVidia App was quite a bit slower. I went back to the non-SoG builds. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
I updated whole builds set. Now better CU loading on PulseFind implemented, maybe result in faster app. Worth to check. |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5516 Credit: 528,817,460 RAC: 242 ![]() ![]() |
where can we download from? Link to new apps? |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 ![]() ![]() |
Good deal. I was already getting outstanding performace on my 570 just running one wu at a time. Would the new pulsefind implementation show up on a fresh compile of the non-SoG AMD app as well or just the SoG version? Thanks, Chris |
Joe Januzzi ![]() Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 ![]() ![]() |
|
Grumpy Swede (I stand with Ukraine) ![]() Send message Joined: 1 Nov 08 Posts: 8927 Credit: 49,849,242 RAC: 65 ![]() ![]() |
I updated whole builds set. Thanks Raistmer. Will try it later this week (maybe not until the weekend), when I get the time to do so. Until now, I've crunched over 11000 WU's with the SoG v3366, and only one invalid, which probably wasn't a "real" invalid anyhow. BTW, Windows Defender (Windows 8.1) didn't like the new file "MB8_win_x86_SSE3_OpenCL_NV_r3401_SoG.7z", and wanted me to send it in to the M$ Head Quarter for further analyzis, which I refused to do. I'm sure it was just because it hasn't been seen before in the wild. ![]() |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 33270 Credit: 79,922,639 RAC: 80 ![]() ![]() |
Good deal. I was already getting outstanding performace on my 570 just running one wu at a time. Would the new pulsefind implementation show up on a fresh compile of the non-SoG AMD app as well or just the SoG version? Yes, both SoG and Non SoG have new Pulsefind. With each crime and every kindness we birth our future. |
![]() ![]() Send message Joined: 16 Jun 01 Posts: 6324 Credit: 106,370,077 RAC: 121 ![]() ![]() |
Good deal. I was already getting outstanding performace on my 570 just running one wu at a time. Would the new pulsefind implementation show up on a fresh compile of the non-SoG AMD app as well or just the SoG version? yes. And the link is: https://cloud.mail.ru/public/DMkN/x4BRCYuAV |
Joe Januzzi ![]() Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 ![]() ![]() |
FYI Here's some SoG 3401 Wu's. http://setiathome.berkeley.edu/workunit.php?wuid=2086366753 http://setiathome.berkeley.edu/workunit.php?wuid=2086366685 http://setiathome.berkeley.edu/workunit.php?wuid=2086358459 http://setiathome.berkeley.edu/workunit.php?wuid=2086374320 Joe ![]() Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
Rasputin42 Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0 ![]() |
I am getting a very spiky utilization and it takes much longer than r3366. What are the requirements for r3401? |
![]() ![]() ![]() Send message Joined: 27 May 99 Posts: 5516 Credit: 528,817,460 RAC: 242 ![]() ![]() |
Yea I'm seeing large continuous kernal usage. Times look to be about 4 minutes slower than Version 3366 Tomorrow I'm going to try the NonSoG version and see what that does in regards to CPU and times |
Joe Januzzi ![]() Send message Joined: 13 Apr 03 Posts: 54 Credit: 307,134,110 RAC: 492 ![]() ![]() |
Version 3401 is using more CPU. My CPU usage is 85 - 100%, closer to 100%. I'll run all night with this Version. Hopefully it gives some good data. My GTX 560 Ti is now using SoG Version 3366. I'm also running 1 CPU Wu. Joe ![]() Real Join Date: Joe Januzzi (ID 253343) 29 Sep 1999, 22:30:36 UTC Try to learn something new everyday. |
©2022 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.