SETI@home v8.12 Windows GPU applications support thread

Message boards : Number crunching : SETI@home v8.12 Windows GPU applications support thread
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1818854 - Posted: 22 Sep 2016, 8:10:35 UTC - in response to Message 1818835.  

Nah, I'm getting the massive overflows from tasks from those tape periods on r3525 also. It isn't just you.



. . PHEW!

. . :)

Stephen

.
ID: 1818854 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 1818863 - Posted: 22 Sep 2016, 10:53:24 UTC

Hi everyone, i've the same with R3330 ...
ID: 1818863 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1818885 - Posted: 22 Sep 2016, 13:25:55 UTC - in response to Message 1818863.  

Hi everyone, i've the same with R3330 ...



. . Thanks for replying, I'm glad it isn;t just me.

Stephen
ID: 1818885 · Report as offensive
AMDave
Volunteer tester

Send message
Joined: 9 Mar 01
Posts: 234
Credit: 11,671,730
RAC: 0
United States
Message 1818910 - Posted: 22 Sep 2016, 15:52:36 UTC - in response to Message 1818786.  

Not 90%, but more than noticeable with r3430, and cuda50.
ID: 1818910 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1819314 - Posted: 24 Sep 2016, 6:44:48 UTC

. . Hello Raistmer,

. . I was wondering if you had RC status for v8.19/r3528 yet?

Stephen

.
ID: 1819314 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1819417 - Posted: 24 Sep 2016, 19:20:45 UTC - in response to Message 1819314.  

I hope to get them released at Tuesday. Up to Eric after that.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1819417 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1819467 - Posted: 24 Sep 2016, 22:22:54 UTC - in response to Message 1819417.  

I hope to get them released at Tuesday. Up to Eric after that.



. . That's good news, thanks for your hard work.

Stephen

.
ID: 1819467 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1820358 - Posted: 29 Sep 2016, 1:40:08 UTC - in response to Message 1819417.  
Last modified: 29 Sep 2016, 2:00:10 UTC

The only strange thing of r3528 compared with r3500 (beta vs main) is the report/usage of "LotOfMem path"

(Radeon HD 6570 1024MB / Catalyst 11.12 / Windows XP)
http://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=72429
http://setiathome.berkeley.edu/show_host_detail.php?hostid=4832843

Both apps run on default (no cmdline)

r3528 reports "LotOfMem path: yes"
http://setiweb.ssl.berkeley.edu/beta/result.php?resultid=24879952
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 128MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: yes
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50


r3500 "LotOfMem path: no"
http://setiathome.berkeley.edu/result.php?resultid=5122876433
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 128MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: no
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50


Notes:
r3528 is from:
http://boinc2.ssl.berkeley.edu/beta/download/setiathome_8.19_windows_intelx86__opencl_ati5_sah.exe
http://boinc2.ssl.berkeley.edu/beta/download/MultiBeam_Kernels_r3528.cl

bd3682cea1edc9f8973f233839b1fd39 *setiathome_8.19_windows_intelx86__opencl_ati5_sah.exe
277bd652f71c7de826a90064c9280860 *MultiBeam_Kernels_r3528.cl


r3500 is from:
c9080d2b6e940dd6736f737df6260247 *Lunatics_Win32_v0.45_Beta4-for-SoG_setup.exe

dfcf205436ce6b1c85fa35a9a4a66bf8 *MB8_win_x86_SSE2_OpenCL_ATi_HD5_r3500.exe
277bd652f71c7de826a90064c9280860 *MultiBeam_Kernels_r3500.cl


(MultiBeam_Kernels_r3528.cl == MultiBeam_Kernels_r3500.cl)
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1820358 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1820390 - Posted: 29 Sep 2016, 5:40:15 UTC - in response to Message 1820358.  

r3528 is SoG while r3500 isn't.
"Lot of mem" path has meaning only for SoG.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1820390 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1820393 - Posted: 29 Sep 2016, 5:55:21 UTC - in response to Message 1820390.  

r3528 is SoG while r3500 isn't.
"Lot of mem" path has meaning only for SoG.



. . Now I am confused. If r3500 is not SoG what is it??

Stephen

.
ID: 1820393 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1820396 - Posted: 29 Sep 2016, 6:06:55 UTC - in response to Message 1820393.  

r3528 is SoG while r3500 isn't.
"Lot of mem" path has meaning only for SoG.



. . Now I am confused. If r3500 is not SoG what is it??

Stephen

.

It's non-SoG AMD build:

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_ZERO_COPY OCL_CHIRP3 FFTW AMD specific USE_SSE2 x86
CPUID: AMD Athlon(tm) II X3 455 Processor

Cache: L1=64K L2=512K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSE4A
OpenCL-kernels filename : MultiBeam_Kernels_r3500.cl
ar=0.429339 NumCfft=195489 NumGauss=1101398310 NumPulse=226362612175 NumTriplet=452728333399
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768

Windows optimized setiathome_v8 application
Based on Intel, Core 2-optimized v8-nographics V5.13 by Alex Kan
SSE2xj Win32 Build 3500 , Ported by : Raistmer, JDWhale

SETI8 update by Raistmer

OpenCL version by Raistmer, r3500

AMD HD5 version by Raistmer
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1820396 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1820399 - Posted: 29 Sep 2016, 6:18:30 UTC - in response to Message 1820390.  
Last modified: 29 Sep 2016, 6:23:08 UTC

r3528 is SoG while r3500 isn't.
"Lot of mem" path has meaning only for SoG.

You mean that setiathome_8.19_windows_intelx86__opencl_ati5_sah.exe is ATI AMD SoG build?

I didn't expected "Total device global memory: 512MB" to be considered "LotOfMem"

EDIT:
OK, I now see "SIGNALS_ON_GPU"
SETI8 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_ZERO_COPY SIGNALS_ON_GPU OCL_CHIRP3 FFTW AMD specific USE_SSE2 x86
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1820399 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1820410 - Posted: 29 Sep 2016, 6:37:41 UTC - in response to Message 1820399.  

Yes.
And well, it means app considered enough memory to chose that path.
BTW, try to experiment if selection works properly - increase -sbs value - will you see switch to non-LoM path?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1820410 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1820452 - Posted: 29 Sep 2016, 8:17:03 UTC - in response to Message 1820396.  

r3528 is SoG while r3500 isn't.
"Lot of mem" path has meaning only for SoG.



. . Now I am confused. If r3500 is not SoG what is it??

Stephen

.

It's non-SoG AMD build:

Build features: SETI8 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_ZERO_COPY OCL_CHIRP3 FFTW AMD specific USE_SSE2 x86
CPUID: AMD Athlon(tm) II X3 455 Processor

Cache: L1=64K L2=512K

CPU features: FPU TSC PAE CMPXCHG8B APIC SYSENTER MTRR CMOV/CCMP MMX FXSAVE/FXRSTOR SSE SSE2 HT SSE3 SSE4A
OpenCL-kernels filename : MultiBeam_Kernels_r3500.cl
ar=0.429339 NumCfft=195489 NumGauss=1101398310 NumPulse=226362612175 NumTriplet=452728333399
Currently allocated 209 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768

Windows optimized setiathome_v8 application
Based on Intel, Core 2-optimized v8-nographics V5.13 by Alex Kan
SSE2xj Win32 Build 3500 , Ported by : Raistmer, JDWhale

SETI8 update by Raistmer

OpenCL version by Raistmer, r3500

AMD HD5 version by Raistmer



. . OK I will not pretend I understand what all that means apart from basically it is a CPU based OpenCL app (am I right?). But it is running pretty well on my Nvidia cards. It is running on all my rigs with GT730, GTX950 and 2 x GTX970. So confused or not I am impressed :)

Stephen

.
ID: 1820452 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1820540 - Posted: 29 Sep 2016, 14:39:52 UTC - in response to Message 1820452.  
Last modified: 29 Sep 2016, 15:18:38 UTC

   OK I will not pretend I understand what all that means apart from basically it is a CPU based OpenCL app (am I right?).

No - I posted 2 links to tasks run on my ATI AMD GPU: Radeon HD 6570 on Computer 4832843 and the same physical Computer 72429 on SETI@home Beta.
Read second line of my post - "second line" of text, not the empty one ;) :
http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1820358#1820358

"USE_OPENCL_HD5xxx" and "AMD HD5" means ATI AMD Radeon HD 5xxx series (and above)


But it is running pretty well on my Nvidia cards

The MultiBeam_Kernels_r35xx.cl file is the same for ATI AMD and NVIDIA but the .exe is different.
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1820540 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1820548 - Posted: 29 Sep 2016, 15:04:52 UTC - in response to Message 1820410.  

BTW, try to experiment if selection works properly - increase -sbs value - will you see switch to non-LoM path?

Пробовал, не берьет ;)

The switch -sbs is "accepted" but have no real effect if > of the default 128

All of:
Maximum single buffer size set to:256MB
Maximum single buffer size set to:144MB
Maximum single buffer size set to:160MB
Maximum single buffer size set to:136MB
Maximum single buffer size set to:130MB
Maximum single buffer size set to:320MB

Result in:
Currently allocated 201 MB for GPU buffers
...
Restarted at x.xx percent.
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 128MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: yes
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50

and the video RAM usage is the same: 236 MB (from ~20 at just Desktop)


-sbs really works for me only if < 128
Maximum single buffer size set to:120MB
...
ar=0.012633  NumCfft=146033  NumGauss=0  NumPulse=50210701440  NumTriplet=68004583584
Currently allocated 193 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 14.61 percent.
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 120MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: yes
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50


Maximum single buffer size set to:32MB
...
OpenCL-kernels filename : MultiBeam_Kernels_r3528.cl 
ar=0.012633  NumCfft=146033  NumGauss=0  NumPulse=50210701440  NumTriplet=68004583584
Currently allocated 105 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 15.63 percent.
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 32MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: yes
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50

 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1820548 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1820565 - Posted: 29 Sep 2016, 16:13:56 UTC - in response to Message 1820548.  

So,it just can't allow more. And older revision, non-SoG one?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1820565 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1820630 - Posted: 29 Sep 2016, 20:59:32 UTC - in response to Message 1820565.  

And older revision, non-SoG one?

The same.

The previous Beta task: 27jl16aa.5587.171137.8.42.195.vlar_1

This here: blc5_2bit_guppi_57449_46372_HIP81348_OFF_0020.18939.416.17.26.223.vlar_1


Video RAM used 220 MB (by "ATI MemoryViewer" (31.05.2011)):
Maximum single buffer size set to:132MB
...
Info: CPU affinity mask used: 1; system mask is 7
...
OpenCL-kernels filename : MultiBeam_Kernels_r3500.cl 
ar=0.010761  NumCfft=119383  NumGauss=0  NumPulse=50212798336  NumTriplet=63189698720
Currently allocated 185 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 5.21 percent.
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 128MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: no
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50


Maximum single buffer size set to:124MB
...
OpenCL-kernels filename : MultiBeam_Kernels_r3500.cl 
ar=0.010761  NumCfft=119383  NumGauss=0  NumPulse=50212798336  NumTriplet=63189698720
Currently allocated 181 MB for GPU buffers
In v_BaseLineSmooth: NumDataPoints=1048576, BoxCarLength=8192, NumPointsInChunk=32768
Restarted at 7.95 percent.
Used GPU device parameters are:
	Number of compute units: 6
	Single buffer allocation size: 124MB
	Total device global memory: 512MB
	max WG size: 256
	local mem type: Real
	LotOfMem path: no
	LowPerformanceGPU path: no
	HighPerformanceGPU path: no
period_iterations_num=50



Seems this is device || driver || OS limitation, see:
"Max memory allocation: 134217728"
 OpenCL Platform Name:					 AMD Accelerated Parallel Processing
Number of devices:				 1
  Max compute units:				 6
  Max work group size:				 256
  Max clock frequency:				 800Mhz
  Max memory allocation:			 134217728
  Cache type:					 None
  Cache line size:				 0
  Cache size:					 0
  Global memory size:				 536870912
  Constant buffer size:				 65536
  Max number of constant args:			 8
  Local memory type:				 Scratchpad
  Local memory size:				 32768
  Queue properties:				 
    Out-of-Order:				 No
  Name:						 Turks
  Vendor:					 Advanced Micro Devices, Inc.
  Driver version:				 CAL 1.4.1646
  Version:					 OpenCL 1.1 AMD-APP (831.4)

 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1820630 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1820636 - Posted: 29 Sep 2016, 21:37:20 UTC - in response to Message 1820540.  

   OK I will not pretend I understand what all that means apart from basically it is a CPU based OpenCL app (am I right?).


Read second line of my post - "second line" of text, not the empty one ;) :
http://setiathome.berkeley.edu/forum_thread.php?id=79760&postid=1820358#1820358

"USE_OPENCL_HD5xxx" and "AMD HD5" means ATI AMD Radeon HD 5xxx series (and above)


But it is running pretty well on my Nvidia cards

The MultiBeam_Kernels_r35xx.cl file is the same for ATI AMD and NVIDIA but the .exe is different.


. . OK, so the AMD version of r3500 is not SoG but the Nvidia version is?

. . Always so much to learn.

Stephen

.
ID: 1820636 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1820770 - Posted: 30 Sep 2016, 6:32:40 UTC - in response to Message 1820630.  

Ah, indeed. App checks that value and will not try to aquire more than runtime could give.
So, this GPU can't use more than 128M as single allocation. But it has 512Mb total so can allocate enough number of separate buffers for LoM path.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1820770 · Report as offensive
Previous · 1 . . . 13 · 14 · 15 · 16 · 17 · Next

Message boards : Number crunching : SETI@home v8.12 Windows GPU applications support thread


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.