Posts by petri33


log in
1) Message boards : Number crunching : GPU Wars 2016: AMD RX 480 (Polaris) (Message 1792063)
Posted 1 day ago by Profile petri33Project Donor
799€.


Perrrkele...

;)



My thoughts exactly, but one can not help himself when something this interesting is available. If the performance/noise/temps/power do match the price I can send it back.
2) Message boards : Number crunching : GPU Wars 2016: AMD RX 480 (Polaris) (Message 1792023)
Posted 1 day ago by Profile petri33Project Donor
Dayum, so 1070 is possibly a 970/980(non-ti) upgrade at half-ish the initial price?.

... and outperforms a GTX 980Ti & the Titan X.


I demand independent reviews, lol

Or better yet, personal inspection?


My computer parts dealer had 5 gtx1080's on shelf. I ordered one. Should take 1-3 days to arrive.The card is Inno3D Founder Edition (aka slow). 799€.
3) Message boards : Number crunching : Why need 5 different stock AMD OpenCL GPU applications? (Message 1790376)
Posted 6 days ago by Profile petri33Project Donor
Well,
There is a haystack of different needles and straws. To attack them best in the long run ....
4) Message boards : Number crunching : GPU Wars 2016: AMD RX 480 (Polaris) (Message 1788937)
Posted 11 days ago by Profile petri33Project Donor
Probably tomorrow I start gathering my various test pieces, so that there is some sortof organised pack for whoever the lucky first to get one is, can extract some meaningful numbers.

Good to hear.

+1
5) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1788929)
Posted 11 days ago by Profile petri33Project Donor
This is what it shows with just the Two 950s in Yosemite;

Fri May 20 00:54:55 2016 | | Starting BOINC client version 7.6.32 for x86_64-apple-darwin
Fri May 20 00:54:55 2016 | | CUDA: NVIDIA GPU 0: Graphics Device (driver version 7.5.29, CUDA version 7.5, compute capability 5.2, 2048MB, 1874MB available, 2022 GFLOPS peak)
Fri May 20 00:54:55 2016 | | CUDA: NVIDIA GPU 1: Graphics Device (driver version 7.5.29, CUDA version 7.5, compute capability 5.2, 2048MB, 1825MB available, 2022 GFLOPS peak)
Fri May 20 00:54:55 2016 | | OpenCL: NVIDIA GPU 0: Graphics Device (driver version 10.5.2 346.02.03f06, device version OpenCL 1.2, 2048MB, 1874MB available, 2022 GFLOPS peak)
Fri May 20 00:54:55 2016 | | OpenCL: NVIDIA GPU 1: Graphics Device (driver version 10.5.2 346.02.03f06, device version OpenCL 1.2, 2048MB, 1825MB available, 2022 GFLOPS peak)
Fri May 20 00:54:55 2016 | | OpenCL CPU: Intel(R) Xeon(R) CPU E5472 @ 3.00GHz (OpenCL driver vendor: Apple, driver version 1.1, device version OpenCL 1.2)
Fri May 20 00:54:55 2016 | SETI@home | Found app_info.xml; using anonymous platform
Fri May 20 00:54:55 2016 | SETI@home Beta Test | Found app_info.xml; using anonymous platform
Fri May 20 00:54:56 2016 | | Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU E5472 @ 3.00GHz [x86 Family 6 Model 23 Stepping 6]
Fri May 20 00:54:56 2016 | | OS: Mac OS X 10.10.5 (Darwin 14.5.0)

That's great, IF you only want to use Two cards. Now...why would you only want to use Two cards?
It would be nice If BOINC could handle 3 or maybe 4 cards without going Bonkers.
It does handle Three 750TI without any trouble...even Three ATI cards work. Try it with Two 950s & a 750TI though.


How about these environment variables: CUDA_​VISIBLE_​DEVICES and CUDA_​DEVICE_​ORDER ?

See http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars

Id try to export CUDA_​VISIBLE_​DEVICES=0,1,2 with all permutations and try to launch BOINC after each change. The order of devices may have an effect.

CUDA_​VISIBLE_​DEVICES=1,0,2 would probably list gtx750ti first.

Petri
6) Message boards : Number crunching : GPU Wars 2016: AMD RX 480 (Polaris) (Message 1788698)
Posted 12 days ago by Profile petri33Project Donor
That sounds like a great plan if it can be achieved. Go Man Go! ;-)


I'm doing some testing and trying and I think that Jason will do the whole thing right.

You can get a sneak peek preview here. It is an ar 0.42 task in 164 seconds. There are some high ar tasks that take less than 60 seconds. The guppi vlars take about 700 seconds and need some more optimizing.

Oh how I wish I could get one of those GTX1080's ....
7) Message boards : Number crunching : GPU Wars 2016: AMD RX 480 (Polaris) (Message 1788691)
Posted 12 days ago by Profile petri33Project Donor
..., why does it have to tax the CPU so much, why not just upload the task to the GPU, let it grind away at it till it's done, and pass the task back to the CPU to report it back on it's way?

That would have been me.
And my understanding of it is that it boils down to programme flow & control. For all their power, basically they are extremely limited in their programmability.
Most of what you can programme boils down to the processing of one section of data, once that's done it's returned to the programme running on the CPU which then supplies the next bit of data & so on till the WU is done.
The ability to do something else on the GPU, based on the result from previous processing without requiring input from the CPU is extremely minimal, as I understand it.


Hi,
Minimal but possible. Using streams and multiple staging areas for input and output one can send new work in in advance and retrieve finished work out asynchronously. The CPU does the waiting and GPU keeps on processing.
--
Petri
8) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786854)
Posted 20 days ago by Profile petri33Project Donor
I've got a feeling ....

The credit will turn up now since all kind of hosts and especially standard NV app is doing Guppi too.

That is a feeling, but I hope I'll wake up to a better world, if not tomorrow then in three weeks.
9) Message boards : Number crunching : Trying To Increase The Clock Speed Of An Nvidia 750 ti ?? (Message 1786847)
Posted 20 days ago by Profile petri33Project Donor
... EDIT2:

After setting higher voltages the cards raised their clocks automagically upwards. I'm expecting some errors, but this is definitely something I have to try. :)

EDIT3:
nvidia-smi reports a couple of more Watts consumed by GPU's.


Hmmm, will have to see what (if any) of the control features my 680 on Ubuntu will allow (once updated a few things). It always did go slightly better with a voltage bump, and likewise boost higher once it had it.


And, ..
please do not forget about to google "NVIDIA coolbits"

EDIT: And I know You knew about this.
10) Message boards : Number crunching : Trying To Increase The Clock Speed Of An Nvidia 750 ti ?? (Message 1786807)
Posted 20 days ago by Profile petri33Project Donor
I have not yet tried this.
http://www.phoronix.com/scan.php?page=news_item&px=MTg0MDI

Tell us if it works.

EDIT: just did.

do something like this:
root@Linux1:~/sah_v7_opt/Xbranch/client# nvidia-settings -a "[GPU:0]/GPUOverVoltageOffset=16000" Attribute 'GPUOverVoltageOffset' (Linux1:0[gpu:0]) assigned value 16000. root@Linux1:~/sah_v7_opt/Xbranch/client# nvidia-settings -a "[GPU:1]/GPUOverVoltageOffset=16000" Attribute 'GPUOverVoltageOffset' (Linux1:0[gpu:1]) assigned value 16000. root@Linux1:~/sah_v7_opt/Xbranch/client# nvidia-settings -a "[GPU:2]/GPUOverVoltageOffset=16000" Attribute 'GPUOverVoltageOffset' (Linux1:0[gpu:2]) assigned value 16000. root@Linux1:~/sah_v7_opt/Xbranch/client# nvidia-settings -a "[GPU:3]/GPUOverVoltageOffset=16000" Attribute 'GPUOverVoltageOffset' (Linux1:0[gpu:3]) assigned value 16000. root@Linux1:~/sah_v7_opt/Xbranch/client#


EDIT2:

After setting higher voltages the cards raised their clocks automagically upwards. I'm expecting some errors, but this is definitely something I have to try. :)

EDIT3:
nvidia-smi reports a couple of more Watts consumed by GPU's.
11) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786806)
Posted 20 days ago by Profile petri33Project Donor

"Ja, men tider dom var uppladdat är inte samma. Dom var något interlaced."
Yes, but the times they were uploaded are not the same. They were somewhat interlaced. (Not runnig the whole time together, and may have had other kind of tasks running at the same time)

In this case it doesn't matter much at all. I've run hundreds of them on BETA, 3 at a time, when I only had GBT VLARs onboard, and I know that the times are 28-30 minutes each.

Just wait until there are plenty of them available here, and you'll see that those are the standard times my 980 runs GBT VLARS with SoG, when run 3 at a time.


OK, jag ska vänta på det.
OK, I'll wait.
12) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786802)
Posted 20 days ago by Profile petri33Project Donor
Running Raismter's SoG on my Titan X's

Running 3 at a time on each GPU with commandlines from Mike,

Each are taking just under 1 hour (56-58 minutes)

http://setiathome.berkeley.edu/result.php?resultid=4923517925

http://setiathome.berkeley.edu/result.php?resultid=4923517967

http://setiathome.berkeley.edu/result.php?resultid=4923517729

That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each.

http://setiathome.berkeley.edu/result.php?resultid=4924826717
http://setiathome.berkeley.edu/result.php?resultid=4924819417
http://setiathome.berkeley.edu/result.php?resultid=4924799847


Are you dividing by 3 or is that the exact time? 3 in 54 minutes for me averages 18 minutes

Not dividing as you can see from the results. It is the exact times. Each of them is done in less than 30 minutes.

Name blc2_2bit_guppi_57451_24929_HIP63406_0019.20556.831.17.26.163.vlar_0
Run time 29 min 10 sec
CPU time 27 min 50 sec
--------------------------------------------------------------------------
Name blc2_2bit_guppi_57451_25284_HIP63406_OFF_0020.20361.831.17.26.177.vlar_1
Run time 28 min 25 sec
CPU time 27 min 48 sec
------------------------------------------------------------------------------
Name blc2_2bit_guppi_57451_24929_HIP63406_0019.17306.831.18.27.176.vlar_1
Run time 28 min 32 sec
CPU time 25 min 40 sec
----------------------------------------------------------------------------


"Ja, men tider dom var uppladdat är inte samma. Dom var något interlaced."
Yes, but the times they were uploaded are not the same. They were somewhat interlaced. (Not runnig the whole time together, and may have had other kind of tasks running at the same time)
13) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786799)
Posted 20 days ago by Profile petri33Project Donor
Running Raismter's SoG on my Titan X's

Running 3 at a time on each GPU with commandlines from Mike,

Each are taking just under 1 hour (56-58 minutes)

http://setiathome.berkeley.edu/result.php?resultid=4923517925

http://setiathome.berkeley.edu/result.php?resultid=4923517967

http://setiathome.berkeley.edu/result.php?resultid=4923517729

That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each.

EDIT: A retake: all were run on GPU 0. So 29 min fore each. Not so good.

http://setiathome.berkeley.edu/result.php?resultid=4924826717
http://setiathome.berkeley.edu/result.php?resultid=4924819417
http://setiathome.berkeley.edu/result.php?resultid=4924799847


Are you dividing by 3 or is that the exact time? 3 in 54 minutes for me averages 18 minutes


I clicked the links and saw about 29 min for each three of them. So running one at a time that would be about 10 min for one.

EDIT: but they'd have to run on same GPU simultaneously.,
14) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786795)
Posted 20 days ago by Profile petri33Project Donor
Jus as someone said: Let the creditscrew settle now for a week or thee.
I was looking at the credit granted for host http://setiathome.berkeley.edu/results.php?hostid=7939003&offset=0&show_names=0&state=4&appid= and saw a huge difference in credit granted depending on which kind of host the task validated against.
15) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786794)
Posted 20 days ago by Profile petri33Project Donor
Running Raismter's SoG on my Titan X's

Running 3 at a time on each GPU with commandlines from Mike,

Each are taking just under 1 hour (56-58 minutes)

http://setiathome.berkeley.edu/result.php?resultid=4923517925

http://setiathome.berkeley.edu/result.php?resultid=4923517967

http://setiathome.berkeley.edu/result.php?resultid=4923517729

That was a long time. Running 3 at a time with SoG on my 980, takes 28-29 minutes each.

http://setiathome.berkeley.edu/result.php?resultid=4924826717
http://setiathome.berkeley.edu/result.php?resultid=4924819417
http://setiathome.berkeley.edu/result.php?resultid=4924799847


Those have ar 0.009, 0.01 and 0.009. Other reporters may have ar at around 0.005 or something. Nice times though (3 ones under 30).
16) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786780)
Posted 20 days ago by Profile petri33Project Donor
plan autocorr R2C batched FFT failed 5 Not enough VRAM for Autocorrelations... setiathome_CUDA: CUDA runtime ERROR in device memory allocation, attempt 1 of 6


it has failed at the start...

Sometimes a reboooooot helps. The GPU memory can get in to a fragmented state. There is plenty of ram available but not in one continuous block.

How about running the older build? On top hosts there is one running happily and without any extra inconclusives. (I have trashed my version and need to revert back because of too many inconclusives, no errors though.)
17) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786650)
Posted 20 days ago by Profile petri33Project Donor

I suspect they are not doing gaussian searching: Sigma > GaussTOffsetStop: 498 > -434

And it's just correct. Gaussian search doesn't performed on VLAR at all.
What bug and what fix then?

EDIT: w/o boundary checking your change will cause out of range error and result in exception for protected memory range.


Yes, I know. I thought a moment and found out that if the angle rate is low enough there will be no gaussian, since the telescope movement + earth rotation is not enough in a given time window to form a gaussian.

So no error.
18) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786635)
Posted 20 days ago by Profile petri33Project Donor
Sounds about right Petri. Gaussian Search requires a drift over a spot on the sky. i.e. pass the telescope beam over a specific direction and get a bell shaped signal from a constant source. VLAR targeted searches (unblinking eye at one spot) don't pass over a spot, but travel with it.


Yes, I know what a gaussian is and how it is formed in to the signal.

I have noticed the same negative value in test wu PG009.

I wonder if ths is a bug in cuda code base or in all sw versions. See edit in my other message.
19) Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this? (Message 1786629)
Posted 20 days ago by Profile petri33Project Donor
Guppi vlars are coming in and they are done in 9-12 minutes one at a time.

I suspect they are not doing gaussian searching: Sigma > GaussTOffsetStop: 498 > -434

The difference between singma and GaussTOffsetStop would be 64 if the TOffsetStop would be positive ...

analyzePoT.cpp: PoTInfo.GaussTOffsetStart = static_cast<int>(floor(swi.analysis_cfg.pot_t_offset * PoTInfo.GaussSigma+0.5)); analyzePoT.cpp: PoTInfo.GaussTOffsetStop = swi.analysis_cfg.gauss_pot_length - PoTInfo.GaussTOffsetStart; gaussfit.cpp: should probably be analyzePoT.cpp: PoTInfo.GaussTOffsetStart = static_cast<int>(floor(swi.analysis_cfg.pot_t_offset * PoTInfo.GaussSigma+0.5)); analyzePoT.cpp: PoTInfo.GaussTOffsetStop = PoTInfo.GaussTOffsetStart + swi.analysis_cfg.gauss_pot_length; gaussfit.cpp:


Just my thoughts.
20) Message boards : Number crunching : GPU Wars 2016: AMD RX 480 (Polaris) (Message 1786092)
Posted 22 days ago by Profile petri33Project Donor
but since the summer just arrived here I'm not going to 'winter' in Australia. :)

Mid teens overnight to high 20°s or low 30°s during the day up here.
Of course down south you can enjoy below zero misery over night & single digits (or not much more) during the day.


I know. A big country.

But those 1080's specifications look good. I'm going to have to consider buying one. Just do not know where to dump my 780's. I've got two that I do not use and I have to take one out of the machine if I buy a 1080.

Anyone interested in buying 3 gtx780's (cooling fans broken)?


Next 20

Copyright © 2016 University of California