GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this?

Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1787018 - Posted: 12 May 2016, 14:08:14 UTC - in response to Message 1787017.  

Also supposedly better equipped for longer simpler serial workloads (e.g. bitcoin mining)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1787018 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1787019 - Posted: 12 May 2016, 14:12:21 UTC - in response to Message 1787007.  

In SETI Beta I saw that guppi .vlars crunched with atiapu_SoG on my Linux box with an AMD HD 7770 take twice the time of those crunched with ati5_cal132 and get the double of credits.
Tullio

And link to your report about that on beta is?....
If someone think that we have any manpower to look ALL results... well, saying very politely you are wrong.
Beta testers (instead of beta hosts) assumed to be sentient beings actively participating in project, not just mentioning peculiarities "by the way" in chatting thread.

To make it plain clear: in PulseFind area SoG performance should be exactly equal non-SoG one. Taking into account that VLAR is almost PulseFind, SoG and non-SoG times should be close. All else worth reporting.

That's beta about.

I have reported, even if in the wrong place. I enlisted in SETI Beta because I had read that guppi .vlar were sent to GPUs only there and not to SETI@home. Now I see them also here and I am crunching them on my Windows PC with its GTX 750 Ti OC. I have no experience with graphic boards, I am using mostly Virtual Box in the CERN projects, where I was an Alpha tester since November 2010.
Tullio
ID: 1787019 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787033 - Posted: 12 May 2016, 14:56:06 UTC - in response to Message 1787019.  


I have reported, even if in the wrong place.

I was an Alpha tester since November 2010.
Tullio


So you probably get used to post links to original data in discussion? Links to results in current discussion to analyse?
ID: 1787033 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1787036 - Posted: 12 May 2016, 14:56:49 UTC - in response to Message 1787017.  
Last modified: 12 May 2016, 15:00:59 UTC


I am running 3 per card on my 750Tis. I am seeing similar times as others, but what I see is that my GPU memory usage is constantly changing from 30 to 90%,(IN the past it was steady at about 85%) My GPU Meter on the desktop shows this as well as GPUZ. I see mixed units running, both Guppie and non-Guppie. I was wondering if this is normal or has some meaning. The GPU usage shows 85-95% usage most of the time.

Depends on what this particular metrics actually measure. If it measure device global memory domain uncached accesses then it roughly equal to "cache misses" in CPU world. And then, apparently, it's bad thing. If it measures just memory accesses w/o distinction between cache hits/misses... then it's bad thing too but for another reason and quite non-fixable one. It would mean low to not compute-intensive parts of code prevail. In other words, too few arithmetic operations on single memory load. Just the very thing that make app to be memory-constrained.

Instead of Gaussian where quite high computation load per each sample implied, PulseFind very similar to AstroPulse's algorithms. Maybe that's why AMD owners feel different then NV ones with VLAR and AP - better memory subsystem (in older generations at least).


Something I just noticed, and this may have something to do with it. After rebuilding this computer because of some file corruption preventing me from doing a backup of the system, I now have both CUDA50 and CUDA42 WUs running at the same time.
IF this is incorrect, How do I correct it? Both CUDAs seem to be running to completion just fine. This computer: http://setiathome.berkeley.edu/results.php?hostid=7965534

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1787036 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787041 - Posted: 12 May 2016, 15:12:27 UTC - in response to Message 1787036.  

IF this is incorrect, How do I correct it? Both CUDAs seem to be running to completion just fine. This computer: http://setiathome.berkeley.edu/results.php?hostid=7965534

It's perfectly correct for stock setup.
ID: 1787041 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1787045 - Posted: 12 May 2016, 15:32:10 UTC - in response to Message 1787033.  

Then tell me what I had to do.
Tullio
ID: 1787045 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1787052 - Posted: 12 May 2016, 15:45:49 UTC - in response to Message 1787041.  

IF this is incorrect, How do I correct it? Both CUDAs seem to be running to completion just fine. This computer: http://setiathome.berkeley.edu/results.php?hostid=7965534

It's perfectly correct for stock setup.

I guess I did not realize it was stock as I thought I had downloaded and installed the Lunatics Installer.
Can I just run the installer now?

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1787052 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787055 - Posted: 12 May 2016, 15:49:34 UTC - in response to Message 1787045.  

Then tell me what I had to do.
Tullio

Locate tasks you talk about and provide links to them on beta webpage.
ID: 1787055 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787056 - Posted: 12 May 2016, 15:50:37 UTC - in response to Message 1787052.  

IF this is incorrect, How do I correct it? Both CUDAs seem to be running to completion just fine. This computer: http://setiathome.berkeley.edu/results.php?hostid=7965534

It's perfectly correct for stock setup.

I guess I did not realize it was stock as I thought I had downloaded and installed the Lunatics Installer.
Can I just run the installer now?

Yes, but OpenCL GPU apps need to be updated to rev 3430 after that from my cloud repo.
ID: 1787056 · Report as offensive
Profile johnnymc
Volunteer tester
Avatar

Send message
Joined: 5 May 99
Posts: 35
Credit: 9,138,623
RAC: 0
United States
Message 1787068 - Posted: 12 May 2016, 17:21:12 UTC
Last modified: 12 May 2016, 17:23:19 UTC

My GTX 960 is handling:
one non gbt at 56% gpu usage 20-25 minutes crunch time
two non gbt at 78% gpu usage pretty close to the same time each
thr non gbt at 99% gpu usage but seems to take them 90 minutes each.

Once the GBT vlars came in I had to crunch one work unit at a time.

Two of these running concurrent took about an hour ten minutes each
with gpu pegged at 99%.

One at a time takes 25-35 minutes each with gpu usage 97-98%.
Life's short; make fun of it.
ID: 1787068 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1787073 - Posted: 12 May 2016, 17:32:36 UTC - in response to Message 1787056.  


Yes, but OpenCL GPU apps need to be updated to rev 3430 after that from my cloud repo.

I ran the Lunitics installer and now the memory usage on the GPU is much more constant.
Where do I find the rev 3430?

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1787073 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1787098 - Posted: 12 May 2016, 20:47:44 UTC - in response to Message 1787073.  


Yes, but OpenCL GPU apps need to be updated to rev 3430 after that from my cloud repo.

I ran the Lunitics installer and now the memory usage on the GPU is much more constant.
Where do I find the rev 3430?


http://mikesworldnet.de/download.html


With each crime and every kindness we birth our future.
ID: 1787098 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1787196 - Posted: 13 May 2016, 3:20:35 UTC - in response to Message 1787098.  

Thanks, I could not find it because IE decided to show me your page from the temp file, not the new page. I did a refresh and it was there as it should be.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1787196 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1787218 - Posted: 13 May 2016, 6:09:21 UTC - in response to Message 1787013.  


Why is it that the GPUs ability to process work is so heavily dependant on the CPU?

In very short: cause there are parts of control code that should be executed serially. And cause currently they executed on CPU.

So while all the crunching work should be done on the GPU, but at present it isn't.


And then , when communication CPU<=>GPU really decreased, "my BOINC doesn't tick " appears...

So ideally that communication should be asynchronous. Every 100th, 10th (half, even whole second) the GPU outputs it's progress on the WU until it's complete at which time it sends out the result file and requests another WU. The GPU isn't waiting on any CPU responses (until it requests another WU), and the CPU can do with the progress updates whatever it needs to (update the progress bar or ignore it as too recent since last update or whatever).
If while the WU is being processed the CPU doesn't receive any updates on the progress of the WU, then it initiates a call to the GPU to find out what's going on.
Grant
Darwin NT
ID: 1787218 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787225 - Posted: 13 May 2016, 7:19:50 UTC - in response to Message 1787218.  


Why is it that the GPUs ability to process work is so heavily dependant on the CPU?

In very short: cause there are parts of control code that should be executed serially. And cause currently they executed on CPU.

So while all the crunching work should be done on the GPU, but at present it isn't.


And then , when communication CPU<=>GPU really decreased, "my BOINC doesn't tick " appears...

So ideally that communication should be asynchronous. Every 100th, 10th (half, even whole second) the GPU outputs it's progress on the WU until it's complete at which time it sends out the result file and requests another WU. The GPU isn't waiting on any CPU responses (until it requests another WU), and the CPU can do with the progress updates whatever it needs to (update the progress bar or ignore it as too recent since last update or whatever).
If while the WU is being processed the CPU doesn't receive any updates on the progress of the WU, then it initiates a call to the GPU to find out what's going on.


Ah, you strongly know how it "should be", fine, we have new optimiztion team member it seems, with fresh and great ideas. Good, you can start here: https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt
ID: 1787225 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1787226 - Posted: 13 May 2016, 7:44:20 UTC - in response to Message 1787225.  
Last modified: 13 May 2016, 7:44:32 UTC

Ah, you strongly know how it "should be", fine, we have new optimiztion team member it seems, with fresh and great ideas. Good, you can start here: https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt

If only I had even an ounce of ability...
Grant
Darwin NT
ID: 1787226 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787227 - Posted: 13 May 2016, 7:49:45 UTC - in response to Message 1787226.  

Ah, you strongly know how it "should be", fine, we have new optimiztion team member it seems, with fresh and great ideas. Good, you can start here: https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt

If only I had even an ounce of ability...


So, all this just wishful thinking and theoretical speculations about?
Then I could tell not how it "should" but how it "does". It doesn't help.
Both from thechnical and human resources point of view.
ID: 1787227 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1787229 - Posted: 13 May 2016, 8:15:20 UTC - in response to Message 1787227.  

Ah, you strongly know how it "should be", fine, we have new optimiztion team member it seems, with fresh and great ideas. Good, you can start here: https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt

If only I had even an ounce of ability...


So, all this just wishful thinking and theoretical speculations about?


Curiosity pure & simple, about why an application running on a GPU is so dependant on CPU resources.
Grant
Darwin NT
ID: 1787229 · Report as offensive
Profile RueiKe Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 14 Feb 16
Posts: 492
Credit: 378,512,430
RAC: 785
Taiwan
Message 1787230 - Posted: 13 May 2016, 8:16:27 UTC

The are taking about 10min each for 1 instance on my Fury X.
GitHub: Ricks-Lab
Instagram: ricks_labs
ID: 1787230 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1787232 - Posted: 13 May 2016, 8:40:33 UTC - in response to Message 1787229.  
Last modified: 13 May 2016, 8:44:19 UTC

Ah, you strongly know how it "should be", fine, we have new optimiztion team member it seems, with fresh and great ideas. Good, you can start here: https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt

If only I had even an ounce of ability...


So, all this just wishful thinking and theoretical speculations about?


Curiosity pure & simple, about why an application running on a GPU is so dependant on CPU resources.

Cause GPU and CPU are different devices and GPU constitutes coprocessor, not Central Processing Unit.

EDIT: and cause "aplication running on GPU" is misleading jargon. Application is running on CPU always. That's the way modern PC architecture works and that's why only one device in PC called CPU.
ID: 1787232 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

Message boards : Number crunching : GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.