Cuda 6 -> Multi-GPU Scaling support for Seti ?

Message boards : Number crunching : Cuda 6 -> Multi-GPU Scaling support for Seti ?
Message board moderation

To post messages, you must log in.

AuthorMessage
Zarck
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 28
Credit: 662,507
RAC: 0
France
Message 1504602 - Posted: 16 Apr 2014, 6:18:37 UTC

ID: 1504602 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1504684 - Posted: 16 Apr 2014, 11:59:04 UTC - in response to Message 1504602.  
Last modified: 16 Apr 2014, 11:59:24 UTC

Multi-GPU need to be used if task scales well.
SETI MultiBeam (and AstroPulse) generally doesn't. So better to run few separate tasks on different GPUs than to use multiple GPUs to process single task.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1504684 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1504692 - Posted: 16 Apr 2014, 12:35:52 UTC - in response to Message 1504602.  
Last modified: 16 Apr 2014, 12:41:29 UTC

Cuda 6 -> Multi-GPU Scaling support for Seti ?

http://on-demand.gputechconf.com/supercomputing/2013/presentation/SC3108-New-Features-CUDA%206%20-GPU-Acceleration.pdf

https://developer.nvidia.com/cuda-toolkit

@+
*_*


In General, current multibeam tasks are too small a dataset to heavily partition across devices efficiently, so we tend to end up running multiple tasks per device instead. In advance of larger tasks, I'm currently working on foundations to make it possible to load multiple tasks in multi-GPU and distribute the tasks in different ways, such as split-streaming individual tasks, or pipelining portions of multiple single larger tasks across multi-GPUs, both of which can be effective latency hiding measures.

Where Cuda 6 features will likely come in down the line is that, with so many options for processing, and huge setup variations possible, it would be challenging to work out exactly which would be the best way to run with different tasks etc. (not one-size fits all).

There, specially crafted outboard tools will be used to refine GPU code. These will choose between numerous possible internal settings and parallelism configurations, and can have some task type (seti specific domain) knowledge.

Since from exhaustive tools we then have a lot of knowledge about how multibeam tasks load the devices, this will allow the applications to reconfigure, recompile Cuda kernels on the fly, work cooperatively, adapt to user preferences, hardware change and other events 'on the fly'.

Again not so urgent for while the tasks remain relatively small, but eventually those forms of latency hiding will help with the normal (small) tasks too. It's just taking a lot of planning to get together. For exhaustive optimisation, It's looking like the best option is moving toward an install-time bench (like 3D-Mark for graphics), that can be rerun on command or on specific system change , will be the best approach. These would depend heavily on Cuda 6 and OpenCL driver level features.

An advantage to that approach might me that I could possibly make it fairly easy for others to craft their own Cuda Kernels (and even some host side code) , self test for sanity etc, and submit them to my website for incorporation. This I hope would accelerate the frustrating development Lag I have at the moment, whereby I have Petri33's work on the Chirp Kernels to include, but continually get waylaid by Boinc(Api) issues.

Removing the bulk of dependence on Berkeley and (semi)automating benches, tests with full regression against gold references, I'm hopeful would accelerate development in a crowd-sourcing way, while possibly adding some cool tools that hardware (GPU & Other) reviewers might like to include in their reviews in the future (so providing greater exposure for Seti@Home)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1504692 · Report as offensive

Message boards : Number crunching : Cuda 6 -> Multi-GPU Scaling support for Seti ?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.