Cuda 6 -> Multi-GPU Scaling support for Seti ?

Author	Message
Zarck Volunteer tester Send message Joined: 3 Apr 99 Posts: 28 Credit: 662,507 RAC: 0	Message 1504602 - Posted: 16 Apr 2014, 6:18:37 UTC Cuda 6 -> Multi-GPU Scaling support for Seti ? http://on-demand.gputechconf.com/supercomputing/2013/presentation/SC3108-New-Features-CUDA%206%20-GPU-Acceleration.pdf https://developer.nvidia.com/cuda-toolkit @+ _ ID: 1504602 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1504684 - Posted: 16 Apr 2014, 11:59:04 UTC - in response to Message 1504602. Last modified: 16 Apr 2014, 11:59:24 UTC Multi-GPU need to be used if task scales well. SETI MultiBeam (and AstroPulse) generally doesn't. So better to run few separate tasks on different GPUs than to use multiple GPUs to process single task. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1504684 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1504692 - Posted: 16 Apr 2014, 12:35:52 UTC - in response to Message 1504602. Last modified: 16 Apr 2014, 12:41:29 UTC Cuda 6 -> Multi-GPU Scaling support for Seti ? http://on-demand.gputechconf.com/supercomputing/2013/presentation/SC3108-New-Features-CUDA%206%20-GPU-Acceleration.pdf https://developer.nvidia.com/cuda-toolkit @+ _ In General, current multibeam tasks are too small a dataset to heavily partition across devices efficiently, so we tend to end up running multiple tasks per device instead. In advance of larger tasks, I'm currently working on foundations to make it possible to load multiple tasks in multi-GPU and distribute the tasks in different ways, such as split-streaming individual tasks, or pipelining portions of multiple single larger tasks across multi-GPUs, both of which can be effective latency hiding measures. Where Cuda 6 features will likely come in down the line is that, with so many options for processing, and huge setup variations possible, it would be challenging to work out exactly which would be the best way to run with different tasks etc. (not one-size fits all). There, specially crafted outboard tools will be used to refine GPU code. These will choose between numerous possible internal settings and parallelism configurations, and can have some task type (seti specific domain) knowledge. Since from exhaustive tools we then have a lot of knowledge about how multibeam tasks load the devices, this will allow the applications to reconfigure, recompile Cuda kernels on the fly, work cooperatively, adapt to user preferences, hardware change and other events 'on the fly'. Again not so urgent for while the tasks remain relatively small, but eventually those forms of latency hiding will help with the normal (small) tasks too. It's just taking a lot of planning to get together. For exhaustive optimisation, It's looking like the best option is moving toward an install-time bench (like 3D-Mark for graphics), that can be rerun on command or on specific system change , will be the best approach. These would depend heavily on Cuda 6 and OpenCL driver level features. An advantage to that approach might me that I could possibly make it fairly easy for others to craft their own Cuda Kernels (and even some host side code) , self test for sanity etc, and submit them to my website for incorporation. This I hope would accelerate the frustrating development Lag I have at the moment, whereby I have Petri33's work on the Chirp Kernels to include, but continually get waylaid by Boinc(Api) issues. Removing the bulk of dependence on Berkeley and (semi)automating benches, tests with full regression against gold references, I'm hopeful would accelerate development in a crowd-sourcing way, while possibly adding some cool tools that hardware (GPU & Other) reviewers might like to include in their reviews in the future (so providing greater exposure for Seti@Home) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1504692 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.