Theory and Practice of "dynamic optimization" for both cpu and gpu processing

Message boards : Number crunching : Theory and Practice of "dynamic optimization" for both cpu and gpu processing
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1871797 - Posted: 8 Jun 2017, 20:34:18 UTC

Hi,
Because I am not much of a mathematician or computer programmer for that matter I wanted to kick off a thread on optimization in the Seti@home context. Specifically the applications that we run for both the cpus and gpu (all brands).

My understanding is it is made more complicated because there are optimizations in Linux/Unix etc that are not possible under Windows.

Given all the above, one of the questions that has come up is "dynamic optimization" even possible in Seti@home? I think it maybe at least a qualified yes, because I have read of gpu self improvement efforts via "auto tuning" (which I am not sure I understand) and some other stuff I don't understand.

So have at it.

Tom
(Interested by stander)
A proud member of the OFA (Old Farts Association).
ID: 1871797 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1872372 - Posted: 11 Jun 2017, 3:51:45 UTC - in response to Message 1871797.  
Last modified: 11 Jun 2017, 4:03:45 UTC

There are 3 'kinds' of optimisation currently used, to varying degrees and combinations, in the seti@home CPU and GPU applications:

1 - build-time, which are hardwired codepaths relevant to the build's target platform/device
2 - install-time or first-run, which amounts to selecting specific builds manually, or JIT compilation in the case of OpenCL/CUDA
3 - run-time (dynamic), which resembles a short initial benchmark, with or without FFTW-like 'wisdom'.

Few current builds are purely one or another kind, because of the use of libraries and frameworks provided by other vendors. Examples include:
1) most builds have at least some amount of build time optimisation, depending on platform, if it's meant as a stock/generically applicable build or for 3rd party anonymous platform installation (requiring more power-user type knowledge). There are different stock CPU builds per platform, but they are pretty generic within that. 'Too specific' optimisations of this type are problematic for stock distribution, due to Boinc infrastructure limitations.
2) Even stock CPU builds store some FFTW Wisdom, GPU builds tend to cache JIT compiled Kernels as binaries on the system and usually change if hardware changes. [Note: this includes user imparted 'wisdom', such as command line settings. ]
3) Stock CPU has a lot of this run-time determination, in addition to the FFTW wisdom, as displayed in stderr.txt (It's possible to add a -verbose switch to display the many codepaths included), 3rd party & GPU not-so-much. [Auto tune storing partial or info, overlaps with #2]

Being this 'organic' an ecosystem, you'll probably see more mature builds head to #2 & #3. As the device+system landscape grows ever more complex, the optimisation process becomes one more of an AI-like process than a deterministic one, and the demands on our systems (other than dedicated crunchers) change from moment to moment. So eventually a #4 is likely to materialise, such that the applications will be able to work around whatever you're doing on the system. At present, process priorities appear 'not-great' for this, and unable to cope if a device leaves or is added during run. In a sense the Boinc client/manager should be doing this heuristic process management, and the landscape is changing. As the Boinc infrastructure is underfunded/staffed probably alternative solutions will need to be devised 3rd party for this level of management at some point.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1872372 · Report as offensive

Message boards : Number crunching : Theory and Practice of "dynamic optimization" for both cpu and gpu processing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.