Constructing A Machine, With An Eye To SETI@Home

Questions and Answers : GPU applications : Constructing A Machine, With An Eye To SETI@Home
Message board moderation

To post messages, you must log in.

AuthorMessage
Drew

Send message
Joined: 22 May 17
Posts: 4
Credit: 0
RAC: 0
United States
Message 1868827 - Posted: 22 May 2017, 19:10:21 UTC
Last modified: 22 May 2017, 19:41:32 UTC

New to SETI@Home, or - I hope to be, soon. Not part of the project yet. Since I'm building a small web server to host a site for my new business, and I intend to devote server downtime to the cause, I've got half-a-dozen questions, all of the how-does-the-software-interact-with-the-hardware kind. If someone would let me know whether there's a more-appropriate forum for that kind of thing, or a more-comprehensive source of detailed hardware advice than the CPU, GPU and 'Top Machines' lists, I'd appreciate it very much.

Thanks to Mr. Kevvy for fixing the title!
ID: 1868827 · Report as offensive
Drew

Send message
Joined: 22 May 17
Posts: 4
Credit: 0
RAC: 0
United States
Message 1868937 - Posted: 23 May 2017, 8:43:35 UTC

Found the Live Help link; going to give that a try. Seems like the way to go. But! Thought I'd post my Q's anyhow, against the chance that anyone's feeling informative. Again - I'm building what amounts to a server for small-office use, and I'd like to optimize it for distributed-computing projects like S@H. But that's difficult without first-hand experience with either BOINC or SETI@Home.

- Does applying multiple GPU's to a single work unit require that they be connected by SLI, Crossfire or some such?
- How good is BOINC at delegating? Do work units tend to make maximal use of available system resources?
- Do work units tend to make maximal use of GPU and CPU power at the same time? How fluid are they?
- Is there any arbitrary limit on the number of work units the network will delegate to a single machine for simultaneous execution?
- For a given machine, is there a good method of predicting the practical limit on the number of work units it's likely to be assigned?
- Do the bandwidth and volume of system storage have noteworthy impacts on execution?
- What kind of internet-connection bandwidth is ideal? What's the top transmission speed of the SETI@Home server?
- How large do the files tend to be, actually?

Answers to any of these questions - whether measured, or personal impressions - would really help to move the ball forward, and would be much appreciated. Thanks.[/i]
ID: 1868937 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1868941 - Posted: 23 May 2017, 11:16:28 UTC

- Does applying multiple GPU's to a single work unit require that they be connected by SLI, Crossfire or some such?
Multiple GPUs run multiple tasks, not all one task. This project uses single threaded applications only, meaning that one task will run on every CPU core you have, and on every GPU you have. You can run multiple tasks on one GPU, if it's hefty enough and has enough memory.
- How good is BOINC at delegating? Do work units tend to make maximal use of available system resources?
Depends on what you ask.
- Do work units tend to make maximal use of GPU and CPU power at the same time? How fluid are they?
See answer 1.
- Is there any arbitrary limit on the number of work units the network will delegate to a single machine for simultaneous execution?
See answer 1.
- For a given machine, is there a good method of predicting the practical limit on the number of work units it's likely to be assigned?
All beginning machines will get a little amount of work, only after it has run at least 10 tasks correctly (validated OK, no errors) per piece of hardware (CPU, GPU) will BOINC be able to better guess how long the various tasks run for and ask the correct amounts of work. Seeing how there are at least 4 different lengths of tasks, this can take a while.
- Do the bandwidth and volume of system storage have noteworthy impacts on execution?
No.
- What kind of internet-connection bandwidth is ideal? What's the top transmission speed of the SETI@Home server?
Seti's tasks are just 360KB big, even when you get 10 at the same time, your 3MB download won't impact the connection/bandwidth that much. The project has a 1TB connection, with 100MB specified for downloads. Do know that everyone else will be using that same pipe at the same time you are, so you never have a full 100Mbit connection.
- How large do the files tend to be, actually?
Multibeam ~360KB, Breakthrough Listen probably the same, Astropulse 8MB.
ID: 1868941 · Report as offensive
Drew

Send message
Joined: 22 May 17
Posts: 4
Credit: 0
RAC: 0
United States
Message 1869026 - Posted: 24 May 2017, 4:44:55 UTC - in response to Message 1868941.  

That clarifies a lot, thanks. But you really surprised me on a couple of points - would you mind a few follow-ups?

- tasks and work units are the same thing. Right?

- Are the work units compressed to 360KB? Are they larger on-disk? Or larger when they're finished? If they're so small, why do they take more-than-trivial time to process? Or - do they?

- If the work units are always single-threaded, then multiprocessing is only beneficial insomuch as it allows for executing more than one at a time. Right? If so, then about how many can a decent, last-generation GPU work on all at once?

- Once BOINC has validated a machine fully, and familiarized itself with that machine's capacities, is the turnaround between work units only as long as it takes to transmit them?

- Does SETI really have a terabyte web connection? A terabyte? 'cause that is bizonkers. How long do your work units usually to take to download/upload?
ID: 1869026 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1869033 - Posted: 24 May 2017, 5:07:00 UTC

Each work unit is sent out initially to two users as "tasks" - this is so results can be validated.

Tasks are expanded in memory, not on the disk. However there is a small amount of additional disk space needed for holding a snapshot of the results. For example on this computer the 300 tasks, along with all the overheads, occupies just over 200MB. The return data is only about 30kb, and once a task has been completed and returned all the files for that task are deleted.
They take a long time to process because there is so much work to be done. Each processing cycle is actually several passes through the data, each pass applying a different search algorithm. Typical CPU run is 2-3 hours of processing time, the actual time depends on what else is running on the computer at the time.

BOINC is a scheduler - it tracks how much work you have in hand, how much you have to return, times you allow communication with the servers and which task to run next. You would normally have at least one "spare" tasks sitting around waiting to be processed the maximum is 100 per CPU (not thread CPU thread or per core) and 100 per GPU.

Transfers from the servers to the web actually run at 100Mb, but even at that reduced speed the data files spend more time being buffered elsewhere than they do being transmitted from the servers....
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1869033 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1869056 - Posted: 24 May 2017, 10:25:52 UTC - in response to Message 1869026.  

Additional information, what Rob didn't answer.
- tasks and work units are the same thing. Right?
Each work unit is split into two or more tasks, which are sent out to different computers. What those computers return is then called a result.

- If the work units are always single-threaded, then multiprocessing is only beneficial insomuch as it allows for executing more than one at a time. Right? If so, then about how many can a decent, last-generation GPU work on all at once?
The CPU applications are singlethreaded, not the tasks. The tasks are just small data files.

With OpenCL and CUDA on the GPUs it is possible to run a multithreaded application on these tasks. This is why tasks run on a GPU are run so much faster than on a CPU: because the CPU runs one task per core, the GPU has all its stream processors hacking away in parallel.

Multithreaded OpenCL processing using a CPU is of no interest to the main volunteer developer here.

As for amount of tasks that can be run at the same time on last generation GPUs, as far as I read around the Nvidia 9xx and 1xxx versions are best used for this, they can run 2, 3 tasks at the same time. AMD RX400 and 500 can run 2 tasks at the same time, but only using specific drivers and Windows. Else it's more stable to run just one task.

- Once BOINC has validated a machine fully, and familiarized itself with that machine's capacities, is the turnaround between work units only as long as it takes to transmit them?
No, because the upload process BOINC uses is a two-sprong thing: uploads are just file transfers moving a file from your hard drive to a hard drive over on the server. Then you have to report that task to the server in a separate communication. But here's the clincher, since reporting one task takes as much overhead on the server as reporting 200 tasks does, it is preferred to report as many tasks at the same time as possible. Which can make the return time look skewed on the website, because that shows the report time, not the upload time.

To make BOINC use the connection sufficiently, it prefers to report tasks when it asks for more work, which means that when you have your cache request set for multiple days, that it can take multiple days for BOINC to ask for work and report all the tasks waiting to be reported. With a maximum of the 100 tasks of course per hardware instance.

- Does SETI really have a terabyte web connection? A terabyte? 'cause that is bizonkers. How long do your work units usually to take to download/upload?
Seti has a terabit connection, that it mostly uses to connect to other projects. The Nebula project through which it sends large swats of its data to the Einstein project so they can run it on their ATLAS server, uses the 1Tb connection. The Seti project uses just 100Mbit of that for uploads and downloads and the web-backend. Mind, that is the full 100 megabit that they use for uploads and downloads. And you'll have to turn that around, as their upload speed is your download speed. :)

My devices have these numbers:
Phone:
Average upload rate 30.73 KB/sec
Average download rate 3122.92 KB/sec

PC:
Average upload rate 27.33 KB/sec
Average download rate 821.97 KB/sec
ID: 1869056 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1869540 - Posted: 26 May 2017, 14:35:30 UTC

Possible suggestions for using a Webserver and Bonic/Seti at home.

1) Put in a lot of cpu ram. This will speed up web page serving if/when your business gets more busy.

2) Make sure you have the "pause Bonic/Seti" when system load exceeds (I think 30% is the default). This will means your webserver will have "priority" if/when it gets loaded up by "stuff".

3) I think the Webserver is considered a background task. Assuming you are running a Windows (Server) operating system, there is a place where you can set it to give "background" tasks priority over "foreground" tasks e.g.. the monitor/keyboard/mouse and anything you are displaying.

4) Run a temperature throttling application so that your cpu (and possibly your gpu) don't get so hot it slows the cpu down and/or causes a hardware failure. Heat kills on any pc/workstation/server. I like "TThortle".

5) Unless you are buying the brand new/current generation AMD cpu, AMD cpu's do significantly less number crunching due to having 1 number cruncher core for each 2 general cpu cores.

6) Don't plan on running games or anything similar on the console of your webserver.

7) It is possible to get very good production (just not top tier production) using gpu cards like the GTX 750Ti/GTX 1050 Ti (the 750 replacement) while NOT using a lot of electricity. Yes, believe it or not, it will drive up your electricity bill. So if you have modest goals, you can use the above cards to get the best "bang for your (electricity) buck."

8) You may very well need an individual Air Conditioner wherever you place your webserver. I know my office is the hottest place in my house. (And I am down to two systems).

9) Don't run the Bonic/Seti screen saver unless you are NOT doing gpu computing. It looks neat but it slows the gpu processing down. On the other hand it is a great conversation starter. obtw, some versions of Windows 10 have had trouble with the Bonic screen saver. If that is what you are using, don't use the Bonic/Seti screen saver. (Assume you are not using Win10 but it would be feasible for a startup situation).

10) Yes, running Bonic/Seti will increase the wear and tear on your Webserver. Plan on higher maintenance and repair costs than your business budget currently has.

HTH,

Tom Miller
A proud member of the OFA (Old Farts Association).
ID: 1869540 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1869593 - Posted: 26 May 2017, 17:37:59 UTC

5) Unless you are buying the brand new/current generation AMD cpu, AMD cpu's do significantly less number crunching due to having 1 number cruncher core for each 2 general cpu cores.


Not true - this isn't even true for the FX family of processors.
The older (phenom & athlon) series had one FPU per core, the FX series had one FPU per per pair of cores. Of course with the Ryzen family (or at least the members that have been released so far) have one FPU per general core, and have multi threading (which makes their architecture similar to the intel ix & xeon series.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1869593 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1869604 - Posted: 26 May 2017, 19:08:20 UTC - in response to Message 1869540.  

I like "TThortle".
TThrottle.

9) Don't run the Bonic/Seti screen saver unless you are NOT doing gpu computing. It looks neat but it slows the gpu processing down.
OpenGL uses the shaders on the GPU, OpenCL and CUDA use the stream processors on the GPU. They can run simultaneously without interfering with each other. You're confusing this with using the screen saver when running work on the CPU, because both the GPU tasks and the screen saver do use the CPU, running these at the same time may result in less work being done.
The graphics application, and therefore the screen saver, only comes with CPU applications and only works with CPU tasks, The GPU application has no graphics.

On the other hand it is a great conversation starter. obtw, some versions of Windows 10 have had trouble with the Bonic screen saver. If that is what you are using, don't use the Bonic/Seti screen saver. (Assume you are not using Win10 but it would be feasible for a startup situation).
Newsflash, the problem with the screen saver not working under Windows 10 Creators Update has been fixed by Microsoft. All you need is update KB4020102.
ID: 1869604 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1869636 - Posted: 26 May 2017, 21:52:13 UTC - in response to Message 1869593.  

5) Unless you are buying the brand new/current generation AMD cpu, AMD cpu's do significantly less number crunching due to having 1 number cruncher core for each 2 general cpu cores.


Not true - this isn't even true for the FX family of processors.
The older (phenom & athlon) series had one FPU per core, the FX series had one FPU per per pair of cores. Of course with the Ryzen family (or at least the members that have been released so far) have one FPU per general core, and have multi threading (which makes their architecture similar to the intel ix & xeon series.


Sorry, I generalized from my own experience with the A-10 processors and what I thought I had read. So bottom line is read the specs for the cpu chip you are buying. Make sure it has one FPU processor for each core. Otherwise it will crunch more slowly.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1869636 · Report as offensive

Questions and Answers : GPU applications : Constructing A Machine, With An Eye To SETI@Home


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.