Astropulse CPU vs GPU?


log in

Advanced search

Message boards : Number crunching : Astropulse CPU vs GPU?

Previous · 1 · 2 · 3 · Next
Author Message
spitfire_mk_2
Avatar
Send message
Joined: 14 Apr 00
Posts: 441
Credit: 12,110,807
RAC: 9,154
United States
Message 1336428 - Posted: 9 Feb 2013, 23:16:27 UTC - in response to Message 1336318.

Hi, just wondering...

How many AP tasks can a GF 560Ti 1.3Gb or 660GTX 2Gb do at a time? How many CPU cores would that need?

Just want to know if it is worth waiting for Linux version of AP cuda/OpenCL.



p.s.
I'm happy with my i7-3930K CPU doing 6 AP units simultaneously producing 6 completed AP results every 13500 seconds while it is also crunching 6 GPU MB processes on above mentioned cards both doing 3 at a time. The 3930K uses Lunatics linux AVX build. CPU temperature rises to 74-79C, depending on core, when doing 6 AP simultaneously. GPU's stay at 63C.


From what I have seen, opencl task takes same amount of vide memory as the cuda task. So from purely technical angle:
560ti with 1.3 GB can hold 3 or 4 tasks
GTX660 with 2 GB will hold 8 tasks
____________

Profile Mike
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23381
Credit: 31,818,607
RAC: 24,276
Germany
Message 1336429 - Posted: 9 Feb 2013, 23:17:39 UTC
Last modified: 9 Feb 2013, 23:19:25 UTC

There`s a lot you can do.

Did you read the readme of the AP app ?
You are using 1 core anyways so why not running 2 instances.

I wrote some best tips how you can tweak the app.
Its included in the readme file.
____________

Profile Mike
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23381
Credit: 31,818,607
RAC: 24,276
Germany
Message 1336430 - Posted: 9 Feb 2013, 23:20:00 UTC - in response to Message 1336428.

Hi, just wondering...

How many AP tasks can a GF 560Ti 1.3Gb or 660GTX 2Gb do at a time? How many CPU cores would that need?

Just want to know if it is worth waiting for Linux version of AP cuda/OpenCL.



p.s.
I'm happy with my i7-3930K CPU doing 6 AP units simultaneously producing 6 completed AP results every 13500 seconds while it is also crunching 6 GPU MB processes on above mentioned cards both doing 3 at a time. The 3930K uses Lunatics linux AVX build. CPU temperature rises to 74-79C, depending on core, when doing 6 AP simultaneously. GPU's stay at 63C.


From what I have seen, opencl task takes same amount of vide memory as the cuda task. So from purely technical angle:
560ti with 1.3 GB can hold 3 or 4 tasks
GTX660 with 2 GB will hold 8 tasks


Thats wrong assumption.

____________

spitfire_mk_2
Avatar
Send message
Joined: 14 Apr 00
Posts: 441
Credit: 12,110,807
RAC: 9,154
United States
Message 1336432 - Posted: 9 Feb 2013, 23:25:47 UTC - in response to Message 1336430.

Hi, just wondering...

How many AP tasks can a GF 560Ti 1.3Gb or 660GTX 2Gb do at a time? How many CPU cores would that need?

Just want to know if it is worth waiting for Linux version of AP cuda/OpenCL.



p.s.
I'm happy with my i7-3930K CPU doing 6 AP units simultaneously producing 6 completed AP results every 13500 seconds while it is also crunching 6 GPU MB processes on above mentioned cards both doing 3 at a time. The 3930K uses Lunatics linux AVX build. CPU temperature rises to 74-79C, depending on core, when doing 6 AP simultaneously. GPU's stay at 63C.


From what I have seen, opencl task takes same amount of vide memory as the cuda task. So from purely technical angle:
560ti with 1.3 GB can hold 3 or 4 tasks
GTX660 with 2 GB will hold 8 tasks


Thats wrong assumption.

There is no assumption there. Only observations.

Can you tell me your observations on the size of opencl memory consumption on your video cards?
____________

Profile Mike
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23381
Credit: 31,818,607
RAC: 24,276
Germany
Message 1336434 - Posted: 9 Feb 2013, 23:32:42 UTC

Memory consumption doesn`t really matter here.
Each ap unit consuming almost a CPU core on nvidia cards.

More than 4 is not worth.

____________

rob smith
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8140
Credit: 52,727,990
RAC: 74,800
United Kingdom
Message 1336549 - Posted: 10 Feb 2013, 8:32:58 UTC

While the memory of a given card may be able to "hold" a number of WU, the processor (on the GPU) can only efficiently work on two or three at a time.

If you want to try it, start with processing one WU at a time, allow the card to run through for several hours, noting the processing time for each, now repeat for two WU, three WU and 4 WU. Provided the random sample of WU you have been fed are all much the same you will see a slight increase in processing time between 1 and 2, 2 and 3 and a very large increase between 3 and 4.

You do of course need to have a feed of WU, so don't try this while the servers are "having a holiday", which they are as I type (and this holiday may well continue until about 18:00UTC (19:00 your local). I was about to start to try this on my new cruncher, but I think I'll leave that set at 1 per GPU (its a 690 which has two GPU on one card)
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Claggy
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4048
Credit: 32,693,315
RAC: 531
United Kingdom
Message 1336579 - Posted: 10 Feb 2013, 10:23:44 UTC - in response to Message 1336421.

I did try running with one core feeding the GPU about 6 months ago (with the current hardware and software set up) and the overall throughput (total tasks per hour)was less with one core set aside for feeding the GPU than having the default settings where there is a "free for all". I have no doubt that for other CPU+GPU configurations the results would be different.

(I have run GPU-X on a few occasions, and the load time (time when the GPU is essential idle, is only a few seconds - between 5 and 10, not a lot when the run time is way over a thousand).


Thats just because 1 core is not enough on AMD machines.

I finnish 2 APs in less an hour.
Thats ~3000 seconds.
You finnish in about 13000 - 15000 seconds.
Do you see the difference ?
I`m running APs over 2 years now and am fully aware of each conditions.

Claggy has also a 460 and does faster than my ATI.

My GTX460 (which a factory overclocked variant) does AP Wu's in about 35 minutes (2100 secs), while my HD7770 (again a factory overclocked model) does AP in about 1 hour (3600 secs), both one at a time

For Both the Nvidia and AMD/ATI AP apps, you're absolutely got to reserve a core,
for the Nvidia app it's because of a change in the 270.xx drivers, after those drivers Raistmers Nvidia OpenCL apps fully utilise a core to feed the app (aka the 100% usage Bug/feature),
if you don't reserve a core the app is not fed as fast and takes a lot longer to finish,
you can downgrade to 26x.xx drivers to get around the 100% usage Bug/feature, but then the app isn't as fast as with a free core,

For the AMD/ATI app it is the same but opposite, if you don't free a core sometimes the Wu's take two or three times as long, with a very low GPU usage,
freeing a core guarantees the app will proceed at fully speed, with low CPU usage,

Claggy

rob smith
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8140
Credit: 52,727,990
RAC: 74,800
United Kingdom
Message 1336580 - Posted: 10 Feb 2013, 10:25:50 UTC

Claggy, for those that don't know how to, could you post a "Janet and John" on reserving a core.
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

juan BFB
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 4939
Credit: 269,826,310
RAC: 371,027
Brazil
Message 1336582 - Posted: 10 Feb 2013, 10:30:36 UTC - in response to Message 1336549.
Last modified: 10 Feb 2013, 10:36:23 UTC

You do of course need to have a feed of WU, so don't try this while the servers are "having a holiday", which they are as I type (and this holiday may well continue until about 18:00UTC (19:00 your local). I was about to start to try this on my new cruncher, but I think I'll leave that set at 1 per GPU (its a 690 which has two GPU on one card)


Just a clue, talking about MB app only, I did not not try the new AP Cuda (a DL of a single AP WU takes hours here), on the 690, 2 WU at a time apears to be the best value (total of 4 WU on the 2xGPU) but keep at least 1 core free per GPU to feed them (my system are all Intel, on an AMD i belive 2 is better). But each system is unique you need to test.

You are running x41g (the normal Lunnatics optimized app) the 690 runs perfect on the new x41zc and gets a performance gain of 10-15% with cuda5 so you must try that, DL in the Jasons site: http://jgopt.org/download.html

And don´t forget to keep an eye on the temps, specialy if you use x41zc (more performace=few more heat) i use EVGA precision to keep the fan running faster than normal and that keeps the 690 at low 70´S.
____________

Profile Mike
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 23381
Credit: 31,818,607
RAC: 24,276
Germany
Message 1336584 - Posted: 10 Feb 2013, 10:45:46 UTC - in response to Message 1336580.

Claggy, for those that don't know how to, could you post a "Janet and John" on reserving a core.


Easiest way is in boinc manager preferences.
On multiprocessors use 76%.
This will reserve 2 cores on a 8 core CPU.

____________

Claggy
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4048
Credit: 32,693,315
RAC: 531
United Kingdom
Message 1336588 - Posted: 10 Feb 2013, 10:55:38 UTC - in response to Message 1336580.

Claggy, for those that don't know how to, could you post a "Janet and John" on reserving a core.

There's two ways of doing it,

eithier set in your computing preferences 'On multiprocessors, use at most' to the percentage of cores you want to use, ie, for an 8 core CPU where you want to free just one core, set the precentage to 87.5%
You can eithier do this in the local preferences, or set up a new location/venue with just that host at that location (you have four locations, default, home, school and work available)

Or you can do it in your app_info automatically, by changing the <avg_ncpus> and <max_ncpus> values for the OpenCL AP, if you're only going to run a single instance of AP at a time, set it to the following:

<avg_ncpus>1.0</avg_ncpus>
<max_ncpus>1.0</max_ncpus>

This will mean every time an OpenCL AP Wu starts it'll have a core reserved for it, and will be returned for CPU use once it completes,

If you're going to be running two OpenCL AP Wu's at a time (or have two GPUs), the above will free two cores when enough OpenCL AP Wu's run, so only want to free a core set them to 0.5 instead, or 0.25 for two Wu's on two GPUs,
the problem with this method is you might find you're only running one AP Wu, the rest of the Counts being filled with Seti_enhanced work, so you haven't got any cores freed, so best to just go for the top method.
(I use a combination of the two, % of CPU usage set to 87.5%, with both ATI & Nvidia OpenCL apps set free a core)

Claggy

Claggy
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4048
Credit: 32,693,315
RAC: 531
United Kingdom
Message 1336589 - Posted: 10 Feb 2013, 11:09:37 UTC
Last modified: 10 Feb 2013, 11:10:14 UTC

One thing i've noticed while looking at Cliff's, Rob's and spitfire_mk_2's AP results is they are all running with the default parameters,
these parameters where set like that so the app can be run on low end GPUs like 8400GS and HD5400's, and won't utilse a GTX460 or GTX660 fully,

Before you start running Multiple instances, tune the app first, this will improve GPU usage, and increase memory usage, then worry about running multiple instances,

For a mid Range GPU like a GTX460 or a HD7770, -unroll 10 -ffa_block 6144 -ffa_block_fetch 1536 is suitable,

rather than putting it in your app_info, put it in the ap_cmdline_win_x86_SSE2_OpenCL_NV.txt or ap_cmdline_win_x86_SSE2_OpenCL_ATI.txt file instead,
this has the advantage that you can change the parameters without having to restart Boinc,

The r1761 readme's have the rest of the parameters listed.

Claggy

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8375
Credit: 46,728,782
RAC: 21,120
United Kingdom
Message 1336590 - Posted: 10 Feb 2013, 11:14:40 UTC - in response to Message 1336584.

Claggy, for those that don't know how to, could you post a "Janet and John" on reserving a core.

Easiest way is in boinc manager preferences.
On multiprocessors use 76%.
This will reserve 2 cores on a 8 core CPU.

If we're doing "Janet and John", you should perhaps point out that there are two possible places to set this value:

1) Via the computing preferences page for your account on this website.
2) Via "Computing preferences..." on the Tools menu in BOINC Manager itself (look at the bottom of the 'processor usage' tab).

People should choose one of these locations for setting preferences, and stick with it. If you even make one single change directly in BOINC Manager, you will 'lock-in' all the other settings on the first three tabs of the BOINC Manager preferences dialog, and any later changes on the website will be ignored.

That's because BOINC gives priority to the local settings. If you're not sure which preference set you're using at the moment, look for the line

Reading preferences override file

in your message/event log when BOINC starts up (just below the list of projects you're attached to). If you find you've inadvertently set up a local override file, but prefer to use website settings, you can use the 'clear' button in the BOINC Manager preferences dialog.

Profile petri33
Volunteer tester
Send message
Joined: 6 Jun 02
Posts: 372
Credit: 66,659,240
RAC: 48,393
Finland
Message 1336596 - Posted: 10 Feb 2013, 11:37:57 UTC
Last modified: 10 Feb 2013, 11:39:05 UTC

Hi,

Thank you all for the information. I'll be waiting for a linux OpenCL AP and a x41z version of linux MB.

I'll definitely reserve a core for the GPU AP too.

As of now I'm running 6 of 12 cores doing CPU MB/AP and the other six feed the cuda MB tasks. The 6 CPU tasks run at 99,6% according to 'top' and the 6 feeding the GPU use from 6% to 10% each running on its own core.

The reason to run with these setting is that I think the i7-3930K has only 6 FPU/MMX/AVX units and when in HT the CPU processes would have to share them causing context swithing and register file store/loads and would stress the cache and memory bus.
____________

Glen
Send message
Joined: 5 Feb 13
Posts: 8
Credit: 26,875
RAC: 0
Australia
Message 1337356 - Posted: 12 Feb 2013, 6:15:40 UTC

Is it possible to stop seti from doing Astropulse units without aborting them ? I need to be able to use my computer and Astropulse units make it impossible to use while it's doing them .PLUS I"M LOOKING FOR ET NOT PULSERS if i wish to find pulsers i'll do a diffent progect. I have allready burt 1 video card out and stoped doing seti alltogeather for a few years i thought you fixed this problem.
____________

rob smith
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8140
Credit: 52,727,990
RAC: 74,800
United Kingdom
Message 1337360 - Posted: 12 Feb 2013, 6:24:08 UTC

You can stop S@H from sending you Astoplulse quite simply.
Go to your account web page, select "SETI@home preferences", then "edit default preferences".
Deselect the two Astopluse entries, and the "allow other applications", then hit the "update" button.

____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

spitfire_mk_2
Avatar
Send message
Joined: 14 Apr 00
Posts: 441
Credit: 12,110,807
RAC: 9,154
United States
Message 1337470 - Posted: 12 Feb 2013, 15:56:26 UTC - in response to Message 1337360.

You can stop S@H from sending you Astoplulse quite simply.
Go to your account web page, select "SETI@home preferences", then "edit default preferences".
Deselect the two Astopluse entries, and the "allow other applications", then hit the "update" button.

Do you want to receive GPU AP tasks?
____________

Glen
Send message
Joined: 5 Feb 13
Posts: 8
Credit: 26,875
RAC: 0
Australia
Message 1337581 - Posted: 13 Feb 2013, 1:55:36 UTC - in response to Message 1337360.

Thanks mate i will do what you said i can relax and not worry about burning out my video card.
____________

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1177
Credit: 41,563,119
RAC: 109,597
United States
Message 1338580 - Posted: 15 Feb 2013, 19:26:10 UTC - in response to Message 1337360.
Last modified: 15 Feb 2013, 20:03:40 UTC

You can stop S@H from sending you Astoplulse quite simply.
Go to your account web page, select "SETI@home preferences", then "edit default preferences".
Deselect the two Astopluse entries, and the "allow other applications", then hit the "update" button.

It appears the nVidia preferences are broken. I'm having the exact same results with preferences set to SETI@home Enhanced: yes; AstroPulse v6: no; If no work for selected applications is available, accept work from other applications? no. The scheduler keeps sending me nVidia AstroPulse instead of Multibeam. My old nVidia card doesn't do APs very well, it's much better with MBs. It's probably the same bug that is causing the NVIDIA GPU SETI@home Enhanced tasks to be vaporized instead of being resent as a 'lost task'. This is the second time my nVidia 609 tasks have been timed-out instead of being resent along with all the ATI & CPU 'lost' tasks. Someone needs to fix that.

Profile Gundolf Jahn
Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 356,858
RAC: 17
Germany
Message 1338813 - Posted: 16 Feb 2013, 8:16:43 UTC - in response to Message 1338580.

It appears the nVidia preferences are broken.

Did you check that you set the preferences in the same venue (default, home, work, school) your host is attached to?

Gruß,
Gundolf

Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Astropulse CPU vs GPU?

Copyright © 2014 University of California