AP V7

Message boards : Number crunching : AP V7
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 20 · Next

AuthorMessage
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1587656 - Posted: 16 Oct 2014, 6:40:57 UTC
Last modified: 16 Oct 2014, 6:57:25 UTC

I can confirm this behaviour on both AMD and NV GPUs - when running a mix of AP and MB on the same GPU, the AP task will run faster than when running AP tasks only, and the MB task will run more slowly than when running MB tasks only.

Yet another vagary that makes comparing GPU performance difficult for S@h.
Soli Deo Gloria
ID: 1587656 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1587733 - Posted: 16 Oct 2014, 10:11:59 UTC - in response to Message 1587656.  

AP was quite heavely optimized for ATi platform from the early days of Brook+.
I think it's safe enough to say it more suits ATi GPUs (in terms of maximum usage of available computation resources) than ATi MB.
Up to the moment of GUI lags and too big wasted computations if any, the bigger kernel is the more efficiently it uses GPU resources (at other equal params of course) and the less CPU overhead it have.
Hence here is nothing to fix.
The only thing one could check is to ensure if AP.MB mix still gives better performance than serial AP then MB execution. Separate testing required for that (for one who wants absolute performance peak from his GPU).
As estimate it can be supposed as "yes". Even slower MB running along with AP will result in better total times than AP and MB running serially.
ID: 1587733 · Report as offensive
Wedge009
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 451
Credit: 431,396,357
RAC: 553
Australia
Message 1587757 - Posted: 16 Oct 2014, 11:04:39 UTC

Oh, I wasn't complaining, just making a statement. Even if one could find which is 'optimal' in terms of AP/MB mix, it would be difficult to maintain with fluctuating number of AP tasks.

But I agree that AP on ATi/AMD GPUs is very good. Thank you for all your years of work on it. (:
Soli Deo Gloria
ID: 1587757 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1587760 - Posted: 16 Oct 2014, 11:10:13 UTC

Hi Folks,
Just completed another AP:-
16/10/2014 11:59:12 | SETI@home | Computation for task ap_24ap11ae_B5_P1_00293_20141015_10539.wu_0 finished

It had a short run time of 732.85 seconds..

And the WU running with it did NOT take much longer than normal, probably due to the short run time of the AP

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1587760 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1587784 - Posted: 16 Oct 2014, 13:33:08 UTC - in response to Message 1587656.  
Last modified: 16 Oct 2014, 13:33:49 UTC

I can confirm this behaviour on both AMD and NV GPUs - when running a mix of AP and MB on the same GPU, the AP task will run faster than when running AP tasks only, and the MB task will run more slowly than when running MB tasks only.


Oh, I wasn't complaining, just making a statement.



And the WU running with it did NOT take much longer than normal, probably due to the short run time of the AP


Thanks for observations. So far I collect info mostly for AP+AP configs.
Info for for APv7 tasks collected up to now is available here:

http://lunatics.kwsn.net/1-discussion-forum/astropulse-v7-performance-illustration.0.html

AP+MB or interactions with other GPU apps from another projects were out of scope so far. Maybe more info will be collected in time.
ID: 1587784 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1587787 - Posted: 16 Oct 2014, 13:37:24 UTC - in response to Message 1587784.  

AP+MB or interactions with other GPU apps from another projects were out of scope so far. Maybe more info will be collected in time.

In general, I think I remember the CUDA MB apps here yielding GPU occupancy/performance to Einstein's OpenCL apps too, on the rare occasions when I've tried running both on the same GPU. Nowadays I tend to restrict apps from different projects to separate devices, otherwise the runtime estimation averages get messed up.
ID: 1587787 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1587813 - Posted: 16 Oct 2014, 14:51:06 UTC - in response to Message 1587760.  

Hi Folks,
Just completed another AP:-
16/10/2014 11:59:12 | SETI@home | Computation for task ap_24ap11ae_B5_P1_00293_20141015_10539.wu_0 finished

It had a short run time of 732.85 seconds..

And the WU running with it did NOT take much longer than normal, probably due to the short run time of the AP

Regards,

As is often the case for a 30/30 overflow, the limit for repetitive pulses was reached very early at the end of the first large chunk. After that it was only looking for single pulses in an additional 70 large chunks.

The counts indicate that the 40.74% blanking included all of the data for the short period rep. pulse search so only one (long period) search was actually done (and terminated early when the limit was reached). Raistmer has indicated the Fast Folding Algorithm (FFA) processing for repetitive pulses is usually the longest running kernel, so can be expected to have the most impact on another task sharing the GPU.

Different values for the -ffa_block and -ffa_block_fetch options might make AP have less impact on MB, it's a complex situation and I won't try to guess the effect on overall productivity. It's possible overall RAC could be increased that way, though.
                                                                  Joe
ID: 1587813 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1587817 - Posted: 16 Oct 2014, 15:04:33 UTC - in response to Message 1587813.  

Hi Josef,
That type of tweaking is somewhat beyond me, I can get into major finger trouble just adding a line to cc_config.xml

Thanks for the info though.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1587817 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19048
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1587823 - Posted: 16 Oct 2014, 15:32:57 UTC - in response to Message 1587787.  

AP+MB or interactions with other GPU apps from another projects were out of scope so far. Maybe more info will be collected in time.

In general, I think I remember the CUDA MB apps here yielding GPU occupancy/performance to Einstein's OpenCL apps too, on the rare occasions when I've tried running both on the same GPU. Nowadays I tend to restrict apps from different projects to separate devices, otherwise the runtime estimation averages get messed up.

Problems and benefits from running different BOINC applications are not limited to GPU's.

In the days when multiple cpu cores where rare (before AP was even on Beta), it was found running Seti and Einstein was a good balance, But difficult to achieve unless you ran with no work cache, and manually reset balance after outages.

And then when we started AP testing at Beta, it was found running AP and Climate Prediction tasks at same time was a disaster.
ID: 1587823 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1587830 - Posted: 16 Oct 2014, 15:48:58 UTC

This is 100% totally just OCD nit-picking (and in no way intended to be considered as a complaint), but I just noticed something with r2692 x64 SSE3 CPU app.

In every single build that I've participated in over the years (starting with *digs in archive folder* ap_4.35rev24b54_SSE3), after the initial no-progress at the start of a WU, once the completion percentage starts increasing, every 1-second GUI update showed continuing progress.

However, with r2692, I have noticed that it shows no gain for 10-120 seconds, and then makes huge leaps and stops progressing again (the CPU thread stays fully-loaded the whole time). I started my three v7 WUs all at the same time this morning, and as I was watching them, they all paused at the same percentage intervals, even if they got there at different times.

With that being said, it still seems to be progressing along overall quite a bit faster than r557 did, so it's still definitely a plus, but this is just a new behavior that I haven't seen before. I assume it has something to do with the FFT iterations scanning ahead through blocks..or something like that? Just curious is all.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1587830 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1587892 - Posted: 16 Oct 2014, 17:52:23 UTC - in response to Message 1587830.  

This is 100% totally just OCD nit-picking (and in no way intended to be considered as a complaint), but I just noticed something with r2692 x64 SSE3 CPU app.

In every single build that I've participated in over the years (starting with *digs in archive folder* ap_4.35rev24b54_SSE3), after the initial no-progress at the start of a WU, once the completion percentage starts increasing, every 1-second GUI update showed continuing progress.

However, with r2692, I have noticed that it shows no gain for 10-120 seconds, and then makes huge leaps and stops progressing again (the CPU thread stays fully-loaded the whole time). I started my three v7 WUs all at the same time this morning, and as I was watching them, they all paused at the same percentage intervals, even if they got there at different times.

With that being said, it still seems to be progressing along overall quite a bit faster than r557 did, so it's still definitely a plus, but this is just a new behavior that I haven't seen before. I assume it has something to do with the FFT iterations scanning ahead through blocks..or something like that? Just curious is all.

There was a change in BOINC on how it shows the % completed by an app. I forget the exact nature of it, but with versions up to 7.0.42 I would see large jumps in the progress of an AP task. Then once I started using 7.2.42 the % completed was updated more often.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1587892 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1587902 - Posted: 16 Oct 2014, 18:17:56 UTC - in response to Message 1587892.  

This is 100% totally just OCD nit-picking (and in no way intended to be considered as a complaint), but I just noticed something with r2692 x64 SSE3 CPU app.

In every single build that I've participated in over the years (starting with *digs in archive folder* ap_4.35rev24b54_SSE3), after the initial no-progress at the start of a WU, once the completion percentage starts increasing, every 1-second GUI update showed continuing progress.

However, with r2692, I have noticed that it shows no gain for 10-120 seconds, and then makes huge leaps and stops progressing again (the CPU thread stays fully-loaded the whole time). I started my three v7 WUs all at the same time this morning, and as I was watching them, they all paused at the same percentage intervals, even if they got there at different times.

With that being said, it still seems to be progressing along overall quite a bit faster than r557 did, so it's still definitely a plus, but this is just a new behavior that I haven't seen before. I assume it has something to do with the FFT iterations scanning ahead through blocks..or something like that? Just curious is all.

There was a change in BOINC on how it shows the % completed by an app. I forget the exact nature of it, but with versions up to 7.0.42 I would see large jumps in the progress of an AP task. Then once I started using 7.2.42 the % completed was updated more often.

Boinc 7.2.38 introduced:

• client: if app doesn't report fraction done, estimate it.

• client: if app doesn't report fraction done, estimate fraction done in a way that converges to but never reaches 100%.

Basically it means before an app checkpoints, Boinc will estimate an apps progress until it checkpoints,
Once the app checkpoints, Boinc will display the progress that the app reports, the latest AP apps report progress less often.

Claggy
ID: 1587902 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1588023 - Posted: 16 Oct 2014, 23:14:00 UTC - in response to Message 1587902.  

There was a change in BOINC on how it shows the % completed by an app. I forget the exact nature of it, but with versions up to 7.0.42 I would see large jumps in the progress of an AP task. Then once I started using 7.2.42 the % completed was updated more often.

Boinc 7.2.38 introduced:

• client: if app doesn't report fraction done, estimate it.

• client: if app doesn't report fraction done, estimate fraction done in a way that converges to but never reaches 100%.

Basically it means before an app checkpoints, Boinc will estimate an apps progress until it checkpoints,
Once the app checkpoints, Boinc will display the progress that the app reports, the latest AP apps report progress less often.

Claggy

Well I'm using 6.10.58. Up until about 18 hours ago, r557 was updating the percentage on every 1-second GUI refresh. The only thing I changed was adding the new app for v7. *shrug* Doesn't really matter, I was just curious why it does that now.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1588023 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1588068 - Posted: 17 Oct 2014, 1:17:15 UTC
Last modified: 17 Oct 2014, 1:17:52 UTC

I have an old first gen Intel iMAC. I noticed it wasn't downloading and AP v7 work. I seem to recall hearing that 64-bit Intel was going to be a requirement for future releases? Looking at the list of apps I see "Mac OS/X 10.3+ 7.00" under AP v7. Is that an app for PPC and/or hardware with OS X.3?
Currently I run BOINC on OS X.6 for both of my MACs. It is a boot image I use for reimaging the OS on the machines.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1588068 · Report as offensive
Profile Mark Wyzenbeek
Avatar

Send message
Joined: 28 Jun 99
Posts: 134
Credit: 6,203,079
RAC: 0
United States
Message 1588144 - Posted: 17 Oct 2014, 5:20:02 UTC

I installed Lunatics v0.43 and waited to see an AP V7. And waited. I finally checked my Seti@home Preferences and saw that AP V7 was set to no. I had to edit the preferences and check the box by AP V7. I'm surprised it didn't default to yes. Now I'll wait again. :)
The Universe is not only stranger than you imagine, it's stranger than you can imagine.

SETI@home classic workunits 1,405 CPU time 57,318 hours
ID: 1588144 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1588196 - Posted: 17 Oct 2014, 8:16:58 UTC

Re AP7 credits:-
my results Cr given by date:-

14/10/14 473.47
15/10/14 439.32
15/10/14 457.61
16/10/14 412.79
16/10/14 449.52
17/10/14 477.34

Seems to go up and down willy nilly:-)

Anyway I still have 1 waiting on a wingman..

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1588196 · Report as offensive
Profile IZ3ATV
Volunteer tester

Send message
Joined: 1 Aug 99
Posts: 28
Credit: 31,986,825
RAC: 0
Italy
Message 1588330 - Posted: 17 Oct 2014, 16:06:56 UTC

I was wondering if rescheduling would be over with AP7 and I was wrong.
http://setiathome.berkeley.edu/show_host_detail.php?hostid=7374749
IZ3ATV
ID: 1588330 · Report as offensive
woohoo
Volunteer tester

Send message
Joined: 30 Oct 13
Posts: 972
Credit: 165,671,404
RAC: 5
United States
Message 1588335 - Posted: 17 Oct 2014, 16:11:57 UTC

lots of video cards
ID: 1588335 · Report as offensive
Profile IZ3ATV
Volunteer tester

Send message
Joined: 1 Aug 99
Posts: 28
Credit: 31,986,825
RAC: 0
Italy
Message 1588340 - Posted: 17 Oct 2014, 16:16:45 UTC - in response to Message 1588335.  

Max 500 WUs for that rig.
IZ3ATV
ID: 1588340 · Report as offensive
woohoo
Volunteer tester

Send message
Joined: 30 Oct 13
Posts: 972
Credit: 165,671,404
RAC: 5
United States
Message 1588341 - Posted: 17 Oct 2014, 16:18:37 UTC

I know, I was just imagining a dozen or so video cards
ID: 1588341 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 20 · Next

Message boards : Number crunching : AP V7


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.