Really low performance (I believe)

Message boards : Number crunching : Really low performance (I believe)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Sean Arrowsmith

Send message
Joined: 17 Feb 01
Posts: 34
Credit: 4,955,471
RAC: 0
United Kingdom
Message 1906893 - Posted: 14 Dec 2017, 0:32:10 UTC

Hi All,

I am just a little concerned something is not right with my SETI crunching.

Having got a new powerful PC I decided to start SETI crunching again, I was told where to get the optimized apps for my Ryzen 1800X/GTC 1060 STRYIX

So my settings are 90% of processors with 90% of CPU time. I am running 2 projects (SETI and Milkyway) and they are spilt 50/50, since the 27th when I started , SETI is only doing 1256 (WorkDone) and Milkyway is averaging 32504 (WorkDone)

Is this correct in your eyes. Is something drastic wrong? Why is Milkyway performing 10x better.
Any ideas?

Thanks
ID: 1906893 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1906897 - Posted: 14 Dec 2017, 0:45:19 UTC - in response to Message 1906893.  
Last modified: 14 Dec 2017, 0:45:39 UTC

Some projects have shorter deadline than others. Example. Einstein@home vs Seti@home. With a 50/50 split between the 2 projects, Einstein will actually run more tasks per day than Seti because of shorter deadlines. It's only when a work unit on Seti gets near it's deadline that Seti work unit will crunch. I would suggest you select 1 project as your primary and the other as a back up, so 100% at one and 0 at the other. That way it will crunch the primary and when there is no work there, it will switch to the back up. With a 0% selection, it will only download about 1 work unit per card at a time to keep it busy until the primary project has more work, ie.. seti checks every 5 minutes, so if after 5 minutes new work is available, that seti work will download and your back up project work will go to standby.
ID: 1906897 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1906905 - Posted: 14 Dec 2017, 1:30:25 UTC - in response to Message 1906893.  

Yes, you are not running the cpu optimally. Biggest factor in how fast a Ryzen runs a cpu tasks is the core frequency and most importantly, the memory clock. Plus you are overloading the cpu with too many concurrent cpu tasks. Knock your concurrent cpu tasks down by one or two to get your cpu_time versus your run_time closer to equal times. Also it helps to lock your cpu tasks to a physical core. You can to do that via either Process Explorer or Process Lasso. Let the virtual or HT cores support the 1060 gpu tasks.

If you look at my Ryzen 1700X, you can get an example of where both BLC and Arecibo VLAR tasks are running.

Workunit 2776558293
Workunit 2776771374

Also it helps if you are running the Optimized Apps from the Lunatics Installer. In this case the AVX app.

In my 1700X crunch system, the cpu clock is 3.925 Ghz with memory clock of 3333 Mhz @ CL14 timings.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1906905 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22188
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1906949 - Posted: 14 Dec 2017, 5:50:59 UTC

Also Milkyway pays more credits per hour of operation than SETI
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1906949 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1906969 - Posted: 14 Dec 2017, 7:10:30 UTC
Last modified: 14 Dec 2017, 7:11:12 UTC

You could also try a custom app_config.xml for each project and set max_concurrent (the maximum number of tasks of an application to run at a given time) to 8. This would split your CPU 50:50.

I used to use this when I ran POGS as it was so aggresive with deadlines and timeout between connecting to the server that it pushed SETI out.

More details on app_config at https://boinc.berkeley.edu/wiki/Client_configuration
ID: 1906969 · Report as offensive
Profile Sean Arrowsmith

Send message
Joined: 17 Feb 01
Posts: 34
Credit: 4,955,471
RAC: 0
United Kingdom
Message 1907002 - Posted: 14 Dec 2017, 9:47:38 UTC - in response to Message 1906905.  

Thank you Keith for your advice.

I have tried looking at other posts for info on what you are saying but confused, so if you could point me in the right direction I would be most grateful. SETI was the very first project I started on back in 2001 and then joined a couple of other projects mostly concentrating on Milkyway but I was to build SETI more. The other week when I started BOINC again you helped me get the right optimised apps. https://setiathome.berkeley.edu/forum_thread.php?id=82262

I have a Ryzen 1800x and as you know that has 8 cores (16 threads), how would I optimise the CPU as you suggested. My Ryzen is running at stock 3.690 freq and memory is at 2166Mhz
ID: 1907002 · Report as offensive
Profile Sean Arrowsmith

Send message
Joined: 17 Feb 01
Posts: 34
Credit: 4,955,471
RAC: 0
United Kingdom
Message 1907014 - Posted: 14 Dec 2017, 10:21:25 UTC - in response to Message 1906897.  

Hi Zalster, I have put Milkyway down as a resource share of 0 to create this as the backup project whilst concentrating on SETI.

Thanks, now hopefully I can get my Ryzen CPU optimized as Keith suggests.
ID: 1907014 · Report as offensive
Profile Sean Arrowsmith

Send message
Joined: 17 Feb 01
Posts: 34
Credit: 4,955,471
RAC: 0
United Kingdom
Message 1907222 - Posted: 15 Dec 2017, 11:12:23 UTC - in response to Message 1906969.  

Thanks David,
I will look at this, but my app_info is large (standard one created by Lunatics), so not really sure which applications I need to create in the app_config file, so I will leave this alone as I do not know what I am doing and shall probably break it :-)

Thanks
ID: 1907222 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1907224 - Posted: 15 Dec 2017, 11:49:17 UTC - in response to Message 1907222.  
Last modified: 15 Dec 2017, 11:51:23 UTC

Thanks David,
I will look at this, but my app_info is large (standard one created by Lunatics), so not really sure which applications I need to create in the app_config file, so I will leave this alone as I do not know what I am doing and shall probably break it :-)

Thanks


Yes the Lunatics one is complex. Try just creating an app_config for Milkyway and limit it to 8 CPUs. Then SETI would get at least 8, and if Milkyway had no work SETI would run on the other CPUs until some work became available again at Milkyway.

Would only need to contain

<app_config>
<project_max_concurrent>8</project_max_concurrent>
</app_config>
ID: 1907224 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907270 - Posted: 15 Dec 2017, 19:13:36 UTC - in response to Message 1907002.  
Last modified: 15 Dec 2017, 19:23:00 UTC

Thank you Keith for your advice.

I have tried looking at other posts for info on what you are saying but confused, so if you could point me in the right direction I would be most grateful. SETI was the very first project I started on back in 2001 and then joined a couple of other projects mostly concentrating on Milkyway but I was to build SETI more. The other week when I started BOINC again you helped me get the right optimised apps. https://setiathome.berkeley.edu/forum_thread.php?id=82262

I have a Ryzen 1800x and as you know that has 8 cores (16 threads), how would I optimise the CPU as you suggested. My Ryzen is running at stock 3.690 freq and memory is at 2166Mhz

Hi Sean, as I mentioned, Ryzen processing speed is directly affected by the memory clock since it controls the inter-CCX communication. Ryzen was designed for stock 2666 memory speed without overclocking. So your memory running at 2133 is a handicap. I don't know what your memory kit is or whether you just allowed the system to install the memory at default JEDEC 2133 clocks or whether that is the fastest your memory kit will run. Go into the BIOS and select the XMP profile the memory kit was sold as would be the first step.

Since you are running the anonymous platform, I will offer my app_config.xml file from my Win10 1700X machine. It limits the amount of concurrent tasks allowed to get better run_times and also sets the resources appropriately tuned for the SoG app.

<app_config>
  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_SoG</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>1</ngpus>
    <cmdline>-sbs 1024 -period_iterations_num 1 -tt 1500 -high_perf -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -high_prec_timer</cmdline>
  </app_version>
  <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>1</ngpus>
    <cmdline>-unroll 24 -oclFFT_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 8 1 -tune 2 64 8 1</cmdline>
  </app_version>
<project_max_concurrent>11</project_max_concurrent>
</app_config>


With max concurrent set at 11, I am running 8 CPU tasks and with 3 GPU's, I am running 3 GPU tasks. This produces the fastest completion times for both CPU and GPU. I am also forcing the cpu app to only run on the physical cores with the Process Lasso application. The 8 HT cores in the 1700X support the gpu tasks and run the computer desktop and any other background programs.

I would recommend two auxiliary programs. System Information Viewer (SIV) SIV and Process Lasso Process Lasso
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907270 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907271 - Posted: 15 Dec 2017, 19:19:49 UTC - in response to Message 1907222.  

Thanks David,
I will look at this, but my app_info is large (standard one created by Lunatics), so not really sure which applications I need to create in the app_config file, so I will leave this alone as I do not know what I am doing and shall probably break it :-)

Thanks

Yes, the app_info that the Lunatic Installer creates is inordinately large. Mainly because it needed to handle all possible hardware and software combinations. The default Lunatics app_info is 40KB in size. Mine is 5 KB because I whittled out all the irrelevant CUDA applications and their assorted plan_classes.

I only define one CPU app for MB and AP tasks and only one GPU app for MB and AP task. In my case the AVX apps for both MB and AP and the SoG app for MB and the OpenCL app for AP.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907271 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1907273 - Posted: 15 Dec 2017, 19:27:33 UTC - in response to Message 1907270.  


Hi Sean, as I mentioned, Ryzen processing speed is directly affected by the memory clock since it controls the inter-CCX communication. Ryzen was designed for stock 2666 memory speed without overclocking. So your memory running at 2133 is a handicap. I don't know what your memory kit is or whether you just allowed the system to install the memory at default JEDEC 2133 clocks or whether that is the fastest your memory kit will run. Go into the BIOS and select the XMP profile the memory kit was sold as would be the first step.

Since you are running the anonymous platform, I will offer my app_config.xml file from my Win10 1700X machine. It limits the amount of concurrent tasks allowed to get better run_times and also sets the resources appropriately tuned for the SoG app.

<app_config>
  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_SoG</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>1</ngpus>
    <cmdline>-sbs 1024 -period_iterations_num 1 -tt 1500 -high_perf -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -high_prec_timer</cmdline>
  </app_version>
  <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>1</ngpus>
    <cmdline>-unroll 24 -oclFFT_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 8 1 -tune 2 64 8 1</cmdline>
  </app_version>
<project_max_concurrent>11</project_max_concurrent>
</app_config>


With max concurrent set at 11, I am running 8 CPU tasks and with 3 GPU's, I am running 3 GPU tasks. This produces the fastest completion times for both CPU and GPU. I am also forcing the cpu app to only run on the physical cores with the Process Lasso application. The 8 HT cores in the 1700X support the gpu tasks and run the computer desktop and any other background programs.

I would recommend two auxiliary programs. System Information Viewer (SIV) SIV and Process Lasso Process Lasso


Those command lines are pretty aggressive for a 1060, might want to change some of those values.
ID: 1907273 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907276 - Posted: 15 Dec 2017, 19:36:33 UTC

Yes, agree with Z-man. Too aggressive for a 1060. I was just simply trying to suggest the use of max_concurrent in app_config. You should always consult the SoG app docs for suggested tuning parameters for your specific hardware. Then search the forum for past posts about good working tuning parameters for a 1060. Focus on posts by Mike.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907276 · Report as offensive
Profile Sean Arrowsmith

Send message
Joined: 17 Feb 01
Posts: 34
Credit: 4,955,471
RAC: 0
United Kingdom
Message 1907280 - Posted: 15 Dec 2017, 20:07:24 UTC

Thanks for the extra replies.

I made a mistake, my memory is actually clocked at 2666.

Whilst in the BIOS I also updated it, then applied the Asus OC Tuner to clock the CPU at 4Ghz (from 3.6Ghz), this has raised the temps by 10deg to 72deg C but with full load the 240mm water cooler is coping well and still 20deg cooler than the max 95deg the 1800x will tolorate, I also OC the Strix 1060 to 2075Mhz from the 1873Mhz

I shall use you app_config and see how this goes. Thanks again all of you for your replies.
ID: 1907280 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907285 - Posted: 15 Dec 2017, 20:34:50 UTC - in response to Message 1907280.  

Good to hear your memory is at least at default XMP. Memory has been the biggest headache for Ryzen since launch. The latest BIOS have made big improvements in memory compatibility since launch day. The best memory for Ryzen and TR is single rank memory made with Samsung B-dies. They offer the greatest memory speeds which dramatically improve Ryzen performance.

I think my SoG tunings are too aggressive for a 1060 even for a 6GB card. I have a 6GB 1060 in my Ryzen Windows 10 host. But the main cards are a GTX 1070 Ti and and a 1070. Don't be confused about what BOINC reports for my system. It is just one of the quirks of BOINC that whatever card occupies the lowest enumerated PCIe bus slot on the motherboard is chosen to identify the gpu's in the system. It so happens my 1060 is in that slot, so BOINC says I have (3) GTX 1060's. I do not.

But I get away with the tunings I posted in my app_config because they are written mainly for my 1070's. I suffer a bit of performance loss on the 1060 because of it. In your single card 1060 system you won't have to make any compromises.

You might want to start with this tune line for the SoG app.
-tt 1500 -sbs 1024 -period_iterations_num 4 -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64

If you don't get system keyboard or mouse pointer lags, you can tighten up -period_iterations_num to 1. You can also add the -high_perf parameter. I'd stay away from the -hp parameter.

The -unroll for the AP app is most definitely too aggressive. I would try the stock -unroll 10 parameter first. As always, it takes time to make good system tunings. Make one change and observe the task times for both Arecibo shorties and BLC tasks, then make another change and reevaluate.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907285 · Report as offensive

Message boards : Number crunching : Really low performance (I believe)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.