Problems with MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu

Message boards : Number crunching : Problems with MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2028211 - Posted: 17 Jan 2020, 22:47:25 UTC - in response to Message 2028209.  

Be careful with adding extra compiler optimisations - they can break the scientific accuracy. Eric K has just emailed me about the ATI driver bug:

Actually, all I did to turn the stock 8.22 into the "new" 8.24 was edit the .exe file to remove the offending flag from the cl compiler flags.

I did that on the linux command line.

# sed 's/-cl-unsafe-math-optimizations/                             /' setiathome_8.22_windows_intelx86__opencl_ati5_sah.exe >setiathome_8.24_windows_intelx86__opencl_ati5_sah.exe
ID: 2028211 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2028214 - Posted: 17 Jan 2020, 22:50:17 UTC
Last modified: 17 Jan 2020, 22:50:29 UTC

From Eric's post, the flag was
-funsafe-math-optimizations

Grant
Darwin NT
ID: 2028214 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2028215 - Posted: 17 Jan 2020, 22:53:07 UTC - in response to Message 2028214.  

That might be the format for a new compilation, but he simply hacked the binary executable. I don't think he has the tools to recompile - this was quicker than liaising with Raistmer in Russia.
ID: 2028215 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 2028217 - Posted: 17 Jan 2020, 23:00:15 UTC - in response to Message 2028211.  
Last modified: 17 Jan 2020, 23:04:21 UTC

Be careful with adding extra compiler optimisations - they can break the scientific accuracy.
As was mentioned in a NVidia white paper i linked to some time ago on CUDA programming.

Floating Point and IEEE 754 Compliance for NVIDIA GPUs

Summary-
The key points we have covered are the following:

Use the fused multiply-add operator.
   The fused multiply-add operator on the GPU has high performance and increases the accuracy of computations. No special flags or function calls are needed to gain this benefit in CUDA programs. 
   Understand that a hardware fused multiply-add operation is not yet available on the CPU, which can cause differences in numerical results.
Compare results carefully.
   Even in the strict world of IEEE 754 operations, minor details such as organization of parentheses or thread counts can affect the final result. Take this into account when doing comparisons between implementations.
Know the capabilities of your GPU.
   The numerical capabilities are encoded in the compute capability number of your GPU. Devices of compute capability 2.0 and later are capable of single and double precision arithmetic following the IEEE 754 standard, and have hardware units for performing fused multiply-add in both single and double precision.
Take advantage of the CUDA math library functions.
   These functions are documented in Appendix E of the CUDA C++ Programming Guide [7]. The math library includes all the math functions listed in the C99 standard [3] plus some additional useful functions. 
   These functions have been tuned for a reasonable compromise between performance and accuracy.
   We constantly strive to improve the quality of our math library functionality. Please let us know about any functions that you require that we do not provide, or if the accuracy or performance of any of our functions does not meet your needs. Leave comments in the NVIDIA CUDA forum1 or join the Registered Developer Program2 and file a bug with your feedback.



Of particular relevance-
These functions have been tuned for a reasonable compromise between performance and accuracy.
Performance can be increased, but at the cost of accuracy. So any tweaks must be carefully monitored for their effect on result accuracy.
Grant
Darwin NT
ID: 2028217 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2028218 - Posted: 17 Jan 2020, 23:02:30 UTC - in response to Message 2028217.  
Last modified: 17 Jan 2020, 23:03:56 UTC

Makes sense. We had to do the same things in the intel_gpu app a while back, to restore full scientific accuracy at the expense of a little speed.

The Intel problem was also in the fused multiply-add operator.
ID: 2028218 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2028233 - Posted: 18 Jan 2020, 0:17:49 UTC - in response to Message 2028211.  
Last modified: 18 Jan 2020, 1:15:54 UTC

Be careful with adding extra compiler optimisations - they can break the scientific accuracy. Eric K has just emailed me about the ATI driver bug:

Actually, all I did to turn the stock 8.22 into the "new" 8.24 was edit the .exe file to remove the offending flag from the cl compiler flags.

I did that on the linux command line.

# sed 's/-cl-unsafe-math-optimizations/                             /' setiathome_8.22_windows_intelx86__opencl_ati5_sah.exe >setiathome_8.24_windows_intelx86__opencl_ati5_sah.exe
OK, I think I have it sorted. The problem was with the compiler option -funsafe-math-optimizations which isn't found anywhere in the sah_v7_opt folder. Someone used something from somewhere else... The options that Are in the sah_v7_opt folder are clumped together in an option called --enable-comoptions which is in the Configure-lines provided by Urs. The --enable-comoptions is obviously being used by the current Apps because the App would be Much slower if they weren't. Hmmm, that means the current App I tried testing on Main should be Good to Go. The only problem is there doesn't seem to be anyway to Test it. The last I heard Anonymous platform Still doesn't work at Beta, and the Validator on Main seems to be Broken with AMD Tasks. The ~125 AMD Tasks I ran on Main the other day Still aren't being Validated. Any Idea Why these tasks are basically being Ignored by the Validator? https://setiathome.berkeley.edu/results.php?hostid=6979629

And yes, --enable-comoptions is in the Configure-lines provided by Urs for the CPU Apps...

Not long afterwards....
Well, there is something called -cl-unsafe-math-optimizations in both fft_setup.cpp & GPU_lock.cpp in a few locations. Oh well. So, I'll just look at removing that from the code, it doesn't look too difficult.
ID: 2028233 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 2028243 - Posted: 18 Jan 2020, 1:12:35 UTC - in response to Message 2028233.  

"comoptions" (compiler options) can be different for every codepath (sse - avx2). See .../AKv8/m4/optimizations.m4 for details.
_\|/_
U r s
ID: 2028243 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2028246 - Posted: 18 Jan 2020, 1:21:44 UTC - in response to Message 2028243.  

Hey Urs. I finally did find -cl-unsafe-math-optimizations in those two named files. Any suggestion on the best way to remove it?
ID: 2028246 · Report as offensive
Urs Echternacht
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 692
Credit: 135,197,781
RAC: 211
Germany
Message 2028252 - Posted: 18 Jan 2020, 1:45:44 UTC - in response to Message 2028246.  
Last modified: 18 Jan 2020, 1:46:31 UTC

"Best way" would be to detect AMD Navi GPU and only remove unnessary/disturbing options then. Older GPU's could eventually need such options to deliver the project the necessary precision. So far Eric has been lucky!

Such change has to wait until OpenCL gets correct information from driver about hw (e.g. number of compute units, ...). I've the feeling that AMD's work on the OpenCL-section of their driver for Navi is not finished yet.
_\|/_
U r s
ID: 2028252 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2028411 - Posted: 18 Jan 2020, 22:26:31 UTC - in response to Message 2028252.  

OK, I guess I'll put the GPU App on the back burner for now. So.... I decided to try the SSSE3 CPU App compiled in Ubuntu 16.04 since I have the seti code working in the newer versions of Linux. The SSSE3 App didn't show any improvement over the older App, I then decided to try compiling a Ryzen tagged App in Ubuntu 18.04. I don't have a Ryzen to test it on, however, the new App seems to work very well on my Intel 6700K. It works so well, I'm going to withhold comments until I get a second opinion. The major item is the benchmark App shows the new AVX App as having the same precision as AVX2r4008...which is Good. I've only run it in the benchmark App, and I'd suggest others do the same for now as it's strictly an alpha app for now.
Download it here, http://www.arkayn.us/lunatics/MBv8_CPU-AVX-Ryzen.7z, and let me know how it works on other machines.
ID: 2028411 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2028434 - Posted: 19 Jan 2020, 2:38:19 UTC

OK, here you go TBar. Using Rick's BenchMT benchmark tool.
benchMT v1.6.0 ― SETI MB Benchmarking Utility ― Linux edition

Suspending BOINC

System Details
Hostname: Serenity
Run Name: Petri
APP Mode: MultiBeam
Platform: Linux 5.3.0-26-generic
OS Description: Ubuntu 18.04.3 LTS
CPU Model: AMD Ryzen 9 3950X 16-Core Processor
CPU MHz: 4200.0000
CPU Cores: 16
CPU Threads: 32
GPU Count: 3
GPU Threads: 3
Total GPU Devices: [0, 1, 2]
Specified GPU Devices: [0]
Devices Map: {0: 1}
GPU Details: [TU104 [GeForce RTX 2080 Rev. A]] [TU104 [GeForce RTX 2080 Rev. A]] [GP104 [GeForce GTX 1080]]
Current Dir: /home/keith/Downloads/Utils/benchMT-master/
Slots Dir: /home/keith/Downloads/Utils/benchMT-master/workdir/Slots/
TimeNow: Sun Jan 19 00:11:11 2020
TimeNowShort: 0119_001111
CPU App Path: /home/keith/Downloads/Utils/benchMT-master/APPS_CPU/
GPU App Path: /home/keith/Downloads/Utils/benchMT-master/APPS_GPU/
REF App Path: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/
Reference Results Path: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/
STD Signal WU Path: /home/keith/Downloads/Utils/benchMT-master/WU_std_signal/
WU Path: /home/keith/Downloads/Utils/benchMT-master/WU_test/
Test Data Path: /home/keith/Downloads/Utils/benchMT-master/testData/
BOINC Home: /home/keith/Desktop/BOINC/
Repetitions: 1
Allocated CPU Threads: 6
Allocated GPU Threads: 0
Mode yes: False
Mode noBS: False
Mode std_signals: False
Mode display_slots: False
Mode display_compact: False
Mode no_ref: False
Mode force_ref: False
Mode energy: False
Mode astropulse: False

APP List
MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu
MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu

WU List
blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu
blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu
03dc14aa.7253.7429.7.34.218.wu

================================================================================
================================================================================
================================================================================
================================================================================
App Name: MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu
App Args: --nographics
WU Name: blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu
WU Angle Range: 0.028688679491004
Spike count: 3
Autocorr count: 0
Pulse count: 12
Triplet count: 0
Gaussian count: 0
Results: /home/keith/Downloads/Utils/benchMT-master/testData/Serenity_benchMT_Petri_0119_001111//result.MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu.blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu.3a4bfee6e6004cc5a2bd8d4f57208781.sah
REF Name: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu.sah

Real Time: 1611.06
User Time: 1608.69
System Time: 0.03
MaxMem: 41680
SwapNum: 0
CtxSwt: 148511
MajPF: 1

Result : Strongly similar, Q= 99.98%
================================================================================

================================================================================
App Name: MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu
App Args: --nographics
WU Name: blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu
WU Angle Range: 0.058697256447932
Spike count: 2
Autocorr count: 0
Pulse count: 16
Triplet count: 0
Gaussian count: 0
Results: /home/keith/Downloads/Utils/benchMT-master/testData/Serenity_benchMT_Petri_0119_001111//result.MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu.blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu.8ad9e2b500434bca93d6c8b0cf9d672e.sah
REF Name: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu.sah

Real Time: 1651.86
User Time: 1649.47
System Time: 0.03
MaxMem: 43384
SwapNum: 0
CtxSwt: 152683
MajPF: 0

Result : Strongly similar, Q= 99.92%
================================================================================

================================================================================
App Name: MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu
App Args: --nographics
WU Name: blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu
WU Angle Range: 0.028688679491004
Spike count: 3
Autocorr count: 0
Pulse count: 12
Triplet count: 0
Gaussian count: 0
Results: /home/keith/Downloads/Utils/benchMT-master/testData/Serenity_benchMT_Petri_0119_001111//result.MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu.blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu.133e596d6c2f40979f3212fa725345d5.sah
REF Name: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.blc14_2bit_guppi_58691_83520_HIP79781_0103.8969.0.22.45.117.vlar.wu.sah

Real Time: 1685.10
User Time: 1682.59
System Time: 0.16
MaxMem: 42640
SwapNum: 0
CtxSwt: 155403
MajPF: 36

Result : Strongly similar, Q= 99.98%
================================================================================

================================================================================
App Name: MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu
App Args: --nographics
WU Name: blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu
WU Angle Range: 0.058697256447932
Spike count: 2
Autocorr count: 0
Pulse count: 16
Triplet count: 0
Gaussian count: 0
Results: /home/keith/Downloads/Utils/benchMT-master/testData/Serenity_benchMT_Petri_0119_001111//result.MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu.blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu.d87c41649b444e2f94d908d5106b3a59.sah
REF Name: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.blc64_2bit_guppi_58642_02075_3C295_0008.17286.0.22.45.27.vlar.wu.sah

Real Time: 1757.38
User Time: 1754.89
System Time: 0.11
MaxMem: 48448
SwapNum: 0
CtxSwt: 161444
MajPF: 22

Result : Strongly similar, Q= 99.91%
================================================================================

================================================================================
App Name: MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu
App Args: --nographics
WU Name: 03dc14aa.7253.7429.7.34.218.wu
WU Angle Range: 0.64032871789011
Spike count: 4
Autocorr count: 1
Pulse count: 2
Triplet count: 3
Gaussian count: 0
Results: /home/keith/Downloads/Utils/benchMT-master/testData/Serenity_benchMT_Petri_0119_001111//result.MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu.03dc14aa.7253.7429.7.34.218.wu.0f5207c7b74543439a6b5bc8401c326a.sah
REF Name: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.03dc14aa.7253.7429.7.34.218.wu.sah

Real Time: 1934.53
User Time: 1932.05
System Time: 0.06
MaxMem: 48440
SwapNum: 0
CtxSwt: 181511
MajPF: 0

Result : Strongly similar, Q= 99.93%
================================================================================

================================================================================
App Name: MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu
App Args: --nographics
WU Name: 03dc14aa.7253.7429.7.34.218.wu
WU Angle Range: 0.64032871789011
Spike count: 4
Autocorr count: 1
Pulse count: 2
Triplet count: 3
Gaussian count: 0
Results: /home/keith/Downloads/Utils/benchMT-master/testData/Serenity_benchMT_Petri_0119_001111//result.MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu.03dc14aa.7253.7429.7.34.218.wu.da609ed5a0f54d20a416bd02fd63293d.sah
REF Name: /home/keith/Downloads/Utils/benchMT-master/APPS_REF/REF_RESULTS/ref-result.setiathome_8.00_x86_64-pc-linux-gnu.03dc14aa.7253.7429.7.34.218.wu.sah

Real Time: 2058.20
User Time: 2055.33
System Time: 0.26
MaxMem: 50092
SwapNum: 0
CtxSwt: 193068
MajPF: 28

Result : Strongly similar, Q= 99.93%
================================================================================


9 of 9 jobs complete

┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name │ start │ finish │tot_time │ state │
│ │ │ │app_args │wu_name │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│0 │ NA │CPU│MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu │01:27:11│01:55:18│0:28:07.416│COMPLETE│
│ │ │ │--nographics │blc14_2bit_guppi_58691_83520_HIP79781_│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│1 │ NA │CPU│MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu │01:27:11│01:54:03│0:26:52.299│COMPLETE│
│ │ │ │--nographics │blc14_2bit_guppi_58691_83520_HIP79781_│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│2 │ NA │CPU│MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu │01:27:11│01:56:30│0:29:19.480│COMPLETE│
│ │ │ │--nographics │blc64_2bit_guppi_58642_02075_3C295_000│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│3 │ NA │CPU│MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu │01:27:11│01:54:45│0:27:34.360│COMPLETE│
│ │ │ │--nographics │blc64_2bit_guppi_58642_02075_3C295_000│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│4 │ NA │CPU│MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu │01:27:11│02:01:30│0:34:19.763│COMPLETE│
│ │ │ │--nographics │03dc14aa.7253.7429.7.34.218.wu │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│5 │ NA │CPU│MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu │01:27:11│01:59:27│0:32:16.639│COMPLETE│
│ │ │ │--nographics │03dc14aa.7253.7429.7.34.218.wu │
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│6 │ NA │REF│ref-cpu.setiathome_8.00_x86_64-pc-linux-gnu │00:11:15│00:58:06│0:46:50.587│COMPLETE│
│ │ │ │ --nographics │blc14_2bit_guppi_58691_83520_HIP79781_│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│7 │ NA │REF│ref-cpu.setiathome_8.00_x86_64-pc-linux-gnu │00:11:15│00:59:42│0:48:26.662│COMPLETE│
│ │ │ │ --nographics │blc64_2bit_guppi_58642_02075_3C295_000│
├────┼────┼───┼────────────────────────────────────────────────────────────┼────────┬────────┬───────────┬────────┤
│8 │ NA │REF│ref-cpu.setiathome_8.00_x86_64-pc-linux-gnu │00:11:15│01:27:11│1:15:55.237│COMPLETE│
│ │ │ │ --nographics │03dc14aa.7253.7429.7.34.218.wu │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
Resuming BOINC
Finish Time: Sun Jan 19 02:01:31 2020

The Ryzen AVX r4101 app is faster than reference app by:
BLC14 - 174%
BLC64 - 161%
03DC14 - 232%

The Ryzen AVX r4101 app is faster than the default AIO SSE41 r3711 app by:
BLC14 - 4.6%
BLC64 - 6.4%
03DC14- 6.4%
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2028434 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2028438 - Posted: 19 Jan 2020, 4:20:53 UTC - in response to Message 2028434.  


The Ryzen AVX r4101 app is faster than reference app by:
BLC14 - 174%
BLC64 - 161%
03DC14 - 232%

The Ryzen AVX r4101 app is faster than the default AIO SSE41 r3711 app by:
BLC14 - 4.6%
BLC64 - 6.4%
03DC14- 6.4%


I'm assuming the "reference" apps are the baseline Linux cpu apps?

Keith you mentioned getting about a 3% boost using the previous Intel AVX cpu app. This one is getting a bit more.

Cool.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2028438 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2028448 - Posted: 19 Jan 2020, 5:09:14 UTC - in response to Message 2028438.  

Shows in the list.

REF│ref-cpu.setiathome_8.00_x86_64-pc-linux-gnu

I thought about including the old r3345 avx app, the r3712 avx2 app and r3306 app but I wanted to post at least a cursory result tonight and not have it run overnight.

I trusted what TBar said about the new 4101 app that was much faster than his recent 4008 app this thread is originally about.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2028448 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2028582 - Posted: 20 Jan 2020, 0:39:36 UTC - in response to Message 2028448.  
Last modified: 20 Jan 2020, 0:43:15 UTC

Anymore news on the Ryzen 4101 build? It seems the Apps built in 18.04 give the best speedup and precision, but, I'm getting a couple SIGABORT errors with those Apps on my i7-6700k. Are you seeing any errors on the Ryzen machines?
ID: 2028582 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 2028614 - Posted: 20 Jan 2020, 6:33:52 UTC - in response to Message 2028582.  

On my i7-6700K :

Listing executable(s) in /APPS :
MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu

Listing executable in /REF_APPS :
MBv8_8.05r3345_avx_linux64
----------------------------------------------------------------
Current WU: 20ap19aa.27021.7020.5.32.145.wu

----------------------------------------------------------------
Skipping default app MBv8_8.05r3345_avx_linux64, displaying saved result(s)
Elapsed Time: ....................... 3253 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.22r4101_avx-ryzen_x86_64-pc-linux-gnu
Elapsed Time : ...................... 2649 seconds
Speed compared to default : ......... 122 %
-----------------
Comparing results
Result : Strongly similar, Q= 99.98%

But only 1% speed improvement on an i5-9500 wit the same wu.
ID: 2028614 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2028618 - Posted: 20 Jan 2020, 7:38:10 UTC - in response to Message 2028614.  

The best I can tell the Apps compiled in Ubuntu 16.06 and 18.04 just don't work correctly with the code in the AKv8 folder. Some tasks run fast, some don't, while others give errors. It seems changing ALL the instances of define _GLIBCXX_USE_CXX11_ABI 0 to 1 will allow the code to compile, however, it won't compile correctly. The Apps compiled in 15.04 don't have that problem, those Apps work correctly. The same two work units that don't work with r4101_avx-ryzen work fine with r3711 or r4008, no matter what settings you try in 18.04. It may be different in some other version of Linux, but, I don't use some other version of Linux. For now, if you want the App to work correctly then you need to compile it in Ubuntu 15.04 or lower. That means No compile options for Ryzen can be used in the optimized CPU Apps.
ID: 2028618 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2029125 - Posted: 25 Jan 2020, 6:42:56 UTC
Last modified: 25 Jan 2020, 6:53:21 UTC

Final update. It seems the problem with the CPU App only affects the AVX2 App, the AVX App seems to be unaffected. So, since the New App is quite a bit faster than the existing Apps, we'll just use the AVX version. It also appears the App works just fine on Intel CPUs even though it's flagged znver1. I included the two work units that don't work with the AVX2 App in the download, just in case someone wants to do some testing. I haven't tested the App in BOINC with an Arecibo VHAR, however, since the progress bar problem is caused by something in the code I suspect the new App will have the same problem as the other newer Apps. Just ignore the progress bar on VHAR tasks.

The New App is here, http://www.arkayn.us/lunatics/MBv8_CPU-AVX-4101.7z
If no one has any trouble with it I'll replace the r4008-AVX2 App with this one in the All-In-One.

This is the New App against the r2008 AVX2 App on My Intel i7-6700K

====================================================================
Listing executable(s) in /APPS :
MBv8_8.22r4101_avx_x86_64-pc-linux-gnu

Listing executable in /REF_APPS :
MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu
----------------------------------------------------------------
Current WU: 03my17ab.4903.11519.16.43.91.wu
----------------------------------------------------------------
Running default app with command :... MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu
Elapsed Time: ....................... 2472 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.22r4101_avx_x86_64-pc-linux-gnu
Elapsed Time : ...................... 2235 seconds
Speed compared to default : ......... 110 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 100.0%
----------------------------------------------------------------
Done with 03my17ab.4903.11519.16.43.91.wu
====================================================================
Current WU: 26ja07aa.30143.294304.16.43.191.wu
----------------------------------------------------------------
Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 1216 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.22r4101_avx_x86_64-pc-linux-gnu
Elapsed Time : ...................... 1124 seconds
Speed compared to default : ......... 108 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.99%
----------------------------------------------------------------
Done with 26ja07aa.30143.294304.16.43.191.wu
====================================================================
Current WU: blc32_2bit_guppi_58406_17412_HIP2473_0076.2911.818.23.46.76.vlar.wu
----------------------------------------------------------------
Skipping default app MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu, displaying saved result(s)
Elapsed Time: ....................... 2323 seconds
----------------------------------------------------------------
Running app with command : .......... MBv8_8.22r4101_avx_x86_64-pc-linux-gnu
Elapsed Time : ...................... 2095 seconds
Speed compared to default : ......... 110 %
-----------------
Comparing results
Result      : Strongly similar,  Q= 99.99%
----------------------------------------------------------------
Done with blc32_2bit_guppi_58406_17412_HIP2473_0076.2911.818.23.46.76.vlar.wu
====================================================================
ID: 2029125 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2029132 - Posted: 25 Jan 2020, 8:10:49 UTC

Thanks for the update TBar. I want to test your original AVX2 version with those test work units that fail on Ryzen hardware. Will bench the new AVX against both the SSE41 and the old AVX2 also just for grins.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2029132 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 2033215 - Posted: 20 Feb 2020, 21:14:27 UTC

Any results yet on 4101? Is it stable and accurate? Is it faster then 3712 avx2?
ID: 2033215 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2033219 - Posted: 20 Feb 2020, 21:36:32 UTC - in response to Message 2033215.  

I benched r4101 against the old AVX r3345, AVX r3712 both Intel and AMD version, the AVX2 r4008 and the stock r3711 SSE41 version.

It beat them all by 1-10% depending on the task. Different gains for Arecibo and BLC standard angle and very low angle.

So generally faster overall but nothing earth shattering and it didn't break anything.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2033219 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Problems with MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.