Posts by RueiKe

21) Message boards : Number crunching : Help Needed - Can't get GPU Tasks (Message 2016432)
Posted 24 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
But what I am saying is that the appname you are using is now not recognized anymore. I assume you got the app at some point by using the stock BOINC installation and that is one of the apps that the scheduler sent you. And how you were able to put that app into the reference gpu apps in BenchMT. But it does not bode well that that app has 0 production currently. Maybe that app has been deprecated.

But unless somebody can figure out where that app is in on BOINC server through the fanout, I don't know if that app even exists in the scheduler database of apps that it considers valid for trying out.

I thought that supporting anonymous apps meant that any app name you wanted to use would work as long as it is configured properly in the app_info file.
22) Message boards : Number crunching : Help Needed - Can't get GPU Tasks (Message 2016430)
Posted 24 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
not sure what <main_program/> is, but it's missing parent tags. maybe that's breaking it.

edit, nevermind, self closing tag syntax.

what does the event log say at startup? does it complain about the app_info file?

are the permissions set correctly on the new app?


I verified permissions set properly. I didn’t see any concerns in log files, but will check in more detail when I get back to the system this evening.
23) Message boards : Number crunching : Help Needed - Can't get GPU Tasks (Message 2016426)
Posted 23 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
Are you sure that is even the correct SAH gpu appname? When I look at the list of applications, the only app with that name has 0 GigaFLOPS of production. That would lead me to believe the app never was released regardless of it existing in the list. I do see an a 8.22 (opencl_ati_sah) app that has 9 GigaFLOPS of production. I would think that at least exists in reality and is in use.


Its the name of the app as used by benchMT. The SoG app that I have been using has the same naming convention, but sah is replaced by SoG. I have grep'ed the name of the app from the app_info file and find that it matches.

I also tried this one: MBv8_8.22r3584_sse2_clAMD_HD5_x86_64-pc-linux-gnu

In either case, boincmgr deletes what ever app I specify and says no app available in the project properties.
24) Message boards : Number crunching : Help Needed - Can't get GPU Tasks (Message 2016421)
Posted 23 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
It has been a while since I have reconfigured app_info.xml on my system. Not sure if things have changed or maybe I am just remembering things incorrectly...

I wanted to change my new system to use sah instead of SoG app, but I made a typo in the app_info file and lost all work. I corrected it, but now it seems to never download GPU work. When I look at project properties, it says no GPU app exists. Every time I reset project, the app and cl files get deleted when I restart boincmgr. Here are my app_info.xml file contents:
<app_info>
    <app>
      <name>setiathome_v8</name>
    </app>
    <file_info>
      <name>MBv8_8.05r3345_avx_linux64</name>
      <executable/>
    </file_info>
    <app_version>
      <app_name>setiathome_v8</app_name>
      <version_num>805</version_num>
      <platform>x86_64-pc-linux-gnu</platform>
      <cmdline>-nographics </cmdline>
      <file_ref>
        <file_name>MBv8_8.05r3345_avx_linux64</file_name>
        <main_program/>
      </file_ref>
    </app_version>
    <app_version>
      <app_name>setiathome_v8</app_name>
      <version_num>804</version_num>
      <platform>x86_64-pc-linux-gnu</platform>
      <cmdline>-nographics </cmdline>
      <file_ref>
        <file_name>MBv8_8.05r3345_avx_linux64</file_name>
        <main_program/>
      </file_ref>
    </app_version>
<app>
     <name>setiathome_v8</name>
</app>
    <file_info>
      <name>setiathome_8.22_x86_64-pc-linux-gnu__opencl_ati5_sah</name>
      <executable/>
    </file_info>
    <file_info>
      <name>MultiBeam_Kernels_r3584.cl</name>
    </file_info>
    <file_info>
      <name>mb_cmdline_rickslab_ati5.txt</name>
    </file_info>
    <app_version>
      <app_name>setiathome_v8</app_name>
      <platform>x86_64-pc-linux-gnu</platform>
      <version_num>822</version_num>
      <plan_class>opencl_ati5_sah</plan_class>
      <coproc>
        <type>ATI</type>
        <count>1</count>
      </coproc>
      <avg_ncpus>1.0</avg_ncpus>
      <max_ncpus>1.0</max_ncpus>
      <file_ref>
        <file_name>setiathome_8.22_x86_64-pc-linux-gnu__opencl_ati5_sah</file_name>
        <main_program/>
      </file_ref>
      <file_ref>
        <file_name>MultiBeam_Kernels_r3584.cl</file_name>
      </file_ref>
      <file_ref>
        <file_name>mb_cmdline_rickslab_ati5.txt</file_name>
        <open_name>mb_cmdline.txt</open_name>
      </file_ref>
    </app_version>
</app_info>
25) Message boards : Number crunching : Heavy Metal! - high cpu count (Message 2015861)
Posted 18 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
That's a good question. Does BOINC simply ignore the HT cores since the count moves the host beyond 100 cores. Or do you have to disable SMT for BOINC to even work?

I ran it for a while with SMT enabled. It just ran all 100 WUs across the 128 available threads, leaving 28 unused. Doesn’t make sense to use SMT since even 36 extra WUs isn’t enough buffer. If I get a bunch of shorties, cores will be idle until work download catches up.
26) Message boards : Number crunching : Heavy Metal! - high cpu count (Message 2015858)
Posted 18 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
I have a new build using Rome 7702P with 64 cores up and running, Nexon . With no GPU, it is producing about 120K per day. Averages more than 1 WU per minute. I plan to add a Vega20 card over the weekend. This will not be a dedicated cruncher, as I have built it for an analytics project I am working on, but I will turn it over to SETI whenever it is not being used.

It runs out of work very easily. I hope that we will soon see more flexibility in max CPU work to download as core counts increase.


I guess you have disabled SMT?

Yes, I prefer it off for the development work I’m doing.
27) Message boards : Number crunching : Heavy Metal! - high cpu count (Message 2015852)
Posted 18 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
I have a new build using Rome 7702P with 64 cores up and running, Nexon . With no GPU, it is producing about 120K per day. Averages more than 1 WU per minute. I plan to add a Vega20 card over the weekend. This will not be a dedicated cruncher, as I have built it for an analytics project I am working on, but I will turn it over to SETI whenever it is not being used.

It runs out of work very easily. I hope that we will soon see more flexibility in max CPU work to download as core counts increase.
28) Message boards : Number crunching : Ryzen and Threadripper (Message 2002867)
Posted 16 Jul 2019 by Profile RueiKe Special Project $250 donor
Post:
Have anyone reported some struggles with 3700X (Zen 2 series overall)
My 3700x started to reboot sporadic and only got worse and worse.
It started with a reboot in Ubuntu, tried to get it going, got it to work but then it was the same in Windows too, sporadic reboot.

Noes! :-/

(EDIT: When i installed my 2700X again all seems peachy, to be continued with the tests)


My 3700x has been very stable. Running non-stop since I finished benchmarks. I am running with optimized defaults, which includes CPB=Auto (which causes it to be enabled). I am running on an older MB, x370 C6H, with latest BIOS from Asus. Have you tried running with GPU compute disabled?
29) Message boards : Number crunching : Ryzen and Threadripper (Message 2002683)
Posted 15 Jul 2019 by Profile RueiKe Special Project $250 donor
Post:
I went back up and looked at his benchmarks and still couldn't really make heads or tails of them. The "average cpu time" seems to be too high.

Tom


System is using optimized defaults BIOS settings. CPB is enabled, though thermal solution only allows sustained boost to 4.1GHz. The benchMT runs used 3 WUs from the samples provided with the app. Two are Arecibo and one GBT. At least 2 of the 3 are vlar. That’s what causes the long run times.
30) Message boards : Number crunching : Ryzen and Threadripper (Message 2002356)
Posted 13 Jul 2019 by Profile RueiKe Special Project $250 donor
Post:
What kind of temps did you see when running the AVX apps? Were they much higher than running the SSE variants? AMD is not declocking the cpu when AVX instructions run like Intel does.


The temps are not being displayed in glances. Perhaps the kernel isn't ready for the new processors yet, but I didn't dig any deeper into it.
31) Message boards : Number crunching : Ryzen and Threadripper (Message 2002321)
Posted 12 Jul 2019 by Profile RueiKe Special Project $250 donor
Post:
I have spent some time benchmarking apps on the 3700x on a C6H (x370) with optimized defaults for BIOS settings. The first results are from running 16 repetitions of all apps for 3 WUs.


The first test ran a mix of all apps, which may not represent real performance, so I also ran dedicated runs for each app:


The system is up and running with the r3345 avx app. The major pf results were not repeatable, so must have been some other factor impacting it.
32) Message boards : Number crunching : Developing AMD GPU Utilities (Message 1999234)
Posted 22 Jun 2019 by Profile RueiKe Special Project $250 donor
Post:
Hi Rick,

today we configured an RX580 to BOINC on Linux Mint 19.1. clinfo works, BOINC utilizes the GPU, however, amdgpu-utils don't find the card.
amdgpu-monitor showed "No AMD GPUs detected, exiting..." num_amd_gpus must have been 0 there. I couldn't fully understand how gpu_list.num_gpus() is defined to check what it was looking for.

Being Linux Mint, the official driver doesn't install, so I used the extracted OpenCL parts from AMDGPU-PRO 19.10. I'm doing that on Ubuntu 19.04 as well, there the tools work without problems.
What are you checking in that moment, where does that data come from?

Thanks!

Manually installing OpenCL is necessary to get compute working for SETI, but other components of the driver package are required for interacting with the GPU. The first thing that amdgpu-utils looks for to determine compatibility is the file:
/sys/class/drm/card?/device/pp_od_clk_voltage

There are many device files that are used from this location to read and write from the GPU. If critical files are missing or not readable, then the card is classified as not compatible.
33) Message boards : Number crunching : Vega 64 Command Lines? (Message 1992634)
Posted 5 May 2019 by Profile RueiKe Special Project $250 donor
Post:
OK, I don't know why I didn't think of this before. You should look at RueiKe's hosts. https://setiathome.berkeley.edu/show_user.php?userid=10276073
He runs exclusive AMD/ATI gpu hardware including older Radeon Fury, Vega64 and Vega VII gpus. If you look at the times for his gpu tasks on his various hosts, he is running MB tasks in 3 -5 minutes.
If you look at his stderr.txt outputs for his tasks you can garner some of the settings he is using in his MB tuning command line parameters. Some of his threads are here.
https://setiathome.berkeley.edu/forum_thread.php?id=81872#1886964
https://setiathome.berkeley.edu/forum_thread.php?id=82949#1936392

I see he has -hp and -high_perf in his command line but I don't see any instance of cpu_lock. So know that card can go faster.


Here are my command line options:
 -v 1 -instances_per_device 1 -sbs 2048 -period_iterations_num 1 -tt 600 -spike_fft_thresh 4096 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -hp -high_perf -no_defaults_scaling -cpu_lock -nobs -tune 1 64 1 4 -no_use_sleep


In addition, I am using amdgpu-utils to underclock my Vega64s. I am running with a 140W power cap, max Sclk p-state of 6 and a constant Mclk p-state of 3. There are all running in compute mode. All have waterblocks and run at <50C.
34) Message boards : Number crunching : Developing AMD GPU Utilities (Message 1991666)
Posted 27 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:
I have just released a new version of amdgpu-utils:
https://github.com/Ricks-Lab/amdgpu-utils/releases/tag/v2.5.0
    Implemented the --plot option for amdgpu-monitor. This will display plots of critical GPU parameters that update at an interval defined by the --sleep N option.
    Errors in reading non-critical parameters will now show a warning the first time and are disabled for future reads.
    Fixed a bug in implementation of compatibility checks and improved usage of try/except.

35) Message boards : Number crunching : Developing a Multi-Threaded Benchmarking App for Linux (Message 1989862)
Posted 13 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:
I have just released a new version of benchMT:
https://github.com/Ricks-Lab/benchMT/releases/tag/v1.6.0

Changes include the following:
    Complete rewrite of commandline/config file option parsing. Original got complex and buggy.
    Support execution and time/energy metrics for AstroPulse apps/wus. Still no working results comparison utility, so comparison to reference results not possible.

36) Message boards : Number crunching : Developing a Multi-Threaded Benchmarking App for Linux (Message 1988639)
Posted 3 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:

I am reworking all of the logic behind setting modes from command line and config file. Hope I can finish it today. I will let you know when it is ready.

Code on master is under development and won’t be usable until I finish this work.


Not a problem. One of my "native" talents has been to break code that was working for everyone else.....

After all. This is late ALPHA testing. If I could do this when several hundred people were using it reliably, then it might bother me.

:)
Tom


It would be great if you could be an official tester. I usually have a set way of doing things and it is difficult to imagine how many different ways people will think of using the tool. Being a tester would require basic knowledge of git/GitHub, but this is good knowledge to have.

I have completed a major re-write of how command line and cfg options are parsed. The new code is on master and ready for testing. I have gone through a bunch of tests on my side, so looking forward to see if you can find a bug!

To download from master, just go to the main page
https://github.com/Ricks-Lab/benchMT and click on the green "Clone or download" button.
37) Message boards : Number crunching : Anything relating to AstroPulse (3) tasks (Message 1988554)
Posted 3 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:
Think I have converged on an optimum AP command line with the benchMT tool.

Command line data is truncated in chart ...

There is also a psv file created with full untruncated details. Also, a human readable summary is available.
38) Message boards : Number crunching : Developing a Multi-Threaded Benchmarking App for Linux (Message 1988505)
Posted 3 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:
That is what you should expect from a typo.

tom@LYNNE-JUPITER-L:~/Downloads/benchMT-1.5.0$ ./benchMT --bonic_home /home/tom/Desktop/BOINC


I want a SPELLING CHECKER for my command line!

Anyway:
benchMT v1.5.0 ― SETI MB Benchmarking Utility ― Linux edition

Suspending BOINC

System Details
Hostname:  LYNNE-JUPITER-L
Run Name:  
Platform:  Linux 4.15.0-46-generic
OS Description:  Ubuntu 18.04.1 LTS
CPU Model:  Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz
CPU MHz:  3600.0000
CPU Cores:  20
CPU Threads:  40
GPU Count:  3
GPU Threads:  3
GPU Devices:  [0, 1, 2]
Devices Map:  {}
GPU Details:   [GP104] [GP104] [GP104]
Current Dir:  /home/tom/Downloads/benchMT-1.5.0/
Slots Dir:  /home/tom/Downloads/benchMT-1.5.0/workdir/Slots/
TimeNow:  Wed Apr  3 00:51:20 2019
TimeNowShort:  0403_005120
CPU App Path:  /home/tom/Downloads/benchMT-1.5.0/APPS_CPU/
GPU App Path:  /home/tom/Downloads/benchMT-1.5.0/APPS_GPU/
REF App Path:  /home/tom/Downloads/benchMT-1.5.0/APPS_REF/
Reference Results Path:  /home/tom/Downloads/benchMT-1.5.0/APPS_REF/REF_RESULTS/
STD Signal WU Path:  /home/tom/Downloads/benchMT-1.5.0/WU_std_signal/
WU Path:  /home/tom/Downloads/benchMT-1.5.0/WU_test/
Test Data Path:  /home/tom/Downloads/benchMT-1.5.0/testData/
BOINC Home:  /home/tom/Desktop/BOINC/
Repetitions:  1
Allocated CPU Threads:  0
Allocated GPU Threads:  0

APP List
MBv8_8.22r3711_sse41_x86_64-pc-linux-gnu

WU List


   0 of 0 jobs complete

┌────┬────┬───┬────────────────────────────────────────────────────────────┬────────┬────────┬───────────┬────────┐
│Job#│Slot│xPU│app_name                                                    │  start │ finish │tot_time   │ state  │
│    │    │   │app_args                                                    │wu_name                               │
└────┴────┴───┴────────────────────────────────────────────────────────────┴──────────────────────────────────────┘
Resuming BOINC
Finish Time: Wed Apr  3 00:51:29 2019


The results seem to imply I have my BenchCFG file set wrong.

Tom


I am reworking all of the logic behind setting modes from command line and config file. Hope I can finish it today. I will let you know when it is ready.

Code on master is under development and won’t be usable until I finish this work.
39) Message boards : Number crunching : Developing a Multi-Threaded Benchmarking App for Linux (Message 1988447)
Posted 2 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:
Ok, what am I missing:

tom@LYNNE-JUPITER-L:~/Downloads/benchMT-1.5.0$ ./benchMT
BOINC Home Path [ /home/boinc/BOINC/ ] doesn't exist
Please set the correct BOINC Home Path with the --boinc_home command line option
boinccmd [ /home/boinc/BOINC//boinccmd ] doesn't exist
Error in environment.  Exiting...



It is not obvious what is going on here. Maybe it is an extra space before the pathname. If no, can you try specifying on command line? If it is an extra space, I need to modify the split command to deal with it.

OK, it didn't pick up my boinc_home statement in the Bench.cfg file either. I had to specify boinc_home on the command line for it to not complain.

#Specify path for BOINC
mode boinc_home /home/keith/Desktop/BOINC/


Just checked the code. Looks like this never worked. The check of environment happens before the read of config file. I will need to work out a fix.
40) Message boards : Number crunching : Developing a Multi-Threaded Benchmarking App for Linux (Message 1988431)
Posted 2 Apr 2019 by Profile RueiKe Special Project $250 donor
Post:
Ok, what am I missing:

tom@LYNNE-JUPITER-L:~/Downloads/benchMT-1.5.0$ ./benchMT
BOINC Home Path [ /home/boinc/BOINC/ ] doesn't exist
Please set the correct BOINC Home Path with the --boinc_home command line option
boinccmd [ /home/boinc/BOINC//boinccmd ] doesn't exist
Error in environment.  Exiting...



It is not obvious what is going on here. Maybe it is an extra space before the pathname. If no, can you try specifying on command line? If it is an extra space, I need to modify the split command to deal with it.


Previous 20 · Next 20


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.