Posts by RueiKe

1) Message boards : Number crunching : BOINC with Ubuntu 20.04 (Message 2071892)
Posted 27 Mar 2021 by Profile RueiKe Special Project $250 donor
Post:
Been a while since I have updated boinc on my Linux machines (Ubuntu 20.04.2). Where is the best place to download the latest from?


do you want the repository install? or the standalone "all-in-one" version?


I prefer standalone.
2) Message boards : Number crunching : BOINC with Ubuntu 20.04 (Message 2071833)
Posted 26 Mar 2021 by Profile RueiKe Special Project $250 donor
Post:
Been a while since I have updated boinc on my Linux machines (Ubuntu 20.04.2). Where is the best place to download the latest from?
3) Message boards : Number crunching : Developing AMD GPU Utilities (Message 2052627)
Posted 24 Jun 2020 by Profile RueiKe Special Project $250 donor
Post:
I have posted a package at PyPI which can be installed with:

pip install ricks-amdgpu-utils


I have discovered an issue where the PyPI distribution could not find the plot utility when ran from amdgpu-monitor --plot. I have uploaded a new package to fix this. The utilities can be installed with the following command.

pip3 install ricks-amdgpu-utils
4) Message boards : Number crunching : Developing AMD GPU Utilities (Message 2052551)
Posted 23 Jun 2020 by Profile RueiKe Special Project $250 donor
Post:
I have posted a package at PyPI which can be installed with:

pip install ricks-amdgpu-utils
5) Message boards : Number crunching : Developing AMD GPU Utilities (Message 2052432)
Posted 21 Jun 2020 by Profile RueiKe Special Project $250 donor
Post:
I have been working with Keith to include NV GPUs in the utilities as readable (amdgpu-pac still doesn't support NV). Looking for users to give it a try. The latest version is here
6) Message boards : Number crunching : Developing AMD GPU Utilities (Message 2051813)
Posted 14 Jun 2020 by Profile RueiKe Special Project $250 donor
Post:
I have just released a new version of amdgpu-utils:
https://github.com/Ricks-Lab/amdgpu-utils/releases/tag/v3.2.0
    Fixed CRITICAL issue where Zero fan speed could be written when invalid fan speed was read from the GPU.
    Fixed issue in reading pciid file in Gentoo (@CH3CN).
    Modified setup to indicate minimum instead of absolute package versions (@smoe).
    Modified requirements to include min/max package versions for major packages.
    Fixed crash for missing pci-ids file and add location for Arch Linux (@berturion).
    Fixed a crash in amdgpu-pac when no fan details could be read (laptop GPU).
    Fixed deprecation warnings for several property setting functions. Consolidated all property setting to a single function in a new module, and ignore warnings for those that are deprecated. All deprecated actions are marked with FIXME in GPUgui.py.
    Replaced deprecated set properties statement for colors with css formatting.
    Implemented a more robust string format of datetime to address datetime conversion for pandas in some installations.
    Implemented dubug logging across the project. Activated with --debug option and output saved to a .log file.
    Updated color scheme of Gtk applications to work in Ubuntu 20.04. Unified color scheme across all utilities.
    Additional memory parameters added to utilities.
    Read ID information for all GPUs and attempt to decode GPU name. For cards with no card path entry, determine system device path and use for reading ID. Report system device path in amdgpu-ls. Add amdgpu-ls --short report to give brief description of all installed GPUs.

7) Message boards : Number crunching : BOINC with Ubuntu 20.04 (Message 2046670)
Posted 24 Apr 2020 by Profile RueiKe Special Project $250 donor
Post:
Has anyone else tried the TBar release of BOINC under the latest release of Ubuntu? I get errors for missing packages that can not be added as was in the case for 18.04: libwebkitgtk-1.0 and libcurl3. I tried boinc with libcurl4, but then complains about libssl.so.1.0.0.

We are working on it. Already has a running version. Testing it now. Give us some time

Cool. Thanks for the update!
8) Message boards : Number crunching : BOINC with Ubuntu 20.04 (Message 2046668)
Posted 24 Apr 2020 by Profile RueiKe Special Project $250 donor
Post:
Has anyone else tried the TBar release of BOINC under the latest release of Ubuntu? I get errors for missing packages that can not be added as was in the case for 18.04: libwebkitgtk-1.0 and libcurl3. I tried boinc with libcurl4, but then complains about libssl.so.1.0.0.
9) Message boards : Number crunching : Developing a Multi-Threaded Benchmarking App for Linux (Message 2035472)
Posted 3 Mar 2020 by Profile RueiKe Special Project $250 donor
Post:
Well, I have completed the re-write of benchMT just in time for the project completion. I decided to post it anyway, just as part of the project legacy. After about 20 years away from programming, benchMT and other projects in support of SETI, brought me back. It is now my passion. In the several years I have been with the project, I have learned so much about hardware, overclocking, and software development. The forums have been an amazing community and was a big part of that journey. I don't think that there will another BOINC project that will ever be as compelling as SETI@Home. I hope there will be future opportunities for community involvement.

I have just released a new version of benchMT:
https://github.com/Ricks-Lab/benchMT/releases/tag/v2.0.0

Changes include the following:
    Complete rewrite of the core to integrate GpuItem object that was previously only used in energy mode.
    Implemented a more robust check of compute and energy compatibility for all installed GPUs. Switched from lshw to lspci and clinfo. Not working for Intel GPUs.
    Many core improvements for better robustness and style. Added reST docstrings for better supportability and improved type checking by PyCharm.
    Extend energy mode to cover Nvidia, in addition to AMD GPU's. Added max-power to psv and txt reports.
    --noBS on the command line will allow benchMT to function without a BOINC installation.
    Check devmap and give more details to help user understand mapping of BOINC Devices to Driver Card Numbers.
    Added --purge_kernels command line option to purge cached kernel files from working directory.
    Added --lsgpu command line option to display information on all installed GPUs.
    Reading BOINC's coproc_info.xml as a possible way to automate device to card mapping. Not implemented.

10) Message boards : Number crunching : Developing AMD GPU Utilities (Message 2034740)
Posted 1 Mar 2020 by Profile RueiKe Special Project $250 donor
Post:

In the monitoring and graph plots, is there any measure of RAM usage/utilization and PCIe bandwidth utilization?

In the monitor tables, memory busy percent is reported. Before I add it to the plots, I need to work on plotting efficiency. Other memory related parameters available (but not used by amdgpu-utils) are:
mem_busy_percent
mem_info_gtt_used
mem_info_vis_vram_used
mem_info_vram_used
mem_info_gtt_total
mem_info_vis_vram_total
mem_info_vram_total

I don't see any device files/sensors related to PCIe loading.
11) Message boards : Number crunching : Developing AMD GPU Utilities (Message 2034582)
Posted 29 Feb 2020 by Profile RueiKe Special Project $250 donor
Post:
I have just released a new version of amdgpu-utils:
https://github.com/Ricks-Lab/amdgpu-utils/releases/tag/v3.0.0
    Style and code robustness improvements
    Deprecated amdgpu-pciid and removed all related code
    Complete rewrite based on benchMT learning. Simplified code with ObjDict for GpuItem parameters and use of class variables for generic behavior parameters
    Use lspci as the starting point for developing GPU list and classify by vendor, readability, writability, and compute capability. Build in potential to be generic GPU util, instead of AMD focused
    Test for readability and writability of all GPUs and apply utilities as appropriate
    Add assessment of compute capability for each installed GPU
    Eliminated the use of lshw to determine driver compatibility and display of driver details is now informational with no impact on the utilities
    Add p-state masking capability for Type 2 GPUs (Vega20 and newer)
    Optimized pac writing to GPUs



The core of the utility has been modified to allow extension to other brands of GPUs. Will start this work when I have some available time.

12) Message boards : Number crunching : Problems with MBv8_8.22r4008_avx2_intel_x86_64-pc-linux-gnu (Message 2034389)
Posted 28 Feb 2020 by Profile RueiKe Special Project $250 donor
Post:
Anybody run the new app through the benchmarks yet with a comparison run against the default r3711 SSE41 cpu app?
After spending DAYS trying to get the AKv8 OpenCL App to compile in the newer Linux systems it has come to my attention a large number of Optimizations were left out of the CPU Apps. It May be possible to build a New CPU App with better speed using those said Optimizations. Right now I'm trying to find the System calls named by Eric & RueiKe that should be left out of the OpenCL App. Not having much luck. Anyone know just WHERE in the code "funsafe-math-optimizations" & "unsafe-fp-opt" are lurking? The New OpenCL App is a little faster than r3584, I suspect a Newer CPU App can be faster than r3711 as well.

I have verified that ROCm will have the same problem giving invalid results that was found with Navi on the latest Windows/Linux drivers. They all use the latest LC OpenCL compiler. I had tested ROCm on my Vega20 and confirmed the standard version gives invalid results. I have replaced a lib file with one that would not allow unsafe-math, and that system is giving valid results now. This lib file has other issues and can not be released, so we need to make the same app modification that was made for Windows. Perhaps the magnitude of the problem for Linux is small now, though if we looked hard enough, we could probably find some, but I would expect that new GPUs/drivers in Linux will cause the problem to increase in magnitude over time.

I would expect the unsafe options are related to the creation of the code that runs on the GPU and not the compile options for creating the CPU code. I am not so familiar with OpenCL programming, but I did check the binary with a viewer and did not see the same clear flags that Eric saw in the Windows app.
13) Message boards : Number crunching : Special SETI@Home Fundraiser - 26 x 16TB datacentre drives required for the project's data storage (Message 2025684)
Posted 31 Dec 2019 by Profile RueiKe Special Project $250 donor
Post:
You gave $250.00
to SETI@home
on December 31, 2019
14) Message boards : Number crunching : Keeping your rig running: UPS (Uninteruptable Power Supplies), Power Conditioners etc. (Message 2019244)
Posted 16 Nov 2019 by Profile RueiKe Special Project $250 donor
Post:

[Edit] Just thought of something, does it have to be a network card? The SMT-2200C does have the ability to connect over the network without any added network card. Just plug it into your network with an ethernet cable. So would I just have to add the SMNP interface to the computer to use your utility?

If it is capable, then you should get results with the snmpwalk command:

snmpwalk -v2c -c word ip_address

but you would need to know the word set in the setup of the snmp device. Perhaps when connected to the network, the ups would have a web interface to setup snmp.
15) Message boards : Number crunching : Keeping your rig running: UPS (Uninteruptable Power Supplies), Power Conditioners etc. (Message 2019236)
Posted 16 Nov 2019 by Profile RueiKe Special Project $250 donor
Post:
My first concern is that the output voltage of Eaton is 10V lower than APC. Input voltages are measured by both to be 120V. I have measured the output voltage of Eaton with a meter and confirmed it to be 110V. This is probably not ideal considering the power levels I run at.

Rick - given your are in the USA then 110V is the "correct" voltage for you, while 120 is close to the upper limits, so I'd be more concerned about that.... But are you certain the output wave-forms from the two are the same (and pure sine wave), if they aren't then the output "RMS" voltages could be just what the instruments approximate the different chopped wave-forms to sine waves. This would also be affecting the way the power measurements are not directly comparable.


I am actually in Taiwan. Standard voltage here should be same as USA at 110V, but what I measure at a socket in my home is close to 120V, which matches the input/output of APC. Eaton input is 120V and output is 110V. Looks like Eaton is purposely matching the standard while APC is matching the input. I don't have a scope to look at the waveforms, but the magnitude of the powers seems consistent with the loads I have on each UPS, so I still suspect reported current is off.
16) Message boards : Number crunching : Keeping your rig running: UPS (Uninteruptable Power Supplies), Power Conditioners etc. (Message 2019218)
Posted 16 Nov 2019 by Profile RueiKe Special Project $250 donor
Post:

Right now I can support my compute load for about 8 minutes on battery and I have the apcupsd daemon configured to start shutting down the PC after 3 minutes of no power and the UPS on battery. But if you could dump the compute load like you can on a laptop, the UPS runtime would stretch out to an hour probably. That would allow you to continue using your computer for other things than crunching.


I developed the ups-daemon script for this purpose. If the time on battery time exceeds a specified threshold, then the suspend script will be called. The suspend script in the package will call boinccmd to suspend compute. The resume script will be called when it is detected that the target UPS is no longer on battery. I am still tweaking all of the triggers, so any input would be appreciated. The only limitation is that the utilities only work with snmp to a network attached UPS, so USB attached are not detected.
17) Message boards : Number crunching : Keeping your rig running: UPS (Uninteruptable Power Supplies), Power Conditioners etc. (Message 2019214)
Posted 16 Nov 2019 by Profile RueiKe Special Project $250 donor
Post:
I have a full working set of ups-utils for networked attached UPSs and have been using it to monitor my setup with 3 UPSs. I have some initial observations of the monitor data I have collected. Here is a sample of log data:
ups_type	____UPS____	V in	V out	Load	I out	P out	I(P)	CurFact
apc-ap9630	RicksLab_UPS4	116	120	25	6	685	5.71	0.95
eaton-pw	RicksLab_UPS3	122	110.1	55	3.8	888	8.07	2.12
eaton-pw	RicksLab_UPS2	118.4	110	27	2.7	669	6.08	2.25
apc-ap9630	RicksLab_UPS4	117	120	25	6	681	5.68	0.95
eaton-pw	RicksLab_UPS3	117	110	46	3.2	714	6.49	2.03
eaton-pw	RicksLab_UPS2	119.5	109.9	26	2.7	705	6.41	2.38
apc-ap9630	RicksLab_UPS4	117	120	25	6	683	5.69	0.95
eaton-pw	RicksLab_UPS3	117.8	109.8	55	3.7	886	8.07	2.18
eaton-pw	RicksLab_UPS2	115.7	109.8	27	2.8	642	5.85	2.09
apc-ap9630	RicksLab_UPS4	116	120	25	6	687	5.73	0.95
eaton-pw	RicksLab_UPS3	121.2	110.1	55	3.9	896	8.14	2.09
eaton-pw	RicksLab_UPS2	115.7	110	28	3.1	705	6.41	2.07
apc-ap9630	RicksLab_UPS4	116	120	29	7	763	6.36	0.91
eaton-pw	RicksLab_UPS3	120.6	110.2	56	3.7	912	8.28	2.24
eaton-pw	RicksLab_UPS2	122.4	110.2	29	2.8	687	6.23	2.23

My first concern is that the output voltage of Eaton is 10V lower than APC. Input voltages are measured by both to be 120V. I have measured the output voltage of Eaton with a meter and confirmed it to be 110V. This is probably not ideal considering the power levels I run at.

Next concern is the inconsistency of the current measurements. I have confirmed with a MIB browser, that the utility is reporting back the correct data for the appropriate MIB command. If I compare to out current calculated from output power (assuming resistive load), the APC current is consistent with measured, but Eaton is half of what is expected. I was considering to use a 2.2 correction factor on the Eaton current measurement, but it would be good to know what the issue is before I implement it.
18) Message boards : Number crunching : Keeping your rig running: UPS (Uninteruptable Power Supplies), Power Conditioners etc. (Message 2017664)
Posted 3 Nov 2019 by Profile RueiKe Special Project $250 donor
Post:
I have posted a utility that I have developed for my Linux machines to suspend, resume, and shutdown based on UPS state:

ups-utils

I have verified compatibility with my UPSs which include:
* APC SRT 3000 with AP9630 NMC
* Eaton C-3000-F with PowerWalker NMC
To use the utility, you must first populate a config.py files using config.py.template as a template.

It is my first SNMP app, so still things for me to figure out, so it doesn't list for snmp traps yet.
19) Message boards : Number crunching : Help Needed - Can't get GPU Tasks (Message 2016452)
Posted 24 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
If the file is named in the app_info it must be present in the folder.

That was it. Thanks!
20) Message boards : Number crunching : Help Needed - Can't get GPU Tasks (Message 2016433)
Posted 24 Oct 2019 by Profile RueiKe Special Project $250 donor
Post:
Once it starts deleting files it usually deletes all the named files, even mb_cmdline_rickslab_ati5.txt. Just make sure all 3 of the named GPU files are present in the folder.


My experience had been that if the command file was missing, it would just use the app with default arguments, so I didn’t check the existence of the file. I will check it out this evening.


Next 20


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.