What percent utilization to expect on the CUDA90 linux client?

Message boards : Number crunching : What percent utilization to expect on the CUDA90 linux client?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2005035 - Posted: 31 Jul 2019, 19:09:18 UTC

Is %85 utilization about right? I am guessing as nvidia-smi shows the instantaneous value at the time I run the program. I do see 0% and 100% occassionially and the power meter fluctuates from 790 to 900 watts. I have 3 gpus on a single splitter and the other 5 are on their own slots. I assume the following is good. I ran that program several times and picked the following that seems about the average.

jstateson@tb85-nvidia:~$ nvidia-smi
Wed Jul 31 13:27:10 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
|100%   55C    P2    67W / 151W |   1520MiB /  8117MiB |     80%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 106...  Off  | 00000000:02:00.0 Off |                  N/A |
|100%   68C    P2   116W / 120W |   1300MiB /  6078MiB |     80%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 106...  Off  | 00000000:03:00.0 Off |                  N/A |
|100%   70C    P2    91W / 120W |   1292MiB /  3019MiB |     86%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 106...  Off  | 00000000:04:00.0 Off |                  N/A |
|100%   72C    P2    63W / 120W |   1292MiB /  3019MiB |     85%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 1070    Off  | 00000000:05:00.0  On |                  N/A |
|100%   49C    P2    87W / 151W |   1330MiB /  8119MiB |     85%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce GTX 106...  Off  | 00000000:08:00.0 Off |                  N/A |
|100%   72C    P2    71W / 120W |   1292MiB /  3019MiB |     84%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce GTX 106...  Off  | 00000000:09:00.0 Off |                  N/A |
|100%   67C    P2    85W / 120W |   1292MiB /  3019MiB |     88%      Default |
+-------------------------------+----------------------+----------------------+
|   7  GeForce GTX 106...  Off  | 00000000:0A:00.0 Off |                  N/A |
|100%   63C    P2    83W / 120W |   1292MiB /  3019MiB |     82%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1414      G   /usr/lib/xorg/Xorg                           111MiB |
|    0      1656      G   /usr/bin/gnome-shell                          94MiB |
|    0      2895      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1303MiB |
|    1      1414      G   /usr/lib/xorg/Xorg                             6MiB |
|    1      2887      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1281MiB |
|    2      1414      G   /usr/lib/xorg/Xorg                             6MiB |
|    2      2871      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    3      1414      G   /usr/lib/xorg/Xorg                             6MiB |
|    3      2921      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    4      1414      G   /usr/lib/xorg/Xorg                            15MiB |
|    4      2909      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1303MiB |
|    5      1414      G   /usr/lib/xorg/Xorg                             6MiB |
|    5      2929      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    6      1414      G   /usr/lib/xorg/Xorg                             6MiB |
|    6      2881      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    7      1414      G   /usr/lib/xorg/Xorg                             6MiB |
|    7      2934      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
+-----------------------------------------------------------------------------+


Averageing about 2.3 minutes per work unit

ID: 2005035 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005038 - Posted: 31 Jul 2019, 19:27:47 UTC

Too bad you don't have more cpu cores so you can run the -nobs parameter. I normally see 98-100% on all my cards all the time with -nobs.

You should run nvidia-smi with the loop parameter if you want to watch the utilization with automatic polling.
nvidia-smi -l 2


I would say you are doing the best you can with your limited cpu support. Normally the special app performs best with a full cpu core to support each gpu task.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005038 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2005040 - Posted: 31 Jul 2019, 19:42:16 UTC - in response to Message 2005035.  

you should be seeing 95+% utilization the majority of the time. only at the very beginning and end of the WU processing will you see brief periods of lower values.

I'm guessing with 8 GPUs, you are using USB risers. are you running PCIe gen 3 or gen 2 speeds?

you can run the following command to check:
nvidia-smi --query-gpu=name,pci.bus_id,pcie.link.gen.current --format=csv


or open nvidia-settings and check the link speed listed for each card. [5 GT/s] = gen 2, [8 GT/s] = gen 3.

you most likely have crap USB cables that came with the riser kits. most all of them are bad, and the biggest clue to a bad riser cable in my testing and experience was when the GPU utilization would be low like this. hop on amazon and order some replacement USB 3.0 cables that are higher quality. I've had good luck with the UGREEN branded cables.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2005040 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2005042 - Posted: 31 Jul 2019, 19:47:11 UTC - in response to Message 2005038.  

Too bad you don't have more cpu cores so you can run the -nobs parameter. I normally see 98-100% on all my cards all the time with -nobs.

You should run nvidia-smi with the loop parameter if you want to watch the utilization with automatic polling.
nvidia-smi -l 2


I would say you are doing the best you can with your limited cpu support. Normally the special app performs best with a full cpu core to support each gpu task.


he can run -nobs. he has an 8-thread CPU with 8 GPUs. it would be hammering the CPU at all times, but it can be done. he probably wont see a huge impact to run speeds, its on the edge IMO.

i never checked GPU utilization without nobs, so i'm not sure if it's lower like this or not, but I did see lower utilization with defective USB cables (which eventually manifested stability issues, but the low utilization was always the clue that they were bad)
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2005042 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005044 - Posted: 31 Jul 2019, 19:58:59 UTC - in response to Message 2005042.  

If you don't have a <avg_ncpus>1.0</avg_ncpus> statement in a app_config file and just run the stock app without -nobs, then I was seeing around 85% utilization. I think what he has is normal, at least from my experience. You guys have the experience with risers, I don't. Even on the card that is in my X4 slot on the mobo, I can get to 100% utilization with -nobs. But I have plenty of cpu cores to spare.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005044 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2005048 - Posted: 31 Jul 2019, 20:08:22 UTC
Last modified: 31 Jul 2019, 20:09:46 UTC

Your numbers are low, at least for the 1070 look mine's

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.26       Driver Version: 430.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2070    On   | 00000000:01:00.0 Off |                  N/A |
|100%   71C    P2   158W / 160W |   1429MiB /  7982MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 2070    On   | 00000000:02:00.0 Off |                  N/A |
|100%   55C    P2   159W / 160W |   1429MiB /  7982MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 1070    On   | 00000000:03:00.0 Off |                  N/A |
| 75%   57C    P2   161W / 160W |   1385MiB /  8119MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 1070    On   | 00000000:04:00.0  On |                  N/A |
| 75%   62C    P2   118W / 160W |   1709MiB /  8118MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+


I keep 1 core to feed each GPU and run only 2 CPU + 4 GPU WU at a time.
ID: 2005048 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2005049 - Posted: 31 Jul 2019, 20:13:09 UTC
Last modified: 31 Jul 2019, 20:19:19 UTC

I enabled nobs and will see what happens. Running that nvidia-smi -l 2 shows occasional 0 and 100 like I mentioned but while some are in the 90's I see more in high 80.

Boinctasks shows cpu utilization averaging below %30 (except on startup) so assumeing 1/4 of a cpu for 8 work units supposedly there is more than enough of the E3-1230 Xeon to go around. However, I am not sure how accurate or how useful that CPU statistics is.



When I installed Ubuntu 18.04 I had problem with 5 gpus all on risers and a mining thread mentioned that the TB85 worked best with setting the five x1 slots to GEN-1 so I did that in bios. The x16 slot was not programmable.

root@tb85-nvidia:/home/jstateson# nvidia-smi --query-gpu=name,pci.bus_id,pcie.link.gen.current --format=csv
name, pci.bus_id, pcie.link.gen.current
GeForce GTX 1070, 00000000:01:00.0, 3
GeForce GTX 1060 6GB, 00000000:02:00.0, 2
GeForce GTX 1060 3GB, 00000000:03:00.0, 1
GeForce GTX 1060 3GB, 00000000:04:00.0, 1
GeForce GTX 1070, 00000000:05:00.0, 1
GeForce GTX 1060 3GB, 00000000:08:00.0, 2
GeForce GTX 1060 3GB, 00000000:09:00.0, 1
GeForce GTX 1060 3GB, 00000000:0A:00.0, 1


Not sure what the above means. However, I had a lot of other problems on first install of 18.04 that may have been fixed with the update & upgrade and I maybe go back to bios and restore the x1 to Gen2 as possibly that may not have been a problem in the first place.

I do have a USB cable problem as I can move cables about and see really different conditions (boards not recognized) looks like I have 6 blue thick USB cables and a pair of skinny red & black that I suspect are not as good as the thicker ones. Worth a try to make them the same.
ID: 2005049 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005054 - Posted: 31 Jul 2019, 20:30:32 UTC

I find the utilization values in BoincTasks to be very accurate. So what your screenshot shows I think is the real state of the tasks running. The low values I would think are because of the X1 Gen 1 slot speeds. See if you can go back into the BIOS and change the slot speeds to at least Gen. 2.

And absolutely order those USB cables mentioned.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005054 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2005055 - Posted: 31 Jul 2019, 20:31:30 UTC - in response to Message 2005044.  
Last modified: 31 Jul 2019, 20:36:15 UTC

If you don't have a <avg_ncpus>1.0</avg_ncpus> statement in a app_config file and just run the stock app without -nobs, then I was seeing around 85% utilization. I think what he has is normal, at least from my experience. You guys have the experience with risers, I don't. Even on the card that is in my X4 slot on the mobo, I can get to 100% utilization with -nobs. But I have plenty of cpu cores to spare.

I don't know where that myth came from, there is absolutely No Problem running nobs with more GPUs than cores. You should Never use <avg_ncpus>1.0</avg_ncpus> as it will stop you from running more GPUs. Use the default setting, that's why it is there. If you want to limit CPUs then use the 'Use at most ___% of the CPUs' setting, that's why it's there.
I have run 11 GPUs with a 4 core i5 before using nobs. I was running 6 GPUs with the same i5 and nobs up until yesterday. I'm still running 14 GPUs and nobs with an 8 core GPU without any trouble.
As soon as I get another CPU cooler the new i7 will be running nobs with 9 GPUs. You do need a good cooler, this one will keep an i7 6700K at around 70C running flat out, https://www.newegg.com/p/N82E16835114140, I'm ordering another one today.

My 14 GPU machine runs nicely with only 8 CPU cores, the system will divide the CPU resources automatically and it obviously doesn't need a full CPU to run the GPUs at top speed. The advantage will vary, the lower end cards only see a couple of seconds improvement with using nobs, it depends on the GPU.
bar@TBar-iSETI:~$ nvidia-smi
Wed Jul 31 16:32:42 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    On   | 00000000:01:00.0  On |                  N/A |
| 49%   59C    P2    94W / 151W |   1540MiB /  8116MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1070    On   | 00000000:02:00.0 Off |                  N/A |
| 80%   70C    P2   127W / 151W |   1297MiB /  8119MiB |     97%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 106...  On   | 00000000:05:00.0 Off |                  N/A |
| 75%   74C    P2    92W / 150W |   1269MiB /  3019MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 106...  On   | 00000000:09:00.0 Off |                  N/A |
| 84%   72C    P2    90W / 120W |   1269MiB /  3019MiB |     95%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 106...  On   | 00000000:0A:00.0 Off |                  N/A |
| 65%   68C    P2    82W / 150W |   1269MiB /  3019MiB |     93%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce GTX 106...  On   | 00000000:0B:00.0 Off |                  N/A |
| 70%   64C    P2    82W / 120W |   1269MiB /  3019MiB |     86%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce GTX 970     On   | 00000000:0C:00.0 Off |                  N/A |
| 55%   65C    P2   133W / 170W |   1260MiB /  4043MiB |     92%      Default |
+-------------------------------+----------------------+----------------------+
|   7  GeForce GTX 106...  On   | 00000000:0D:00.0 Off |                  N/A |
| 84%   72C    P2    90W / 120W |   1269MiB /  3019MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|   8  GeForce GTX 106...  On   | 00000000:0E:00.0 Off |                  N/A |
| 55%   67C    P2   136W / 150W |   1269MiB /  3019MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   9  GeForce GTX 106...  On   | 00000000:0F:00.0 Off |                  N/A |
| 70%   73C    P2    91W / 120W |   1269MiB /  3019MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|  10  GeForce GTX 106...  On   | 00000000:10:00.0 Off |                  N/A |
| 65%   72C    P2    85W / 120W |   1269MiB /  3019MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|  11  GeForce GTX 106...  On   | 00000000:11:00.0 Off |                  N/A |
| 70%   69C    P2   103W / 150W |   1269MiB /  3019MiB |     79%      Default |
+-------------------------------+----------------------+----------------------+
|  12  GeForce GTX 1070    On   | 00000000:12:00.0 Off |                  N/A |
| 64%   70C    P2   116W / 151W |   1297MiB /  8119MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|  13  GeForce GTX 106...  On   | 00000000:13:00.0 Off |                  N/A |
| 80%   73C    P2    95W / 120W |   1269MiB /  3019MiB |     87%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1225      G   /usr/lib/xorg/Xorg                            98MiB |
|    0      1869      G   /usr/bin/gnome-shell                         152MiB |
|    0     20132      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1285MiB |
|    1     20078      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1285MiB |
|    2     20045      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|    3     20186      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|    4     20032      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|    5     20140      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|    6     20149      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1247MiB |
|    7     20154      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|    8     20064      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|    9     20051      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|   10     20130      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|   11     20059      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
|   12     20116      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1285MiB |
|   13     20069      C   ...41p_V0.98b1_x86_64-pc-linux-gnu_cuda100  1257MiB |
+-----------------------------------------------------------------------------+
ID: 2005055 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 2005060 - Posted: 31 Jul 2019, 20:49:49 UTC - in response to Message 2005055.  


I don't know where that myth came from, there is absolutely No Problem running nobs with more GPUs than cores.


I think this came about due to confusion by some between the cuda special and SoG. The cuda doesn't require the full thread in order to perform while SoG does. This being the case, Cuda does work with more work units than threads available since it doesn't require that full thread. SoG on the other hand requires the full thread and since that is the case, trying to run more SoG than Threads available resulted in work units stopping and restarting from Zero. Thus we were forced into the -cpu_lock and -number_of_instances to prevent this. That is my best guess as to where it came about.

My own experience has show a modest gain from devoting a full thread to the work (only a few seconds shaved off the time at best) I think it is also dependent on CPU speed as well as RAM Speeds on the MoBo. YMMV
ID: 2005060 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2005065 - Posted: 31 Jul 2019, 21:18:45 UTC - in response to Message 2005042.  

I did see lower utilization with defective USB cables (which eventually manifested stability issues, but the low utilization was always the clue that they were bad)


This is incredible!!!!!! Huge difference with better cables!

I replaced three cables that were "tiny" with the thicker blue one and the difference is unbelievable!!!

boincstats now shows over %95 cpu utilization where before I had under %30. All 8 cores (threads) are busy!

nvidia-smi shows far higher gpu utilization. typically all in %90 except those starting up

Notice for first time that gpu tasks were waiting for memory. I fixed that by assigning %95 for both in use and idle time. Have pair of 4 gig sticks. This system not used for anything else so probably don't need any more although I do have a GTX650ti not being used and an empty slot on the 4-in-1

did not change gen1 allocation. thinks look good here. Might be better using pair of 2-in-1 instead of putting all the extra gpus on a 4-in-1.

root@tb85-nvidia:/home/jstateson# nvidia-smi
Wed Jul 31 16:05:55 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
|100%   58C    P2   105W / 151W |   1512MiB /  8117MiB |     98%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 106...  Off  | 00000000:02:00.0 Off |                  N/A |
|100%   72C    P2    99W / 120W |   1300MiB /  6078MiB |     99%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 106...  Off  | 00000000:03:00.0 Off |                  N/A |
|100%   65C    P2    79W / 120W |   1292MiB /  3019MiB |     95%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 106...  Off  | 00000000:04:00.0 Off |                  N/A |
|100%   78C    P2    78W / 120W |   1292MiB /  3019MiB |     92%      Default |
+-------------------------------+----------------------+----------------------+
|   4  GeForce GTX 1070    Off  | 00000000:05:00.0  On |                  N/A |
|100%   49C    P2    89W / 151W |   1330MiB /  8119MiB |     91%      Default |
+-------------------------------+----------------------+----------------------+
|   5  GeForce GTX 106...  Off  | 00000000:08:00.0 Off |                  N/A |
|100%   70C    P2    55W / 120W |   1292MiB /  3019MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+
|   6  GeForce GTX 106...  Off  | 00000000:09:00.0 Off |                  N/A |
|100%   66C    P2    90W / 120W |   1292MiB /  3019MiB |     95%      Default |
+-------------------------------+----------------------+----------------------+
|   7  GeForce GTX 106...  Off  | 00000000:0A:00.0 Off |                  N/A |
|100%   67C    P2    62W / 120W |   1292MiB /  3019MiB |     72%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1472      G   /usr/lib/xorg/Xorg                           104MiB |
|    0      1745      G   /usr/bin/gnome-shell                          92MiB |
|    0      2587      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1303MiB |
|    1      1472      G   /usr/lib/xorg/Xorg                             6MiB |
|    1      2576      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1281MiB |
|    2      1472      G   /usr/lib/xorg/Xorg                             6MiB |
|    2      2593      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    3      1472      G   /usr/lib/xorg/Xorg                             6MiB |
|    3      2619      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    4      1472      G   /usr/lib/xorg/Xorg                            15MiB |
|    4      2594      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1303MiB |
|    5      1472      G   /usr/lib/xorg/Xorg                             6MiB |
|    5      2614      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    6      1472      G   /usr/lib/xorg/Xorg                             6MiB |
|    6      2603      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
|    7      1472      G   /usr/lib/xorg/Xorg                             6MiB |
|    7      2608      C   ...x41p_V0.98b1_x86_64-pc-linux-gnu_cuda90  1273MiB |
+-----------------------------------------------------------------------------+
root@tb85-nvidia:/home/jstateson#


ID: 2005065 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005066 - Posted: 31 Jul 2019, 21:19:34 UTC - in response to Message 2005055.  

OK, I stand corrected. I will try not to cross my wires and perpetuate any further myths. So just use the default settings in the app_info with -nobs with however many cpu cores you can scrounge up.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005066 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2005067 - Posted: 31 Jul 2019, 21:21:37 UTC

See, Ian was correct about using the best cables available and that making all the difference in the world. Glad you got your mining/crunching rig working correctly.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2005067 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2005070 - Posted: 31 Jul 2019, 21:30:17 UTC - in response to Message 2005067.  

See, Ian was correct about using the best cables available and that making all the difference in the world. Glad you got your mining/crunching rig working correctly.



Going to pass you up (at least on the 2nd top computer page) ;>)
ID: 2005070 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2005075 - Posted: 31 Jul 2019, 21:50:24 UTC - in response to Message 2005070.  
Last modified: 31 Jul 2019, 21:57:53 UTC

See, Ian was correct about using the best cables available and that making all the difference in the world. Glad you got your mining/crunching rig working correctly.

Going to pass you up (at least on the 2nd top computer page) ;>)

All my USB kits came with the Same sized cable, None of them were small. I have a few dozen, and so far I haven't found a single bad cable that came with the kits. I have found a few of the PCIe connectors were bad though, I've had to replace just about everyone of the 90 degree ones and need some more 'normal' ones.
If you want to see the difference, remove the nobs setting and see what happens. In my experience, nobs works better the more GPUs you have. When I was able to use it with 14 GPUs there was quite a difference. Just watch your CPU temps.
ID: 2005075 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2005077 - Posted: 31 Jul 2019, 21:56:19 UTC - in response to Message 2005070.  

Glad it helped. Problems with risers are almost always the bad cables.

I’ve had several of those thick blue cables go bad also, so I’d really recommend upgrading them all to better quality ones. It will save you trouble in the future.

What motherboard do you have? Must be an LGA 1150 board since you’re running an E3 v3 Xeon. I’d definitely go into the BIOS and make sure all of your slots are set to the max speeds possible. Gen 3 is preferable, gen 2 is acceptable with a slight slowdown, but gen 1 will slowdown the cards considerably.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2005077 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2005079 - Posted: 31 Jul 2019, 22:25:42 UTC - in response to Message 2005077.  
Last modified: 31 Jul 2019, 22:33:24 UTC

Glad it helped. Problems with risers are almost always the bad cables.

I’ve had several of those thick blue cables go bad also, so I’d really recommend upgrading them all to better quality ones. It will save you trouble in the future.

What motherboard do you have? Must be an LGA 1150 board since you’re running an E3 v3 Xeon. I’d definitely go into the BIOS and make sure all of your slots are set to the max speeds possible. Gen 3 is preferable, gen 2 is acceptable with a slight slowdown, but gen 1 will slowdown the cards considerably.



The v3 is a socket 1150. Most other systems are cheap socket 1366 (compared to 1150) I was unable to find a mining motherboard with socket 1366 and did not see any 6 core socket 1150. If I had it to do over I should have gotten that motherboard that had all USB3 sockets: no need for x1 risers but it cost a lot more and the CPUs are newer plus I already have a lot of ddr3. I ruined an RX560 and and RX570 by putting the x1 risers in backwards. Those alone would have paid for the USB3 motherboard.

Temps for the v3 max at 72 supposedly. However I could not find that in intel's "arc'. It was listed at an overclocker forum. The ARC not show any specs other than stating the chip has thermal monitoring. Not sure if that means it can throttle down.

sensors show the following at full load
coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +64.0°C  (high = +80.0°C, crit = +100.0°C)
Core 0:        +64.0°C  (high = +80.0°C, crit = +100.0°C)
Core 1:        +63.0°C  (high = +80.0°C, crit = +100.0°C)
Core 2:        +61.0°C  (high = +80.0°C, crit = +100.0°C)
Core 3:        +63.0°C  (high = +80.0°C, crit = +100.0°C)
ID: 2005079 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2005080 - Posted: 31 Jul 2019, 22:28:21 UTC - in response to Message 2005079.  

Temps look fine.

I was more asking for the motherboard make and model so I can look up the manual and look at the specs of the PCIe slots. Do you know the motherboard model?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2005080 · Report as offensive
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2005081 - Posted: 31 Jul 2019, 22:30:39 UTC - in response to Message 2005080.  
Last modified: 31 Jul 2019, 23:01:37 UTC

Temps look fine.

I was more asking for the motherboard make and model so I can look up the manual and look at the specs of the PCIe slots. Do you know the motherboard model?



Biostar TB85 but not the "BTC" model

https://www.biostar.com.tw/app/en/mb/introduction.php?S_ID=734

[EDIT] With CPU utilization in the high 90s on all 8 GPUs I suspect adding additional GPUs even a 650ti probably going to impact the other boards. Best improvement is maybe that gen2 or possibly overclock but I am keeping my garage door partially open at night and fully open in daytime as even open mining rack with 36" fan still runs hot.
ID: 2005081 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2005084 - Posted: 31 Jul 2019, 22:33:38 UTC - in response to Message 2005065.  

...Notice for first time that gpu tasks were waiting for memory. I fixed that by assigning %95 for both in use and idle time. Have pair of 4 gig sticks. This system not used for anything else so probably don't need any more although I do have a GTX650ti not being used and an empty slot on the 4-in-1
I know someone who had the Exact same problem with BOINC saying 'waiting for memory', he too had 8 GB of ram. Only he accused the BOINC Error as somehow being caused by the CUDA App instead of him being Low on memory. To this day he still blames the CUDA App even though No One Else had that problem with the CUDA App. I'd suggest you add more RAM if you don't want any more problems.
ID: 2005084 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : What percent utilization to expect on the CUDA90 linux client?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.