GPU Curiosity Question

Message boards : Number crunching : GPU Curiosity Question
Message board moderation

To post messages, you must log in.

AuthorMessage
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702511 - Posted: 17 Jul 2015, 14:34:43 UTC

I have a dedicated Nvidia 750 TI (no monitor attached) running one task and the temperature of the GPU is 50C. According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds. There is one virtual CPU core free to support the GPU and "top" always shows the load on it at less than 15%. The question is, why would the GPU only show that it is working less than 5% of the time?

I have compared the server times using “ps ax” and the server logs to the times reported by the task page and they seem to agree. I suspect it is what it is, but it would be nice to know why it is. Perhaps the Linux algorithm for reporting the CPU time is flawed, but you would think the GPU temperature would be higher if that were the case. The reported temperature of 50C would certainly indicate that the GPU is not being overtaxed
ID: 1702511 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1702513 - Posted: 17 Jul 2015, 14:50:41 UTC - in response to Message 1702511.  
Last modified: 17 Jul 2015, 14:52:38 UTC

I have a dedicated Nvidia 750 TI (no monitor attached) running one task and the temperature of the GPU is 50C. According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds. There is one virtual CPU core free to support the GPU and "top" always shows the load on it at less than 15%. The question is, why would the GPU only show that it is working less than 5% of the time?

I have compared the server times using “ps ax” and the server logs to the times reported by the task page and they seem to agree. I suspect it is what it is, but it would be nice to know why it is. Perhaps the Linux algorithm for reporting the CPU time is flawed, but you would think the GPU temperature would be higher if that were the case. The reported temperature of 50C would certainly indicate that the GPU is not being overtaxed

Maybe the -use_sleep function is causing a weird issue with the monitoring utility? Perhaps it is polling at an interval when things are simply not active.
I looked at a few other hosts that also have 750 Ti's & the AP processing times seems very similar to yours. Some were a bit higher. So they may be running more than 1 task at a time or some other configuration that is less optimized. However your CPU times are the lowest I found.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1702513 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1702634 - Posted: 17 Jul 2015, 22:38:31 UTC - in response to Message 1702511.  

I have a dedicated Nvidia 750 TI (no monitor attached) running one task and the temperature of the GPU is 50C. According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds. There is one virtual CPU core free to support the GPU and "top" always shows the load on it at less than 15%. The question is, why would the GPU only show that it is working less than 5% of the time?

I have compared the server times using “ps ax” and the server logs to the times reported by the task page and they seem to agree. I suspect it is what it is, but it would be nice to know why it is. Perhaps the Linux algorithm for reporting the CPU time is flawed, but you would think the GPU temperature would be higher if that were the case. The reported temperature of 50C would certainly indicate that the GPU is not being overtaxed

Perhaps you are confusing GPU with CPU -- "top" knows nothing about GPUs, only CPUs. In fact, I know nothing that will give you an estimate of GPU usage under Linux -- nvidia_smi will tell you the GPU temperature, the percentage of the fan speed and memory usage. But it won't tell you the things that GPUz (purports to) tell(s) you about the GPU on Windows systems.
50 C is a doddle for a 750 Ti -- it will easily run two instances, where the one or the other takes over computing in the "slack" periods that the other job is transferring data to the host PC. Set it to run two instances and watch your RAC climb a bit -- and your GPU run 10 or so (Celsius) degrees warmer. (I've got a 660 Ti in this Linux machine, it's showing 63 C running two Astropulse WUs with an ambient of about 25 C; I also have a 750 Ti in a Windows machine at work, it handles two SETI jobs with ease, though I do notice a bit of mouse lag when one of its jobs is an Astropulse WU).
ID: 1702634 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702673 - Posted: 18 Jul 2015, 2:01:58 UTC - in response to Message 1702513.  

I have a dedicated Nvidia 750 TI (no monitor attached) running one task and the temperature of the GPU is 50C. According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds. There is one virtual CPU core free to support the GPU and "top" always shows the load on it at less than 15%. The question is, why would the GPU only show that it is working less than 5% of the time?

I have compared the server times using “ps ax” and the server logs to the times reported by the task page and they seem to agree. I suspect it is what it is, but it would be nice to know why it is. Perhaps the Linux algorithm for reporting the CPU time is flawed, but you would think the GPU temperature would be higher if that were the case. The reported temperature of 50C would certainly indicate that the GPU is not being overtaxed

Maybe the -use_sleep function is causing a weird issue with the monitoring utility? Perhaps it is polling at an interval when things are simply not active.
I looked at a few other hosts that also have 750 Ti's & the AP processing times seems very similar to yours. Some were a bit higher. So they may be running more than 1 task at a time or some other configuration that is less optimized. However your CPU times are the lowest I found.


Just for grins. I tried using “-no_use_sleep” and “top” indicated that the free virtual core went to 100% on the GPU application from less than 15%. In addition, three of the remaining seven virtual cores that normally run AP tasks at 99-100% were shown to be at a nominal 70%. Unfortunately the reported temperature of the GPU stayed about the same so I think I only hurt my CPU performance on APs without offering any improvement on the GPUs performance. Guess I will stick with "-use sleep" and live with the low CPU (GPU) times relative to the run times.
ID: 1702673 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702675 - Posted: 18 Jul 2015, 2:14:32 UTC - in response to Message 1702634.  

I have a dedicated Nvidia 750 TI (no monitor attached) running one task and the temperature of the GPU is 50C. According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds. There is one virtual CPU core free to support the GPU and "top" always shows the load on it at less than 15%. The question is, why would the GPU only show that it is working less than 5% of the time?

I have compared the server times using “ps ax” and the server logs to the times reported by the task page and they seem to agree. I suspect it is what it is, but it would be nice to know why it is. Perhaps the Linux algorithm for reporting the CPU time is flawed, but you would think the GPU temperature would be higher if that were the case. The reported temperature of 50C would certainly indicate that the GPU is not being overtaxed


Perhaps you are confusing GPU with CPU -- "top" knows nothing about GPUs, only CPUs. In fact, I know nothing that will give you an estimate of GPU usage under Linux -- nvidia_smi will tell you the GPU temperature, the percentage of the fan speed and memory usage. But it won't tell you the things that GPUz (purports to) tell(s) you about the GPU on Windows systems.


I was not basing the utilization of the GPU on “top”. I used “top” only to judge the degree to which the AP application was utilizing the free core which I indicated at no more than 15%. I did that only to show that I did not think the lack of CPU cycles in support of the GPU was slowing down the GPU.

I used “ps –ax” to list the AP application and show how much time it was actually running. In the example below it shows 69 seconds and when a WU completes, the last time “ps ax” displays matches the CPU (GPU) time shown on the SETI Results Page which is nominally 80 seconds

12286 RNl 1:09. ./../projects/setiathome.berkeley.edu/astropulse_7.08…….

The actual elapsed real time is typically 2100 seconds which is what is shown in the Results Page as Run Time.

I was basing the low usage on the disparity between the reported “CPU [GPU] time” and the “Run Time” which seems to indicate that the GPU is only working less than 5% of the time. That is probably not the case and I was just curious as to why that might be. My apologies for not being clearer in my first post.

50 C is a doddle for a 750 Ti -- it will easily run two instances, where the one or the other takes over computing in the "slack" periods that the other job is transferring data to the host PC. Set it to run two instances and watch your RAC climb a bit -- and your GPU run 10 or so (Celsius) degrees warmer. (I've got a 660 Ti in this Linux machine, it's showing 63 C running two Astropulse WUs with an ambient of about 25 C; I also have a 750 Ti in a Windows machine at work, it handles two SETI jobs with ease, though I do notice a bit of mouse lag when one of its jobs is an Astropulse WU).


My results have been different. If I try to run two tasks, the tasks take twice as long and if the temperature of the GPU is any indication, I suspect I actually lose ground a bit since the temperature reported by “nvidia-smi” is about 2 degrees cooler than when running one WU.
ID: 1702675 · Report as offensive
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 1702737 - Posted: 18 Jul 2015, 5:52:59 UTC

#dsh:
I'm using Nvidia driver 344.22 in Linux-64. I run "nvidia-settings" which is part of the driver package utilities to get GPU % utilization. When the nvidia-settings window opens, click on GPU0 to see lots of info including utilization and GPU memory used, etc.
Even though you have a different driver version I would be surprised if it did not have the "settings" utility.

Gene;
ID: 1702737 · Report as offensive
castor

Send message
Joined: 2 Jan 02
Posts: 13
Credit: 17,721,708
RAC: 0
Finland
Message 1702785 - Posted: 18 Jul 2015, 11:36:06 UTC - in response to Message 1702511.  
Last modified: 18 Jul 2015, 11:47:44 UTC

According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds.

What you take as "CPU (GPU) Time" is actually CPU time only.
The real "GPU time" will be (close to) that shown as "Run Time".

E: I get temps around 58C on one of my 750Tis under 99% GPU load. Perhaps your's just has a very efficient cooler.
ID: 1702785 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1702934 - Posted: 18 Jul 2015, 22:55:41 UTC - in response to Message 1702785.  
Last modified: 18 Jul 2015, 22:59:09 UTC

E: I get temps around 58C on one of my 750Tis under 99% GPU load. Perhaps your's just has a very efficient cooler.

Depends on the cooler & ambient temperature.

My Gigabyte GTX 750Tis generally sit at around 60-65°c with around 90-95% load on them with the ambient temperatures around 36°c.
At present (coldest days so far this year) ambient temperature is down to 22°c, card temperatures are mid 50°s, but fan speed is down to 65% of maximum. Usually they're at around 75-80%.
Grant
Darwin NT
ID: 1702934 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702953 - Posted: 19 Jul 2015, 2:52:44 UTC - in response to Message 1702737.  

#dsh:
I'm using Nvidia driver 344.22 in Linux-64. I run "nvidia-settings" which is part of the driver package utilities to get GPU % utilization. When the nvidia-settings window opens, click on GPU0 to see lots of info including utilization and GPU memory used, etc.
Even though you have a different driver version I would be surprised if it did not have the "settings" utility.

Gene;


This is on a headless server, but I managed to install "Xming" on the local PC so it would run and display "xclock" and "firefox" using "Putty". When I run "nvidia-settings" however it cannot find the driver but if I close the the error message I can click on the various tabs and I do not see anything about GPU % utilization. It looks like it is only for specifying "nvidia settings Configuration" and "Application Profiles". Based on that I have decided not put any more time into it, but I thank you for the thought though. It is appreciated.
ID: 1702953 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702956 - Posted: 19 Jul 2015, 3:13:34 UTC - in response to Message 1702785.  

According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds.

What you take as "CPU (GPU) Time" is actually CPU time only.
The real "GPU time" will be (close to) that shown as "Run Time".

E: I get temps around 58C on one of my 750Tis under 99% GPU load. Perhaps your's just has a very efficient cooler.



After giving it some more thought, you are absolutely right. "ps ax" is still reporting what the application is using, the same as top, and the only time the times match is when the application is using 100% of the available clock cycles. That may have been what Ivan was trying to tell me but it did not sink in because he was referencing "top" and I was focused on "ps ax". What can I say. I just messed up.
ID: 1702956 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702961 - Posted: 19 Jul 2015, 3:37:50 UTC - in response to Message 1702934.  

E: I get temps around 58C on one of my 750Tis under 99% GPU load. Perhaps your's just has a very efficient cooler.

Depends on the cooler & ambient temperature.

My Gigabyte GTX 750Tis generally sit at around 60-65°c with around 90-95% load on them with the ambient temperatures around 36°c.
At present (coldest days so far this year) ambient temperature is down to 22°c, card temperatures are mid 50°s, but fan speed is down to 65% of maximum. Usually they're at around 75-80%.


My ambient temperature is currently at 27C and the 750 TI is at 51C. That is a lot less heat generated than yours and Castor's but I have no idea what I can contribute that to. Perhaps I will try measuring the input power to the PC with the card working on a WU and when it is not. The difference should be the power used by the 750 plus what ever losses there are in the PS to power the 750 It would be interesting to see what that is in relation to the values indicated by Nvidia and those reported by the tests at "Tom's Hardware" and others when the card is maxed out.
ID: 1702961 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1702964 - Posted: 19 Jul 2015, 3:51:10 UTC - in response to Message 1702675.  

I have a dedicated Nvidia 750 TI (no monitor attached) running one task and the temperature of the GPU is 50C. According to the SETI Task Page a typical AP Run Time of 2100 seconds results in a CPU (GPU) Time of around 80 seconds. There is one virtual CPU core free to support the GPU and "top" always shows the load on it at less than 15%. The question is, why would the GPU only show that it is working less than 5% of the time?

I have compared the server times using “ps ax” and the server logs to the times reported by the task page and they seem to agree. I suspect it is what it is, but it would be nice to know why it is. Perhaps the Linux algorithm for reporting the CPU time is flawed, but you would think the GPU temperature would be higher if that were the case. The reported temperature of 50C would certainly indicate that the GPU is not being overtaxed


Perhaps you are confusing GPU with CPU -- "top" knows nothing about GPUs, only CPUs. In fact, I know nothing that will give you an estimate of GPU usage under Linux -- nvidia_smi will tell you the GPU temperature, the percentage of the fan speed and memory usage. But it won't tell you the things that GPUz (purports to) tell(s) you about the GPU on Windows systems.


I was not basing the utilization of the GPU on “top”. I used “top” only to judge the degree to which the AP application was utilizing the free core which I indicated at no more than 15%. I did that only to show that I did not think the lack of CPU cycles in support of the GPU was slowing down the GPU.

I used “ps –ax” to list the AP application and show how much time it was actually running. In the example below it shows 69 seconds and when a WU completes, the last time “ps ax” displays matches the CPU (GPU) time shown on the SETI Results Page which is nominally 80 seconds

12286 RNl 1:09. ./../projects/setiathome.berkeley.edu/astropulse_7.08…….

The actual elapsed real time is typically 2100 seconds which is what is shown in the Results Page as Run Time.

I was basing the low usage on the disparity between the reported “CPU [GPU] time” and the “Run Time” which seems to indicate that the GPU is only working less than 5% of the time. That is probably not the case and I was just curious as to why that might be. My apologies for not being clearer in my first post.

Edit: To anyone reading the above, it is in error and Ivan is absolutely correct. I realized "top" was reporting the CPU time used by the GPU app but I was not making the connection that same was true for "ps -ax" reported time. My mistake. Thanks to Ivan and Castor for pointing out the error of my ways.


50 C is a doddle for a 750 Ti -- it will easily run two instances, where the one or the other takes over computing in the "slack" periods that the other job is transferring data to the host PC. Set it to run two instances and watch your RAC climb a bit -- and your GPU run 10 or so (Celsius) degrees warmer. (I've got a 660 Ti in this Linux machine, it's showing 63 C running two Astropulse WUs with an ambient of about 25 C; I also have a 750 Ti in a Windows machine at work, it handles two SETI jobs with ease, though I do notice a bit of mouse lag when one of its jobs is an Astropulse WU).


My results have been different. If I try to run two tasks, the tasks take twice as long and if the temperature of the GPU is any indication, I suspect I actually lose ground a bit since the temperature reported by “nvidia-smi” is about 2 degrees cooler than when running one WU.
ID: 1702964 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1702990 - Posted: 19 Jul 2015, 6:03:12 UTC - in response to Message 1702961.  

My ambient temperature is currently at 27C and the 750 TI is at 51C. That is a lot less heat generated than yours and Castor's but I have no idea what I can contribute that to.


For reference,
Core clock 1163MHz
Memory clock 1350MHz
Power consumption <70% of TDP max.
VDDC 1.2V

My second card is actually clocked slightly higher at 1176MHz, power consumption is slightly less, as VDDC is 1.15V, but it's temperature is considerably lower- I attribute this to it's location in the system- the coolest one is near the bottom of the motherboard, the hotter one has it's cooling fans up against the back of the cooler one.
Grant
Darwin NT
ID: 1702990 · Report as offensive

Message boards : Number crunching : GPU Curiosity Question


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.