LotzaCores and a GTX 1080 FTW

Message boards : Number crunching : LotzaCores and a GTX 1080 FTW
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 11 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1793784 - Posted: 5 Jun 2016, 22:32:29 UTC

The BOINC client locally on your computer will identify each card separately and list them and their distinct characteristics in the event log at startup.

But the BOINC database on the server will summarise them as multiple copies of the 'best' card, losing the differences.
ID: 1793784 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1793785 - Posted: 5 Jun 2016, 22:32:47 UTC - in response to Message 1793783.  

for whatever reason, BOINC can't seem to identify more than one video card correctly.

Not really true, especially for GPUs from one vendor.
(i.e. if all are NVIDIA they use the same NVIDIA driver)

I agree with the "not identify" assertion if one modifies it to "not report to certain pages". It is my experience and observation that if more than one GPU of more than one model of Nvidia cards is mounted in a system, the system summary data posted on the web site computers list and computer detail pages will give just one of the models, with a number giving the correct total number of cards.

Somewhere I saw an assertion that if cards of differing CUDA capability levels are present, the one mentioned will have the highest CUDA capability level present. I don't know how it chooses among cards of the same level.

But "not identify" is a bit too broad, as in such mixed systems of Nvidia cards a review of the stderr for a task will show the actual card used, at least over at Einstein.
ID: 1793785 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1793789 - Posted: 5 Jun 2016, 22:55:16 UTC - in response to Message 1793785.  

Somewhere I saw an assertion that if cards of differing CUDA capability levels are present, the one mentioned will have the highest CUDA capability level present. I don't know how it chooses among cards of the same level.

All cards will be itemised in the Event Log, but 'lower' cards will be marked as (nor used) unless the option to use all cards is set in cc_config.xml

The comparison for NVidia cards depends on:

// factors (decreasing priority):
// - compute capability
// - software version
// - available memory
// - speed
//
// If "loose", ignore FLOPS and tolerate small memory diff
ID: 1793789 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793793 - Posted: 5 Jun 2016, 23:02:30 UTC - in response to Message 1793784.  

The BOINC client locally on your computer will identify each card separately and list them and their distinct characteristics in the event log at startup.

But the BOINC database on the server will summarise them as multiple copies of the 'best' card, losing the differences.

Yes, this. No one else looking at my info on the server summary page can see the entire roster of hardware in the system, just one vid card and one CPU. Probably isn't really important to display it all, otherwise they would have corrected that years ago.

ID: 1793793 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793794 - Posted: 5 Jun 2016, 23:04:24 UTC - in response to Message 1793789.  
Last modified: 5 Jun 2016, 23:07:22 UTC

That had been set, all are working, running 2 tasks each. Now if I could just reserve 3 out of 8 cores for GPU support, I'd be golden.

ID: 1793794 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1793802 - Posted: 5 Jun 2016, 23:21:05 UTC - in response to Message 1793794.  
Last modified: 5 Jun 2016, 23:21:46 UTC

That had been set, all are working, running 2 tasks each. Now if I could just reserve 3 out of 8 cores for GPU support, I'd be golden.


My app_config file
<app_config>
 <app>
  <name>setiathome_v8</name>
  <gpu_versions>
  <gpu_usage>0.50</gpu_usage>
  <cpu_usage>1.00</cpu_usage>
  </gpu_versions>
 </app>
</app_config>


That runs 2 WUs at a time & reserves 1 CPU core for each WU. Setting CPU usage to 0.50 would reserve 1 core for every 2 WUs.
Grant
Darwin NT
ID: 1793802 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793807 - Posted: 5 Jun 2016, 23:37:14 UTC - in response to Message 1793802.  

So, app_config, not app_info? I've been messing around this whole time in the app_info, changing those settings. Shouldn't I be modifying that one? Ruh roh, raggy... Taking a quick look in both dirs, it appears I don't have that file currently, so I'd have to create it? If I do, what do I set the app_info back to, .04 CPU's again? Thanks!

ID: 1793807 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1793808 - Posted: 5 Jun 2016, 23:38:29 UTC - in response to Message 1793807.  

App_config will override any setting in app_info
ID: 1793808 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1793810 - Posted: 5 Jun 2016, 23:51:19 UTC - in response to Message 1793807.  

So, app_config, not app_info?

The problem with app_info is if you make a mistake you can trash all your work & cache, and you have to modify every GPU entry in it, and you have to exit & restart BOINC for the changes to take effect.

Mess up app_config & things may not work, but you won't lose your cache, you only need to make the one entry, and you don't have to exit & restart BOINC for the changes to take effect; just Options, Read config files.
Grant
Darwin NT
ID: 1793810 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793831 - Posted: 6 Jun 2016, 2:34:56 UTC - in response to Message 1793810.  

Hmm, ok, that is good to know. I usually just use find and replace in notepad, it makes the changes almost foolproof, though it is still weird that it doesn't seem to have an effect on this computer, regardless what value I put in there. I will try creating a app_config, I presume it goes in the same place? And I can leave the app_info file as it is, since it will keep ignoring it as I am making the other file?

ID: 1793831 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1793832 - Posted: 6 Jun 2016, 2:38:41 UTC - in response to Message 1793831.  

The app_info provides the information that the apps need to run the work

The app_config fine tunes those apps

So you need both.
ID: 1793832 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793834 - Posted: 6 Jun 2016, 2:52:32 UTC - in response to Message 1793832.  
Last modified: 6 Jun 2016, 3:22:41 UTC

Well, that was interesting. I copied and pasted your text above and created the file, then I restarted BOINC, it took maybe 20-30 seconds to start, and then said that my GTX 950 couldn't be ejected. I thought ok, that's good, nobody wants my video card ejected from the computer, that would be a little extreme. I tweeked it to .5 on both and now it suspended 2 of the 8 GPU tasks that were running, but I still have all 8 cores running. I renamed it to prevent it from running on the next startup, and now my 4th card is no longer recognized, I have 3 cards running 2 tasks each, and one task that used to be running, now saying waiting to run, for a GPU to open up it appears. So, apparently somehow it did eject my 950, at least in it's mind, my question is how do I get it back? Sheeesh, such drama...

*edit* Just for fun, I restarted my computer, just to start everything fresh, looked in Precision 16, and it lists only the 3 cards. Here is my logfile, now really not sure what is going on here but for whatever reason it seems like it's having a hardware issue:

6/5/2016 9:55:14 PM |  | Starting BOINC client version 7.6.22 for windows_x86_64
6/5/2016 9:55:14 PM |  | log flags: file_xfer, sched_ops, task
6/5/2016 9:55:14 PM |  | Libraries: libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8
6/5/2016 9:55:14 PM |  | Data directory: C:\ProgramData\BOINC
6/5/2016 9:55:14 PM |  | Running under account Flash
6/5/2016 9:55:14 PM |  | CUDA: NVIDIA GPU 0: GeForce GTX 950 (driver version 368.22, CUDA version 8.0, compute capability 5.2, 2048MB, 1940MB available, 2158 GFLOPS peak)
6/5/2016 9:55:14 PM |  | CUDA: NVIDIA GPU 1: GeForce GTX 670 (driver version 368.22, CUDA version 8.0, compute capability 3.0, 2048MB, 1959MB available, 2915 GFLOPS peak)
6/5/2016 9:55:14 PM |  | CUDA: NVIDIA GPU 2: GeForce GTX 670 (driver version 368.22, CUDA version 8.0, compute capability 3.0, 2048MB, 1951MB available, 2915 GFLOPS peak)
6/5/2016 9:55:14 PM |  | OpenCL: NVIDIA GPU 0: GeForce GTX 950 (driver version 368.22, device version OpenCL 1.2 CUDA, 2048MB, 1940MB available, 2158 GFLOPS peak)
6/5/2016 9:55:14 PM |  | OpenCL: NVIDIA GPU 1: GeForce GTX 670 (driver version 368.22, device version OpenCL 1.2 CUDA, 2048MB, 1959MB available, 2915 GFLOPS peak)
6/5/2016 9:55:14 PM |  | OpenCL: NVIDIA GPU 2: GeForce GTX 670 (driver version 368.22, device version OpenCL 1.2 CUDA, 2048MB, 1951MB available, 2915 GFLOPS peak)
6/5/2016 9:55:14 PM | SETI@home | Found app_info.xml; using anonymous platform
6/5/2016 9:55:14 PM |  | Host name: FlashFlyer
6/5/2016 9:55:14 PM |  | Processor: 8 GenuineIntel        Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz [Family 6 Model 58 Stepping 9]
6/5/2016 9:55:14 PM |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes f16c rdrandsyscall nx lm avx vmx tm2 pbe fsgsbase smep
6/5/2016 9:55:14 PM |  | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)
6/5/2016 9:55:14 PM |  | Memory: 31.95 GB physical, 63.89 GB virtual
6/5/2016 9:55:14 PM |  | Disk: 223.47 GB total, 137.07 GB free
6/5/2016 9:55:14 PM |  | Local time is UTC -5 hours
6/5/2016 9:55:14 PM |  | Config: event log limit disabled
6/5/2016 9:55:14 PM |  | Config: use all coprocessors
6/5/2016 9:55:14 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8017700; resource share 100
6/5/2016 9:55:14 PM | SETI@home | General prefs: from SETI@home (last modified 03-Apr-2013 23:59:56)
6/5/2016 9:55:14 PM | SETI@home | Computer location: home
6/5/2016 9:55:14 PM | SETI@home | General prefs: no separate prefs for home; using your defaults
6/5/2016 9:55:14 PM |  | Reading preferences override file
6/5/2016 9:55:14 PM |  | Preferences:
6/5/2016 9:55:14 PM |  | max memory usage when active: 16357.26MB
6/5/2016 9:55:14 PM |  | max memory usage when idle: 31078.80MB
6/5/2016 9:55:14 PM |  | max disk usage: 100.00GB
6/5/2016 9:55:14 PM |  | (to change preferences, visit a project web site or select Preferences in the Manager)


ID: 1793834 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1793838 - Posted: 6 Jun 2016, 3:23:22 UTC - in response to Message 1793834.  

Al,

You have a cc_config.xml also telling it to use all the GPUs?

Can't remember if you said if you did or not...
ID: 1793838 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1793841 - Posted: 6 Jun 2016, 3:29:09 UTC - in response to Message 1793834.  

<app_config>
<app>
<name>setiathome_v8</name>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>1.00</cpu_usage>
</gpu_versions>
</app>
</app_config>

As long as it's in the /projects/setiathome.berkeley.edu folder and is named app_config.xml then BOINC should be able to read it & use it.


Easy way to check if it is being used or not- change <gpu_usage>0.50</gpu_usage> to <gpu_usage>1.00</gpu_usage> (or <gpu_usage>0.33</gpu_usage>) and then Options, Read config files. That will result in 1 WU per card (or 3).
If that works, then change it back to <gpu_usage>0.50</gpu_usage> & then make <cpu_usage>2.00</cpu_usage> and re-read the config file- that should result in 2 CPU cores no longer crunching CPU work units for every 1 GPU WU running. Change it back to <cpu_usage>0.50</cpu_usage> & re-read config files & you should have 1 CPU core reserved for every 2 GPU WUs running.


If nothing happens then it's the wrong name, wrong location or something in the file got mucked up.
Grant
Darwin NT
ID: 1793841 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1793847 - Posted: 6 Jun 2016, 3:44:19 UTC - in response to Message 1793838.  
Last modified: 6 Jun 2016, 3:45:11 UTC

Al,

You have a cc_config.xml also telling it to use all the GPUs?

Can't remember if you said if you did or not...


For Al,

From the BOINC reference,
<use_all_gpus>0|1</use_all_gpus>If 1, use all GPUs (otherwise only the most capable ones are used).

So it would be

cc_config
  <options>
    <use_all_gpus>1</use_all_gpus>
  </options>
</cc_config


in a file named cc_config.xml, and this one is located in the
C:\Program data\BOINC
folder
Grant
Darwin NT
ID: 1793847 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793913 - Posted: 6 Jun 2016, 10:19:10 UTC - in response to Message 1793838.  

Zalster, yes, I have that line added, it is running all the GPUs, well, all but one, it appears that it may have been an unfortunate coincidence that the card flaked out just as I was testing that config file, as looking at the device manager, it is disabled in Windows, possibly due to a resource conflict. I am going to have to try digging into that one a little more, as the card does show up, it'a just disabled. I did do a search about ejecting cards, and found a post on Reddit about the Nvidia video driver issue from about 4 months or so ago, basically saying it was a bug in that version, and that they had added support for external Thunderbolt 3 GPUs in that update, and for some reason it shows it for all GPUs and not just GPUs connected through Thunderbolt. As there had been a couple releases since then, not sure what to make of it, but I am letting my cache run down on this machine, and may just end up reloading it, as there comes a point where it just doesn't make sense to futz with it chasing tails as such. It may fix it, possibly not, but if after a few tries (with things like driver reinstall/regression, etc.) I don't come up with a good solution, it's Fdisk time for this bad boy.

ID: 1793913 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793914 - Posted: 6 Jun 2016, 10:22:39 UTC - in response to Message 1793847.  

Thanks Grant, this is the cc_config I use on all of my machines, weather they have one or 4+ GPUs in them, as I don't believe it will harm anything having it in there with just one card?

<cc_config> 
 <options> 
  <use_all_gpus>1</use_all_gpus>
  <save_stats_days>10000</save_stats_days> 
  <max_event_log_lines>0</max_event_log_lines>
  <max_stdout_file_size>40</max_stdout_file_size>
 </options>
</cc_config>


ID: 1793914 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1793916 - Posted: 6 Jun 2016, 10:25:11 UTC - in response to Message 1793914.  
Last modified: 6 Jun 2016, 10:29:35 UTC

... I don't believe it will harm anything having it in there with just one card?

Correct. No problem with that.

Edit - not sure <max_event_log_lines> 0 is such a good idea, though. The Event Log is your friend.
ID: 1793916 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1793921 - Posted: 6 Jun 2016, 10:49:26 UTC - in response to Message 1793916.  

Richard, I was told that setting it to 0 disables the limit, it will continue to grow and not overwrite itself. I completely agree with you, it is def your friend, certainly wouldn't want it to not work! :-)

ID: 1793921 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1793926 - Posted: 6 Jun 2016, 11:22:01 UTC - in response to Message 1793921.  

I think 'max lines' controls the number of current lines (since startup) sent by the client for live display in BOINC Manager - that can get sluggish if you go a long way beyond the 2,000 (IIRC) line default.

You might be more interested in enlarging the permanent disk archive:

<max_stdout_file_size>N</max_stdout_file_size>
Specify the maximum size of the standard out log file (stdoutdae.txt); default is 2 MB.

Two tips:

1) use the "Event Log options..." picker (Ctrl+Shift+F) once - and it creates a fully-populated cc_config.xml file, with all possible tags listed and current/default values.

2) Bookmark http://boinc.berkeley.edu/wiki/Client_configuration to find out what they all mean.

(yes, the default Event Log display limit is 2,000 lines, and 0 removes the limit - it just falls over when you run out of memory)
(the value of N in the suggestion above is bytes, not MB)
ID: 1793926 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 11 · Next

Message boards : Number crunching : LotzaCores and a GTX 1080 FTW


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.