Message boards :
Number crunching :
LotzaCores and a GTX 1080 FTW
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 11 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
The BOINC client locally on your computer will identify each card separately and list them and their distinct characteristics in the event log at startup. But the BOINC database on the server will summarise them as multiple copies of the 'best' card, losing the differences. |
archae86 Send message Joined: 31 Aug 99 Posts: 909 Credit: 1,582,816 RAC: 0 |
for whatever reason, BOINC can't seem to identify more than one video card correctly. I agree with the "not identify" assertion if one modifies it to "not report to certain pages". It is my experience and observation that if more than one GPU of more than one model of Nvidia cards is mounted in a system, the system summary data posted on the web site computers list and computer detail pages will give just one of the models, with a number giving the correct total number of cards. Somewhere I saw an assertion that if cards of differing CUDA capability levels are present, the one mentioned will have the highest CUDA capability level present. I don't know how it chooses among cards of the same level. But "not identify" is a bit too broad, as in such mixed systems of Nvidia cards a review of the stderr for a task will show the actual card used, at least over at Einstein. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Somewhere I saw an assertion that if cards of differing CUDA capability levels are present, the one mentioned will have the highest CUDA capability level present. I don't know how it chooses among cards of the same level. All cards will be itemised in the Event Log, but 'lower' cards will be marked as (nor used) unless the option to use all cards is set in cc_config.xml The comparison for NVidia cards depends on: // factors (decreasing priority): // - compute capability // - software version // - available memory // - speed // // If "loose", ignore FLOPS and tolerate small memory diff |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
The BOINC client locally on your computer will identify each card separately and list them and their distinct characteristics in the event log at startup. Yes, this. No one else looking at my info on the server summary page can see the entire roster of hardware in the system, just one vid card and one CPU. Probably isn't really important to display it all, otherwise they would have corrected that years ago. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
That had been set, all are working, running 2 tasks each. Now if I could just reserve 3 out of 8 cores for GPU support, I'd be golden. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13770 Credit: 208,696,464 RAC: 304 |
That had been set, all are working, running 2 tasks each. Now if I could just reserve 3 out of 8 cores for GPU support, I'd be golden. My app_config file <app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>0.50</gpu_usage> <cpu_usage>1.00</cpu_usage> </gpu_versions> </app> </app_config> That runs 2 WUs at a time & reserves 1 CPU core for each WU. Setting CPU usage to 0.50 would reserve 1 core for every 2 WUs. Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
So, app_config, not app_info? I've been messing around this whole time in the app_info, changing those settings. Shouldn't I be modifying that one? Ruh roh, raggy... Taking a quick look in both dirs, it appears I don't have that file currently, so I'd have to create it? If I do, what do I set the app_info back to, .04 CPU's again? Thanks! |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
App_config will override any setting in app_info |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13770 Credit: 208,696,464 RAC: 304 |
So, app_config, not app_info? The problem with app_info is if you make a mistake you can trash all your work & cache, and you have to modify every GPU entry in it, and you have to exit & restart BOINC for the changes to take effect. Mess up app_config & things may not work, but you won't lose your cache, you only need to make the one entry, and you don't have to exit & restart BOINC for the changes to take effect; just Options, Read config files. Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Hmm, ok, that is good to know. I usually just use find and replace in notepad, it makes the changes almost foolproof, though it is still weird that it doesn't seem to have an effect on this computer, regardless what value I put in there. I will try creating a app_config, I presume it goes in the same place? And I can leave the app_info file as it is, since it will keep ignoring it as I am making the other file? |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
The app_info provides the information that the apps need to run the work The app_config fine tunes those apps So you need both. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Well, that was interesting. I copied and pasted your text above and created the file, then I restarted BOINC, it took maybe 20-30 seconds to start, and then said that my GTX 950 couldn't be ejected. I thought ok, that's good, nobody wants my video card ejected from the computer, that would be a little extreme. I tweeked it to .5 on both and now it suspended 2 of the 8 GPU tasks that were running, but I still have all 8 cores running. I renamed it to prevent it from running on the next startup, and now my 4th card is no longer recognized, I have 3 cards running 2 tasks each, and one task that used to be running, now saying waiting to run, for a GPU to open up it appears. So, apparently somehow it did eject my 950, at least in it's mind, my question is how do I get it back? Sheeesh, such drama... *edit* Just for fun, I restarted my computer, just to start everything fresh, looked in Precision 16, and it lists only the 3 cards. Here is my logfile, now really not sure what is going on here but for whatever reason it seems like it's having a hardware issue: 6/5/2016 9:55:14 PM | | Starting BOINC client version 7.6.22 for windows_x86_64 6/5/2016 9:55:14 PM | | log flags: file_xfer, sched_ops, task 6/5/2016 9:55:14 PM | | Libraries: libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8 6/5/2016 9:55:14 PM | | Data directory: C:\ProgramData\BOINC 6/5/2016 9:55:14 PM | | Running under account Flash 6/5/2016 9:55:14 PM | | CUDA: NVIDIA GPU 0: GeForce GTX 950 (driver version 368.22, CUDA version 8.0, compute capability 5.2, 2048MB, 1940MB available, 2158 GFLOPS peak) 6/5/2016 9:55:14 PM | | CUDA: NVIDIA GPU 1: GeForce GTX 670 (driver version 368.22, CUDA version 8.0, compute capability 3.0, 2048MB, 1959MB available, 2915 GFLOPS peak) 6/5/2016 9:55:14 PM | | CUDA: NVIDIA GPU 2: GeForce GTX 670 (driver version 368.22, CUDA version 8.0, compute capability 3.0, 2048MB, 1951MB available, 2915 GFLOPS peak) 6/5/2016 9:55:14 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 950 (driver version 368.22, device version OpenCL 1.2 CUDA, 2048MB, 1940MB available, 2158 GFLOPS peak) 6/5/2016 9:55:14 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 670 (driver version 368.22, device version OpenCL 1.2 CUDA, 2048MB, 1959MB available, 2915 GFLOPS peak) 6/5/2016 9:55:14 PM | | OpenCL: NVIDIA GPU 2: GeForce GTX 670 (driver version 368.22, device version OpenCL 1.2 CUDA, 2048MB, 1951MB available, 2915 GFLOPS peak) 6/5/2016 9:55:14 PM | SETI@home | Found app_info.xml; using anonymous platform 6/5/2016 9:55:14 PM | | Host name: FlashFlyer 6/5/2016 9:55:14 PM | | Processor: 8 GenuineIntel Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz [Family 6 Model 58 Stepping 9] 6/5/2016 9:55:14 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes f16c rdrandsyscall nx lm avx vmx tm2 pbe fsgsbase smep 6/5/2016 9:55:14 PM | | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 6/5/2016 9:55:14 PM | | Memory: 31.95 GB physical, 63.89 GB virtual 6/5/2016 9:55:14 PM | | Disk: 223.47 GB total, 137.07 GB free 6/5/2016 9:55:14 PM | | Local time is UTC -5 hours 6/5/2016 9:55:14 PM | | Config: event log limit disabled 6/5/2016 9:55:14 PM | | Config: use all coprocessors 6/5/2016 9:55:14 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8017700; resource share 100 6/5/2016 9:55:14 PM | SETI@home | General prefs: from SETI@home (last modified 03-Apr-2013 23:59:56) 6/5/2016 9:55:14 PM | SETI@home | Computer location: home 6/5/2016 9:55:14 PM | SETI@home | General prefs: no separate prefs for home; using your defaults 6/5/2016 9:55:14 PM | | Reading preferences override file 6/5/2016 9:55:14 PM | | Preferences: 6/5/2016 9:55:14 PM | | max memory usage when active: 16357.26MB 6/5/2016 9:55:14 PM | | max memory usage when idle: 31078.80MB 6/5/2016 9:55:14 PM | | max disk usage: 100.00GB 6/5/2016 9:55:14 PM | | (to change preferences, visit a project web site or select Preferences in the Manager) |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Al, You have a cc_config.xml also telling it to use all the GPUs? Can't remember if you said if you did or not... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13770 Credit: 208,696,464 RAC: 304 |
<app_config> <app> <name>setiathome_v8</name> <gpu_versions> <gpu_usage>0.50</gpu_usage> <cpu_usage>1.00</cpu_usage> </gpu_versions> </app> </app_config> As long as it's in the /projects/setiathome.berkeley.edu folder and is named app_config.xml then BOINC should be able to read it & use it. Easy way to check if it is being used or not- change <gpu_usage>0.50</gpu_usage> to <gpu_usage>1.00</gpu_usage> (or <gpu_usage>0.33</gpu_usage>) and then Options, Read config files. That will result in 1 WU per card (or 3). If that works, then change it back to <gpu_usage>0.50</gpu_usage> & then make <cpu_usage>2.00</cpu_usage> and re-read the config file- that should result in 2 CPU cores no longer crunching CPU work units for every 1 GPU WU running. Change it back to <cpu_usage>0.50</cpu_usage> & re-read config files & you should have 1 CPU core reserved for every 2 GPU WUs running. If nothing happens then it's the wrong name, wrong location or something in the file got mucked up. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13770 Credit: 208,696,464 RAC: 304 |
Al, For Al, From the BOINC reference, <use_all_gpus>0|1</use_all_gpus>If 1, use all GPUs (otherwise only the most capable ones are used). So it would be cc_config <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config in a file named cc_config.xml, and this one is located in the C:\Program data\BOINC folder Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Zalster, yes, I have that line added, it is running all the GPUs, well, all but one, it appears that it may have been an unfortunate coincidence that the card flaked out just as I was testing that config file, as looking at the device manager, it is disabled in Windows, possibly due to a resource conflict. I am going to have to try digging into that one a little more, as the card does show up, it'a just disabled. I did do a search about ejecting cards, and found a post on Reddit about the Nvidia video driver issue from about 4 months or so ago, basically saying it was a bug in that version, and that they had added support for external Thunderbolt 3 GPUs in that update, and for some reason it shows it for all GPUs and not just GPUs connected through Thunderbolt. As there had been a couple releases since then, not sure what to make of it, but I am letting my cache run down on this machine, and may just end up reloading it, as there comes a point where it just doesn't make sense to futz with it chasing tails as such. It may fix it, possibly not, but if after a few tries (with things like driver reinstall/regression, etc.) I don't come up with a good solution, it's Fdisk time for this bad boy. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Thanks Grant, this is the cc_config I use on all of my machines, weather they have one or 4+ GPUs in them, as I don't believe it will harm anything having it in there with just one card? <cc_config> <options> <use_all_gpus>1</use_all_gpus> <save_stats_days>10000</save_stats_days> <max_event_log_lines>0</max_event_log_lines> <max_stdout_file_size>40</max_stdout_file_size> </options> </cc_config> |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
... I don't believe it will harm anything having it in there with just one card? Correct. No problem with that. Edit - not sure <max_event_log_lines> 0 is such a good idea, though. The Event Log is your friend. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Richard, I was told that setting it to 0 disables the limit, it will continue to grow and not overwrite itself. I completely agree with you, it is def your friend, certainly wouldn't want it to not work! :-) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
I think 'max lines' controls the number of current lines (since startup) sent by the client for live display in BOINC Manager - that can get sluggish if you go a long way beyond the 2,000 (IIRC) line default. You might be more interested in enlarging the permanent disk archive: <max_stdout_file_size>N</max_stdout_file_size> Two tips: 1) use the "Event Log options..." picker (Ctrl+Shift+F) once - and it creates a fully-populated cc_config.xml file, with all possible tags listed and current/default values. 2) Bookmark http://boinc.berkeley.edu/wiki/Client_configuration to find out what they all mean. (yes, the default Event Log display limit is 2,000 lines, and 0 removes the limit - it just falls over when you run out of memory) (the value of N in the suggestion above is bytes, not MB) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.