2 video cards in linux. Boinc sees them as same device! |
![]() |
| log in |
Questions and Answers : Unix/Linux : 2 video cards in linux. Boinc sees them as same device!
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
6.4.5 has one internal scheduler for both CPU and GPU work, so that one is kind of broken. You then have to add the <ncpus>CPUs+GPUs</ncpus> (e.g. you have 4 CPUs + 2 GPUs, so this line is <ncpus>6</ncpus>) in the options section of the cc_config.xml file. without messing with the ncpus statement it does what i need. i only allow it to use 3 out of the 4 cpus reserving the 4th for my workstation use and cuda. it selects the correct number of cpus just using the local prefs set at 75% and feeds both cuda units so at present i cannot ask for more. i found my machine got more work done more smoothly with 3 cpus than 4. work times decreased by an average of 35min per wu by running 3. my uploaded scores were hardly affected at all and my workstation does not suffer any pauses or other ailments caused by clogged cpus. i tried keeping 4 cpus and reducing the percentage of cpu used doing work but it got worse instead of better. 3 seems to be my magic number. hmm i wouldn't think device recognition and feeding would be so involved. maybe so, i have not seen the source. for now 6.4.5 appears to be working perfectly for me unless my scores at the home site get severely messed i will probably use this until a 6.6.x version is fixed. ____________ | |
| ID: 919338 · | |
|
6.6.36 was already reported to not work well on Linux. I spent and evening working on installing it to my mandriva only to have it fail. I backed it off to 6.4.5 You are probably better off backing off 6.6.36 unless you want to the latest build which may also be unstable. | |
| ID: 919353 · | |
6.6.36 was already reported to not work well on Linux. I spent and evening working on installing it to my mandriva only to have it fail. I backed it off to 6.4.5 You are probably better off backing off 6.6.36 unless you want to the latest build which may also be unstable. 6.6.37 also behaved the same way. i went all the way back to 6.4.5 which although its scheduling is a bit funky, it works fine. will stay with this until i hear that one of the 6.6.x series is working properly with multiple gpus. ____________ | |
| ID: 919380 · | |
Yeah, I figured out you must be using ps to see it, confirmed it myself... for some reason 6.6.20 and 6.6.36 both stick it all on device 0, but 6.4.5 works correctly (one on each). hmm one thing i noticed about 6.4.5. it has not gotten any new cuda units, only cpu. it just requested 100 cpu units and no gpu. so i put 6.6.37 back to see if any gpu were available and they are downloading now. looks like im gonna have to use this to get work then switch to 6.4.5 to make it work right? ____________ | |
| ID: 919845 · | |
|
Unless you're getting nothing but AP, CPU and GPU units are the same. Check your settings here, make sure use GPU is checked, and make sure your app_info.xml is right. | |
| ID: 919941 · | |
Unless you're getting nothing but AP, CPU and GPU units are the same. Check your settings here, make sure use GPU is checked, and make sure your app_info.xml is right. everything is set and has been. 6.6.x versions get gpu work just fine. the seti prefs for 'home' which is what this computer belongs to are set to use gpu and local prefs are set to use gpu. 6.4.5 just did not get any gpu work when it got mb units. i dont do ap i only do mb. i use the AK ssse3 for intel for the cpus and for the gpu im using vlar killer setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu 6.4.5 messages give no indication of not using any gpu it simply did not get any. but i got a few hundred via 6.6.37 then once they were queued for download i switched back and let them continue to dl under 6.4.5. it may be i didnt give it long enough to 'settle in' after the long upload problems. i tend to get impatient with this stuff. now that it's got plenty of work i will just leave 6.4.5 up and wait to see if it runs out of cuda units to do or if it gets more. i thought the reassignments were only available in the windows versions of the cuda app. this one does not reassign, it kills off with computation error. would be nice to get a cuda 2.2 app that would reassign them. feels like i am wasting a lot of resources on both ends with just killing, but my tesla cannot take vlars. they lock it solid every time. i just saw that url that has a script so ill check it out. thanks ____________ | |
| ID: 919956 · | |
|
Odd. all of mine go to the GPU on 6.4.5, I use that rebranding script to move them all over (obviously make sure BOINC is off at the time). Been using it for about a day now, works pretty well. | |
| ID: 919970 · | |
Odd. all of mine go to the GPU on 6.4.5, I use that rebranding script to move them all over (obviously make sure BOINC is off at the time). Been using it for about a day now, works pretty well. it could be because i was forcing uploads and updates manually for a little bit to try to get a feel for things but manual may mess up it getting things although i am surprised when it got its first 20 cpu units it did not get any gpu units at all and the gpus were idle at that time. when i put 6.6.37 up it immediately got 40 gpu units. im just gonna let it settle out a bit and see how it does without a human touching it :) ____________ | |
| ID: 919978 · | |
Odd. all of mine go to the GPU on 6.4.5, I use that rebranding script to move them all over (obviously make sure BOINC is off at the time). Been using it for about a day now, works pretty well. hehe it left my cpus with 54 work units :) i had to change the path where it goes into the projects/seti directory, it had windows syntax of \\ so i changed it to \/ in 2 places in that line and separated the filename declarations with spaces at the top and added a #!/usr/bin/perl to the top line and the scipt worked just fine. oh and i had to change the max workunits from 550 to 1000. now all i have to do is figure out the best times to run this under cron and set u a script to stop boinc, run it, copy the new file over and restart boinc. maybe every hour? it didnt catch any vlars at all but i suspect if there were any it would have caught them and rescheduled them to the cpus. at least it didnt report any. got v 1.5. it seems that beginning around 1.6 they started concentrating on windoze and changed it into an executable along the way. ____________ | |
| ID: 919989 · | |
|
Sounds about what I had to change as well, moved a ton to my CPU (probably too many, but we'll see). Lately I've been getting a ton of VLAR and a good amount of VHAR too. | |
| ID: 919991 · | |
|
sunu told me about boinc 6.6.11. i have been running it for about 6 hours now and it works! it understands the devices, 1 wu per device, gets proper workunits for cpu and gpu, works well with app_info.xml and the AK mb app and the 2.2 cuda app. seems this one is the best so far to run until they fix the newer versions. after it got a ton of cpu apps since i was out, i ran the V5 perl script again with lower numbers plugged in, and it changed things to 111 cpu apps with 422 gpu apps. still has not reported finding any vlar or vhar workunits to reschedule to cpu unless it is just silent about it. hehe after i ran the script when boinc came back up it promptly went out to fill the void of cpu units :) | |
| ID: 920485 · | |
|
Sounds good, where did you find 6.6.11 so I can give it a try? | |
| ID: 920544 · | |
Sounds good, where did you find 6.6.11 so I can give it a try? You can find almost every release at this place. http://boincdl.ssl.berkeley.edu/dl/ ____________ | |
| ID: 920600 · | |
Sounds good, where did you find 6.6.11 so I can give it a try? yes that is where i got it from. specifically http://boincdl.ssl.berkeley.edu/dl/boinc_6.6.11_x86_64-pc-linux-gnu.sh ____________ | |
| ID: 920610 · | |
|
Thanks, didn't know about that one and couldn't find it from the normal download location. I'll try it out when I get home tonight! | |
| ID: 920615 · | |
Sounds good, where did you find 6.6.11 so I can give it a try? That works perfectly! | |
| ID: 920625 · | |
Sounds good, where did you find 6.6.11 so I can give it a try? cool. i think we finally hit on the magic version until they catch up with the fixes in the new versions. ____________ | |
| ID: 920633 · | |
Sounds good, where did you find 6.6.11 so I can give it a try? Yup, it still says "High priority" on stuff not due until the 9th, but downloading new WU regularly so I'm happy with that. I'll just periodically check for VLAR (and VHAR, once my VLAR count goes back down) on the GPU (modified that script to make a new one that just spits out info about types of tasks and how many are assigned to each) and reassign them. | |
| ID: 920638 · | |
|
cool. yeah it does a few funky things but they all get processed so i don't really care.. im hoping the 'stock' functionality of the script is sufficient since other than changing the ratios at the top i basically have to use it as is. i can only trust it checks cuda wu for vlar and moves them to cpu since it doesnt report anything. | |
| ID: 920662 · | |
|
Here's the modified script I use, it's pretty simple. Just run it (I've seen no harm in running while BOINC is, as it doesn't change anything) and it spits out something like: $path="client_state.xml"; open (IN, $path); $NumOfCPUTasks=0; $NumOfGPUTasks=0; $NumVLAR=0; $NumVHAR=0; $NumGPUToNumCPU_high_limit=25; $NumGPUToNumCPU_low_limit=0.5; while (<IN>) { if( /<workunit>/ ){ #parsing result $trueAR=-1;#error condition $WUname=""; while(<IN>){ if( /<\/workunit>/ ){ open (WU, "projects\/setiathome.berkeley.edu\/" .$WUname) || die "ERROR: cant open task file " . $WUname; while(<WU>){#reading task file and deciding where it should go if( /<true_angle_range>(.*)<\/true_angle_range>/ ){ $trueAR=$1; if( $trueAR == -1 ){ die "ERROR detected - cant determine AR value\n"; } if($trueAR < 0.13){ $tasks{$WUname}=1; $NumVLAR++;} elsif($trueAR > 1.127){ $tasks{$WUname}=2; $NumVHAR++; }else{$tasks{$WUname}=3;} last; } } close(WU); last; } if( /<name>(.*)<\/name>/ ){ $WUname=$1; #print "task:\\".$1."\\ \n"; } elsif( /<version_num>603<\/version_num>/ ){$NumOfCPUTasks++;} elsif( /<version_num>608<\/version_num>/ ){$NumOfGPUTasks++;} } } } close(IN); open (IN, $path); while (<IN>) { if( /<name>(.*)<\/name>/ ){ $WUname=$1; #print "task:\\".$1."\\ \n"; } elsif( /<version_num>608<\/version_num>/ ){ if($tasks{$WUname}){ if($tasks{$WUname} == 1){ print "VLAR on GPU: " . $WUname ."\n";} elsif($tasks{$WUname} == 2){ print "VHAR on GPU: " . $WUname ."\n";} } } } close(IN); print "Number of CPU tasks:".$NumOfCPUTasks."\n"; print "Number of GPU tasks:".$NumOfGPUTasks."\n"; print "Number of VLAR tasks:".$NumVLAR."\n"; print "Number of VHAR tasks:".$NumVHAR."\n"; if($NumOfCPUTasks!=0){ $GPU_to_CPU_ratio=$NumOfGPUTasks/$NumOfCPUTasks;} else{ $GPU_to_CPU_ratio=1;} if($GPU_to_CPU_ratio >$NumGPUToNumCPU_high_limit){ print "Too many tasks allocated to GPU already ".$GPU_to_CPU_ratio."\n";} if($GPU_to_CPU_ratio <$NumGPUToNumCPU_low_limit){ print "Too many tasks allocated to CPU already " .$GPU_to_CPU_ratio ."\n";} print "Total tasks: ".($NumOfCPUTasks+$NumOfGPUTasks)."\n"; | |
| ID: 920841 · | |
Questions and Answers : Unix/Linux : 2 video cards in linux. Boinc sees them as same device!
| Copyright © 2013 University of California |