Questions and Answers :
Unix/Linux :
2 video cards in linux. Boinc sees them as same device!
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
6.4.5 has one internal scheduler for both CPU and GPU work, so that one is kind of broken. You then have to add the <ncpus>CPUs+GPUs</ncpus> (e.g. you have 4 CPUs + 2 GPUs, so this line is <ncpus>6</ncpus>) in the options section of the cc_config.xml file. without messing with the ncpus statement it does what i need. i only allow it to use 3 out of the 4 cpus reserving the 4th for my workstation use and cuda. it selects the correct number of cpus just using the local prefs set at 75% and feeds both cuda units so at present i cannot ask for more. i found my machine got more work done more smoothly with 3 cpus than 4. work times decreased by an average of 35min per wu by running 3. my uploaded scores were hardly affected at all and my workstation does not suffer any pauses or other ailments caused by clogged cpus. i tried keeping 4 cpus and reducing the percentage of cpu used doing work but it got worse instead of better. 3 seems to be my magic number. hmm i wouldn't think device recognition and feeding would be so involved. maybe so, i have not seen the source. for now 6.4.5 appears to be working perfectly for me unless my scores at the home site get severely messed i will probably use this until a 6.6.x version is fixed. |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
6.6.36 was already reported to not work well on Linux. I spent and evening working on installing it to my mandriva only to have it fail. I backed it off to 6.4.5 You are probably better off backing off 6.6.36 unless you want to the latest build which may also be unstable. In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
6.6.36 was already reported to not work well on Linux. I spent and evening working on installing it to my mandriva only to have it fail. I backed it off to 6.4.5 You are probably better off backing off 6.6.36 unless you want to the latest build which may also be unstable. 6.6.37 also behaved the same way. i went all the way back to 6.4.5 which although its scheduling is a bit funky, it works fine. will stay with this until i hear that one of the 6.6.x series is working properly with multiple gpus. |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
Yeah, I figured out you must be using ps to see it, confirmed it myself... for some reason 6.6.20 and 6.6.36 both stick it all on device 0, but 6.4.5 works correctly (one on each). hmm one thing i noticed about 6.4.5. it has not gotten any new cuda units, only cpu. it just requested 100 cpu units and no gpu. so i put 6.6.37 back to see if any gpu were available and they are downloading now. looks like im gonna have to use this to get work then switch to 6.4.5 to make it work right? |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Unless you're getting nothing but AP, CPU and GPU units are the same. Check your settings here, make sure use GPU is checked, and make sure your app_info.xml is right. http://lunatics.kwsn.net/12-gpu-crunching/cpu-gpu-rebranding-perl-script.msg17406.html#msg17406 Grab that tool, there is a V5 on page 2 or 3 I think. It will move all VLAR and VHAR to your CPU and everything else to GPU. I had to make a couple changes to it, so if it doesn't work le tme know and I'll give you the changes I made (it couldn't open the files on my system for some reason). |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
Unless you're getting nothing but AP, CPU and GPU units are the same. Check your settings here, make sure use GPU is checked, and make sure your app_info.xml is right. everything is set and has been. 6.6.x versions get gpu work just fine. the seti prefs for 'home' which is what this computer belongs to are set to use gpu and local prefs are set to use gpu. 6.4.5 just did not get any gpu work when it got mb units. i dont do ap i only do mb. i use the AK ssse3 for intel for the cpus and for the gpu im using vlar killer setiathome-6.08.CUDA_2.2_x86_64-pc-linux-gnu 6.4.5 messages give no indication of not using any gpu it simply did not get any. but i got a few hundred via 6.6.37 then once they were queued for download i switched back and let them continue to dl under 6.4.5. it may be i didnt give it long enough to 'settle in' after the long upload problems. i tend to get impatient with this stuff. now that it's got plenty of work i will just leave 6.4.5 up and wait to see if it runs out of cuda units to do or if it gets more. i thought the reassignments were only available in the windows versions of the cuda app. this one does not reassign, it kills off with computation error. would be nice to get a cuda 2.2 app that would reassign them. feels like i am wasting a lot of resources on both ends with just killing, but my tesla cannot take vlars. they lock it solid every time. i just saw that url that has a script so ill check it out. thanks |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Odd. all of mine go to the GPU on 6.4.5, I use that rebranding script to move them all over (obviously make sure BOINC is off at the time). Been using it for about a day now, works pretty well. |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
Odd. all of mine go to the GPU on 6.4.5, I use that rebranding script to move them all over (obviously make sure BOINC is off at the time). Been using it for about a day now, works pretty well. it could be because i was forcing uploads and updates manually for a little bit to try to get a feel for things but manual may mess up it getting things although i am surprised when it got its first 20 cpu units it did not get any gpu units at all and the gpus were idle at that time. when i put 6.6.37 up it immediately got 40 gpu units. im just gonna let it settle out a bit and see how it does without a human touching it :) |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
Odd. all of mine go to the GPU on 6.4.5, I use that rebranding script to move them all over (obviously make sure BOINC is off at the time). Been using it for about a day now, works pretty well. hehe it left my cpus with 54 work units :) i had to change the path where it goes into the projects/seti directory, it had windows syntax of \\ so i changed it to \/ in 2 places in that line and separated the filename declarations with spaces at the top and added a #!/usr/bin/perl to the top line and the scipt worked just fine. oh and i had to change the max workunits from 550 to 1000. now all i have to do is figure out the best times to run this under cron and set u a script to stop boinc, run it, copy the new file over and restart boinc. maybe every hour? it didnt catch any vlars at all but i suspect if there were any it would have caught them and rescheduled them to the cpus. at least it didnt report any. got v 1.5. it seems that beginning around 1.6 they started concentrating on windoze and changed it into an executable along the way. |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Sounds about what I had to change as well, moved a ton to my CPU (probably too many, but we'll see). Lately I've been getting a ton of VLAR and a good amount of VHAR too. |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
sunu told me about boinc 6.6.11. i have been running it for about 6 hours now and it works! it understands the devices, 1 wu per device, gets proper workunits for cpu and gpu, works well with app_info.xml and the AK mb app and the 2.2 cuda app. seems this one is the best so far to run until they fix the newer versions. after it got a ton of cpu apps since i was out, i ran the V5 perl script again with lower numbers plugged in, and it changed things to 111 cpu apps with 422 gpu apps. still has not reported finding any vlar or vhar workunits to reschedule to cpu unless it is just silent about it. hehe after i ran the script when boinc came back up it promptly went out to fill the void of cpu units :) |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Sounds good, where did you find 6.6.11 so I can give it a try? |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
Sounds good, where did you find 6.6.11 so I can give it a try? You can find almost every release at this place. http://boincdl.ssl.berkeley.edu/dl/ |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
Sounds good, where did you find 6.6.11 so I can give it a try? yes that is where i got it from. specifically http://boincdl.ssl.berkeley.edu/dl/boinc_6.6.11_x86_64-pc-linux-gnu.sh |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Thanks, didn't know about that one and couldn't find it from the normal download location. I'll try it out when I get home tonight! |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Sounds good, where did you find 6.6.11 so I can give it a try? That works perfectly! |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
Sounds good, where did you find 6.6.11 so I can give it a try? cool. i think we finally hit on the magic version until they catch up with the fixes in the new versions. |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Sounds good, where did you find 6.6.11 so I can give it a try? Yup, it still says "High priority" on stuff not due until the 9th, but downloading new WU regularly so I'm happy with that. I'll just periodically check for VLAR (and VHAR, once my VLAR count goes back down) on the GPU (modified that script to make a new one that just spits out info about types of tasks and how many are assigned to each) and reassign them. |
Chuck Gorish Send message Joined: 19 Jun 00 Posts: 156 Credit: 29,589,106 RAC: 0 |
cool. yeah it does a few funky things but they all get processed so i don't really care.. im hoping the 'stock' functionality of the script is sufficient since other than changing the ratios at the top i basically have to use it as is. i can only trust it checks cuda wu for vlar and moves them to cpu since it doesnt report anything. |
Joseph Monk Send message Joined: 31 Mar 07 Posts: 150 Credit: 1,181,197 RAC: 0 |
Here's the modified script I use, it's pretty simple. Just run it (I've seen no harm in running while BOINC is, as it doesn't change anything) and it spits out something like: VHAR on GPU: 12mr09ac.6289.7025.14.10.195 VHAR on GPU: 05dc08ae.32507.890.16.10.254 Number of CPU tasks:413 Number of GPU tasks:323 Number of VLAR tasks:330 Number of VHAR tasks:42 Total tasks: 736 You can't run it while downloading new WU, as it can't open the WU files to read them if they aren't there yet. Right now (330 VLAR tasks) I've moved VHAR to GPU hence it complains about it, but you can see CPU has 413 tasks and there's only 330 VLAR so I should run the rebrand script again soon. Here's the script: $path="client_state.xml"; open (IN, $path); $NumOfCPUTasks=0; $NumOfGPUTasks=0; $NumVLAR=0; $NumVHAR=0; $NumGPUToNumCPU_high_limit=25; $NumGPUToNumCPU_low_limit=0.5; while (<IN>) { if( /<workunit>/ ){ #parsing result $trueAR=-1;#error condition $WUname=""; while(<IN>){ if( /<\/workunit>/ ){ open (WU, "projects\/setiathome.berkeley.edu\/" .$WUname) || die "ERROR: cant open task file " . $WUname; while(<WU>){#reading task file and deciding where it should go if( /<true_angle_range>(.*)<\/true_angle_range>/ ){ $trueAR=$1; if( $trueAR == -1 ){ die "ERROR detected - cant determine AR value\n"; } if($trueAR < 0.13){ $tasks{$WUname}=1; $NumVLAR++;} elsif($trueAR > 1.127){ $tasks{$WUname}=2; $NumVHAR++; }else{$tasks{$WUname}=3;} last; } } close(WU); last; } if( /<name>(.*)<\/name>/ ){ $WUname=$1; #print "task:\\".$1."\\ \n"; } elsif( /<version_num>603<\/version_num>/ ){$NumOfCPUTasks++;} elsif( /<version_num>608<\/version_num>/ ){$NumOfGPUTasks++;} } } } close(IN); open (IN, $path); while (<IN>) { if( /<name>(.*)<\/name>/ ){ $WUname=$1; #print "task:\\".$1."\\ \n"; } elsif( /<version_num>608<\/version_num>/ ){ if($tasks{$WUname}){ if($tasks{$WUname} == 1){ print "VLAR on GPU: " . $WUname ."\n";} elsif($tasks{$WUname} == 2){ print "VHAR on GPU: " . $WUname ."\n";} } } } close(IN); print "Number of CPU tasks:".$NumOfCPUTasks."\n"; print "Number of GPU tasks:".$NumOfGPUTasks."\n"; print "Number of VLAR tasks:".$NumVLAR."\n"; print "Number of VHAR tasks:".$NumVHAR."\n"; if($NumOfCPUTasks!=0){ $GPU_to_CPU_ratio=$NumOfGPUTasks/$NumOfCPUTasks;} else{ $GPU_to_CPU_ratio=1;} if($GPU_to_CPU_ratio >$NumGPUToNumCPU_high_limit){ print "Too many tasks allocated to GPU already ".$GPU_to_CPU_ratio."\n";} if($GPU_to_CPU_ratio <$NumGPUToNumCPU_low_limit){ print "Too many tasks allocated to CPU already " .$GPU_to_CPU_ratio ."\n";} print "Total tasks: ".($NumOfCPUTasks+$NumOfGPUTasks)."\n"; |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.