Upgrade from GTX275 to GTX 295 Question

Questions and Answers : GPU applications : Upgrade from GTX275 to GTX 295 Question
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 928913 - Posted: 26 Aug 2009, 22:45:50 UTC - in response to Message 928911.  

O.K. Will try that first.
Am I likly to loose any tasks in the cache?

A straight change of Boinc version should not affect work in-flight with the Science App.

F.
ID: 928913 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 928917 - Posted: 26 Aug 2009, 23:18:08 UTC - in response to Message 928913.  

Well Fred gone back to 6.6.26 and still only the 2 GPU's.
Well today is another day so will try something else later.
Let you know how things go tonight.
Dave
ID: 928917 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929216 - Posted: 28 Aug 2009, 7:43:31 UTC

Well I have to take the "Bull by the Horns" and try and solve this unseen 295 problem.
Going back to Boinc 6.6.26 did not work,and am now back on 6.6.36.

I will remove both cards and put an older type card in the system, and remove the drivers,then try a new install of the cards and drivers.
At least I have not lost any ground on seti stats as the one 295 that is running is working as well as the 2*275's that were in here.
To anyone else reading this.
If you have any ideas please feel free to chip in.
Dave



ID: 929216 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 929236 - Posted: 28 Aug 2009, 12:17:56 UTC - in response to Message 929216.  

Well I have to take the "Bull by the Horns" and try and solve this unseen 295 problem.
Going back to Boinc 6.6.26 did not work,and am now back on 6.6.36.

I will remove both cards and put an older type card in the system, and remove the drivers,then try a new install of the cards and drivers.
At least I have not lost any ground on seti stats as the one 295 that is running is working as well as the 2*275's that were in here.
To anyone else reading this.
If you have any ideas please feel free to chip in.
Dave




try this. we had trouble getting most versions of boinc to properly recognize and use 2 physical devices... 6.6.11 worked well except it had other issues mostly dealing with elapsed time reporting... i am now using 6.9.0 from svn source and it works perfectly with 2 devices and time reporting.. i suspect this may be your problem with 6.6.36.

can't hurt.. give 6.6.11 or 6.9.0 a try and i would bet it sees both devices.

ID: 929236 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929245 - Posted: 28 Aug 2009, 13:08:55 UTC - in response to Message 929236.  

Thank's for the info Chuck but can you direct me to the svn source? to d/load files from.
Dave
ID: 929245 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 929340 - Posted: 28 Aug 2009, 22:00:48 UTC - in response to Message 929245.  
Last modified: 28 Aug 2009, 22:03:53 UTC

Thank's for the info Chuck but can you direct me to the svn source? to d/load files from.
Dave


make sure svn is installed and then just do this to get the current source


svn co http://boinc.berkeley.edu/svn/trunk/boinc

there are also links for windows svn clients.. this is the main page to check out

http://boinc.berkeley.edu/trac/wiki/SourceCode

i suspect the version is 6.10.x now....
ID: 929340 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 929357 - Posted: 28 Aug 2009, 23:13:30 UTC - in response to Message 929245.  

Thank's for the info Chuck but can you direct me to the svn source? to d/load files from.
Dave

Without svn you can get any version you require from here.

F.
ID: 929357 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 929368 - Posted: 29 Aug 2009, 0:14:19 UTC - in response to Message 929357.  
Last modified: 29 Aug 2009, 0:15:40 UTC

Thank's for the info Chuck but can you direct me to the svn source? to d/load files from.
Dave

Without svn you can get any version you require from here.

F.



true but that page doesn't help me on my stuff since the highest version they have for 64bit linux is 6.6.37 which can't handle multiple devices properly. at least svn has the source for fixed multiple devices. i see 6.10.x out there but not for linux 64 only for linux 32. thats the only problem with getting precompiled stuff. have to wait till someone decides to make a distribution for my arch. then it is also generically compiled to run on the most machines which is necessary for a binary distro. i like to compile specifically for each machine.

since he has windows though he can get the 6.10.x from there and see how it goes for him without the extra work of compiling.
ID: 929368 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929438 - Posted: 29 Aug 2009, 10:19:22 UTC

Well I have now updated Boinc to 6.10 and rebooted the rig and still not seeing the second card.
Using Everest by Lavalys I can see that the temp of the gpu's
Are: GPU1 59c GPU2 70c GPU3 63c and GPU4 49c. CPU temp is 48c 100% loaded
The 2 GPU's that are running are returning 608's in around 3.5 mins to 13.5 mins so they are working ok.
I think the only option left to me is as i posted earlier.
Dave
ID: 929438 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 929535 - Posted: 29 Aug 2009, 21:11:25 UTC - in response to Message 929438.  

Well I have now updated Boinc to 6.10 and rebooted the rig and still not seeing the second card.
Using Everest by Lavalys I can see that the temp of the gpu's
Are: GPU1 59c GPU2 70c GPU3 63c and GPU4 49c. CPU temp is 48c 100% loaded
The 2 GPU's that are running are returning 608's in around 3.5 mins to 13.5 mins so they are working ok.
I think the only option left to me is as i posted earlier.
Dave


odd. wonder what is causing this. i have seen system configs in the stats running 4 295 devices for 8 cuda all working properly.. wonder what they do differently... im at a loss. sorry. i covered everything i knew about multiple devices in getting my gtx285 and tesla working at the same time. there is probably some little thing being missed causing this but as far as i can tell it has all been covered.
ID: 929535 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929635 - Posted: 30 Aug 2009, 6:37:34 UTC
Last modified: 30 Aug 2009, 7:20:14 UTC

Last night I did some testing of various options with the GPU cards.
I found that in the Nvidia control panel there is an Enable Multi GPU Mode, and a Do Not Use Multi GPU Mode.
This was not there when I had the two 295's in the rig.
There was only the Enable Quad SLI Mode, and Do Not Use SLI.
I know that I tried it in both modes when I had both cards in, and was in Do Not Use SLI Mode when the problems were happening.
I have now been running this i7 rig over night with 1 BFG GTX295 in it with both GPU1 and GPU2 running ok with not a problem, no errors on the work completed.
I will now install the second 295 and hope that Boinc sees it.
This rig is Vista 64bit and Boinc 6.10 nothing overclocked.

Dave

[Edit] Just fitted the second card and in Do not use sli mode and Boinc still only seeing the first card and crunching on it.
I am now wondering if there is a way to force boinc to see all gpus?
ID: 929635 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 929652 - Posted: 30 Aug 2009, 7:54:02 UTC - in response to Message 929635.  


Just fitted the second card and in Do not use sli mode and Boinc still only seeing the first card and crunching on it.
I am now wondering if there is a way to force boinc to see all gpus?

Only the cc_config file, as far as I am aware, and I assume you still have that in place with the <use_all_gpus> flag set? And, while this may be a little "grandmother / egg-sucking", you are sure the cc_config file is in the right folder and is being seen by Boinc? (Set one of the log flags in there and check the effect in your BM log after re-reading it).

Really all I can think of ATM.

F.
ID: 929652 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929663 - Posted: 30 Aug 2009, 9:29:36 UTC - in response to Message 929652.  

Hi Fred yes done that and here are the few lines with it set to 0 and then set to 1.

30/08/2009 10:14:58 SETI@home Starting task 27mr08ag.24858.18477.16.10.150_0 using setiathome_enhanced version 608
30/08/2009 10:22:02 Re-reading cc_config.xml
30/08/2009 10:22:02 Re-read config file
30/08/2009 10:22:02 log flags: task, file_xfer, sched_ops
30/08/2009 10:23:19 Re-reading cc_config.xml
30/08/2009 10:23:19 Re-read config file
30/08/2009 10:23:19 Configured to use all coprocessors
30/08/2009 10:23:19 log flags: task, file_xfer, sched_ops



I am now lost as what to do and am a bit frustrated.
I know that boinc has issues with reporting the actual cpu details and all I can think of is that it may have a problem getting details of this new version of the card,althought it sees and is crunching on the first 1 ok.

Dave
ID: 929663 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 929667 - Posted: 30 Aug 2009, 10:30:09 UTC - in response to Message 929663.  
Last modified: 30 Aug 2009, 10:31:12 UTC

Hi Fred yes done that and here are the few lines with it set to 0 and then set to 1.

30/08/2009 10:14:58 SETI@home Starting task 27mr08ag.24858.18477.16.10.150_0 using setiathome_enhanced version 608
30/08/2009 10:22:02 Re-reading cc_config.xml
30/08/2009 10:22:02 Re-read config file
30/08/2009 10:22:02 log flags: task, file_xfer, sched_ops
30/08/2009 10:23:19 Re-reading cc_config.xml
30/08/2009 10:23:19 Re-read config file
30/08/2009 10:23:19 Configured to use all coprocessors
30/08/2009 10:23:19 log flags: task, file_xfer, sched_ops



I am now lost as what to do and am a bit frustrated.
I know that boinc has issues with reporting the actual cpu details and all I can think of is that it may have a problem getting details of this new version of the card,althought it sees and is crunching on the first 1 ok.

Dave



i just thought of something... in the project directory make sure a file named "number_of_gpus" exists. typically it contains a 1 or 2. maybe you should put a 4 in there. can't hurt to try it.
ID: 929667 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929669 - Posted: 30 Aug 2009, 10:46:08 UTC - in response to Message 929667.  

Hi Fred
Good thought looked in the project folder and there isn't a file called that!
Looked on the other rig and there is I will copy the file from that rig to this and try as it is then change to 4 if it doesn't work.Here's hopeing
Thank's
Something to try scratching my head at the moment
ID: 929669 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929673 - Posted: 30 Aug 2009, 11:13:46 UTC

Fred No luck just the same with a 1 or a 4 in the number of gpus file.
After lunch I am going to take the cards out of this rig and throw them in the river,(Only joking).
I am going to put them in my AMD rig to see if it is a rig problem.
The AMD rig is still Vista but 32 bit as opposed to this one which is 64bit,it has the to GTX275's
in at the moment with the same Nvidia drivers and is running 6.6.36. I may have a power supply
issue but if I get the message 4 GPU's then that will suffice,I don't need to run the task's.

Let you know how I get on later,as there is a F1 race at 1pm.
Dave

Thank's for your help so far it is appreciated.
ID: 929673 · Report as offensive
Chuck Gorish

Send message
Joined: 19 Jun 00
Posts: 156
Credit: 29,589,106
RAC: 0
United States
Message 929682 - Posted: 30 Aug 2009, 13:12:06 UTC - in response to Message 929673.  

Fred No luck just the same with a 1 or a 4 in the number of gpus file.
After lunch I am going to take the cards out of this rig and throw them in the river,(Only joking).
I am going to put them in my AMD rig to see if it is a rig problem.
The AMD rig is still Vista but 32 bit as opposed to this one which is 64bit,it has the to GTX275's
in at the moment with the same Nvidia drivers and is running 6.6.36. I may have a power supply
issue but if I get the message 4 GPU's then that will suffice,I don't need to run the task's.

Let you know how I get on later,as there is a F1 race at 1pm.
Dave

Thank's for your help so far it is appreciated.


hmm number of gpus should have controlled what was being looked at. and it was in the seti project folder living with the apps, app info file etc... should be a file with 1 character in it on the first line.

you have a point. if there is not sufficient power sometimes motherboards hide resources and shut them down. is there a power connector on the mobo for pci-e? mine has one that is to be used if more then 1 device is plugged in.

i suppose it is possible that at startup since all is at idle everything is happy and wants to run, and then when faced with cuda processing the cards discover not enough power, panic and begin shutdowns in order to continue running at least in single gpu mode.

what size power supply is in that? is it multi-rail? and are different rails feeding different cards? might want to compare the specs to be sure a rail can actually supply the amount of current the 295 wants. for 2 of those cards plus an i7 and associated gear an 850w would be extremely marginal. i barely have enough headroom in mine and my cards don't draw nearly as much as the 295s do. each 295 is rated at 289 watts so thats 578w plus processor/mobo/ram/drives/fans add another 150 to 200w so a maxed out supply would have to supply 778w absolute minimum just to run for a bit before it burned up not leaving much of a margin for an 850 plus considering the 'duty cycle' ratings of the psu only those psu rated 850w continuous even stand a chance. i would use absolute minimum an antec signature 850 (can run at 850 all day without breaking a sweat), preferring a 1000w. or 2 supplies a 650w for powering just the gpus and another 650w for the rest of the system. problem with that would be finding a 650 that can supply the 12v rail current requirements of the 295. might have to go 850 for that anyway.

ID: 929682 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929686 - Posted: 30 Aug 2009, 13:53:33 UTC - in response to Message 929682.  

Thanks Chuck for your input.
I have an 850w dual rail active PSU and a kill-a-watt inline.
At the moment running 8 cpu threads and the 2 GPU's on the GTX295
It is only showing 460 watts total power so nearly 400 watts to go.

Dave
ID: 929686 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 929696 - Posted: 30 Aug 2009, 14:14:05 UTC

What's confusing here is that everything else sees all 4 GPU's. Only Boinc refuses to see them so I have (so far) discounted motherboard / PSU issues.

It will be interesting to see what the AMD rig makes of them.

F.
ID: 929696 · Report as offensive
FiveHamlet
Avatar

Send message
Joined: 5 Oct 99
Posts: 783
Credit: 32,638,578
RAC: 0
United Kingdom
Message 929702 - Posted: 30 Aug 2009, 14:38:21 UTC - in response to Message 929696.  

Yes I thought that Fred I have to go out for a little while so I will report later.
The 295 that is crunching at the moment is turning the wu's round in about 13 mins on each gpu and hasn't missed a beat so 6.10 and the drivers seem to be working well.
I was thinking of going on the Boinc Forum and see if anyone on there can help.
I can't see a thread specific to this peroblem but you never know.
Dave
ID: 929702 · Report as offensive
Previous · 1 · 2 · 3 · 4 · Next

Questions and Answers : GPU applications : Upgrade from GTX275 to GTX 295 Question


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.