Setting up Linux to crunch CUDA90 and above for Windows users

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 104 · 105 · 106 · 107 · 108 · 109 · 110 . . . 162 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1998977 - Posted: 21 Jun 2019, 1:43:50 UTC - in response to Message 1998938.  

OK I am still struggling a bit with the problems with this machine

8732075

After the first problem I swapped the 1060's over, MB to riser, riser to MB. It then ran for 31 hours till this morning, then it happened again.

Most common riser problem is a "Cuda initialization problem" which starts and then postpones each task in cache after 30 seconds, resulting in all tasks showing waiting to run rather than ready to start, and eventually aborting due to too many restarts. Other common issue is simply failing to see or start up one of more GPUs.
Generally a reseat of power or signal cable or replacement of the USB 3 cable solves this.


That describes my most common issue. Once I switched to the high grade USB 3 cables, the problem seems to have been something got joggled (that happens when you carry a mining rack from one room to another). :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1998977 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1998984 - Posted: 21 Jun 2019, 3:20:39 UTC - in response to Message 1998977.  
Last modified: 21 Jun 2019, 3:26:16 UTC

Most common riser problem is a "Cuda initialization problem" which starts and then postpones each task in cache after 30 seconds, resulting in all tasks showing waiting to run rather than ready to start, and eventually aborting due to too many restarts. Other common issue is simply failing to see or start up one of more GPUs.
Generally a reseat of power or signal cable or replacement of the USB 3 cable solves this.


That describes my most common issue. Once I switched to the high grade USB 3 cables, the problem seems to have been something got joggled (that happens when you carry a mining rack from one room to another). :)

Tom
Another good thought is to keep the USB cable length as short as feasible, and no more connections than necessary.
My initial build had 12"m-f cables from the card slot to the i/o panel, 3' m-m cables to the external case, and 12" m-f cables into the enclosure. So 5' cabling and two extra connections per card.
As you might imagine, those 12" m-fs went away in a hurry :).
ID: 1998984 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1998991 - Posted: 21 Jun 2019, 4:48:09 UTC
Last modified: 21 Jun 2019, 4:52:04 UTC

Most common riser problem is a "Cuda initialization problem" which starts and then postpones each task in cache after 30 seconds, resulting in all tasks showing waiting to run rather than ready to start, and eventually aborting due to too many restarts.


Yes, that is the problem exactly. I have swapped the riser and all cables with a spare I had and will see how that goes.

Just in case I have ordered another different make spare, it looked slightly better built and has all 3 power connectors, molex, sata and 6-pin GPU. Which will make the power cable easier to fit, and do away with the "adaptor"

Currently I am still running with NNT, and only updating every couple of hours, so that it cannot trash more than 200 tasks ;-)

and no more connections than necessary


Something I learnt many many years ago when I was an apprentice with Post Office Telephones (now BT). Was that every connection was a potential for a fault, so always less is better.
ID: 1998991 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1999007 - Posted: 21 Jun 2019, 8:39:36 UTC - in response to Message 1998991.  

Something I learnt many many years ago when I was an apprentice with Post Office Telephones (now BT). Was that every connection was a potential for a fault, so always less is better.
Indeed, as I also learned at the late great Nortel.
ID: 1999007 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999027 - Posted: 21 Jun 2019, 11:47:03 UTC - in response to Message 1999007.  

Something I learnt many many years ago when I was an apprentice with Post Office Telephones (now BT). Was that every connection was a potential for a fault, so always less is better.
Indeed, as I also learned at the late great Nortel.


. . So is there anyone here who is not a current or ex telephone tech? :)

Stephen

LOL
ID: 1999027 · Report as offensive     Reply Quote
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1999028 - Posted: 21 Jun 2019, 11:55:04 UTC - in response to Message 1999027.  

Something I learnt many many years ago when I was an apprentice with Post Office Telephones (now BT). Was that every connection was a potential for a fault, so always less is better.
Indeed, as I also learned at the late great Nortel.


. . So is there anyone here who is not a current or ex telephone tech? :)

Stephen

LOL


Me. ;)

I am an ex-truck driver though.
A proud member of the OFA (Old Farts Association).
ID: 1999028 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999033 - Posted: 21 Jun 2019, 12:34:21 UTC - in response to Message 1999028.  

. . So is there anyone here who is not a current or ex telephone tech? :)
Stephen LOL


Me. ;)
I am an ex-truck driver though.


. . Well howdy anyway :)

Stephen

:)
ID: 1999033 · Report as offensive     Reply Quote
Profile IntenseGuy

Send message
Joined: 25 Sep 00
Posts: 190
Credit: 23,498,825
RAC: 9
United States
Message 1999109 - Posted: 22 Jun 2019, 0:36:00 UTC - in response to Message 1999028.  

I'm a database management person - not much for Bumblebee - Christmas tree stuff...
SETI@home classic workunits 103,576
SETI@home classic CPU time 655,753 hours
ID: 1999109 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999110 - Posted: 22 Jun 2019, 0:41:37 UTC - in response to Message 1999109.  

I'm a database management person - not much for Bumblebee - Christmas tree stuff...


. . OK, for that I will need a translation :)

Stephen

? ?
ID: 1999110 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1999111 - Posted: 22 Jun 2019, 0:52:42 UTC - in response to Message 1999109.  
Last modified: 22 Jun 2019, 0:55:41 UTC

I'm a database management person - not much for Bumblebee - Christmas tree stuff...

Color codes??
ID: 1999111 · Report as offensive     Reply Quote
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1999112 - Posted: 22 Jun 2019, 0:53:55 UTC - in response to Message 1999111.  

I'm a database management person - not much for Bumblebee - Christmas tree stuff...

Color codes??


. . Maybe! That seems as good as anything at this point :)

Stephen

:)
ID: 1999112 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1999113 - Posted: 22 Jun 2019, 0:55:51 UTC

Heh, reminds me of the resistor color code mnemonic. Haven't thought of that in years ...
ID: 1999113 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1999117 - Posted: 22 Jun 2019, 1:13:57 UTC - in response to Message 1999113.  

Heh, reminds me of the resistor color code mnemonic. Haven't thought of that in years ...

Ha ha ha. First thing drummed into my head for basic electronics. That mnemonic is ingrained for sure. Both G and R rated versions.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1999117 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1853
Credit: 268,616,081
RAC: 1,349
United States
Message 1999128 - Posted: 22 Jun 2019, 2:54:37 UTC - in response to Message 1999117.  

Ha ha ha. First thing drummed into my head for basic electronics. That mnemonic is ingrained for sure. Both G and R rated versions.
Not sure I ever learned the G rated version. Military schools weren't known for being PC back then ...
ID: 1999128 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1999130 - Posted: 22 Jun 2019, 3:37:28 UTC

By the time I first heard the mnemonic (both versions), i'd already memorised the code from sheer repetition of daily use.
Grant
Darwin NT
ID: 1999130 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1999144 - Posted: 22 Jun 2019, 7:36:13 UTC

OK a serious question.

If I was to change GPU's on one of my Linux machines, say from 750ti to 1050's would the system pick up the change or would I have to "re-install" drivers like in WIndows.

I am hoping for the former !!
ID: 1999144 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1999147 - Posted: 22 Jun 2019, 7:51:45 UTC - in response to Message 1999144.  
Last modified: 22 Jun 2019, 8:02:15 UTC

If I was to change GPU's on one of my Linux machines, say from 750ti to 1050's would the system pick up the change or would I have to "re-install" drivers like in WIndows.

Like Windows, it depends.
If the current driver supports the new card, then it will be detected and supported and that's it.
If the current driver doesn't support the new card, then you'll need to install a driver that does.
So check the current driver version, and what model hardware it's good for.

Edit-
A quick search shows your current driver (418.56) will support a GTX 1050 card.
I don't know how things are with Linux initialising new hardware, but with Windows I suspend processing in the BOINC Manager, then shutdown the system, install the new hardware & re-boot. Once the system has found the new hardware, installed it & everything is good, then I re-enable processing.
Grant
Darwin NT
ID: 1999147 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1999148 - Posted: 22 Jun 2019, 7:59:15 UTC - in response to Message 1999144.  

OK a serious question.

If I was to change GPU's on one of my Linux machines, say from 750ti to 1050's would the system pick up the change or would I have to "re-install" drivers like in WIndows.

I am hoping for the former !!


It should just work. You have the 418 drivers which supports both cards.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1999148 · Report as offensive     Reply Quote
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1999149 - Posted: 22 Jun 2019, 8:35:43 UTC

It should just work. You have the 418 drivers which supports both cards.


Thanks, I may just find out this week. ;-)
ID: 1999149 · Report as offensive     Reply Quote
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 1999153 - Posted: 22 Jun 2019, 9:44:20 UTC - in response to Message 1999113.  
Last modified: 22 Jun 2019, 9:48:27 UTC

Heh, reminds me of the resistor color code mnemonic. Haven't thought of that in years ...

B-B-R-O-Y-G-B-V-G-W lol :) I only know the R version, never heard a G version.
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1999153 · Report as offensive     Reply Quote
Previous · 1 . . . 104 · 105 · 106 · 107 · 108 · 109 · 110 . . . 162 · Next

Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.