Setting up a Linux machine to crunch CUDA80 for Windows users

Message boards : Number crunching : Setting up a Linux machine to crunch CUDA80 for Windows users
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 40 · 41 · 42 · 43 · 44 · 45 · 46 . . . 51 · Next

AuthorMessage
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1898006 - Posted: 29 Oct 2017, 1:41:57 UTC - in response to Message 1898002.  

Were you trying to use BoincTasks as a substitute for the manager? I have never run anything other than the Manager for each client. I just like having BoincTasks on my daily driver so I can see what is going on with each machine without having to get off my chair and visit each client machine in person. So I just use it mostly for remote monitoring of each client machine from a central location. I am lazy.


. . Yes, I thought you needed to run it on each machine to monitor that host. But I was trying to use it instead of manager and kept coming back to manager. If I recall there was an issue with virtual box as well. But it was quite a while ago and I cannot recall all the details now.

. . Anyway, that if for another day atm.

Stephen

..
ID: 1898006 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4506
Credit: 281,988,750
RAC: 631,519
United States
Message 1898009 - Posted: 29 Oct 2017, 2:00:44 UTC - in response to Message 1898006.  
Last modified: 29 Oct 2017, 2:01:47 UTC

No, you only need to install it on one computer. As long as the other clients are on the same network subnet, it connects to everyone. You do have to provide the gui_rpc_auth.cfg password for each machine. But there is the nifty BoincToolbox and AddToGroup utility executables in the eFMer directory that seems to find all your clients pretty reliably if the main program doesn't do so on its own.

I just make the password the same on every machine. Makes it simple.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1898009 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1898019 - Posted: 29 Oct 2017, 5:46:31 UTC - in response to Message 1898009.  

No, you only need to install it on one computer. As long as the other clients are on the same network subnet, it connects to everyone. You do have to provide the gui_rpc_auth.cfg password for each machine. But there is the nifty BoincToolbox and AddToGroup utility executables in the eFMer directory that seems to find all your clients pretty reliably if the main program doesn't do so on its own.

I just make the password the same on every machine. Makes it simple.


. . And you don't have a problem between Linux and Windows boxes?

Stephen

??
ID: 1898019 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4506
Credit: 281,988,750
RAC: 631,519
United States
Message 1898029 - Posted: 29 Oct 2017, 7:25:21 UTC - in response to Message 1898019.  

No, it sees the Linux box too. I questioned that at first too but then I think it was Brent that said it works fine with Linux computers on the network and BoincTasks sees it fine. I can do everything on the Linux computer that I can on my Windows machines with BoincTasks.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1898029 · Report as offensive     Reply Quote
Profile MarkJ Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1069
Credit: 52,563,964
RAC: 39,412
Australia
Message 1898053 - Posted: 29 Oct 2017, 10:46:40 UTC - in response to Message 1898019.  
Last modified: 29 Oct 2017, 10:55:46 UTC

And you don't have a problem between Linux and Windows boxes?

Stephen??

It can also see Raspberry Pi’s. The only difference with them and a regular Linux box is they don’t broadcast their name on the network so you need to give each Pi a fixed IP address and enter that IP address in BOINCtasks.

I run BOINCtasks on a windows laptop that doesn’t do any crunching but it can see/control the whole farm. I had a mix of Windows and Linux crunchers including Raspberry Pi’s and a couple of Parallella’s. The Parallella’s have gone and the windows machines became Linux crunchers.
BOINC blog
ID: 1898053 · Report as offensive     Reply Quote
Profile Brent Norman Special Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2134
Credit: 206,099,164
RAC: 482,821
Canada
Message 1898063 - Posted: 29 Oct 2017, 13:28:38 UTC - in response to Message 1898029.  

BoincTasks also runs ON Linux using Wine. I found that out when my Windows box died an I went into BT withdrawals :(

A couple of quirks though.
- It complained about a language file not found when restarting, but copying the file to where it wanted it cured that.
- Reordering columns doesn't work right. This might be cured with config files, haven't looked yet.
- Not a quirk, but I renamed boinctasks.exe to BTasks.exe, so the scheduler app doesn't think BOINC Manager is running.

Other than that it works just fine, and I'm still using my Linux box more as my daily computer ... with suffering RAC mind you.
ID: 1898063 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1908709 - Posted: 24 Dec 2017, 0:38:28 UTC

. . . . . M E R R Y

. . . . . . . . . . . . C H R I S T M A S

Stephen

:)
ID: 1908709 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1908711 - Posted: 24 Dec 2017, 0:41:48 UTC

. . By the way ... does anyone know if there is a way to find BOINC stats about how many hosts are running Special Sauce???

Stephen

??
ID: 1908711 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1910984 - Posted: 5 Jan 2018, 23:11:22 UTC

. . Hello people! (not that anyone seems to read this thread anymore)

. . After a disaster when I accepted an update to Ubuntu release 97 I went back to release 96 which works.

. . The problem with release 97 was it screwed up accessing the flashdrive, for things like, you know, booting the system!

. . Anyway I finally relented and accepted another update, this time release 104, and it wants to incinerate my GPU. While the Boinc stuff has not changed, when it is running the GPU temp keeps stepping up until it gets over 80C. Back to good old release 96.

. . So now I have two dud releases clogging up drive space, not to mention several older releases as well.

. . My question is what do I have to do to completely remove these from the drive? Or what can I do? I am hoping it can be done. Autoclean and autoremove do NOT remove them at all.

Stephen

? ?
ID: 1910984 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9910
Credit: 128,861,050
RAC: 82,619
Australia
Message 1910991 - Posted: 5 Jan 2018, 23:20:07 UTC - in response to Message 1910984.  

. . Anyway I finally relented and accepted another update, this time release 104, and it wants to incinerate my GPU. While the Boinc stuff has not changed, when it is running the GPU temp keeps stepping up until it gets over 80C. Back to good old release 96.

Were your custom fan settings still in effect?
Grant
Darwin NT
ID: 1910991 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1910994 - Posted: 5 Jan 2018, 23:25:53 UTC - in response to Message 1910991.  

. . Anyway I finally relented and accepted another update, this time release 104, and it wants to incinerate my GPU. While the Boinc stuff has not changed, when it is running the GPU temp keeps stepping up until it gets over 80C. Back to good old release 96.

Were your custom fan settings still in effect?


. . Nope, they stopped working when I upgraded the video drivers for a trial for TBar. The fans are now at 100% all the time.

Stephen

:(
ID: 1910994 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9910
Credit: 128,861,050
RAC: 82,619
Australia
Message 1910996 - Posted: 5 Jan 2018, 23:30:59 UTC - in response to Message 1910994.  
Last modified: 5 Jan 2018, 23:31:25 UTC

. . Nope, they stopped working when I upgraded the video drivers for a trial for TBar. The fans are now at 100% all the time.

Then how did the GPUs get so hot with the fans at 100%? That just doesn't make sense.
Grant
Darwin NT
ID: 1910996 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4506
Credit: 281,988,750
RAC: 631,519
United States
Message 1911000 - Posted: 5 Jan 2018, 23:33:14 UTC - in response to Message 1910984.  

Do you have the Synaptic Package Manager installed? If so, search for linux-image in the Search box in the Installed category and mark the old kernel images for complete removal. Be careful to not remove the kernel image you want to keep.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1911000 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4506
Credit: 281,988,750
RAC: 631,519
United States
Message 1911001 - Posted: 5 Jan 2018, 23:34:12 UTC - in response to Message 1910996.  

. . Nope, they stopped working when I upgraded the video drivers for a trial for TBar. The fans are now at 100% all the time.

Then how did the GPUs get so hot with the fans at 100%? That just doesn't make sense.

Yes. ????? How ???
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1911001 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1911005 - Posted: 5 Jan 2018, 23:40:11 UTC - in response to Message 1910996.  
Last modified: 5 Jan 2018, 23:40:27 UTC

. . Nope, they stopped working when I upgraded the video drivers for a trial for TBar. The fans are now at 100% all the time.

Then how did the GPUs get so hot with the fans at 100%? That just doesn't make sense.


. . Tell me about it!

. . When the task kicks off the temp rises to normal for that setup, about 60 to 62 C. But then continues to ramp up over the following seconds until it gets to about 80 C then after a while the temp slowly drops back to about normal just before the task ends. Now that I am back using release 96 that is not happening. So it is something to do with release 104. Another dud for this system. I thought maybe it was overdriving the GPU clocks or voltages but there are no failures and run times are exactly the same (or maybe a few seconds slower).

Stephen

:(
ID: 1911005 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9910
Credit: 128,861,050
RAC: 82,619
Australia
Message 1911007 - Posted: 5 Jan 2018, 23:42:31 UTC - in response to Message 1911001.  

Then how did the GPUs get so hot with the fans at 100%? That just doesn't make sense.

Yes. ????? How ???

First thing that comes to my mind- invalid data. The temp readings were wrong. Might have required a updated monitoring programme to get the correct readings with the new OS version.
Could you feel the heat coming from the cards?
Grant
Darwin NT
ID: 1911007 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1911008 - Posted: 5 Jan 2018, 23:43:57 UTC - in response to Message 1911000.  

Do you have the Synaptic Package Manager installed? If so, search for linux-image in the Search box in the Installed category and mark the old kernel images for complete removal. Be careful to not remove the kernel image you want to keep.


. . Hi Keith,

. . Yep I have Package Manager, I will give that a try. Thanks mate.

Stephen

:)
ID: 1911008 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 3366
Credit: 70,098,631
RAC: 88,034
Australia
Message 1911010 - Posted: 5 Jan 2018, 23:48:14 UTC - in response to Message 1911007.  

Then how did the GPUs get so hot with the fans at 100%? That just doesn't make sense.

Yes. ????? How ???

First thing that comes to my mind- invalid data. The temp readings were wrong. Might have required a updated monitoring programme to get the correct readings with the new OS version.
Could you feel the heat coming from the cards?


. . Well I think the invalid data is out as there were no errors. This system is a low profile case and I have to open it up to physically test the temp, which I confess I didn't do.

. . It may be a sensor issue, but since it is not happening with release 96 that still puts it down to something in release 104.

Stephen

? ?
ID: 1911010 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 4506
Credit: 281,988,750
RAC: 631,519
United States
Message 1911014 - Posted: 5 Jan 2018, 23:58:43 UTC - in response to Message 1911010.  

How are you monitoring gpu temps? Assume from your comment it is via software applications.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1911014 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9910
Credit: 128,861,050
RAC: 82,619
Australia
Message 1911018 - Posted: 6 Jan 2018, 0:06:52 UTC - in response to Message 1911010.  

. . Well I think the invalid data is out as there were no errors.

Invalid temperature data.

. . It may be a sensor issue, but since it is not happening with release 96 that still puts it down to something in release 104.

The new release has a different way of passing the data, hence the erroneous reading as the application that makes use of the data hasn't changed.

Like the early high temperature readings for Ryzen systems till they sorted out how to read & interpret the data they were getting.
Grant
Darwin NT
ID: 1911018 · Report as offensive     Reply Quote
Previous · 1 . . . 40 · 41 · 42 · 43 · 44 · 45 · 46 . . . 51 · Next

Message boards : Number crunching : Setting up a Linux machine to crunch CUDA80 for Windows users


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.