Message boards :
Number crunching :
Bitcoin GPU-based Mining Machines good for BOINC / SETI?
Message board moderation
Previous · 1 . . . 32 · 33 · 34 · 35 · 36 · 37 · 38 . . . 66 · Next
| Author | Message |
|---|---|
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
WOW! 18 is so much different than 16! It has a different look yes. Remember with 18.04, when installing the nvidia drivers, you have to use the command “nvidia-driver-410” which is unlike “nvidia-410” that you used in 16.04. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
I don’t actually have the board that I was talking about. I have some other super micro boards in other systems but not my ex-mining machine. That machine has an ASUS Z270-P motherboard, just a standard mid range board. Can fit a max of 8 GPUs if I use the M.2 slots. That’s how I’m running the system with 7 GPUs. The supermicro board I was referencing was the X9DRX+-F. It’s a dual LGA 2011 board for E5-2600 and E5-2600v2 Xeons. I would like to get my hands on a board like this someday, but not yet. I’m not ready to spend $500 for the board. E5 2600 v1/v2 Xeons can be had pretty cheaply though. And so can DDR3 registered ECC memory. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
installing 18.4.1 |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
I've been in hardware long enough to understand the subtle possibilities of crappy boards versus crappy designs. Ian&Steave, Whats the model # of the Super Micro? |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
WOW! 18 is so much different than 16! |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
Thanks rob. That’s kind of what I was getting at before. That mining boards are not built to the best standards and some boards work while other just aren’t up to snuff (even of the same model). So sometimes you’ll have someone with a board that works perfectly and other people with the same board (but a dud) gives nothing but problems. I hope that isn’t the case for eng, but it should be kept in mind as a possibility. If you don’t get it working, don’t think it’s because you aren’t good enough or you didn’t follow the “magic recipe”. There is no magic. You could just be unlucky. If you opted to go for that very expensive supermicro board that I listed previously, you would likely have much less problems. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
WOW! Sounds like some stuff my boss would have me try. Well, that being said, the march goes on. I'm hanging with Ian&Steve for this run to see how it goes.... |
rob smith ![]() Send message Joined: 7 Mar 03 Posts: 18643 Credit: 416,307,556 RAC: 863
|
The following may be of little comfort to you guys trying, with various degrees of success, to get a high GPU count working properly on a mining board. A few months back one of the jobs I was involved in was constructing a heavy weight multi-CPU multi-GPU computer. This was not the first we'd built, but someone decided it would be a cost saving to use multi-GPU mining boards instead of the usual high-end ones normally employed. So a pile of nVidia P6000 were acquired, along with a selection of "consumer level" mining boards and burn-in testing was commenced using the stress-test software. The initial stability test was to run a reduced rig (8 GPU + CPU on one board with the single controller CPU on its board) in the test chamber at 40C, for a 72hour test run. While that might sound harsh that's the sort of temperature we could expect in the chamber if the cooling system broke. The first mining board wouldn't even support the GPUs at idle, nor would the next four; finally the sixth and last of that batch was stable enough to start the test. Try another brand, and repeat, similar result, try a third brand two survived, but we couldn't get the mixed motherboards to behave at the same time. So we tried some more - eventually we had 4 boards (of the six needed) that passed muster. (~3% motherboards passed, all GPUs passed as did the controller board) At a project review meeting it was decided that the risk presented by having such a high failure rate was unpalatable. So we went back to our usual boards (mostly pre-used from earlier projects) and stuck the full rig in the test chamber for its 72hour cooking. It passed. The test chamber was re-configured so that its cooling system would hold the temperature down to ~15C with the 48 P6000, 10 Xeon (six on the GPU boards, four on the controller) and "tons" of RAM, comms. processors etc. and we started the big run - It ran solidly from April 2018 to the end of October. No hics, beats missed, or anything untoward barring a couple of disks in the external RAID arrays. After the job was done I managed to grab an hour on the beast and loaded BOINC and using cloned data from my PCs did a VERY quick run - all 48 GPUs, plus all the CPUs ran S@H (without a connection to the outside world sadly), average task run time was about a minute on the GPUs so the 900 tasks lasted about 20 minutes of the hour "play time" available. So it can be done if you have a big enough hardware budget, but if you go for "consumer grade" boards then expect problems :-( (And it's a great shame that I only had an hour of "play" time with the beast and no internet connection.....) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Shouldn't have to bug out I meant... |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Okay Ian&Steve C. , Got 18.4.1 burned and ready to go. The wife has the baby so I should have to bug out. I will go slow. One step at a time. These are your steps I will execute with a reboot with all cards installed... 1. add the Ubuntu PPA for nvidia proprietary drivers sudo add-apt-repository ppa:graphics-drivers/ppa 2. update sudo apt-get update 3. install nvidia drivers from the Ubuntu PPA (substitute 396 for 410 if you insist on that version, but it makes no difference) sudo apt-get install nvidia-410 4. install OpenCL components sudo apt-get install ocl-icd-libopencl1 5. reboot |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Yea, the closer each GPU gets to 100% the slower they go... Waiting for 18.4 to burn. The present install is junk. |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Gotta go grab a blank CD for VER 18 ISO burn... |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
I'm waiting for the six to all pop comp errors... |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
I suspect the six are locked in a fugue state. All their counters are identical... Very odd behavior... |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Wish i could have. Other than the fact I mapped the PCI addresses of all the slots, not much to share beyond that. Looks like Seti is running hot and steady on 6 of the 7 GPU's detected. GPU 0 that has the monitor plugged into it is frozen but displaying video... ??? After they all send their first work unit, I'll abort that GPU's work unit and see what happens... |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
Just for grins, I poped the reset a few times on the last config. Got a message that GPU on PCI:4:0:0 "Fell off the bus". weird. Can you give me more info about this error? I think you should be a bit more specific about the problems you’re having. It will help us figure out what’s really happening. Slow down. Post as much info as you can and wait for instruction before moving on. I feel like you’re trying to move too fast, and giving up without posting exactly what’s happening. It makes it hard to try to isolate the problem if you don’t post details. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Got 7 GPU's out of that effort... Lets run Seti for a few minutes to see how it feels... |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Just for grins, I poped the reset a few times on the last config. Got a message that GPU on PCI:4:0:0 "Fell off the bus". weird. So I took it out and restarted... Tick tock... |
eng4hire Send message Joined: 25 Dec 10 Posts: 490 Credit: 34,584,466 RAC: 0
|
Okay. Your 2.0 next... |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
Also. Why do you keep installing the 384 repository driver? You should never need to install that. It’s possible some parts of that package are causing conflicts with the newer versions you’re installing. If this doesn’t work. I want you to try Hail Mary 2.0. Wipe out the whole OS. Fresh install Ubuntu 18.04. Install all 10 cards. Install nvidia 410 drivers from the PPA as I described before. NEVER install the 384 package during this process. Only nouveau -> 410 directly. Then see what happens? Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.