Message boards :
Number crunching :
Bitcoin GPU-based Mining Machines good for BOINC / SETI?
Message board moderation
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 66 · Next
| Author | Message |
|---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 11744 Credit: 1,160,866,277 RAC: 4,249
|
/home/tom/Desktop/BOINC/boinc(boinc_catch_signal+0x65)[0x4b53b5] I'm pretty sure this is BOINC housekeeping or what used to be called heartbeat. It means that BOINC lost communication with the client and the status of the tasks running. That I believe is what triggers that question about the exiting BOINC and offer to restart it. I would try to get the slots to run at Gen 2 speeds, but if you run into the same errors again, I would drop back to Gen 1 speeds and investigate why your motherboard is having communication issues on the dbus and what settings or parameters you can set to increase the tolerance for bus comms errors or increase the timeouts. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
you dont want PCIe to gen 1. that is incredibly slow. and even worse with the risers since you are running only a single lane. I agree in principle. Prior to this situation I had some trouble (some kind of errors?) when running a smaller number of gpus at Gen3 on this MB. The default on the MB is Gen2 anyway. 1) I need direction on the error message I posted. 2) So far, based on eye balling the times, the processing has not (yet) slowed down. Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
you dont want PCIe to gen 1. that is incredibly slow. and even worse with the risers since you are running only a single lane. PCIe gen1 is 4x slower than PCIe gen3. you want all your slots to run PCIe gen3 if you can. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
I have finally found the log I needed some weeks ago when I started getting the "Boinc Manager restarted 3 times in 5 minutes" do you want to restart again? This from the "stderrdae.txt" file. No protocol specified No protocol specified No protocol specified No protocol specified No protocol specified No protocol specified No protocol specified No protocol specified No protocol specified No protocol specified SIGBUS: bus error Stack trace (12 frames): /home/tom/Desktop/BOINC/boinc(boinc_catch_signal+0x65)[0x4b53b5] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fdf9b22e390] /home/tom/Desktop/BOINC/boinc[0x4a8c04] /home/tom/Desktop/BOINC/boinc[0x41861d] /home/tom/Desktop/BOINC/boinc[0x418c6a] /home/tom/Desktop/BOINC/boinc[0x42cf08] /home/tom/Desktop/BOINC/boinc[0x42d423] /home/tom/Desktop/BOINC/boinc[0x421468] /home/tom/Desktop/BOINC/boinc[0x4825a0] /home/tom/Desktop/BOINC/boinc[0x407b1f] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fdf9a5d2830] /home/tom/Desktop/BOINC/boinc[0x407f71] Exiting... Can someone point me to a reference that will make it clearer about what I need to do? I just reset the Pcie bus to Gen 1. It was on Gen2. First Boinc Manager came up with gpu's missing. Then the manager restarted and is now showing them computing. I am running with -nobs. Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
After spending way to much time trying to get something to work "better" I have two things I can say for sure. I should have examined #1 above more closely. When all is said and done I think I have 1 video card that died. 2 Video card bases that died. And probably one 1 to 4 pcie expander card that died. What is interesting is NONE of the cables have died. Now I still don't know if 8 gpus are "stable" but at least I have 8 gpus executing under Seti again. I am running Lubuntu 16.x because I couldn't get 2 different ISO's of Ubuntu 16.x to install or to boot. I was using RUFUS under windows to create my Linux flash drives. And RUFUS complained about the first one not having all the correct versions of some programs and the 2nd version of the ISO using the latest release of RUFUS refused to boot hanging part of the way through the startup to install. Since one was from the main download site and the other was from an official mirror in at a college/university in Oklahoma I really don't have a clue. Anyway, its "sorta stable" (its been running for 10 whole minutes now) :) Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
And it is a bunch cheaper (probably more common too). Since my hardware, bios and software seem to have "dug a hole" and are sitting immovably at 4 gpus I may take a look at this MB again. The narrow cooler seems to be a sub $50 item. The MB I am looking at is under $300. I can't find the board Ian&SteveC are using for less than $600 at the moment. The main thing I loose is 50% of my cpus. On the other hand I have enough cores to drive at least 7 gpus :) Surely three gtx 1060 3GB's would offset the lost production of 20 threads? :) Who knows, maybe my extender 1 to 4 boards will start working again. Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
Seems your choice of motherboards was your downfall. TBar, Ian and eng4hire seem to be having good luck with crypto mining specifically designed motherboards. I have now been through Linux version 14 upgraded to 16 and then re-flashed the bios for #9. No changes (4 gpus). Reflashed to version #1.08. Then I had to remember that 1.08 and earlier used the last pcie X16 slot for the video output. Sigh. Still running 4 gpus. But I may take a vacation from trying to get to 5 or more for the week so that I can try to maintain my RAC. Its a hobby, Its a hobby.... Its a....... what am I doing here? I forgot :) Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
|
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 3150 Credit: 1,282,604,591 RAC: 15,062
|
my board is not a crypto board. it is a high end proprietary server board. Supermicro X9DRX+-F my previous board (still in use in a different system now) was/is a normal mid-level ASUS consumer/enthusiast board. but i never ran more than 7 GPUs in it due to lack of CPU resources when running the -nobs option) ASUS PRIME Z270-P after seeing the issues that TBar and eng4hire have had with crypto boards (duds that need RMA), i don't really have a big desire to use them. they seem to work ok once you get a good one though. rob has made several comments about some independent testing he or colleagues have done on the crypto boards, and very very few of them can pass their stress tests. they are just low quality boards in most cases. See Rob's comments here: "~3% motherboards passed" Tom's Asrock board is a server tier product, but i don't really have any experience with them. I think Tom is in the mindset that he wants to keep using this kind of platform to be able to re-use the CPUs which he bought previously. if he wants to stick to that platform, I would highly recommend a Supermicro board. I went with my current board for a similar reason, i knew i could re purpose the CPUs and RAM by sticking to a platform that supported DDR3 and v2 Xeons, and even if i needed to buy them, they are much cheaper than the DDR4/v3+ counterparts. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours
|
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
Seems your choice of motherboards was your downfall. TBar, Ian and eng4hire seem to be having good luck with crypto mining specifically designed motherboards. Sure sounds like it, except I was able to run 5 (upto 9) or more gpus under Lubuntu except it would crash if I ran it continuously. Then I tried plain vanilla Ubuntu and suddenly I can't boot past 4. It might be if I revert to Lubuntu and figure out a way to automate a daily re-boot and start Boinc Manager that I could "make it work". More or less. On the other hand, at least two people running plain vanilla Ubuntu have been able to run more than 4 gpus. So apparently my hardware "loves" me :) I dunno. I may try for 5 gpus again today. If I get that far I may say. DONE :) Ian&SteveC's latest motherboard is between $500 and $900 depending on where you buy it. And you need either a custom case or a custom support open air base. I have an open air base from a really cheap miners frame/rig I bought. So I might not have to drill holes in a lucite base. I swear this is one of the aggravating two steps forward, one to five steps back situations. :) But I have all the hardware to transplant onto it. :/ Any way back to the "mines" (miners?). Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Keith Myers Send message Joined: 29 Apr 01 Posts: 11744 Credit: 1,160,866,277 RAC: 4,249
|
Seems your choice of motherboards was your downfall. TBar, Ian and eng4hire seem to be having good luck with crypto mining specifically designed motherboards. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
After spending way to much time trying to get something to work "better" I have two things I can say for sure. 1) Every single pcie slot will work when it is the only inhabited slot (tested with a known good riser card/cable/gpu). 2) I went back to the basics of using a single riser card per video card rather than the 1 to 4 expander cards. 2) I finally got 4 gpus working after burning way too many hours of testing using riser cards/cables. I'm going to bed! Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
Also try this command in the Terminal to list all the cards. Ok. Found this. And CNTL/ALT + F3 to drop to the terminal session. Will try some of this out during Tuesday maintenance. Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Keith Myers Send message Joined: 29 Apr 01 Posts: 11744 Credit: 1,160,866,277 RAC: 4,249
|
Is it possible to drop into a terminal session while it is still booting? Yes, just boot the recovery installation from the grub menu. Hit ESC just after the BIOS splash screen and choose the Advanced boot options from the menu. Boot the Recovery installation. Enable Networking to get Read/Write access and select the command terminal. Use your command terminal tools. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
what motherboard is this again? manufacturer and model. ASROCK Rack "13-157-352 SERVER_MB ASROCK : EPC602R" Part# 90-SXG0G0-A0UAYZ Update bios to latest release. "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Keith Myers Send message Joined: 29 Apr 01 Posts: 11744 Credit: 1,160,866,277 RAC: 4,249
|
With the monitor plugged into the card that gives you BIOS output and the boot load sequence until the X server loads the drivers, wait till hard drive access finishes or the approximate time it takes to load the desktop normally. Then hit CTRL-ALT-F3 to drop to the tty console terminal to login. This logs you off the X server and display drivers. Login. Now use the terminal commands I have posted earlier to diagnose the identity of your installed cards. Determine the BusID of the card you are connected to and edit the xorg.conf file to make that card the one assigned to Monitor 0. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
I must of missed something on this. I added the mode--- thingy to grub following the directions. Didn't help. Removed that change. Then tried changing display managers. That didn't help. I will back up the thread and see if I can find any other things for tty that I missed. Tom[/quote] I can drop to terminal and run the stuff you mentioned with 3 gpus. I can't see anything to "drop to terminal" when I run 4 gpus and drive the monitor from a Nvidia card. I can't get 5 gpus to boot past the display manager when using the internal gpu. Is it possible to drop into a terminal session while it is still booting? Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
Have you tried to find your monitor video on another card yet by moving the connection? Yes. Only the "#1 card" will display anything with 3 gpus. Only the number 1 card will tell the monitor it is there with 4 gpus. Otherwise the monitor goes into "power save" mode.
I must of missed something on this. I added the mode--- thingy to grub following the directions. Didn't help. Removed that change. Then tried changing display managers. That didn't help. I will back up the thread and see if I can find any other things for tty that I missed. Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Keith Myers Send message Joined: 29 Apr 01 Posts: 11744 Credit: 1,160,866,277 RAC: 4,249
|
Have you tried to find your monitor video on another card yet by moving the connection? Have you tried dropping to tty terminal for use of the commands I have suggested? Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
Do you have nomodeset in your kernel command line in the grub file? No joy :( "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
Tom M Send message Joined: 28 Nov 02 Posts: 4936 Credit: 276,046,078 RAC: 1,048 |
Do you have nomodeset in your kernel command line in the grub file? Ok. I have followed the instructions. I will be back in a few minutes one way or the other. Tom "I owe", "I owe", "Its off to work I go" (from a bumper sticker on a smallish Mercedes Benz) (on the back of a Semi Tractor) "If you can read this bumper sticker, I've LOST MY TRAILER!" |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.