Message boards :
Number crunching :
PC Build for my Dad
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yeah, that is what I am thinking I need to do. I asked in a PM whether there might have been an earlier BOINC install in a custom location or something. An issue is that Theo, (the son) and who I am dealing with is not the owner (Ted the Dad) and hasn't been present. I would need Ted to be available so he could login to his account for re-attaching to the project. I think I need to start completely from scratch. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
Has the mechanism for Ghost creation ever been figured out? Before when the systems were seriously over committed and under performing they were each producing ghosts, although from memory not at the same time. I notice that another of their systems also has a bunch of Ghosts (the one with the Vega64). Possibly a modem/router issue? It's now struggling with the load of those 2 ThreadRipper systems? Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
No, the 1950X only had the expected 400 tasks after my Tuesday session. I looked at it yesterday and it still only had 400 assigned task, no ghosts. I didn't look at it today though before my session with the 1900X. [Edit] I just looked at the 1900X and it looks like the app_config tunings I applied are being used now. Maybe Theo rebooted the computer and the app_config file finally got read. Also see that the number of Tasks in Progress is almost exactly 400 greater at 9936 than the 9376 or whatever I started the TeamViewer session with. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
A thought ... it's being run with different user logins which cause a reload/ignore of client_state, etc. Could be it's being run from the wrong user account and not 'All Users' EDIT: Also might explain why Grant(?) mentioned it was on Lunatics, nope it's not, yes it is ... every time they switched logins ??? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I never thought of that. I wonder if Theo is logging in under his account when I am TeamView'ing with him on the phone. And when he leaves, his Dad logs in under his account. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stargate (SA) Send message Joined: 4 Mar 10 Posts: 1854 Credit: 2,258,721 RAC: 0 |
Possible |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
And the Ghosts on their other system? Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Up until today, I have never seen ghosts on the 1950X Host 8371071 Every time I looked at it, it had the correct 400 Tasks in Progress. So I never worried about ghosts on that host. I only worried about the wrong applications being used and the overloaded cpu with the lousy task completion times. I fixed the apps and the overloaded condition in Tuesday's session. Today's session on the 1900X Host 8389828 was mainly to get rid of the ghosts and choose the correct apps and eliminate the overloaded cpu condition. I wasn't successful on the ghosts obviously. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
Up until today, I have never seen ghosts on the 1950X Host 8371071 I was mentioning the other, other system. The Apple one with the Vega 64 video card that has ghosts as well. Whatever is happening, is affecting more than just the ThreadRipper systems, although they seem the be the most affected when it occurs. Hence my suspicion of their Modem/router as none of the systems in question at this stage are over committed (which the ThreadRippers were before at various times). I just find it odd that there is another system with Ghosts, and there are Ghosts again on systems that had the usual Ghost producing performance issues resolved. Grant Darwin NT |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
I just wish I could keep my Ryzen running!!! The damn thing runs from 15m to a few days then crashes. Right now it's on stock 3400/3200Mhz RAM and 20h up. Last boot it was 15 min at the same :((( And it always throws a handful of errors whenever it gets the mood to. I just wish it was reliably unreliable with different settings so I could figure it out. Getting darn tempted to pull the Gigabyte K7 and put in the cheaper ASUS B350 ROG ... I want to put my 1080s in it, but not too productive if it won't frigging run. The other computer is rock solid stable. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
I just wish I could keep my Ryzen running!!! The damn thing runs from 15m to a few days then crashes. Right now it's on stock 3400/3200Mhz RAM and 20h up. Last boot it was 15 min at the same :((( And it always throws a handful of errors whenever it gets the mood to. Mem test, different PSU? CPU temps? Any BIOS updates addressing current issues? Grant Darwin NT |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Don't think it's temp, 60C (I think) w/120 AIO which is only luke warm o/p. 140mm front and back on board just to make sure, backplate is cold. RAM has been fine for any 2h tests I've done, was going to try it for a day or more and see. Best I got was 6d with original K4 BIOS @ 3.7/3333 (I think) I need to recheck BIOS1 setting w/K4. I'm on BIOS2 now with the updated K10 BIOS. It's a new 1200W PSU with ~225 load, but yea I guess it's never been tested yet as known good ... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
It's a new 1200W PSU with ~225 load, but yea I guess it's never been tested yet as known good ... Known good would be a good start, or try it with a couple of video cards loaded up in there. A PSU of that caliber might be OK with such a low load, but switchmode PSUs if not carefully designed for such low loads (ie less than 25%) can have stability issues. EDIT- found this reference, Most switched mode power supplies are regulated. So as the load is reduced the controller will reduce the pulse width and hence the duty cycle in an attempt to maintain the output voltage. https://electronics.stackexchange.com/questions/80547/operating-a-switched-mode-power-supply-without-a-load Grant Darwin NT |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
It's worth a try ... a known good 750W PSU, UPS not just surge, 3.7/3333Mhz, 65C w/2x120mm fans on AIO now. 1.75h and still up, will have to see ... Shaved 4m off CPU tasks from 3.4-3.7Ghz, now @50m Will return this thread back to its regular programming ... |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I just wish I could keep my Ryzen running!!! The damn thing runs from 15m to a few days then crashes. Right now it's on stock 3400/3200Mhz RAM and 20h up. Last boot it was 15 min at the same :((( And it always throws a handful of errors whenever it gets the mood to. What kind of crashes? Do you get memory errors that can be looked at in Event Viewer or BlueScreenView that freeze the system or green screen? Or do you get black-screens where the the display is blank and unresponsive, leaves no traces or residues and needs to have the system rebooted with a reset or power down? Anything that logs memory errors can be fixed with better memory or higher voltages on the memory. Black screens are ALWAYS a sign of inadequate Vcore which requires more cpu core voltage, a bump in Vsoc, higher LLC, better power delivery or all of the above. That said, Gigabyte boards are high on the list of mentioned boards that have been uncooperative in the forums. The ASUS B350 ROG has pretty good reviews in the forums. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Don't think it's temp, 60C (I think) w/120 AIO which is only luke warm o/p. 140mm front and back on board just to make sure, backplate is cold. Back the memory off to 3200. Have you used any of the Ryzen memory tools? It's only the top 25% of forum users that are able to get stable memory clocks above 3200 Mhz. The main 50% of forum users can use 3200 with the latest BIOS' across all board manufacturers. The bottom 25% can only run 2400 or 2666 memory mainly because of poor memory choice. Ryzen DRAM Calculator 0.9.9 v11 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I just wish I could keep my Ryzen running!!! The damn thing runs from 15m to a few days then crashes. Right now it's on stock 3400/3200Mhz RAM and 20h up. Last boot it was 15 min at the same :((( And it always throws a handful of errors whenever it gets the mood to. I just noticed the link in the post to the errored tasks. Sigill and Sigsev violations. The minimum Linux kernel that is mostly compatible with Ryzen is supposed to be 4.10. The only kernels certified 100% compatible with Ryzen is supposed to be > 4.14. I am still getting occasional sigsev errors too on my Linux Ryzen. I had less on 4.10 kernel. I seem to be getting more now on the new security patched 4.13. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I believe the Sigsev errors on my Linux box are caused by nvidia-settings. I have had the "Ubuntu has experienced an error" message window a few times and looking at the details of the report, nvidia-settings is the culprit of the Sigsev errors. That program just updated after the security flaw fix to 390.12. Didn't have the error when it was back at 384.98. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
I C U Send message Joined: 4 Apr 07 Posts: 5 Credit: 277,443 RAC: 0 |
Keith wrote: Up until today, I have never seen ghosts on the 1950X Host 8371071 Grant (SSSF) wrote: I was mentioning the other, other system. The Apple one with the Vega 64 video card that has ghosts as well. ...just to throw-in some ideas... On Linux, you basically run BOINC as one user under one account (client app), and the manager is a different program giving direction to the client. I recall trying-out windows BOINC a few years ago, and it basically ran as screensaver. If the windows BOINC is a screensaver app, and you have multiple user accounts, could it be possible to create ghost accounts? The other possibility that may be worth checking is if the computers are named the same, or if they have different host names. The SETI server might be confusing similar computers as the same machine (since all Jeyl's computers are probably seen as on the same router IPaddress). |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Some answers ... - It has been up for 21h now since the PSU replacement with 6 new errors. - 99% of the time it is just a freeze up, normal screen, nothing logged, no sign of what happened. I think it only rebooted once on it's own. - LOL, I changed to the Gigabyte board when you said you didn't like the B350, and by the video you sent me I picked the GB K7 with was said to be a good OC board. - I haven't noted any difference if stability really between 3200/3333. I wish I has a Windows machine with DDR4 to read the RAM, but I don't. I would have to clone my Win8.1 and run it offline, but my backup drive is kind of tied up with my Aunt's Mac backups at the moment. - I have seen the SIGSEGV errors on Ubuntu 16, 14 and Mint 17. Mint 17 would be the only one that had the newest drivers released this week. The Mint 17 is a new install just turned 4 days old on SSD ... I killed my USB Ubuntu 16/14 kernel files whenever I try to push 3.9G. They can probably be repaired ... but oh well, no biggie. I was thinking it might be something to do with USB sticks, but no. - I don't think it is NVidia drivers, since it carried on thru 3 different OS's, 2 without the latest updates. My guess would be hardware. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.