Setiathome crashing on new system

Message boards : Number crunching : Setiathome crashing on new system
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956739 - Posted: 22 Sep 2018, 5:10:17 UTC

I was looking at the cpu/gpu specs and I think your gpu is either internal to the cpu or at least internal to the Motherboard?

If the gpu has started processing "ok" I am wondering when the gpu is running at 100% what are the Cpu's doing? Are they running at 22% or so? And if the gpu is paused, do the cpu cores jump upto 100%?

Thank you for the additional information.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1956739 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956775 - Posted: 22 Sep 2018, 12:44:57 UTC - in response to Message 1956708.  

With an overheating CPU I would expect problems within a minute or so of it starting crunching. With the PSU giving problems, it would take several minutes, although Thermal take is a reasonable brand. Have you got any USB devices connected to the computer (Printer, thumb drive, wireless dongle etc?) If so, unplug them & see if that helps.

Is the system freezing up, giving a Blue Screen of Death, or is it just re-booting?
As others have suggested, check the system error logs to see if there are any messages there that give a hint as to the cause.

A programme such as CPUID CPU-Z will allow you to monitor CPU & system temperatures, as well as CPU voltages. Do you have a multi-meter? If so you could check the rails from the Power Supply.
You can also run Win10s memory diagnostics to check for memory problems.


The system was freezing up(blank screen).... no BSOD or re-booting

Strangely or happily, Boinc appears to be running without issue(no crashs)......I've been at work for the last 8 hours and Windows event viewer has not recorded and Critical errors.
ID: 1956775 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956776 - Posted: 22 Sep 2018, 12:53:03 UTC - in response to Message 1956730.  

The Stock Heatsink and Fan for both Intel AND AMD are woefully inadequate... More-so for AMD. I just retired an AMD A6-6400K APU setup

For the older AMD CPUs, yes the stock coolers were poor.
For the current series, they are actually rather effective and an aftermarket unit is really only needed for overclocking.
While an after market cooler will be better over the long term for full blown Seti crunching, but it's not necessary like it was with many previous CPUs and coolers.


I agree, the current stock AMD cooler are quiet good(the CPU sits @ 30C with normal computer use) . My old AMD APU A-10 Stock cooler was crap and in did replace it
ID: 1956776 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956777 - Posted: 22 Sep 2018, 12:57:04 UTC - in response to Message 1956734.  

I see you have reported 42 tasks in so far http://setiathome.berkeley.edu/results.php?hostid=8535523&offset=0&show_names=0&state=4&appid=
One is a common Work Unit header Error, but the others are fine so far.
What did you find to get it to run?


I actually don't know what is going on, Boinc appears to be running without crashing system(its being enabled for the last 12 hours). I last tested Boinc about 7 days ago...and it cashed my computer then.

So I'm scratching my head trying to figure out what is different from a software point of view from a week ago.....

??
ID: 1956777 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956778 - Posted: 22 Sep 2018, 12:59:52 UTC - in response to Message 1956739.  

I was looking at the cpu/gpu specs and I think your gpu is either internal to the cpu or at least internal to the Motherboard?

If the gpu has started processing "ok" I am wondering when the gpu is running at 100% what are the Cpu's doing? Are they running at 22% or so? And if the gpu is paused, do the cpu cores jump upto 100%?

Thank you for the additional information.

Tom


I believe the AMD GPU is on the CPU chip
ID: 1956778 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956782 - Posted: 22 Sep 2018, 13:11:11 UTC - in response to Message 1956739.  

I was looking at the cpu/gpu specs and I think your gpu is either internal to the cpu or at least internal to the Motherboard?

If the gpu has started processing "ok" I am wondering when the gpu is running at 100% what are the Cpu's doing? Are they running at 22% or so? And if the gpu is paused, do the cpu cores jump upto 100%?

Thank you for the additional information.

Tom


I have just enabled my CPU & GPU to run constantly - CPU is running at 85-95% & GPU running @ 85-98%
ID: 1956782 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956796 - Posted: 22 Sep 2018, 14:46:10 UTC - in response to Message 1956782.  

I don't have a good answer about what is different from last week but I will offer the following information.

On NVidia based computing under Windows(10) the community has regularly suggested that the person with a problem download a late version of the drivers directly from NVidia and do what is called a "clean" install.

Windows users regularly get a video driver from Microsoft that is slightly incompetent or damaged. It causes processing to stop, or to slow down etc.

It is possible that Win10 updated your video driver from Microsoft.

You might investigate that idea keeping in mind that in the past the AMD drivers have also been problematic. Normally I would say go ahead and upgrade to AMD's latest and greatest.

But I am also a fan of "if it a'nt broke, don't fix it." Which means, if it is running "fine" now, maybe we don't touch anything :)


Tom
A proud member of the OFA (Old Farts Association).
ID: 1956796 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956802 - Posted: 22 Sep 2018, 15:18:42 UTC

I have been looking at the GPU tasks that James' system has been processing. The extremely small sample makes it look like his system is processing them slower than the cpu's are processing? I am pretty sure this next advice can't "break" anything (as in start causing things to crash). They could cause slower gpu processing.

I looked at the documentation for AMD gpu's and assuming this is an entry level card they suggested something like this for the "mb-commandline*.txt Place this in the \ProgramData\BOINC\projects\Setiathome folder. There should be an empty copy of this file already present in that folder. There is usually a version of this file for each type of gpu task you have processed. As far as I can tell, you would put the same thing in each file.

-sbs 192 -spike_fft_thresh 2048 -tune 1 2 1 16

I am not sure about the -sbs #. They suggested the 192 for multiple tasks. There was no suggestion for single task gpu's.

Another thing you can "try" is devoting a complete cpu to driving your video card. You would place a file called "app_config.xml" in the \ProgramData\BOINC\projects\Setiathome folder. I have had trouble with Notepad naming the saved file as "app_config.xml.txt" So try saving it as an "all files" or renaming it with the extensions displayed (a view command).

<app_config>
<app>
<name>setiathome_v8</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
</app_config>


I hope this is helpful and doesn't screw up anything.
You can temporarily disable either file by renaming them to anything but the original name. I like to add "_stop" to remind me that they are not currently enabled.
You can run the "read config" files from the Boinc Manager to make the changes start/stop.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1956802 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1956805 - Posted: 22 Sep 2018, 15:27:49 UTC
Last modified: 22 Sep 2018, 15:28:23 UTC

Since its a built in GPU i would rather disable GPU crunching and see if this helps.
Probably you also have to disable one CPU core in Boinc.


With each crime and every kindness we birth our future.
ID: 1956805 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956814 - Posted: 22 Sep 2018, 16:18:19 UTC - in response to Message 1956805.  
Last modified: 22 Sep 2018, 16:18:57 UTC

Since its a built in GPU i would rather disable GPU crunching and see if this helps.
Probably you also have to disable one CPU core in Boinc.


I'm confused. How would disabling GPU crunching show if that would speed up the GPU crunching?

When I previously asked about the processing level with the Gpu running at full speed I was investigating if an APU issue in the A4 to A10 series had raised its ugly head here. I used to regularly get the cpu's running at 21% load on the task manager when the gpu was running 100%. Petri investigated that issue thoroughly and posted his results someplace on the Lunatics website. His result was dropping 1 of the A series cpus off line allowed the other 3 to crunch at full speed while the gpu was also crunching at full speed.

James reported that both the cpu and the gpu appeared to be running at 90+ % loads so I am assuming that scenario doesn't apply.

I can see the possibility of dropping one cpu core offline might speed up the whole machine's production.
If the gpu is slowing down the rest of the system, I can see dropping it offline would increase the whole systems production.

Sigh. Lots of testing to see if any other combination proposed speeds up the whole system production.

I was trying to see if anything would speed up the Gpu crunching to be faster than the currently reported Cpu crunching. Even with "low-end" cards, often the card is still running at something like a task every half hour. Unless the cpu is very fast/high powered they are going to be running cpu tasks every hour to 3 hours or so.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1956814 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1956817 - Posted: 22 Sep 2018, 16:40:44 UTC - in response to Message 1956796.  

But I am also a fan of "if it a'nt broke, don't fix it." Which means, if it is running "fine" now, maybe we don't touch anything :)
BS

You can't keep your hands off a working computer until it IS broken!
ID: 1956817 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956830 - Posted: 22 Sep 2018, 19:19:09 UTC - in response to Message 1956817.  

But I am also a fan of "if it a'nt broke, don't fix it." Which means, if it is running "fine" now, maybe we don't touch anything :)
BS

You can't keep your hands off a working computer until it IS broken!


It is clear I am not clear here (sorry for the rythm).

I should have qualified it because all the other evidence I have ever provided here is under my Hobbyist hat.

In a business setting where "real" money and customer service are the front ranked priorities then "fixing what is currently working adequately" is a non-starter. After getting caught as a front line person in a "system upgrade" that slowed an entire dispatch and communications system (via Satellite) down by 100-200% I am VERY fond of pilot testing before scaling up to the production level.

I plead guilty to tinkering as a hobbyist. Otherwise where is the fun?

Tom
A proud member of the OFA (Old Farts Association).
ID: 1956830 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956832 - Posted: 22 Sep 2018, 19:39:10 UTC

I just got done reading a review from Tom's Hardware (that wasn't about RTX 2080 gpus :) that generates some ideas about one simple change to experiment with (more later).

https://www.tomshardware.com/reviews/amd-ryzen-3-2200g-raven-ridge-cpu,5472.html

As part of the discussion the article raises the issue that when both the cpu and gpu are fully engaged there is going to be performance limitations because of the bus speed and the speed and number of channels of the memory.

So James, one question. You said you have 8 GB of memory. I forgot to ask, is that 1 stick or 2 sticks of ram? Can you tell if you have dual channel memory if it is 2 sticks?

The reason I am asking is because two channel memory (2 or 4 etc. sticks apparently?) will speed up this kind of integrated cpu/gpu system. This presumes the motherboard supports dual or higher channel memory.

The "simple" change I would propose is after you have run your system long enough, without any material changes, to establish a baseline of RAC/Performance, you might try using the Boinc Manager local configuration menu to drop 1 of your cpu cores offline.

If this works, the other 3 cores would start processing at 100% and your gpu would start processing at 100% and your RAC would start going up.

A preliminary indicator would be the length of time your tasks are taking would decline, especially the gpu. When the length of time tasks take starts changing, you can not depend on the "estimated time to completion" column. You need to watch a couple of tasks run past they 99% to 100%. That time listed briefly at 100% is what you are interested in. Has it changed down/up from the estimated time?

If you decide to continue running your system 24/7 this will give you something to look forward to after coming home from work :)

Tom
A proud member of the OFA (Old Farts Association).
ID: 1956832 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1956848 - Posted: 22 Sep 2018, 21:11:36 UTC

It is clear I am not clear here (sorry for the rythm).

Tom, does that mean you can't find the rhythm ??? Boom - shaka laka boom - shaka laka ;^}

Or that you didn't have the time to do the rhyme?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1956848 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1956849 - Posted: 22 Sep 2018, 21:14:53 UTC - in response to Message 1956848.  


Tom, does that mean you can't find the rhythm ??? Boom - shaka laka boom - shaka laka ;^}

https://youtu.be/83nFiPoSuzU?t=7s
ID: 1956849 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1956852 - Posted: 22 Sep 2018, 21:46:52 UTC

+1 love it!.. Never seen that video.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1956852 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1956855 - Posted: 22 Sep 2018, 22:08:05 UTC - in response to Message 1956814.  

Since its a built in GPU i would rather disable GPU crunching and see if this helps.
Probably you also have to disable one CPU core in Boinc.

I'm confused. How would disabling GPU crunching show if that would speed up the GPU crunching?

It wouldn't.
But it would allow the system to produce more work as CPUs with onboard GPUs have to share memory, heat & power limits between them. The end result is that just running all CPU cores will produce more work than running the CPU cores & the GPU as well. Hence disabling GPU crunching to improve overall throughput.
Grant
Darwin NT
ID: 1956855 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956915 - Posted: 23 Sep 2018, 12:33:09 UTC - in response to Message 1956832.  
Last modified: 23 Sep 2018, 12:37:07 UTC

I just got done reading a review from Tom's Hardware (that wasn't about RTX 2080 gpus :) that generates some ideas about one simple change to experiment with (more later).

https://www.tomshardware.com/reviews/amd-ryzen-3-2200g-raven-ridge-cpu,5472.html

As part of the discussion the article raises the issue that when both the cpu and gpu are fully engaged there is going to be performance limitations because of the bus speed and the speed and number of channels of the memory.

So James, one question. You said you have 8 GB of memory. I forgot to ask, is that 1 stick or 2 sticks of ram? Can you tell if you have dual channel memory if it is 2 sticks?

The reason I am asking is because two channel memory (2 or 4 etc. sticks apparently?) will speed up this kind of integrated cpu/gpu system. This presumes the motherboard supports dual or higher channel memory.

The "simple" change I would propose is after you have run your system long enough, without any material changes, to establish a baseline of RAC/Performance, you might try using the Boinc Manager local configuration menu to drop 1 of your cpu cores offline.

If this works, the other 3 cores would start processing at 100% and your gpu would start processing at 100% and your RAC would start going up.

A preliminary indicator would be the length of time your tasks are taking would decline, especially the gpu. When the length of time tasks take starts changing, you can not depend on the "estimated time to completion" column. You need to watch a couple of tasks run past they 99% to 100%. That time listed briefly at 100% is what you are interested in. Has it changed down/up from the estimated time?

If you decide to continue running your system 24/7 this will give you something to look forward to after coming home from work :)

Tom


1 stick of 8GB DDR 4 RAM single channel
&
when Boinc is running only 3 CPU cores and the GPU appear to be active(on the task tab)
ID: 1956915 · Report as offensive
James

Send message
Joined: 20 Oct 07
Posts: 13
Credit: 4,749,353
RAC: 51
Australia
Message 1956916 - Posted: 23 Sep 2018, 12:38:15 UTC - in response to Message 1956855.  

Since its a built in GPU i would rather disable GPU crunching and see if this helps.
Probably you also have to disable one CPU core in Boinc.

I'm confused. How would disabling GPU crunching show if that would speed up the GPU crunching?

It wouldn't.
But it would allow the system to produce more work as CPUs with onboard GPUs have to share memory, heat & power limits between them. The end result is that just running all CPU cores will produce more work than running the CPU cores & the GPU as well. Hence disabling GPU crunching to improve overall throughput.


I'll give this idea a go....

No crashs today with Boinc running......... maybe a driver update has fixed my issue? Win 10 did do a big update last week.
ID: 1956916 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1956919 - Posted: 23 Sep 2018, 13:29:50 UTC - in response to Message 1956915.  


1 stick of 8GB DDR 4 RAM single channel
&
when Boinc is running only 3 CPU cores and the GPU appear to be active(on the task tab)


Here is where research would be important before you were to buy a second stick of memory.

1) Does the MB have 1, 2 or 4 (?) memory channels?
2) If you put single channel memory into memory slots that access different channels of memory on the MB does that allow "dual channel" access? Or must you buy dual channel memory?

I will confess to not having studied that. I do know that I was "tinkering" with a Netbook many years ago (yes it was running Seti) and it turned out to have a single channel MB (and one memory slot).

As for the effect of the system self throttling vs. using Boinc or a parameter file to idle one cpu core, I have no clue if one would be better than the other.

What would be interesting is to try to get an idea of either the theoretical or the practical production your gpu "should" be producing and see what changes would cause that to happen without slowing down your cpu production.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1956919 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : Setiathome crashing on new system


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.