Blue Screen:REFERENCED_BY_POINTER - what could be the reason?

Message boards : Number crunching : Blue Screen:REFERENCED_BY_POINTER - what could be the reason?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1301608 - Posted: 3 Nov 2012, 10:12:33 UTC
Last modified: 3 Nov 2012, 10:12:45 UTC

My main cruncher started to reboot itself recently.
When I disabled auto-restart I saw BSoD with REFERENCED_BY_PINTER bugcheck message.
AFAIK no system-wide updates were done immediately before these reboots begin.

Anyone encountered this type of BSoD and what were the reasons (I read MS article about ref/deref mismatch, but don't know what driver cause this)?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1301608 · Report as offensive
Highlander
Avatar

Send message
Joined: 5 Oct 99
Posts: 167
Credit: 37,987,668
RAC: 16
Germany
Message 1301615 - Posted: 3 Nov 2012, 10:31:19 UTC

only one question Raistmer: do you use a NDAS?
- Performance is not a simple linear function of the number of CPUs you throw at the problem. -
ID: 1301615 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1301618 - Posted: 3 Nov 2012, 10:47:42 UTC - in response to Message 1301615.  

only one question Raistmer: do you use a NDAS?

If NDAS is Network Direct Attached Storage then no, don't use.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1301618 · Report as offensive
Lazydude
Volunteer tester

Send message
Joined: 17 Jan 01
Posts: 45
Credit: 96,158,001
RAC: 136
Sweden
Message 1301622 - Posted: 3 Nov 2012, 10:54:46 UTC - in response to Message 1301608.  

If you are using dumpfiles:
This is a set of tools for analyse of what happend.
Look at "Who Crashed".

http://www.resplendence.com/downloads

Hope this will help you
ID: 1301622 · Report as offensive
Morten Ross
Volunteer tester
Avatar

Send message
Joined: 30 Apr 01
Posts: 183
Credit: 385,664,915
RAC: 0
Norway
Message 1301660 - Posted: 3 Nov 2012, 13:22:35 UTC

First of all you should ensure only minidumps are created, as full memeory dump is not necessary unless third party needs it to create a fix.

I've used BluescreenView for many years as it's to the point to quickly find which drivers were loaded.

Some more URIs on the topic here on MSDN.
Morten Ross
ID: 1301660 · Report as offensive
Profile janneseti
Avatar

Send message
Joined: 14 Oct 09
Posts: 14106
Credit: 655,366
RAC: 0
Sweden
Message 1301684 - Posted: 3 Nov 2012, 14:48:13 UTC - in response to Message 1301608.  

If it is your computer with AMD Radeon HD 6900 series which are malfunctioning, it is perhaps the same problem I have had for a long time.
It started with Catalyst version 12.4 that made my computer started BSOD with the error message REFERENCED_BY_POINTER sometimes.
But when I recently upgraded to version 12.10, it seems that the problem has disappeared.
I have now been using the driver for over a week without BSOD occurs.
ID: 1301684 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1301813 - Posted: 3 Nov 2012, 21:01:02 UTC

Thanks all for suggestions.
Will try to implement them.
Regarding HD6950 - yes, it's that host. But driver installed long time ago (Cat 12.6). Some time ago there were often reboots, then I reduced number of running simultaneously tasks to 1 they gone. Recently I set again 2 tasks simultaneously and that config worked for week or even more. But now restarts (and BSoDs now, when restar was disabled) go too often to do anything useful on that host.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1301813 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6651
Credit: 121,090,076
RAC: 0
United States
Message 1301833 - Posted: 3 Nov 2012, 22:04:07 UTC

When that happened to me this summer, it was because I had run the CPU at 1.47 VDC for about 3 years at 4.2 GHz, and eventually fried the CPU. I orginally thought it was a Window update, then the hard drive, then the RAM, then the motherboard. I knew it wasn't the PSU, so all that was left was the CPU. I replaced the i7 980 with an old i7 950, and it worked just fine. I really do miss that unlocked multiplier though....

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1301833 · Report as offensive
Profile janneseti
Avatar

Send message
Joined: 14 Oct 09
Posts: 14106
Credit: 655,366
RAC: 0
Sweden
Message 1301852 - Posted: 3 Nov 2012, 23:05:13 UTC - in response to Message 1301813.  

Try upgrade your drivers.
Your are using OpenCL 1.2 AMD-APP (938.1) which in my case causes problems.
I'm using OpenCL 1.2 AMD-APP (1016.4) right now without problems.
ID: 1301852 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1301859 - Posted: 3 Nov 2012, 23:22:58 UTC - in response to Message 1301833.  

When that happened to me this summer, it was because I had run the CPU at 1.47 VDC for about 3 years at 4.2 GHz, and eventually fried the CPU. I orginally thought it was a Window update, then the hard drive, then the RAM, then the motherboard. I knew it wasn't the PSU, so all that was left was the CPU. I replaced the i7 980 with an old i7 950, and it worked just fine. I really do miss that unlocked multiplier though....

Steve


It's Q35M motherboard w/o any overclocking features... And fan replaced from stock to Titan long time ago. CPU temp quite low and moreover, host lack of CPU work last few days... and still hangs. So, probably something else in my case.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1301859 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1301860 - Posted: 3 Nov 2012, 23:24:21 UTC - in response to Message 1301852.  
Last modified: 3 Nov 2012, 23:26:04 UTC

Try upgrade your drivers.
Your are using OpenCL 1.2 AMD-APP (938.1) which in my case causes problems.
I'm using OpenCL 1.2 AMD-APP (1016.4) right now without problems.


I will leave it as is through night with mini-dumps enabled. Maybe I will catch faulty AMD driver to rise the case with their support once again...
Then will do update.
[Not sure update it should be or rollback though. Just discovered that app built with old APP SDK 2.5 faster than built with APP SDK 2.6 ... And will support much more hosts ...]
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1301860 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1301869 - Posted: 3 Nov 2012, 23:56:51 UTC - in response to Message 1301860.  

Try upgrade your drivers.
Your are using OpenCL 1.2 AMD-APP (938.1) which in my case causes problems.
I'm using OpenCL 1.2 AMD-APP (1016.4) right now without problems.


I will leave it as is through night with mini-dumps enabled. Maybe I will catch faulty AMD driver to rise the case with their support once again...
Then will do update.
[Not sure update it should be or rollback though. Just discovered that app built with old APP SDK 2.5 faster than built with APP SDK 2.6 ... And will support much more hosts ...]


And will cause much more trouble because all those buggy drivers.



With each crime and every kindness we birth our future.
ID: 1301869 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1302478 - Posted: 5 Nov 2012, 14:14:33 UTC
Last modified: 5 Nov 2012, 14:17:04 UTC

Looks like motherboard for that host is completely dead now.
After another BSoD it refused to reboot.
Starts and stops CPU and system fans in loop.
I removed already all but CPU itself.... Even replaced PSU with old one. Same behavior.
Now when PSU switched ON (by own hard switch) it sometimes starts to spinup system and CPU fans few times sometimes even not start at all. No reaction on "Power" button pressing (on the front side of case).

Before this motherboard performed short fans and HDD spin up when PSU power ON then turned off awaiting power button press.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1302478 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1305365 - Posted: 12 Nov 2012, 14:41:53 UTC

After motherboard replace same BSoD happened again.
But now it was catched!

WhoCrashed gave next info:


This was probably caused by the following module: atikmdag.sys (atikmdag+0x14CF5)
Bugcheck code: 0x18 (0xFFFFFFFF84F7ADB0, 0xFFFFFFFF878621D8, 0x1, 0x1)
Error: REFERENCE_BY_POINTER
file path: C:\Windows\system32\drivers\atikmdag.sys
product: ATI Radeon Family
company: Advanced Micro Devices, Inc.
description: ATI Radeon Kernel Mode Driver


So, it's AMD crappy driver again...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1305365 · Report as offensive

Message boards : Number crunching : Blue Screen:REFERENCED_BY_POINTER - what could be the reason?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.