Win98SE + BOINC 4.05 hangs with stack overflow

Questions and Answers : Windows : Win98SE + BOINC 4.05 hangs with stack overflow
Message board moderation

To post messages, you must log in.

AuthorMessage
Walt Gribben
Volunteer tester

Send message
Joined: 16 May 99
Posts: 353
Credit: 304,016
RAC: 0
United States
Message 20141 - Posted: 31 Aug 2004, 1:54:29 UTC
Last modified: 31 Aug 2004, 1:56:24 UTC

System is 700Mhz PIII, 256M RAM. Win98se.

Every time a workunit completes, Windows appears to lock up and BOINC eventually terminates with an "invalid page fault". Looking at the esp register shows it just dropped below the bottom of the stack frame, due to the push instruction that got the exception. While technically its an invalid page access, its from a stack overflow. Appears that its stuck in recursive calls to an exception handler (exception in an exception routine).

Is this from the BOINC_GUI program, or an artifact of the compiler?


BOINC appears "stuck" at this point and won't end. Shutdown gets stuck also, puts up a "Program is not
responding" box, only recourse is to reset.

After a few of these, put link to drwatson.exe in the startup folder to log more info on the error.

Another occurance, from the Dr Watson log:

>Trap 0c 0006 - Stack fault
>eax=c00309c4 ebx=006effec ecx=00000000 edx=bff76855 >esi=00000000 edi=bff79060
>eip=bff88396 esp=005efe98 ebp=005f0010 -- -- -- nv up EI pl nz AC po nc
>cs=0167 ss=016f ds=016f es=016f fs=2c77 gs=0000
>KERNEL32.DLL:.text+0xf396:
?>0167:bff88396 53 push ebx


From looking thru the Dr Watson output, looks like its in an exception handler.

Top stack entry, getting ready to call FREQASM+0x56a3:

016f:005f0010 0167:bff88396 KERNEL32.DLL:.text+0xf396
(006effec,bffc04d4,00000000,00000000,
005f0038,bff79060,00000000,006effec)
> 0167:bff88389 a1e09cfcbf mov eax,dword ptr [bffc9ce0]
> 0167:bff8838e 8bec mov ebp,esp
> 0167:bff88390 81ec78010000 sub esp,00000178
KERNEL32.DLL:.text+0xf396:
>*0167:bff88396 53 push ebx
> 0167:bff88397 56 push esi
> 0167:bff88398 57 push edi
> 0167:bff88399 8b7510 mov esi,dword ptr [ebp+10]
> 0167:bff8839c 8b38 mov edi,dword ptr [eax]
> 0167:bff8839e 33db xor ebx,ebx
> 0167:bff883a0 85f6 test esi,esi
> 0167:bff883a2 752d jnz bff883d1 = KERNEL32.DLL:.text+0xf3d1
> 0167:bff883a4 8db554ffffff lea esi,[ebp-000000ac]
> 0167:bff883aa 899d58ffffff mov dword ptr [ebp-000000a8],ebx
> 0167:bff883b0 c78554ffffff260000c0 mov dword ptr [ebp-000000ac],c0000026
> 0167:bff883ba 899d5cffffff mov dword ptr [ebp-000000a4],ebx
> 0167:bff883c0 e8dee2feff call bff766a3 = KERNEL32.DLL:_FREQASM+0x56a3

(astersisk shows next instruction to execute, the one above that is the last one that did execute)

Next stack entry (executed right before this one) shows it just called a stack unwind routine, thats part of the exception handler:


>016f:005f0038 0167:bffc04d4 KERNEL32.DLL:.text+0x474d4
> (006effec,005f005c,005f012c,8192e77c,
> 006effec,005f012c,005f0148,005f0080)
> 0167:bffc04bf 53 push ebx
> 0167:bffc04c0 56 push esi
> 0167:bffc04c1 57 push edi
> 0167:bffc04c2 55 push ebp
> 0167:bffc04c3 6a00 push +00
> 0167:bffc04c5 6a00 push +00
> 0167:bffc04c7 68d404fcbf push bffc04d4
> 0167:bffc04cc ff7508 push dword ptr [ebp+08]
> 0167:bffc04cf e8b47efcff call bff88388 = KERNEL32.DLL!RtlUnwind
>KERNEL32.DLL:.text+0x474d4:
>*0167:bffc04d4 5d pop ebp
> 0167:bffc04d5 5f pop edi
> 0167:bffc04d6 5e pop esi
> 0167:bffc04d7 5b pop ebx
> 0167:bffc04d8 8be5 mov esp,ebp
> 0167:bffc04da 5d pop ebp
> 0167:bffc04db c3 retd

(asterisk shows return point for routine in first stack trace entry. Shows it just called the Rtlunwind
routine)

Continuing to eyeball the stack trace shows two more calls to KERNEL32.DLL!Rtlunwind, looks like the exception handling is getting exceptions.

This all got started when BOINC_GUI called DispatchMessage:

006efcd8 00456d79 = BOINC_GUI.EXE:.text+0x55d79

>--------------------
>
> 0167:00456d60 56 push esi
> 0167:00456d61 e8fefeffff call 00456c64 = BOINC_GUI.EXE:.text+0x55c64
> 0167:00456d66 85c0 test eax,eax
> 0167:00456d68 59 pop ecx
> 0167:00456d69 750e jnz 00456d79 = BOINC_GUI.EXE:.text+0x55d79
> 0167:00456d6b 56 push esi
> 0167:00456d6c ff1588f34500 call dword ptr [0045f388] -> USER32.DLL!TranslateMessage
> 0167:00456d72 56 push esi
> 0167:00456d73 ff150cf44500 call dword ptr [0045f40c] -> USER32.DLL!DispatchMessageA
>BOINC_GUI.EXE:.text+0x55d79:
>*0167:00456d79 33c0 xor eax,eax
> 0167:00456d7b 40 inc eax
> 0167:00456d7c 5f pop edi
> 0167:00456d7d 5e pop esi
> 0167:00456d7e c3 retd

Theres lots more, but haven't had a lot of time to go thru the log in great detail.
ID: 20141 · Report as offensive
Team AUSTRALIA (AlexD)

Send message
Joined: 1 Jun 99
Posts: 54
Credit: 66,602
RAC: 0
Australia
Message 20173 - Posted: 31 Aug 2004, 2:32:26 UTC

Have same problem, my system is a PIII 600Mhz, with 512Mb RAM and Win98SE. All other machines are either WinXP Pro or unix-based, and don't have the same problem, so suspect this might be an issue with the latest Wintel client version perhaps compiled with a non-preWinXP friendly compiler (hmmm, wonder if it's written with a microsoft product???).

:(

ID: 20173 · Report as offensive
Walt Gribben
Volunteer tester

Send message
Joined: 16 May 99
Posts: 353
Credit: 304,016
RAC: 0
United States
Message 20190 - Posted: 31 Aug 2004, 2:54:56 UTC - in response to Message 20173.  

> Have same problem, my system is a PIII 600Mhz, with 512Mb RAM and Win98SE.
> All other machines are either WinXP Pro or unix-based, and don't have the same
> problem, so suspect this might be an issue with the latest Wintel client
> version perhaps compiled with a non-preWinXP friendly compiler (hmmm, wonder
> if it's written with a microsoft product???).
>
> :(

Dunno what its written with, it compiles on a bunch of platforms. For Window, its compiled with Microsofts Visual C++. They have project/workspace files for both VS6 and VS7.


ID: 20190 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 20206 - Posted: 31 Aug 2004, 3:10:58 UTC

I keep telling people, DO NOT USE the BOINC screensaver. The screensaver is the issue. Use any other screensaver and this issue will go away.

If you are not using the screensaver, then I am unsure what the issue is. I have a P1-166, and a P1-200MMX both running BOINC 4.05 for over 24 hours, and neither have crashed. I am using the BLANK screensaver.

The screensaver in 2000/XP still does not work correctly, either. It now says "will be fixed in furture version".
ID: 20206 · Report as offensive
Team AUSTRALIA (AlexD)

Send message
Joined: 1 Jun 99
Posts: 54
Credit: 66,602
RAC: 0
Australia
Message 20272 - Posted: 31 Aug 2004, 5:40:40 UTC - in response to Message 20206.  
Last modified: 1 Sep 2004, 6:54:39 UTC

> I keep telling people, DO NOT USE the BOINC screensaver. The screensaver is
> the issue. Use any other screensaver and this issue will go away.

Not using BOINC/SETI@home as the screen saver, never have, not even with S@HC, it is always running as a background process. Anyway, as per the news item on the home page, it's being investigated by the BOINC people.

ID: 20272 · Report as offensive
DevPao

Send message
Joined: 10 Mar 00
Posts: 3
Credit: 36,549
RAC: 0
Italy
Message 20319 - Posted: 31 Aug 2004, 9:01:40 UTC

I never had the screensaver or the graphics active but I have the issue.

Given the poor win98 memory management I tried a memory optimizer.
The crash happens while free memory is at a low mark and the tool starts to shuffle memory chunks to recover memory.

Maybe the issue is memory fragmentation mishandled by win98xx.
ID: 20319 · Report as offensive
DevPao

Send message
Joined: 10 Mar 00
Posts: 3
Credit: 36,549
RAC: 0
Italy
Message 20324 - Posted: 31 Aug 2004, 9:21:20 UTC - in response to Message 20319.  

> I never had the screensaver or the graphics active but I have the issue.
>
> Given the poor win98 memory management I tried a memory optimizer.
> The crash happens while free memory is at a low mark and the tool starts to
> shuffle memory chunks to recover memory.
>
> Maybe the issue is memory fragmentation mishandled by win98xx.
>

After this I rebooted, boinc started and everythink was smooth.

There was a completed CP wu and when the upload started, again, the pc started to become unresponding.
ID: 20324 · Report as offensive
Profile Zlartibartfast

Send message
Joined: 22 May 99
Posts: 36
Credit: 64,493
RAC: 0
United States
Message 20631 - Posted: 31 Aug 2004, 22:44:43 UTC - in response to Message 20206.  

> I keep telling people, DO NOT USE the BOINC screensaver. The screensaver is
> the issue. Use any other screensaver and this issue will go away.
>
> If you are not using the screensaver, then I am unsure what the issue is. I
> have a P1-166, and a P1-200MMX both running BOINC 4.05 for over 24 hours, and
> neither have crashed. I am using the BLANK screensaver.
>
> The screensaver in 2000/XP still does not work correctly, either. It now says
> "will be fixed in furture version".
>
> <img> src="http://boinc.mundayweb.com/seti2/stats.php?userID=1183&trans=off">
>
I have never wasted cycles with screensavers, and until this (http://setiweb.ssl.berkeley.edu/sah/forum_thread.php?id=3403)
I never had a problem with a SETI client (5 years now). There IS an issue with BOINC v4.x under Win 9X/ME (of course ME is an issue by itself) I wish I knew enought to tell you what it was but all I know now is that NT kernels (4,5, possibly 6) and Unix/Linux kernels seem unnaffected. Mac users? don't know - probably not
ID: 20631 · Report as offensive

Questions and Answers : Windows : Win98SE + BOINC 4.05 hangs with stack overflow


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.