Issue running Seti projects

Questions and Answers : Windows : Issue running Seti projects
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1383987 - Posted: 23 Jun 2013, 18:18:50 UTC - in response to Message 1383934.  

Apparently there's an even finer difference between the tasks that work and don't. I apparently finished two files that are 7.03, but also failed two new ones (getting stuck at 20% again). The difference is the deadline (July 12 works; August 14 doesn't) and the Name (14j09ab.13842.13155.3.12. series don't work; 10dc08af.26426.19704.9.12. series does work).

No idea why this is the case but there it is.
ID: 1383987 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1383999 - Posted: 23 Jun 2013, 19:00:32 UTC

I see no aborted/errored tasks after you moved to anonymous platform.
If you see driver restarts now do abort at least one such task.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1383999 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384215 - Posted: 24 Jun 2013, 15:56:38 UTC - in response to Message 1383999.  

Went ahead and aborted one of the stuck tasks. I'd been suspending them rather than doing them.
ID: 1384215 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384256 - Posted: 24 Jun 2013, 16:40:51 UTC - in response to Message 1383987.  
Last modified: 24 Jun 2013, 16:53:25 UTC


This is your first task (finished OK) after you 're-run Lunatics installer and chose non-HD5 app' (MB7_win_x86_SSE_OpenCL_ATi_r1843.exe):
http://setiathome.berkeley.edu/result.php?resultid=3047300762

(non-HD5 app shows: Build features: SETI7 Non-graphics OpenCL OCL_CHIRP3 FFTW AMD specific USE_SSE x86
the HD5 app shows: Build features: SETI7 Non-graphics OpenCL USE_OPENCL_HD5xxx OCL_ZERO_COPY OCL_CHIRP3 FFTW AMD specific USE_SSE x86
)

You used:
-period_iterations_num 40 -sbs 128 -hp


Number of period iterations for PulseFind set to:40
Maximum single buffer size set to:128MB
Priority of worker thread raised successfully
Priority of process adjusted successfully, high priority class used

(I know this is your first task with this app as it shows:
"WARNING: can't open binary kernel file ......, continue with recompile..."
Next tasks do not (and will not) show this 'WARNING' as the binary kernel file compile is already done (and some .bin files are now saved next to the app .exe)
http://setiathome.berkeley.edu/result.php?resultid=3047300764
)

***

To check which app you really run:
- in Windows Task Manager check which .exe is actually running:
MB7_win_x86_SSE_OpenCL_ATi_r1843.exe
MB7_win_x86_SSE_OpenCL_ATi_HD5_r1843.exe

- Go to:
C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\

Open with Notepad the file app_info.xml
Ctrl+F and paste one of these strings to Find:
MB7_win_x86_SSE_OpenCL_ATi_r1843.exe
MB7_win_x86_SSE_OpenCL_ATi_HD5_r1843.exe

(there have to be only the first, the second will/have to give 'Cannot find "MB7_win_x86_SSE_OpenCL_ATi_HD5_r1843.exe"')

***

What is shown in BOINC Manager is not really definitive (do not show which app/exe you really run), it depends on lines in app_info.xml as:
<plan_class>opencl_ati5_sah</plan_class>
<plan_class>opencl_ati_sah</plan_class>
<plan_class>opencl_ati_cat132</plan_class>
<plan_class>opencl_ati5_cat132</plan_class>

(in BOINC Manager you may see any of these irrelevant of what you really run: the 'HD5' or 'non-HD5' app )


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384256 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384310 - Posted: 24 Jun 2013, 18:09:02 UTC - in response to Message 1384256.  

In Task Manager MB7_win_x86_SSE_OpenCL_ATi_r1843.exe (the non HD5 one) shows up when I run a task.
ID: 1384310 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384359 - Posted: 24 Jun 2013, 20:43:44 UTC - in response to Message 1384215.  

Went ahead and aborted one of the stuck tasks. I'd been suspending them rather than doing them.

Pity that BOINC doesn't save log in case of abortion via GUI. No need to do that again - no info recived :/
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384359 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384363 - Posted: 24 Jun 2013, 20:45:39 UTC - in response to Message 1384310.  

In Task Manager MB7_win_x86_SSE_OpenCL_ATi_r1843.exe (the non HD5 one) shows up when I run a task.


Try to remove -hp from command line options (and leave other options there).

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384363 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384421 - Posted: 25 Jun 2013, 2:19:30 UTC - in response to Message 1384363.  

In Task Manager MB7_win_x86_SSE_OpenCL_ATi_r1843.exe (the non HD5 one) shows up when I run a task.

Try to remove -hp from command line options (and leave other options there).

I think Raistmer want you to try this for two reasons:
- less lag for you (when GPU task runs when you use the computer)
- to see if the app will still run OK without -hp


with -hp the app .exe runs as 'high priority' process
without -hp the app .exe runs as 'below normal priority' process


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384421 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384444 - Posted: 25 Jun 2013, 4:35:27 UTC - in response to Message 1384421.  

This is the -hp that you had me add earlier correct (mb_cmdline_win_x86_SSE_OpenCL_ATi.txt)? I did that and still have the issue.

I only have tasks that get stuck at 20% right now and I've tried updating the Project to get more and don't receive them (I think it's either the dead period with no tasks available or there's none available to me right now).
ID: 1384444 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384478 - Posted: 25 Jun 2013, 6:14:19 UTC - in response to Message 1384444.  
Last modified: 25 Jun 2013, 6:22:12 UTC

This is the -hp that you had me add earlier correct (mb_cmdline_win_x86_SSE_OpenCL_ATi.txt)?

Yes
(in fact not me but Raistmer 'had you add' the mb_cmdline options)


I did that and still have the issue.

? But you have/had several OK completed tasks:
http://setiathome.berkeley.edu/results.php?hostid=6973921&offset=0&show_names=0&state=4&appid=11

Edit: Sorry, all the 'SETI@home v7 Anonymous platform (ATI GPU)' valid tasks vanished seconds after I posted (that is normal - they are removed from the list ~24 h after the validation, you had 5-6 OK ATI tasks)

Did you only started having the problem again only after you removed -hp ?


I only have tasks that get stuck at 20% right now...

You may have only 3 at the moment:
http://setiathome.berkeley.edu/results.php?hostid=6973921&offset=0&show_names=0&state=1&appid=11


... and I've tried updating the Project to get more and don't receive them ...

BOINC do not ask for more work from a project if you have [Suspend] any task for that project.
(and you can see that in Event Log: 'Not reporting or requesting tasks')

***

I wonder what is written in stderr.txt for a stuck task.

To see:
in BOINC Manager select a stuck task and click [Properties]
on 'Properties of task ...' window at bottom look for 'Directory slots/0' <- note the number

Now go to:
C:\ProgramData\BOINC\slots\

... and go into the slot number (subdirectory) you already know from 'Properties of task ...'

Open the stderr.txt and look at the messages.
If not very long - post it here (if long - you already know where and how ;) )


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384478 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384555 - Posted: 25 Jun 2013, 14:34:56 UTC - in response to Message 1384478.  

http://pastebin.com/z4ntPU8g

I have the problem only on a select few types of 7.03 tasks for some reason. Not sure what the difference is between the ones that successfully worked and don't work but I posted above that the due date and the name were the only difference I could find.
ID: 1384555 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384564 - Posted: 25 Jun 2013, 15:12:44 UTC

Stuck = driver restart on this case.

Try to incease TDR limit in windows registry.
Try to remove -hp
Will driver restarts continue ?

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384564 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384602 - Posted: 25 Jun 2013, 18:35:42 UTC - in response to Message 1384564.  

I've removed the -hp and am still stuck; don't know how to remove the TDR limit in the windows registry.
ID: 1384602 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384616 - Posted: 25 Jun 2013, 19:54:48 UTC - in response to Message 1384602.  

I've removed the -hp and am still stuck; don't know how to remove the TDR limit in the windows registry.


Described earlier in this thread: http://setiathome.berkeley.edu/forum_thread.php?id=72035&postid=1383819
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384616 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384617 - Posted: 25 Jun 2013, 19:55:47 UTC - in response to Message 1384602.  

I've removed the -hp and am still stuck; don't know how to remove the TDR limit in the windows registry.


But can you confirm that each time app stuck on ~20 smth % you get driver restart indeed?

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384617 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384624 - Posted: 25 Jun 2013, 20:29:28 UTC - in response to Message 1384617.  
Last modified: 25 Jun 2013, 20:57:04 UTC

Yes. Each time the task gets stuck on 20% the first time I receive a driver restarted successfully message. I'm just now reading the TDR page I'll get back to you when I finish doing that.

Edit: How much do you want me to increase the TDR limit by (or what delay do you want it to be at)? It says the default is 2 seconds.

Edit2: OK apparently I can't even find where this TdrLevel: REG_DWORD is in Windows 8. The page was for Vista maybe it was renamed? I tried to Run the "HKLM\System\CurrentControlSet\Control\GraphicsDrivers" but it couldn't be found also tried searching for it and couldn't find it either.
ID: 1384624 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384637 - Posted: 25 Jun 2013, 21:27:48 UTC

I'm afraid can't help with win8. Don't know too.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384637 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384758 - Posted: 26 Jun 2013, 8:17:47 UTC - in response to Message 1384624.  

Edit2: OK apparently I can't even find where this TdrLevel: REG_DWORD is in Windows 8. The page was for Vista maybe it was renamed? I tried to Run the "HKLM\System\CurrentControlSet\Control\GraphicsDrivers" but it couldn't be found also tried searching for it and couldn't find it either.

What means "I tried to Run"?
This is not a command you can "Run"

This is place in the registry, not some folder on the disk.
To go there we use RegEdit (if you are not familiar with it - don't play with RegEdit, there is no 'Undo')
(the so called 'registry' is in fact stored in several files on the disk, but to see inside and manipulate the contents of those files you need some special tool like RegEdit)

(and I will prefer to use TdrDelay instead of TdrLevel 0)


TdrDelay value do not exist in the registry, it have to be created.
(you can check by starting RegEdit, navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers and see if at the right there is no TdrDelay)

To create TdrDelay value (without using RegEdit manually):
Paste the following (blue) text in Notepad and save it:
Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\GraphicsDrivers]
"TdrDelay"=dword:00000005


Rename the .txt file to .reg (my file now have the name TdrDelay.reg)
Run (double-click) the TdrDelay.reg file (there will be prompt "Are you sure ...")
I think you need to Reboot to make any change to TdrDelay in effect.


"TdrDelay"=dword:00000005 means 5 seconds

You can change it but know the number is in Hex, e.g.
"TdrDelay"=dword:00000002 means 2 seconds
"TdrDelay"=dword:00000009 means 9 seconds
"TdrDelay"=dword:0000000A means 10 seconds
"TdrDelay"=dword:00000010 means 16 seconds

! Never make it 00000000 (zero)!


Now you may ask: "How do I remove TdrDelay that I added in the registry? (so that Windows will use its default)"

I use this way:
Copy the file TdrDelay.reg
Rename the copy to TdrDelay-NO.reg
Edit TdrDelay-NO.reg
Change the relevant line to:
"TdrDelay"=-

(just a 'minus sign' after =)
Save TdrDelay-NO.reg and Run (double-click) it

(TdrDelay will be removed from the registry despite the prompt/warning will still be "Are you sure you want to add the information in ...\TdrDelay-NO.reg to the registry?")


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384758 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384767 - Posted: 26 Jun 2013, 8:46:34 UTC - in response to Message 1384564.  

Stuck = driver restart on this case.

Try to incease TDR limit in windows registry.
Try to remove -hp
Will driver restarts continue ?

I think I see some pattern:
The VHAR tasks compute OK:
WU true angle range is : 1.476699
WU true angle range is : 2.587602
WU true angle range is : 2.587602
http://setiathome.berkeley.edu/results.php?hostid=6973921&offset=0&show_names=0&state=4&appid=11

But Normal AR tasks hang:
WU true angle range is : 0.425506
http://pastebin.com/z4ntPU8g


You know what part in the algorithm kicks-in over/under particular angle range.


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384767 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384798 - Posted: 26 Jun 2013, 10:19:00 UTC

Most probably it's driver restart on Gaussians. But maybe not.
Anyway I prepared special build to run on this host.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384798 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Questions and Answers : Windows : Issue running Seti projects


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.