Issue running Seti projects

Questions and Answers : Windows : Issue running Seti projects
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1384799 - Posted: 26 Jun 2013, 10:22:40 UTC
Last modified: 26 Jun 2013, 10:23:30 UTC

@TJ13

Please try to substitude MB Ati app you running now with app from this pack: https://dl.dropboxusercontent.com/u/60381958/MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874_OCL_SYNCHED.7z

You need to unpack all files into project folder and edit your app_info.xml file for that.
You can either replace whole Ati app section with one provided in *.aistub file (inside archive0 or manually change executable file name (there are few places in app_info to change, not single one).

It's build that performs OpenCL queue synchronization after each OpenCL runtime API call. I expect no driver restarts for this build, but will see.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1384799 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384807 - Posted: 26 Jun 2013, 11:48:10 UTC - in response to Message 1384799.  
Last modified: 26 Jun 2013, 12:39:23 UTC


OK, if Raistmer do not mind I will elaborate on his instructions ;)

- You may skip for now changing TdrDelay, try this 'r1874_OCL_SYNCHED' build first -

! Upload and Report all finished tasks ([Update] project)
Exit BOINC (check in Windows Task Manager that boinc.exe 'vanishes' from memory)

In 'project folder':
Delete: (or move to some other folder)
MB7_win_x86_SSE_OpenCL_ATi.aistub

Extract all files from MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874_OCL_SYNCHED.7z and copy them to:
C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\
(this above directory is what we call 'the project folder')

* You will need WinRAR or the free 7-Zip to uncompress .7z (I use both WinRAR and 7-Zip):
http://www.7-zip.org/

* in fact not all the files need to be copied but it's easier for us to explain
* for some of the files you will get 'overwrite warning' - allow the overwrite (files in this .7z are either the same (e.g. libfftw3f-3.dll) or newer)


Read the full post before next step:

Option A - In 'project folder' find and run this file:
aimerge.cmd

(It will create new app_info.xml from all *.aistub files)

You may lose some tasks of type 'opencl_ati5_sah' (that is why 'Upload and Report all finished tasks' before any changes)
because this new .aistub have only these inside:
<plan_class>ati_opencl_sah</plan_class>
<plan_class>opencl_ati_sah</plan_class>

(with Option A in effect will be: mb_cmdline_win_x86_SSE_OpenCL_ATi_HD5.txt)


Option B - Or instead of running aimerge.cmd you can "edit your app_info.xml file for that":

Open app_info.xml with Notepad

Press Ctrl+H
Find what: MB7_win_x86_SSE_OpenCL_ATi_r1843.exe
Replace with: MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe
[Replace All]

Save app_info.xml

(with Option B in effect will be: mb_cmdline_win_x86_SSE_OpenCL_ATi.txt)


Now you can start BOINC


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384807 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384823 - Posted: 26 Jun 2013, 13:31:47 UTC - in response to Message 1384807.  

OK. I did all that and did Option B. Then it didn't run properly so I reran Lunatics and made sure the ATi was on HD5. Then I got the driver restart message again so I reran Lunatics and made sure the Ati wasn't on HD5 and got the same driver restart message. So I dunno what I might have done wrong.
ID: 1384823 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384830 - Posted: 26 Jun 2013, 13:58:01 UTC - in response to Message 1384823.  
Last modified: 26 Jun 2013, 14:11:17 UTC

OK. I did all that and did Option B. Then it didn't run properly ...

"didn't run properly" may mean many things (e.g. error messages in Event Log)

Most probably you copy/pasted spaces around the name.
After 'Option B' the lines should look like:
<name>MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe</name>
<file_name>MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe</file_name>
(total of 9 lines will change)

... and not any of these:
<name> MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe</name>
<file_name>MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe </file_name>
<name> MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe </name>
<file_name>MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.ex</file_name>
<name>MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874</name>


If you are in doubt - after the edit post your app_info.xml
Also (when BOINC run with the new edited app_info.xml) look in Event Log for 'strange' error messages.

And of course "didn't run properly" may mean this app (r1874) itself may be broken (it is test/beta build - new feature added)
Clarify what you mean by "didn't run properly" ...


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384830 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384839 - Posted: 26 Jun 2013, 14:43:47 UTC - in response to Message 1384830.  
Last modified: 26 Jun 2013, 14:45:42 UTC

By "didn't run properly" I meant I got the notice that the driver restarted and recovered successfully.

Anyway I just got a Notice saying "File referenced in app_info.xml does not exist: MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe" when I restarted BOINC after I re-replaced the lines to make sure there weren't any spaces in my copy and paste. Now I only have Astropule and seti 7.00 tasks so I don't know if I will still have the issue or not.
ID: 1384839 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1384866 - Posted: 26 Jun 2013, 16:06:06 UTC - in response to Message 1384839.  
Last modified: 26 Jun 2013, 16:36:23 UTC

Anyway I just got a Notice saying "File referenced in app_info.xml does not exist: MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe"

Then it's clear what to do (you may guess yourself from the message):
Check that MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe exist (in the same dir where app_info.xml is)

If not:
Exit BOINC
Again extract all files from MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874_OCL_SYNCHED.7z and copy them to C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\
Start BOINC

(make/keep copy somewhere of your current app_info.xml for reference, it is probably OK (post it for us to see), just the .exe file is missing)

After you see no more "File referenced .... does not exist" messages
check (in Windows Task Manager) that MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe really is in memory (and not some older .exe)
Then Reboot the computer to start ATI driver clean


The files may/will 'vanish' because of this:
"Then it didn't run properly so I reran Lunatics ..."
Lunatics installer moves many files (*.exe *.dll *.bin ...) to oldApp_backup
before putting/extracting its 'new' files

(if you now have 7-Zip you may open with it Lunatics_Win64_v0.41_setup.exe
and see what files are inside and even extract them)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1384866 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1384898 - Posted: 26 Jun 2013, 19:00:38 UTC - in response to Message 1384866.  

OK repaired it so that when I start BOINC I get no Notices about "File referenced .... does not exist". Restarted my laptop, but I currently don't have any files that I couldn't finish so I'll find out in the next few days I guess if it worked or not.
ID: 1384898 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1385111 - Posted: 27 Jun 2013, 15:03:03 UTC - in response to Message 1384898.  

Apparently something is still wrong. I let it run last night and it downloaded new files that have issues. Now instead of stopping at 20% they don't progress past 0%. I also get an error message in the Event Log.

6/27/2013 10:59:48 AM | SETI@home | Task 18ap08ai.20154.1299.8.12.12_0 exited with zero status but no 'finished' file
6/27/2013 10:59:48 AM | SETI@home | If this happens repeatedly you may need to reset the project.
6/27/2013 10:59:48 AM | SETI@home | Restarting task 18ap08ai.20154.1299.8.12.12_0 using setiathome_v7 version 703 (opencl_ati_sah) in slot 0


And judging by the full event log it has happened repeatedly. Should I go ahead and reset the project or hold off for now?
ID: 1385111 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1385182 - Posted: 27 Jun 2013, 19:12:14 UTC

Just wondering if backing off to an older driver might help?


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1385182 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1385193 - Posted: 27 Jun 2013, 19:37:03 UTC - in response to Message 1385111.  

Apparently something is still wrong. I let it run last night and it downloaded new files that have issues. Now instead of stopping at 20% they don't progress past 0%. I also get an error message in the Event Log.

6/27/2013 10:59:48 AM | SETI@home | Task 18ap08ai.20154.1299.8.12.12_0 exited with zero status but no 'finished' file
6/27/2013 10:59:48 AM | SETI@home | If this happens repeatedly you may need to reset the project.
6/27/2013 10:59:48 AM | SETI@home | Restarting task 18ap08ai.20154.1299.8.12.12_0 using setiathome_v7 version 703 (opencl_ati_sah) in slot 0


And judging by the full event log it has happened repeatedly. Should I go ahead and reset the project or hold off for now?


Could you post stderr log ?

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1385193 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1385217 - Posted: 27 Jun 2013, 20:52:27 UTC - in response to Message 1385111.  

And judging by the full event log it has happened repeatedly. Should I go ahead and reset the project or hold off for now?

No! This message is aimed to stock apps users.
'reset project' will delete all tasks and apps (exe, dll files) for stock apps users.
For you it will only delete all the tasks (even finished but not still reported)


P.S.
Please don't change the setup (apps, app_info.xml) before reporting the current stderr.txt to Raistmer (will help him change the app to fix issues)
(during testing you may from time to time save somewhere the stderr.txt files for different tasks - e.g. by compressing all 'slots' into slots-A.7z , slots-B.7z ...)

What Raistmer asks from you with "Could you post stderr log ?" is what you already know how to do:
- in BOINC Manager select a problem task and click [Properties]
on 'Properties of task ...' window at bottom look for 'Directory slots/0' <- note the number

Now go to:
C:\ProgramData\BOINC\slots\

... and go into the slot number (subdirectory) you already know from 'Properties of task ...'

Open the stderr.txt and look at the messages.
If not very long - post it here (if long - you already know where and how ;) )


PPS:
To make any link clickable - select/mark the full line/URL and click the [URL] button above (when in 'Post to thread' page):

If you press [Quote] under my post you will see the difference between the 2 lines:
http://pastebin.com/z4ntPU8g
http://pastebin.com/z4ntPU8g


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1385217 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1385281 - Posted: 28 Jun 2013, 3:57:31 UTC - in response to Message 1385217.  

ID: 1385281 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1385438 - Posted: 28 Jun 2013, 16:47:03 UTC - in response to Message 1385281.  


Looks like the app restarted ~44 times without actual processing (no first checkpoint == no messages 'Restarted at XX.xx percent.')

Did you see usage of CPU and GPU by this app/exe ?


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1385438 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1385499 - Posted: 28 Jun 2013, 18:37:10 UTC - in response to Message 1385438.  

It seems to be hovering around 2% CPU and 69.3 MB of Memory. I assume Memory in task manager means GPU. I seem to have three tasks running at the same time now (which is weird since when I started I could only run one). Two are 7.00 and are running fine using about 50% total (25% each) of CPU and 60 MB (29.9 MB each) of Memory.
ID: 1385499 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1385649 - Posted: 29 Jun 2013, 3:18:20 UTC - in response to Message 1385499.  

I assume Memory in task manager means GPU.

Of course not.

(I don't know if I have to explain this:
GPU (Graphics Processing Unit) is the chip used in video cards.
In your case of AMD A10-4600M APU the GPU is inside the CPU package ('integrated graphics'):
http://www.cpu-world.com/CPUs/Bulldozer/AMD-A10-Series%20A10-4600M.html
http://www.notebookreview.com/default.asp?newsID=6472&review=amd+trinity+apu+a10&p=2
http://www.notebookcheck.net/Trinity-in-Review-AMD-A10-4600M-APU.74852.0.html
)

To see GPU load/usage/utilisation % (term differs depending on the author of the program)
use any of: SIV, GPU-Z, GPU Caps Viewer, MSI Afterburner, EVGA Precision, ...


I seem to have three tasks running at the same time now (which is weird since when I started I could only run one). Two are 7.00 and are running fine using about 50% total (25% each) of CPU and 60 MB (29.9 MB each) of Memory.

We can't know what you mean by "when I started"
Is it when you started back on 20 Oct 2006 using Intel Celeron M processor 1.50GHz ?:
http://setiathome.berkeley.edu/show_host_detail.php?hostid=2791839

Or is it when you first started using BOINC on this laptop:
http://setiathome.berkeley.edu/show_host_detail.php?hostid=6973921

In normal circumstances on 'this laptop' you have to see 5 tasks: 4 CPU and 1 GPU
But if you use:
"To free 1 CPU core:
On multiprocessors, use at most 99% of the processors

To free 2 CPU cores (for your AMD A10-4600M APU "Number of processors 4"):
On multiprocessors, use at most 50% of the processors

... less CPU tasks will run.
(so probably you "use at most 50% of the processors" to see 2 CPU and 1 GPU tasks running)

"7.00" is not telling us much.
I had to check in AKv8c_AVX.aistub to see <version_num>700</version_num>
and in MB7_win_x86_SSE_OpenCL_ATi.aistub to see <version_num>703</version_num>
to (possibly) know what that means.

As already discussed:
To check which app you really run:
- in Windows Task Manager check which .exe is actually running.
Probably:
AKv8c_Bb_r1846_winx86_AVXx.exe
AKv8c_Bb_r1846_winx86_AVXx.exe
MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1385649 · Report as offensive
TJ13

Send message
Joined: 13 Mar 06
Posts: 35
Credit: 7,970,069
RAC: 6
United States
Message 1385680 - Posted: 29 Jun 2013, 4:58:19 UTC - in response to Message 1385649.  
Last modified: 29 Jun 2013, 4:58:44 UTC

We can't know what you mean by "when I started"

When I started this thread and for this laptop.


AKv8c_Bb_r1846_winx86_AVXx.exe
AKv8c_Bb_r1846_winx86_AVXx.exe
MB7_win_x86_SSE_OpenCL_ATi_HD5_r1874.exe

Those are it exactly. I was just saying I have two running ok and one that's not working. I didn't think it was necessary to say which files were working sorry.

I downloaded the GPU-Z program and it says that the GPU Core Clock jumps between 496.6 MHz and 685.7 MHz when it's trying to run the task but when it restarts (after registering in the event log that it exited with zero status) it stops and only does 334.9 MHz. Similarly the GPU load hovers between 75% and 99% when it's running and drops to 0% (when it registers in the event log the same as above). I'm not sure if the GPU Temperature is relevant or not but it is about 103* C.
ID: 1385680 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1385693 - Posted: 29 Jun 2013, 5:21:16 UTC - in response to Message 1385680.  

I'm not sure if the GPU Temperature is relevant or not but it is about 103* C.

Uh-oh!
Stop using the GPU immediately!
(From Activity menu in BOINC)

I don't think that any GPU can run reliably at 103°C (unless it is 103°F ?)
It is dangerous for the chip.

We didn't know it is so high until now.
Earlier I asked:
"Check in SIV what are the Temperatures of CPU and GPU during the computations."
Your answer was "GPU is at 54* C":
http://setiathome.berkeley.edu/forum_thread.php?id=72035&postid=1383702#1383702


This page say:
Maximum operating temperature ? 100°C
http://www.cpu-world.com/CPUs/Bulldozer/AMD-A10-Series%20A10-4600M.html

(100°C is for the CPU but in your case CPU+GPU are on the same chip)


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1385693 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1385705 - Posted: 29 Jun 2013, 6:29:39 UTC

My C-60 APU netbook can work under ~90C but will shutdown with slightly higher...
100C unreachable for it.

The reason of app restart not quite clear.
Would be good to test this build on some other known to work host. I still had no time for this.

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1385705 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22200
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1385706 - Posted: 29 Jun 2013, 6:34:39 UTC

If it is hitting 103C then it is most likely doing a protective thermal shutdown.
Check the cooling system for "dust bunnies", that the fan(s) are actually running (I had a one that reported to be working but had an internal fault and it was just windmilling at very low speed not screaming its head off at ~3000rpm.)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1385706 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1385728 - Posted: 29 Jun 2013, 7:31:37 UTC - in response to Message 1385693.  


After you stop using the GPU for BOINC: "Use GPU never" from Activity menu
you may do some other controlled test (to see if there are general problems with cooling and stability):

Get GPU Caps Viewer (Zip archive)
http://www.ozone3d.net/gpu_caps_viewer/

Extract the Zip
Run GpuCapsViewer.exe

It will just show some info like GPU-Z

Now start SIV and/or GPU-Z to monitor GPU load and Temperature all the time.
(GPU Caps Viewer shows those also but it's good to have 'second opinion' by other utility)

Next tests can be aborted at any time by <Esc>

At the bottom of GPU Caps Viewer there are 'demos' (they are the tests I'm talking about)
At the right are OpenCL demos - run any/all of them with GPU in name, e.g.:
"CL GPU - 4D Quaternion Julia Set"
(if you run the same demos but with CPU in name they will run slower and will load/heat more the CPU instead of GPU)

At the left are OpenGL demos - run any/all of them.
! Be careful with "GL 2.1 - Furry Cube" - it heats most, be ready to abort it!


At all times monitor the Temperatures
If you see more than 80-85°C - abort the tests


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1385728 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Questions and Answers : Windows : Issue running Seti projects


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.