FM1 LLano

Message boards : Number crunching : FM1 LLano
Message board moderation

To post messages, you must log in.

AuthorMessage
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1190352 - Posted: 31 Jan 2012, 13:24:13 UTC

Hi,

yesterday I assembled my new system, an A8 4Core with onboard HD 6550D

Since I want to use this system also for Albert@home, I installed BM 7.0.11, latest available CCC from the AMD Homepage.
I installed Lunatice optimised apps, using the settings for HD5XXX cards.
My system has a HD5830 and the On-chip HD6550.

A single Open CL wu finished yesterday, now all wu's start, work for ~15 sec's, and then stop. No error, they start again after a short time, but they do not crunch.
The message log says 'Restarting task ... using setiathome_enhanced version 610(ati13ati)'

No special settings to run more than one wu at a time.
Any idea where I should start to look at?

Alexander
ID: 1190352 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1190355 - Posted: 31 Jan 2012, 13:31:22 UTC - in response to Message 1190352.  

That sounds like you have your computer settings for run while Computer and use GPU while computer is in use clicked. Check your computing settings on your account page


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1190355 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1190371 - Posted: 31 Jan 2012, 14:01:19 UTC

Hi skildude,

sorry, I checked the settings, they seem to be ok, and Albert@home is running fine. The seti wu's try to start and switch back to 'verdrängt' after 13 - 15 secs.
ID: 1190371 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20291
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1190441 - Posted: 31 Jan 2012, 21:52:39 UTC

Not enough GPU memory available for the s@h tasks?...

Happy fast crunchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1190441 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1190483 - Posted: 31 Jan 2012, 23:20:29 UTC - in response to Message 1190352.  

...
Any idea where I should start to look at?

Alexander

The stderr.txt file produced by the application should have some interesting entries. It can be found in the slot directory, and if you select a task in BOINC Manager then click "Properties" the information shown should include a line indicating which slot directory is in use for that task.
                                                                   Joe
ID: 1190483 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1190605 - Posted: 1 Feb 2012, 6:15:45 UTC

Windows optimized S@H Enhanced application by Alex Kan
Version info: v8b2-SSE3x (AMD/Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE3x Win32 Build 386 , Ported by : Jason G, Raistmer, JDWhale

CPUID: AMD A8-3870 APU with Radeon(tm) HD Graphics
Speed: 1 x 3000 MHz
Cache: L1=64K L2=1024K
Features: MMX SSE SSE2 SSE3

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.431849
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): worker didn't respond to exit request within 2 seconds, exiting anyway.
Restarted at 3.85 percent.
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): worker didn't respond to exit request within 2 seconds, exiting anyway.


Will this help?
ID: 1190605 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1190723 - Posted: 1 Feb 2012, 17:42:43 UTC - in response to Message 1190605.  

Windows optimized S@H Enhanced application by Alex Kan
Version info: v8b2-SSE3x (AMD/Intel, Core 2-optimized v8-nographics) V5.13 by Alex Kan
SSE3x Win32 Build 386 , Ported by : Jason G, Raistmer, JDWhale

CPUID: AMD A8-3870 APU with Radeon(tm) HD Graphics
Speed: 1 x 3000 MHz
Cache: L1=64K L2=1024K
Features: MMX SSE SSE2 SSE3

Work Unit Info:
...............
Credit multiplier is : 2.85
WU true angle range is : 0.431849
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): worker didn't respond to exit request within 2 seconds, exiting anyway.
Restarted at 3.85 percent.
boinc_exit(): requesting safe worker shutdown ->
boinc_exit(): worker didn't respond to exit request within 2 seconds, exiting anyway.


Will this help?

That looks like it's from the CPU AK_v8b2 application, so doesn't say much about the OpenCL application for ATI GPU.

Your 'Restarting task ... using setiathome_enhanced version 610(ati13ati)' note implies boinc_temporary_exit() is being used. I was hoping a stderr.txt from the OpenCL app would also show why the app wasn't able to run. The BOINC wiki page assumes not enough memory available and that's the most likely thing, but other kinds of initialization problems may cause the application to ask BOINC to try again a little later.
                                                                  Joe
ID: 1190723 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1190759 - Posted: 1 Feb 2012, 21:13:27 UTC

Joe,
many thx for your efford to help.
I cannot believe that mem is a issue, more likely some other process. I've seen a short message saying 'scheduler wait' in the status-column, diappearing after a couple of seconds.
I've tried to put the other AMD-GPU project to hold; did not change anything.
Albert@home wu's run fine.

01.02.2012 13:41:12 | | Processor: 4 AuthenticAMD AMD A8-3870 APU with Radeon(tm) HD Graphics [Family 18 Model 1 Stepping 0]
01.02.2012 13:41:12 | | Processor: 1.00 MB cache
01.02.2012 13:41:12 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni cx16 syscall nx lm svm sse4a osvw ibs skinit wdt page1gb rdtscp 3dnowext 3dnow
01.02.2012 13:41:12 | | OS: Microsoft Windows 7: Home Premium x64 Edition, Service Pack 1, (06.01.7601.00)
01.02.2012 13:41:12 | | Memory: 7.50 GB physical, 14.99 GB virtual
01.02.2012 13:41:12 | | Disk: 111.69 GB total, 73.43 GB free
01.02.2012 13:41:12 | | Local time is UTC +1 hours
01.02.2012 13:41:12 | | VirtualBox version: 4.1.8
01.02.2012 13:41:12 | | NVIDIA GPU 0: GeForce 9500 GT (driver version 285.62, CUDA version 4.10, compute capability 1.1, 512MB, 473MB available, 130 GFLOPS peak)
01.02.2012 13:41:12 | | ATI GPU 0: ATI Radeon HD 5800 series (Cypress) (CAL version 1.4.1664, 1024MB, 991MB available, 2688 GFLOPS peak)
01.02.2012 13:41:12 | | ATI GPU 1: AMD SUPERSUMO (CAL version 1.4.1664, 512MB, 479MB available, 960 GFLOPS peak)
01.02.2012 13:41:12 | | OpenCL: NVIDIA GPU 0: GeForce 9500 GT (driver version 285.62, device version OpenCL 1.0 CUDA, 512MB, 473MB available)
01.02.2012 13:41:12 | | OpenCL: ATI GPU 0: Cypress (driver version CAL 1.4.1664 (VM), device version OpenCL 1.1 AMD-APP (851.4), 1024MB, 991MB available)
01.02.2012 13:41:12 | | OpenCL: ATI GPU 1: BeaverCreek (driver version CAL 1.4.1664 (VM), device version OpenCL 1.1 AMD-APP (851.4), 512MB, 479MB available)
01.02.2012 13:41:12 | | NVIDIA GPU is OpenCL-capable
01.02.2012 13:41:12 | | ATI GPU is OpenCL-capable

Alexander
ID: 1190759 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1190769 - Posted: 1 Feb 2012, 22:09:55 UTC

I've updated now to BM 7.0.12
not exact the same behaviour, one wu has now a very long stderr.txt
here is a link to it:
http://dl.dropbox.com/u/50246791/stderr.txt

And it contains the line
ERROR: OpenCL kernel/call 'Enqueueing kernel:PC_find_spike_kernel_cl' call failed (-55) in file ..\analyzeFuncs.cpp near line 3106.

Hope this helps!

Alexander
ID: 1190769 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1190827 - Posted: 2 Feb 2012, 1:39:29 UTC - in response to Message 1190769.  

I've updated now to BM 7.0.12
not exact the same behaviour, one wu has now a very long stderr.txt
here is a link to it:
http://dl.dropbox.com/u/50246791/stderr.txt

And it contains the line
ERROR: OpenCL kernel/call 'Enqueueing kernel:PC_find_spike_kernel_cl' call failed (-55) in file ..\analyzeFuncs.cpp near line 3106.

Hope this helps!

Alexander

Hmm, the reported error is fairly inscrutable. Raistmer has pointed out elsewhere that -55 is the numerical code for CL_INVALID_WORK_ITEM_SIZE but the selected ATI GPU has "Max work group size: 256" so that should not be the problem. It's curious that so little information is presented for the BeaverCreek GPU within the APU, but that doesn't seem likely to be the cause.

I'm out of my depth on this, but I'll note that others who have had unexplained -55 errors have cured them by changing to different drivers. And I'll also note that with both ATI and NVIDIA GPUs in the system, ATI drivers after NV drivers is the install order which users have found actually works.

Beyond that, I hope those with some practical OpenCL experience will look for clues in the stderr.txt you captured. If the task happened to finish, BOINC would only report the last 64KB. That would actually be fairly complete, the over 500KB has 126 iterations of the same cycle, but it's better to know than have to guess it's all repetitive.
                                                                  Joe
ID: 1190827 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1190842 - Posted: 2 Feb 2012, 3:31:28 UTC
Last modified: 2 Feb 2012, 3:36:56 UTC

-55 is the invalid workgroup size error indeed and in this case can be caused mixed ATi/NV setup.

To confirm this try to remove NV GPU from PC and restart that task.
Will error appear?

EDIT: one possibility would be to switch GPUs inside case (physically) to get different reporting order for OpenCL enumeration. Also, reinstalling AMD driver after NV one could give different order too.

The switch to non-HD5 build would not help but switching to LHD4K build could be helpful (though this variant definitely will lower host performance for ATi GPUs so better to switch cards for now).
Hope this issue will be removed in upcoming build.
ID: 1190842 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1190863 - Posted: 2 Feb 2012, 6:01:15 UTC

THX for all the help.

I've set my nVidia project to no new work. When the remaining tasks are done I'll try to switch the cards or even remove the nvidia.
BTW, the drivers were installed in the posted order, nVidia first and AMD afterwards.

Alexander
ID: 1190863 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1191016 - Posted: 2 Feb 2012, 18:38:47 UTC - in response to Message 1190863.  

THX for all the help.

I've set my nVidia project to no new work. When the remaining tasks are done I'll try to switch the cards or even remove the nvidia.
BTW, the drivers were installed in the posted order, nVidia first and AMD afterwards.

Alexander


I'm testing build with fix for your issue so maybe GPU removal would be not needed soon...
ID: 1191016 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1191020 - Posted: 2 Feb 2012, 18:52:07 UTC
Last modified: 2 Feb 2012, 18:53:13 UTC

well,

removing the nvidia card did the trick ...
Exchanging the pci-e slots is a problem due to available space.

So, what is the source of the problem and how can that be solved? Different settings in the lunatics installer or is it a scheduler-problem, that needs to be postet in the boinc-forum?

Alexander

Edit: a couple of seconds too late, Raistmer already posted an answer ...
ID: 1191020 · Report as offensive
garfield
Volunteer tester

Send message
Joined: 4 Jan 02
Posts: 45
Credit: 7,409,265
RAC: 65
Austria
Message 1191336 - Posted: 3 Feb 2012, 23:51:25 UTC

I've got the new built, it's working now.
http://setiathome.berkeley.edu/show_host_detail.php?hostid=6392383

THX a lot for the help!

Alexander
ID: 1191336 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1191404 - Posted: 4 Feb 2012, 6:49:36 UTC

If someone met the same problem, here is the link to updated binaries: http://dl.dropbox.com/u/60381958/MB7_r426hd5.7z
Bugfix will be included in next ordinary release of course.
ID: 1191404 · Report as offensive

Message boards : Number crunching : FM1 LLano


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.