ATI MB rev170 Now 177 beta app testing

Message boards : Number crunching : ATI MB rev170 Now 177 beta app testing
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · Next

AuthorMessage
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1069605 - Posted: 22 Jan 2011, 21:59:26 UTC - in response to Message 1069523.  
Last modified: 22 Jan 2011, 22:01:55 UTC

Did research on my results.

A little confusing to me is the run times on 0.42 units.

Some finnish in 1800 seconds and others need over 3000 seconds.
Same AR.

Any ideas ?

Links on these specific results?



http://setiathome.berkeley.edu/result.php?resultid=1778140015

http://setiathome.berkeley.edu/result.php?resultid=1778437846


With each crime and every kindness we birth our future.
ID: 1069605 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1069809 - Posted: 23 Jan 2011, 9:02:59 UTC

first one (~1800s) looks quite OK, second already purged unfortunately.
ID: 1069809 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1069810 - Posted: 23 Jan 2011, 9:04:34 UTC - in response to Message 1069809.  
Last modified: 23 Jan 2011, 9:08:15 UTC

first one (~1800s) looks quite OK, second already purged unfortunately.


Another one.

http://setiathome.berkeley.edu/result.php?resultid=1778214897

Fresh one.

http://setiathome.berkeley.edu/result.php?resultid=1778214644


With each crime and every kindness we birth our future.
ID: 1069810 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1069818 - Posted: 23 Jan 2011, 9:34:42 UTC - in response to Message 1069810.  

All looks OK, nothing apparent in counters that could account for huge differencies in elapsed time.

first one (~1800s) looks quite OK, second already purged unfortunately.


Another one.

http://setiathome.berkeley.edu/result.php?resultid=1778214897

Fresh one.

http://setiathome.berkeley.edu/result.php?resultid=1778214644

ID: 1069818 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1069826 - Posted: 23 Jan 2011, 10:39:59 UTC - in response to Message 1069818.  

All looks OK, nothing apparent in counters that could account for huge differencies in elapsed time.

first one (~1800s) looks quite OK, second already purged unfortunately.


Another one.

http://setiathome.berkeley.edu/result.php?resultid=1778214897

Fresh one.

http://setiathome.berkeley.edu/result.php?resultid=1778214644



I thought the same.
Thats the reason i´m confused.




With each crime and every kindness we birth our future.
ID: 1069826 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1069827 - Posted: 23 Jan 2011, 10:41:47 UTC - in response to Message 1069818.  
Last modified: 23 Jan 2011, 10:50:01 UTC

All looks OK, nothing apparent in counters that could account for huge differencies in elapsed time.

first one (~1800s) looks quite OK, second already purged unfortunately.


Another one.

http://setiathome.berkeley.edu/result.php?resultid=1778214897

Fresh one.

http://setiathome.berkeley.edu/result.php?resultid=1778214644




Disable your timers ;) You are executing serialising instructions before and after every Kernel call, thereby flushing all cache levels, forcing refetches prone to bus contention.

Jason
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1069827 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1069888 - Posted: 23 Jan 2011, 17:31:37 UTC - in response to Message 1069827.  


Disable your timers ;) You are executing serialising instructions before and after every Kernel call, thereby flushing all cache levels, forcing refetches prone to bus contention.

Jason

Most of them just counters with non-blocking count.
Others are called only on big code chunks (and each transfer between host memory and GPU should flush cache too after all).
ID: 1069888 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1069978 - Posted: 23 Jan 2011, 20:50:12 UTC

I'm failing in several attempts to download the ap. At the supplied files.mail.ru site I see links for two files. But if I click on the MB link, my browser takes me back to the same page offering download.

I was able to do this at a previous link provided for the AP R504 files, and trying that link today I am still actually offered download.
ID: 1069978 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1070023 - Posted: 23 Jan 2011, 22:51:34 UTC
Last modified: 23 Jan 2011, 22:55:45 UTC

r177 working OK on HD6970:

http://setiathome.berkeley.edu/result.php?resultid=1753166479


Max compute units: 24
Max work group size: 256
Max clock frequency: 880Mhz
Max memory allocation: 268435456
Cache type: None
Cache line size: 0
Cache size: 0
Global memory size: 1073741824
Constant buffer size: 65536
Max number of constant args: 8
Local memory type: Scratchpad
Local memory size: 32768
Queue properties:
Out-of-Order: No
Name: Cayman
Vendor: Advanced Micro Devices, Inc.
Driver version: CAL 1.4.900
Version: OpenCL 1.1 ATI-Stream-v2.3 (451)
Extensions: cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_amd_device_attribute_query cl_amd_printf cl_amd_media_ops cl_amd_popcnt cl_khr_d3d10_sharing
ID: 1070023 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1070041 - Posted: 23 Jan 2011, 23:40:55 UTC - in response to Message 1069978.  

I'm failing in several attempts to download the ap. At the supplied files.mail.ru site I see links for two files. But if I click on the MB link, my browser takes me back to the same page offering download.
Ah, found the difference, one URL started with download82, which works, while the other started content3, which takes one around the circle.

Downloaded and installed--now to wait for Berkeley to resume sending work.

ID: 1070041 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1070158 - Posted: 24 Jan 2011, 9:49:15 UTC - in response to Message 1070041.  

I'm failing in several attempts to download the ap. At the supplied files.mail.ru site I see links for two files. But if I click on the MB link, my browser takes me back to the same page offering download.
Ah, found the difference, one URL started with download82, which works, while the other started content3, which takes one around the circle.

Downloaded and installed--now to wait for Berkeley to resume sending work.

Probably it's because last link contained 2 archives while first one was only with one. Perhaps web-server shapes download pages differently for these cases.
ID: 1070158 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1070421 - Posted: 25 Jan 2011, 4:37:00 UTC

After downloading and instally MB r177 initiallly I had no work. With downloads resumed, I've got 8 MB tasks assigned to ATI, and the first has started running (after I suspended some ATI AP results).

Here is the thing that seems odd, %progress is incrementing steadily but seemingly entirely based on CPU work, not GPU. With the result currently showing 5.6% complete, most samples on GPU-Z show 0% GPU workload, while the CPU workload is using a about 3/4 of a virtual CPU on my hyperthreaded Westemere.

Here is the work unit

Are there some kinds of WU for which this ATI GPU ap leaves all the work on the CPU? Or is more likely I have done my installation incorrectly, and this is a symptom of my error?
ID: 1070421 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1070459 - Posted: 25 Jan 2011, 7:55:40 UTC - in response to Message 1070421.  
Last modified: 25 Jan 2011, 7:56:17 UTC

After downloading and instally MB r177 initiallly I had no work. With downloads resumed, I've got 8 MB tasks assigned to ATI, and the first has started running (after I suspended some ATI AP results).

Here is the thing that seems odd, %progress is incrementing steadily but seemingly entirely based on CPU work, not GPU. With the result currently showing 5.6% complete, most samples on GPU-Z show 0% GPU workload, while the CPU workload is using a about 3/4 of a virtual CPU on my hyperthreaded Westemere.

Here is the work unit

Are there some kinds of WU for which this ATI GPU ap leaves all the work on the CPU? Or is more likely I have done my installation incorrectly, and this is a symptom of my error?

AFAIK your GPU has max of 128 workitems per workgroup. Most ATi GPUs have 256 instead.
Unfortunately, current ATi MB version (r177) can not be run on such GPUs.
This issue was detected by Helli, your results support this.
Check stderr.txt, probably it contains many error messages.
ID: 1070459 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1070483 - Posted: 25 Jan 2011, 13:39:55 UTC - in response to Message 1070459.  

Raistmer wrote:
Unfortunately, current ATi MB version (r177) can not be run on such GPUs.
This issue was detected by Helli, your results support this.
Check stderr.txt, probably it contains many error messages.
I don't see anything I recognize as an error message in stderr.txt. Just eight lines echoing the command line parameter settings, worker thread and process priority adjustment, device number and slot number reports, building program, ending in "main kernels: OK code 0"

The odd thing is that is was reporting steady progress, at a rate suggesting it would hit 100% in about four hours. In case it might show something interesting, I'll take this one out of suspend and see if it completes. But from your information it seems unwise to run more than the one WU of MB r177 on this host.

ID: 1070483 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1070523 - Posted: 25 Jan 2011, 15:55:45 UTC - in response to Message 1070483.  

In case it might show something interesting, I'll take this one out of suspend and see if it completes. But from your information it seems unwise to run more than the one WU of MB r177 on this host.


If it will run to completion and validates OK...
ID: 1070523 · Report as offensive
archae86

Send message
Joined: 31 Aug 99
Posts: 909
Credit: 1,582,816
RAC: 0
United States
Message 1070574 - Posted: 25 Jan 2011, 23:06:52 UTC - in response to Message 1070523.  

Raistmer wrote:
If it will run to completion and validates OK...

The apparent progress in percent complete was reported by BoincTasks, which usually agrees in all digits with that displayed by BoincMgr, after a few seconds of sampling delay.

But BoincMgr never changed from displaying 0.000% complete even after more than three hours had passed and BoincTasks was reported more than 100%.

I won't try any more on r177.

ID: 1070574 · Report as offensive
Profile dnolan
Avatar

Send message
Joined: 30 Aug 01
Posts: 1228
Credit: 47,779,411
RAC: 32
United States
Message 1070827 - Posted: 26 Jan 2011, 15:54:31 UTC

Just noticed an error on this WU.
In case it expires,
Exit status -177 (0xffffffffffffff4f)

and down in the details of the debug code:

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x75CA22A1

Hopefully a one-time error, but thought I'd post it for info anyway.
The task was done on my i7 2600 w/ 2 X HD5870 cards, and ran on device 0.

-Dave
ID: 1070827 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34258
Credit: 79,922,639
RAC: 80
Germany
Message 1070832 - Posted: 26 Jan 2011, 16:09:33 UTC
Last modified: 26 Jan 2011, 16:10:01 UTC

Had those too Dave.

It was a bunch of bad units.
Just make sure you dont have more of this tape.

I´ve got over 100 back then.


With each crime and every kindness we birth our future.
ID: 1070832 · Report as offensive
Profile dnolan
Avatar

Send message
Joined: 30 Aug 01
Posts: 1228
Credit: 47,779,411
RAC: 32
United States
Message 1070834 - Posted: 26 Jan 2011, 16:10:50 UTC

Ok, thanks Mike, good to know.

-Dave
ID: 1070834 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1070909 - Posted: 26 Jan 2011, 20:39:20 UTC - in response to Message 1070827.  

Just noticed an error on this WU.
In case it expires,
Exit status -177 (0xffffffffffffff4f)

and down in the details of the debug code:

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x75CA22A1

Hopefully a one-time error, but thought I'd post it for info anyway.
The task was done on my i7 2600 w/ 2 X HD5870 cards, and ran on device 0.

-Dave

<message>
Maximum elapsed time exceeded
</message>

When BOINC detects the task has run longer than the project has allowed, or has allocated more memory than the project has allowed, or if the user aborts a task, it tells the science application to abort the work. On Windows the science application does so by calling a Windows API DebugBreak() function which forces a breakpoint to activate the Windows debugger. The idea is to get the dump so someone might analyze what was wrong which led to the abort. IOW, the "Breakpoint Encountered (0x80000003)" is intentionally programmed for such cases. If the science application were so badly hung it doesn't react to the abort command BOINC would forcefully terminate it a few seconds later, I've never seen that happen.
                                                               Joe
ID: 1070909 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · Next

Message boards : Number crunching : ATI MB rev170 Now 177 beta app testing


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.