Message boards :
Number crunching :
OpenCL AstroPulse crash after processing completion - write here.
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 11 · Next
Author | Message |
---|---|
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
. Most probably app can't recive required buffer in GPU memory. Allocted memory size depends both on unroll and ffa_block params. SETI apps news We're not gonna fight them. We're gonna transcend them. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
. I can consistently provoke the CL_OUT_OF_RESOURCES Error with r1316, using one at a time, with the settings; DATA_CHUNK_UNROLL set to:12; FFA thread block override value:10240; FFA thread fetchblock override value:2560. There are a few articles on that Error, OpenCL kernel crashes with -5 Error. This is a memory error, just as the thread topic appears to be. I have already reinstalled the driver since reinstalling it when I swapped the NV8800 & 250, and just received the CL_OUT_OF_RESOURCES Error with settings of -unroll 12 -ffa_block 9216 -ffa_block_fetch 2304 -sbs 256. Doesn't sound right, to me. Running out of memory on a single task with a card that has 1 GB of vRAM? -1 (0xffffffffffffffff) Unknown error number �Mx - exit code -1 (0xffffffff) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Have you got it sorted yet? The Debugger is still running, here is a fresh one for you; ap_07oc12ae_B1_P0_00061_20130224_25925.wu_0 Yes, my performance might decrease if I keep getting Invalids on Overflows where each App Exits at a different time depending on which App you are running. You would think that each App agreeing on 30/30/0.00 would be enough. I'm beginning to lose my amusement... |
Mike Send message Joined: 17 Feb 01 Posts: 34271 Credit: 79,922,639 RAC: 80 |
In all honesty. For a mid range card you are running to aggressive params. You have finnished enough valid units so i know the card and the app is working. Also you change the values to often. Let it settle down first with a safe setting. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Right now I'm using -unroll 12 -ffa_block 8192 -ffa_block_fetch 4096 -sbs 256 -hp The card is running in the Mid 80s percent of GPU Load. The card has 12 Compute units, hence the Unroll 12. How do you suggest increasing the GPU load up to around 90-95% without trying more instances. I've already found that I have to lower the settings to such a point running 2 does nothing except increase completion times. If I go above 8192, I receive the Memory Error. Below, well, I'd really like to run at more than 70% GPU load. Why do you think running a card with 12 Compute units at 85% is being too aggressive? |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
This is a memory error, just as the thread topic appears to be. I have already reinstalled the driver since reinstalling it when I swapped the NV8800 & 250, and just received the CL_OUT_OF_RESOURCES Error with settings of -unroll 12 -ffa_block 9216 -ffa_block_fetch 2304 -sbs 256. Doesn't sound right, to me. Running out of memory on a single task with a card that has 1 GB of vRAM? Try a different driver/APP runtime, according to your stderr.txt the driver/APP runtime you're running reports the following: OpenCL Platform Name: AMD Accelerated Parallel Processing While my HD7770 on Cat 12.8 reports: OpenCL Platform Name: AMD Accelerated Parallel Processing Claggy |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I'm running Windows XP. I'm using the most recent driver for Windows XP. I tried going backwards a few months ago, it didn't help anything back then. BTW, The App was designed for the Driver I'm using, 12.1. I predict that we will see many more 'inconclusive' results as more people use a newer driver/app than the people using XP who can't update the Driver... |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
I'm running Windows XP. I'm using the most recent driver for Windows XP. I tried going backwards a few months ago, it didn't help anything back then... Then the next step would be Windows 7, and first the same driver, then a later driver. Claggy |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Go ahead and send me a copy, I'll try it out. I'd much rather reboot my machine and use the installed version of OSX Lion though. So much so, that I refuse to buy another copy of Windows for this Mac. I did buy a copy of Windows 8 for the MC PC though. It appears my old CUDA cards hate it, the ATI 4670 appears unaffected. You can send any version of Windows 7 for my Mac, I'm not picky. Thanks ;-) |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Go ahead and send me a copy, I'll try it out. I'd much rather reboot my machine and use the installed version of OSX Lion though. So much so, that I refuse to buy another copy of Windows for this Mac. I did buy a copy of Windows 8 for the MC PC though. It appears my old CUDA cards hate it, the ATI 4670 appears unaffected. AMD/ATI has changed the memory sizes available from driver to driver before, if you can't see that the only way you're going to increase them is to run a recent driver, and that means Windows Vista, 7 or 8, then it's not my problem, and snidey remarks aren't going to make me help you again. Claggy |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Running out of memory on a single task with a card that has 1 GB of vRAM? Sure, if single buffer size will exceed maximum possible size for particular card. Memory can't be allocated in one buffer, look for clinfo output that each stderr contains. SETI apps news We're not gonna fight them. We're gonna transcend them. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
I'm running Windows XP. I'm using the most recent driver for Windows XP. I tried going backwards a few months ago, it didn't help anything back then. Unfortunately, all has it's own limitations. What about 3 instances per this quite fast GPU ? Or maybe even 4 with more free CPU cores. But then you would try to decrease params. Currently AP app (to save GPU memory) reallocates GPU buffers (instead of MB app that does allocations only at beginning). AP uses 2 quite different computation loops. So called "main loop" where memory consumption governed with -unroll switch and FFA loop where memory consumption governed with -ffa_block switch. If one of modes require considerably more GPU memory than another and you happen to run few app instances with unluck offset where all of instances require max memory, then you will see semi-random out of resourses failures. fast GPU and "only" 1 Gb of memory make it harder to max load. Especially with XP + non-VM driver (VM should means virtual memory as peoples on AMD forums think. I saw no confirmation from AMD staff of this but it makes some sense). P.S. and, as Claggy noticed, your driver can cut GPU memory size by factor of 2 leaving you only with 52MB of OpenCL-available memory. It makes situation even worse. I know you can't go later Cat with XP but maybe it's worth to try lower ones like 11.12 ? Looks like there is need in 2 separate builds - with latest SDK to match latest Cat drivers and with oldest (like 2.4) SDK to support older configs. One fits all approach doesn't work well it seems. SETI apps news We're not gonna fight them. We're gonna transcend them. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Go ahead and send me a copy, I'll try it out. I'd much rather reboot my machine and use the installed version of OSX Lion though. So much so, that I refuse to buy another copy of Windows for this Mac. I did buy a copy of Windows 8 for the MC PC though. It appears my old CUDA cards hate it, the ATI 4670 appears unaffected. There you go, fixed it for you. It was actually an attempt at Humor. Do you think running a driver the App wasn't designed for is a 'good' idea? I remember a little thread about a new Driver AMD released that caused problems around here. Something about going back to 12.3, or something like that. I'm still hearing about problems with the newest AMD drivers and the AP App. The Default App r1316 wasn't designed for the newer Driver/App. I can see real problems ahead. What is the latest, install the old OpenCL but the newer Driver? I'm getting that 'inconclusive' feeling already. Seriously, do you think that is a good idea? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I'm running Windows XP. I'm using the most recent driver for Windows XP. I tried going backwards a few months ago, it didn't help anything back then. No luck with 11.12, it gives the same as 12.1. I suspect someone reading this knows about the others down to 11.7.
Number of devices: 1 Max compute units: 12 Max work group size: 256 Max clock frequency: 775Mhz Max memory allocation: 134217728 Cache type: None Cache line size: 0 Cache size: 0 Global memory size: 536870912 Constant buffer size: 65536 Max number of constant args: 8 Local memory type: Scratchpad Local memory size: 32768 Queue properties: Out-of-Order: No Name: Barts Vendor: Advanced Micro Devices, Inc. Driver version: CAL 1.4.1646 Version: OpenCL 1.1 AMD-APP (831.4)
|
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
Sorry to interrupt the current conversation, but I suddenly had three error results which may or may not be related to the restart-at-end-of-processing issue I believe was being discussed earlier: Host 1504137, NV WU 1 Host 1504137, NV WU 2 Host 6077487, ATI WU All run-times seem normal, but I don't see any error messages in the log - the only error given is that the client aborted the application for some reason... Soli Deo Gloria |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
|
Wedge009 Send message Joined: 3 Apr 99 Posts: 451 Credit: 431,396,357 RAC: 553 |
No worries, hopefully they won't be that common. Soli Deo Gloria |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I'm running Windows XP. I'm using the most recent driver for Windows XP. I tried going backwards a few months ago, it didn't help anything back then. I thought I'd run 11.12 until I get the Exit Code -1073741819 (0xc0000005) Error, nothing yet. Looks like there is need in 2 separate builds - with latest SDK to match latest Cat drivers and with oldest (like 2.4) SDK to support older configs. I see nothing wrong with the current r1316 for the older rigs, at some point everyone will eventually update to the last Driver XP can use. The 0xc0000005 Error seems to occur only every 50 or so tasks for those that get it. You could make the cutoff to the older rigs OpenCL 1.2/SDK 2.8, that appears to be the problem point. I'm still getting 'Inconclusive/Invalids' from those tasks I ran two at a time with lowered settings. Something is going on there. I ran the App single a few times then switched to two at a time. I immediately got the out of memory errors with BOINC .45 doing it's restart routine. I switched to .52 to get away from the BOINC restarts, then lowered the settings to get away from the out of memory errors. For some reason, the tasks run from that point on have something 'different' about them. Everything seems fine with the ones run one at a time. I 'could' try two at a time with 11.12/r1316 and see if it reoccurs... |
trader Send message Joined: 25 Jun 00 Posts: 126 Credit: 4,968,173 RAC: 0 |
been reading this thread and i think i'll ask a question prior to install new video card. first off specs. msi z77a-g45 mb corsair 16gb ram 1600mhz xmp i7 3770k oc'd to 4.2ghz intel 120gb ssd wd 1tb hdd win 7 ult x64 sp1 currently using onboard intel graphics but ordered and expect to get by friday msi gtx680 lightning. now before i install this and start up seti is there anything i can do to prevent the headaches i see here. I RTFM and it was WYSIWYG then i found out it was a PEBKAC error |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Sure, you can disable AP work on NV GPU (in your case just disable AP work at all). But, if you wanna test Intel OpenCL AP processing you could go into anonymous platform setup and configure project to run only AP tasks on Intel GPU and not on your NV card. Then you could help in Intel GPU testing while not sacrifice NV GPU productivity. SETI apps news We're not gonna fight them. We're gonna transcend them. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.