Message boards :
Number crunching :
Discussion of Invalid Host Messaging
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
So, not a single AstroPulse "Computation error" since updating to BOINC 7.0.38. Interesting. I had been receiving around one, sometimes two, a day with 7.0.28. I see the AstroPulse error was vaporized, but, I did receive 4 others with the nVidia card before adding the FLOP entry. You can see when I updated to 7.0.38 here, Task 2780311703 It's become somewhat of a cliffhanger now, will it pass or throw the Error? The suspense is growing with every passing hour, All AstroPulse v6 tasks for computer 6797524 I suppose I could just change the Unroll to 2 and see what happens, but, that would remove the drama... This could save a large amount of Computer Time if updating BOINC solved the problem :-) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
*** I suppose I could just change the Unroll to 2 and see what happens *** I've completed 6 APs in a row with the Unroll setting at 2. If I would have done that back here ^Something isn't right^ around 5 of those 6 would have ended in a "Computation error". Here are the first 3 of the 6; Task 2784334429 Task 2784338147 Task 2784338745 I'm ready to declare Victory. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
*** I suppose I could just change the Unroll to 2 and see what happens *** You could adjust the Fetch and Thread_Block, see that you use 1:3 and Fetch_Block 2048 and Thread_Block 6144, have you tried other values, f.i. Tread_Block 10240 and Fetch_Block 5120 or 8192 and 4096: (1:2). Running on device number: 0 DATA_CHUNK_UNROLL at default:2 DATA_CHUNK_UNROLL set to:2 FFA thread block override value:6144 FFA thread fetchblock override value:2048 With UNROLL=2, Tread and Fetch_Block can be bigger! Which one is the most effective also depends which GPU you're using. See that you use an 9800GT and a BARTS GPU. Quite different architecture and thus Regsize. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
*** I suppose I could just change the Unroll to 2 and see what happens *** Look at my results from last night, you will see different -ffa_block & -ffa_block_fetch numbers, All AstroPulse v6 tasks for computer 6797524. Since declaring Victory I have been attempting to up the average GPU usage back to around 90%. BOINC 7.0.38 seems to have lowered the average to around 80%. From my experience the -ffa_block & -ffa_block_fetch numbers don't make that much of a difference above the 6144 & 1536 setting I was using for a long time. The Unroll numbers do make a noticeable difference. From what I've read, the optimum Unroll number is equal to your Compute Units, which for the 6850 is 12. I only work Multibeam on the 8800, and try to keep the 6850 on AstroPulses. They use different Apps, different settings. There are a large number of people receiving the "Access Violation (0xc0000005) at address 0x0040xxxx read attempt to address 0x04A3xxxx" Error, all you have to do is look around and you will find plenty. I find them by just looking at the results from my AstroPulse Wingmen. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
GPU usage has nothing to do with the Boinc version. Blankings are reponsible for GPU usage because its calculation is done by the CPU. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
GPU usage has nothing to do with the Boinc version. Sorry Mike. My experience has shown that different BOINC versions do make a difference. I'm well aware of the Blanking slowing down GPU usage as I've watched the process for quite a while now. I just watched one crawl by at around 40% GPU usage. BTW, remember the problem with running 2 MBs with BOINC 7.0.36? It was solved by going back to BOINC 7.0.28. Well, the problem is back with 7.0.38. Not only that, NOW I'm having problems with running 2 APs with 7.0.38. It's fine until I hit 2 of those Blanked APs at the same time, then I receive a hang. But....But.... I didn't have that problem with APs with BOINC 7.0.28 OR 7.0.36. BOINC versions do make a difference on my MacPro Running Win XPsp3 with an AMD 6850. How many of those do you have around here? A MacPro, running Win XPsp3, with an 6850? |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
Do you keep a CPU core free ? Its more the mixed setup i would say. Running 2 different GPUs always uses much ressources. I`m fully aware of the 6850, it has issues running multiple instances. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
A while back, I went to using a CPU setting of 60% for Multiprocessors. That gave me about the same as I had using a different setting, 2 CPUs for 603 Tasks and 2 CPUs for the GPUs. I've never had a problem with having 2 CPUs free for the 2 GPUs even with running Multiple Instances on 1 GPU. I'm running the wonderful System Information Viewer continuously, and it has a CPU Process graph for each Process. When the Hang occurs, the CPU process for one AP maxes out in the Red. My guess is BOINC 7.0.38 wants me to sacrifice another CPU for the GPUs. Again, I did't have that problem in 7.0.28 or 7.0.36. I'm really not interested in running 2 APs at the same time, I was just testing to see if I got the (0xc0000005) error. I didn't receive the (0xc0000005) error, I got something else. I'm happy with running 1 AP at a time, at around 90% GPU usage. If you look back at my results when using 7.0.28, you will notice I was getting some fast times. Those times were with the GPU running around 90% with most tasks. Since updating to 7.0.38, the most I've seen is around 80% usage on the fastest APs I've run so far. I should have run across quite a few at 90%, I haven't....So Far. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
BTW, I still haven't received the dreaded "Access Violation (0xc0000005) at address 0x0040A1FA read attempt to address 0x00399F64" Error on my MacPro since updating to 7.0.38. Later tonight I will be updating the machine with the ATI 4670 to 7.0.38. I'm not sure what's going on with that machine, it's had a terrible week. It usually doesn't give any problems. Now it's given the "Unhandled Exception Detected... - Unhandled Exception Record - Reason: Access Violation (0xc0000005) at address 0x0040A1FA read attempt to address 0x00399F64" Error, Followed by an Invalid, and then more questionable results. We'll see what happens after the update. When it rains.... Check this out. I just looked at my last completed AP and look at what my Wingman got; Access Violation (0xc0000005) at address 0x0040A1FA read attempt to address 0x0052901C You can't make this stuff up.... |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
After 9 days, another Error. Well, it's better than it use to be. Access Violation (0xc0000005) at address 0x0040A1FA read attempt to address 0x003AA2DC When the other results are in, I'll bet they validate the results that are currently labeled "Invalid". Something is causing that Error after the results are written... |
trader Send message Joined: 25 Jun 00 Posts: 126 Credit: 4,968,173 RAC: 0 |
if you are getting errors using the stock app, and are running windows x64. pm me and i might be able to help |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
i might be able to help With a 10-week old problem? Do share. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Maybe someone could send a PM to Cody Sharp and explain to him how to add the command line entry to his ap_cmdline_6.04_windows_intelx86__opencl_ati.txt file. He just showed up at the top of one of my Workgroups again. I guess his file might have a different name since he's running Win 7 64bit. If he just added one of these lines to that file he could probably avoid most of those Errors. I keep thinking about it every time I see one of his Errors, but, there are so many of those affected Hosts... Maybe SETI could make a post on the News page about the problem and explain how to add one of the lines; Command Line Parameters, High end cards (more than 12 compute units) -unroll 12 -ffa_block 8192 -ffa_block_fetch 4096 -hp Mid range cards (less than 12 compute units) -unroll 10 -ffa_block 6144 -ffa_block_fetch 1536 -hp Entry level GPU (less than 6 compute units) -unroll 4 -ffa_block 2048 -ffa_block_fetch 1024 -hp It might save some bandwidth...and give people sending PMs a link to reference. |
Mike Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80 |
The app has lowest values set as default. You only need to do that if you want to go faster. With each crime and every kindness we birth our future. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
It also causes something else, related to the Error. What happens if you set the settings too high in XP? *Out of Memory* It causes the App to use more memory. But, You should know that though... It sure helped me back here, Then try adding -unroll 10 -ffa_block 6144 -ffa_block_fetch 1536 to your ap_cmdline_6.04_windows_intelx86__opencl_ati.txt file. Send someone having a lot of these Errors a message, see if it helps them. Oh, I'm about out of APs again... |
trader Send message Joined: 25 Jun 00 Posts: 126 Credit: 4,968,173 RAC: 0 |
i might be able to help An answer via an objective interpolation of your definition of the word share when applied to the meagar number of posts I have precludes me from providing you with the answer that you desire that will not violate the august moderator's interpretation of what a flame/hate mail is, and I actually care about what he thinks. |
William Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0 |
i might be able to help Which 'he' would that be now, in the last sentence? *puzzled look* A person who won't read has no advantage over one who can't read. (Mark Twain) |
andybutt Send message Joined: 18 Mar 03 Posts: 262 Credit: 164,205,187 RAC: 516 |
we would still like to know your input to the problem whatever Richard said to offend you Andy |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
Bump. Member of the People Encouraging Niceness In Society club. |
Wiggo Send message Joined: 24 Jan 00 Posts: 34871 Credit: 261,360,520 RAC: 489 |
Bump. You just had to do that didn't you? It triggered me going through that long list of those non-anonymous bad rigs of mine from last October and seeing how those rigs are going now. A lot have gone, some have been fixed, but sadly a lot are still in "mangle mode". If I get a chance today I'll go through my "Anonymous" list from that same time and see how that one stacks up. Then I may just feel nasty enough to post the full details of all those bad rigs. Cheers. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.