SETI@home v8.22 Windows GPU applications support thread

Message boards : Number crunching : SETI@home v8.22 Windows GPU applications support thread
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10050
Credit: 969,065,047
RAC: 1,533,504
United States
Message 2009456 - Posted: 27 Aug 2019, 1:05:47 UTC - in response to Message 2009443.  

Several 15fe13aa GPU tasks that appear to have problems doing much other than hitting the time limit, on my computer and sometimes at least one more computer:
These all gave:
Exit status 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
Could you check if these share at least one bad input file?

. . the first question is ... have you been doing lot of task rescheduling?

Stephen

?

What is task rescheduling?

SETI@Home is not the only GPU BOINC project this computer is connected to, though.

If you don't know the answer to that question, then don't worry about it. Not the problem. Now at least two separate accounts with Windows 10 and 436 Nvidia drivers are erroring out all VHAR tasks with time exceeded errors. I don't call that coincidence. And all other hosts running those same tasks on different platforms and different drivers successfully complete them normally in the usual task times.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2009456 · Report as offensive     Reply Quote
robertmiles
Volunteer tester

Send message
Joined: 16 Jan 12
Posts: 187
Credit: 3,708,987
RAC: 2,291
United States
Message 2009458 - Posted: 27 Aug 2019, 1:32:33 UTC - in response to Message 2009456.  

How do I tell VHAR tasks from all other types of tasks?

More than half of the SETI@Home GPU tasks on my computer today have completed much faster, and apparently properly.

I am using a 436 driver, so I'll look for an older driver. Does it matter which one as long as it is recent?
ID: 2009458 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10050
Credit: 969,065,047
RAC: 1,533,504
United States
Message 2009462 - Posted: 27 Aug 2019, 2:34:13 UTC - in response to Message 2009458.  

How do I tell VHAR tasks from all other types of tasks?

More than half of the SETI@Home GPU tasks on my computer today have completed much faster, and apparently properly.

I am using a 436 driver, so I'll look for an older driver. Does it matter which one as long as it is recent?

The only way to tell is by the taskname since any VHAR task has to be from Arecibo Observatory by default. Also the taskname has to end in just a single _digit character as _0, _1, _2, _3 etc. This is the normal way to designate the copies of the task sent out to the various wingmen. Then you still have to look at the properties of the task that is in the client_state.xml file if it hasn't been processed yet, or wait for it to finish and then report and then view the stderr.txt output. The stderr.txt always lists the AR (angle range) of the task.

https://setiathome.berkeley.edu/result.php?resultid=7990213477

WU true angle range is : 0.446129

is what we call the standard angle range for Arecibo tasks.

The breakdown of AR is:

VHARs >1.0 (aka "Shorties")
Mid-range (0.12 - 0.99) (aka "MARs"?)
VLARs <0.12 (aka "OMG Why are these SO SLOW!")

All the tasks coming from Green Bank Telescope are VLAR and have .vlar appended to the task name. The reason is that the GBT is a tracking telescope and the target does not move relative to the telescope unlike the Arecibo telescope where the Earth is always moving relative to the target in the sky.

Your failing tasks all are VHAR tasks with angle ranges of 2.7.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2009462 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1467
Credit: 181,397,520
RAC: 457,389
United States
Message 2009463 - Posted: 27 Aug 2019, 2:36:55 UTC - in response to Message 2009456.  
Last modified: 27 Aug 2019, 2:37:54 UTC

...Now at least two separate accounts with Windows 10 and 436 Nvidia drivers are erroring out all VHAR tasks with time exceeded errors. I don't call that coincidence. And all other hosts running those same tasks on different platforms and different drivers successfully complete them normally in the usual task times.

FWIW, I see some similar of these as well on my linux box, Ubuntu 18.04 with 430.26 https://setiathome.berkeley.edu/results.php?hostid=8729943&offset=0&show_names=0&state=6&appid=
ID: 2009463 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10050
Credit: 969,065,047
RAC: 1,533,504
United States
Message 2009464 - Posted: 27 Aug 2019, 2:38:02 UTC

You would need to try another VHAR task of 2.7AR with the 430 driver for example and have it successfully validate as a test to determine if it's the driver version that is the problem.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2009464 · Report as offensive     Reply Quote
robertmiles
Volunteer tester

Send message
Joined: 16 Jan 12
Posts: 187
Credit: 3,708,987
RAC: 2,291
United States
Message 2009469 - Posted: 27 Aug 2019, 3:22:21 UTC - in response to Message 2009458.  

How do I tell VHAR tasks from all other types of tasks?

More than half of the SETI@Home GPU tasks on my computer today have completed much faster, and apparently properly.

I am using a 436 driver, so I'll look for an older driver. Does it matter which one as long as it is recent?

I've now installed a 431 driver.
ID: 2009469 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6115
Credit: 100,883,796
RAC: 51,278
Russia
Message 2009487 - Posted: 27 Aug 2019, 9:31:29 UTC

VHAR is the kind of task where SoG plays at full power. No PulseFind. So almost whole OpenCL calls scheduled to run to GPU immediately - true parallel execution. If driver fails to process such long sequence it restarted.
That's why VHAR might be more fragile for any driver changes.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2009487 · Report as offensive     Reply Quote
Profile CalicoSkies
Avatar

Send message
Joined: 20 May 99
Posts: 29
Credit: 949,970
RAC: 3,638
United States
Message 2010417 - Posted: 2 Sep 2019, 3:08:11 UTC - in response to Message 1842133.  

I have an Nvidia RTX 2070 Super graphics card (Asus ROG STRIX), and SETI@Home Nvidia tasks would normally take up to 3-4 minutes to complete on the GPU. However, today I noticed SETI@Home Nvidia tasks get to about 0.603% complete and stop, and the estimated time to finish keeps going up.

What might be causing this?
Today I had updated my Nvidia driver, and I'm wondering if that might have something to do with it. I'm currently using the Nvidia 436.15 driver.
ID: 2010417 · Report as offensive     Reply Quote
robertmiles
Volunteer tester

Send message
Joined: 16 Jan 12
Posts: 187
Credit: 3,708,987
RAC: 2,291
United States
Message 2010420 - Posted: 2 Sep 2019, 3:28:02 UTC - in response to Message 2010417.  
Last modified: 2 Sep 2019, 3:28:23 UTC

I have an Nvidia RTX 2070 Super graphics card (Asus ROG STRIX), and SETI@Home Nvidia tasks would normally take up to 3-4 minutes to complete on the GPU. However, today I noticed SETI@Home Nvidia tasks get to about 0.603% complete and stop, and the estimated time to finish keeps going up.

What might be causing this?
Today I had updated my Nvidia driver, and I'm wondering if that might have something to do with it. I'm currently using the Nvidia 436.15 driver.


I've recently seen the 436 driver listed as a known cause of errors for some GPU BOINC project, but I don't remember which one. You might try substituting an older driver until you get an answer better than this.
ID: 2010420 · Report as offensive     Reply Quote
Profile Wiggo "Democratic Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 16951
Credit: 235,365,304
RAC: 182,916
Australia
Message 2010422 - Posted: 2 Sep 2019, 3:45:38 UTC - in response to Message 2010417.  

I have an Nvidia RTX 2070 Super graphics card (Asus ROG STRIX), and SETI@Home Nvidia tasks would normally take up to 3-4 minutes to complete on the GPU. However, today I noticed SETI@Home Nvidia tasks get to about 0.603% complete and stop, and the estimated time to finish keeps going up.

What might be causing this?
Today I had updated my Nvidia driver, and I'm wondering if that might have something to do with it. I'm currently using the Nvidia 436.15 driver.
Roll back your driver to at least 431.xx as the latest drivers have a very bad bug in them. ;-)

Cheers.
ID: 2010422 · Report as offensive     Reply Quote
hsdecalc

Send message
Joined: 1 Feb 15
Posts: 4
Credit: 2,154,852
RAC: 2,247
Germany
Message 2010437 - Posted: 2 Sep 2019, 8:33:16 UTC

Same problem here (WIN 10). Solved by changing the NVIDIA driver in Version 413.70.
Easy to install by program “NVIDIA GeForce Experience”.
I opened the options (three dots) and select the “NVIDIA Studio Driver” instead of the “Game Ready Driver”.
ID: 2010437 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1467
Credit: 181,397,520
RAC: 457,389
United States
Message 2010439 - Posted: 2 Sep 2019, 9:03:29 UTC - in response to Message 2010437.  
Last modified: 2 Sep 2019, 9:05:59 UTC

Same problem here (WIN 10). Solved by changing the NVIDIA driver in Version 413.70.
Easy to install by program “NVIDIA GeForce Experience”.
I opened the options (three dots) and select the “NVIDIA Studio Driver” instead of the “Game Ready Driver”.

Unless I'm missing something, you're still running 436.15? At least, that's what BOINC reported a couple minutes ago.
ID: 2010439 · Report as offensive     Reply Quote
CryptokiD
Avatar

Send message
Joined: 2 Dec 00
Posts: 150
Credit: 3,153,258
RAC: 3,070
United States
Message 2010762 - Posted: 5 Sep 2019, 15:01:29 UTC

OK I have a bit of a problem with my gpu not being fully utilised and maybe one of yous guys could help with this. One of my computers has 2 ATI gpus numbered GPU #0 and GPU #1.

OK so. GPU #0 runs fine crunching 1 work unit at a time. The application named Gpu-z shows me that GPU #0 is at about 98%. My problem is with GPU #1.....

GPU #1 also works fine however. it has a low % of GPU usage. Gpu-z shows it running at about 50% to 60% and even has perilous of 0% lasting 1 to 10 seconds. In the process of troubleshooting I discovered the fix is to run 2 or maybe even 3 work units at a time on GPU #1 to keep it fully utilised. Herein lies the problem. How do I run multiple work units on GPU #1. While keeping GPU #0 at 1 work unit at a time? Because they are both. ATI cards I have not been able to figure this out. If they were 2 different brands like Nvidia and ATI then this would be simple but I can't seem to get GPU #0 to remain at 1 work unit at a time. I can get both GPUs to run 2 work units each. Or 1 work unit each. But I cannot get 3 out of them. Anyone know of a process by which I may do this?

And next question, is there a list of undocumented command line switches for the various applications we use? I notice a few times people have referenced the command line switch "-nobs" or some thing similar but I cannot find any documentation on what exactly this nobs switch does. I have seen people here mention other command line switches which are also undocumented. Well if no documentation exists then can someone please do a quick writeup on all of them?

Last question, where can I download the nocal versions GPU applications and their associated files? I would love to try running the ATI HD5 SoG nocal app but I have not been able to find a download link. There's actually a few ATI nocal apps for the GPU available for which I would like to play with but finding a pesky download link is so far proven futile for me.

And yes, I have used the forum search function for the questions I have, as well as duckduckgo search in general.

Thanks. Gents.
ID: 2010762 · Report as offensive     Reply Quote
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 8186
Credit: 507,425,630
RAC: 419,169
Panama
Message 2010769 - Posted: 5 Sep 2019, 16:04:56 UTC - in response to Message 2010762.  
Last modified: 5 Sep 2019, 16:09:16 UTC

I can get both GPUs to run 2 work units each. Or 1 work unit each. But I cannot get 3 out of them. Anyone know of a process by which I may do this?

The only way i know to run a different number of WU on each GPU (of the same brand) on the same host is to run several instances of Boinc on the host. Then you configure each instance for each GPU isolated.

And next question, is there a list of undocumented command line switches for the various applications we use? I notice a few times people have referenced the command line switch "-nobs" or some thing similar but I cannot find any documentation on what exactly this nobs switch does. I have seen people here mention other command line switches which are also undocumented. Well if no documentation exists then can someone please do a quick writeup on all of them?

The -nobs is part of the Linux optimized builds, not supported on Windows versions. It is documented there. Since you run Windows hosts your app does not use it.

Last question, where can I download the nocal versions GPU applications and their associated files? I would love to try running the ATI HD5 SoG nocal app but I have not been able to find a download link. There's actually a few ATI nocal apps for the GPU available for which I would like to play with but finding a pesky download link is so far proven futile for me.

I not use ATI stuff so i not know the answer of this question. Maybe you could visit the Mike or Arkaym sites. They normally store all variants of the app in their sites.
ID: 2010769 · Report as offensive     Reply Quote
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 17874
Credit: 408,033,007
RAC: 43,837
United Kingdom
Message 2010775 - Posted: 5 Sep 2019, 16:32:00 UTC - in response to Message 2010762.  

It all depends on what the two GPS are - if they are not the same, although both of the same "brand", then it is "fairly normal" to see one running harder than the other. There are various techniques to get around this, but they are all hard work and require varying degrees of trial and error.
If you have a mixture of a highly capable GPU and a very weak one and try to run them both with the same setting then one of them will always suffer - as you imply. The current applications do not play well with more than two tasks, even the SoG application grabs as much of the GPU as it can, so is often better only running one task, as when running two tasks a fair bit of effort is expended in swapping between tasks not actually calculating.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2010775 · Report as offensive     Reply Quote
Stephen "Heretic" Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 4718
Credit: 157,592,800
RAC: 262,329
Australia
Message 2010784 - Posted: 5 Sep 2019, 17:32:35 UTC - in response to Message 2010762.  

And next question, is there a list of undocumented command line switches for the various applications we use? I notice a few times people have referenced the command line switch "-nobs" or some thing similar but I cannot find any documentation on what exactly this nobs switch does. I have seen people here mention other command line switches which are also undocumented. Well if no documentation exists then can someone please do a quick writeup on all of them?
Thanks. Gents.

. . The -nobs command is part of the latter CUDA apps and CUDA apps ONLY work on Nvidia cards not ATI. So what Juan forgot to tell you is that even if you ran Linux you still could not use those apps. SoG versions 3557 and 3584 run well usually as a one at a time app, but you may want to play with period_iterations_num and tt values to get the most out of the card. Lower end card may work best with somewhat higher values of period... like 50 or higher while higher end cards will do their best with much lower values such as under 10. Maybe a compromise value will get things working well for you on both cards.

. . Those parameters are documented in the SoG readme files.

Stephen
ID: 2010784 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1467
Credit: 181,397,520
RAC: 457,389
United States
Message 2010799 - Posted: 5 Sep 2019, 19:34:45 UTC - in response to Message 2010762.  
Last modified: 5 Sep 2019, 19:35:02 UTC

OK I have a bit of a problem with my gpu not being fully utilised ... My problem is with GPU #1..... GPU #1 also works fine however. it has a low % of GPU usage.

It's also worth considering that your motherboard architecture may be in play here. Not all PCIE slots get equal CPU resources and access.
In example, I have several mobos with 2 x16 slots and 2 x4 slots. The first slot is truly x16 and gets to the CPU directly. The second X16 slot is actually X8, it and it and the two X4 slots get to the CPU through a different I/O controller. I've long noted that the three "indirect" slots get serviced after the first one, and perform less, even though there generally isn't enough i/o going on that it should matter strictly on bus speed. Unless you have a mobo designed for mining or the like, yours probably has similar limitations.
GPU utilization itself, as measured by HWInfo or GPU-Z, do make it into the 90s on all 4 GPUs, however.
ID: 2010799 · Report as offensive     Reply Quote
CryptokiD
Avatar

Send message
Joined: 2 Dec 00
Posts: 150
Credit: 3,153,258
RAC: 3,070
United States
Message 2010812 - Posted: 5 Sep 2019, 21:34:33 UTC

Thanks for the replies. , too bad about nobs. whats it do anyways? A few things to mention. I would still love it if someone would cough up a list of cmd line switches. I have seen references and hints on this very forum as well as a few other boinc related forums about the existence of undocumented switches but so far I havent been able to get anyone to divulge what they know. I guess its like a secret sauce....

ok so on my system the GPU #0 is integral with the cpu. What is called an "apu" system. GPU #0 is physically on the same chip as the 4 core CPU and so it gets first dibs on everything, bandwidth included. otoh, GPU #1 is on a pcie x16 link thats currently stuck at x8 while I work out some issues with the modified bios I made for myself. But even when it was at x16 it barely made a difference according to gpu-z. I guess im stuck for now as I dont want the headache of multipule boincs. Again, Thanks to all who replied.
ID: 2010812 · Report as offensive     Reply Quote
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1467
Credit: 181,397,520
RAC: 457,389
United States
Message 2010814 - Posted: 5 Sep 2019, 21:50:42 UTC - in response to Message 2010812.  
Last modified: 5 Sep 2019, 21:58:39 UTC

... I would still love it if someone would cough up a list of cmd line switches ...
Cough, cough ... from the docs ...

Umm, I quoted a full set of docs, but I think it was SAH app rather than SoG app, so deleted it.
Check your C:\ProgramData\BOINC\projects\setiathome.berkeley.edu\docs folder for ReadMe_MultiBeam_OpenCL_NV_SoG.txt or similar. Understand you're not NV, but it has reference to the other GPUs as well. If there, it should have all you need.
If not, PM me and I'll reply with a copy.
Jim ...
ID: 2010814 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 10050
Credit: 969,065,047
RAC: 1,533,504
United States
Message 2010815 - Posted: 5 Sep 2019, 21:51:06 UTC - in response to Message 2010812.  

Thanks for the replies. , too bad about nobs. whats it do anyways? A few things to mention. I would still love it if someone would cough up a list of cmd line switches. I have seen references and hints on this very forum as well as a few other boinc related forums about the existence of undocumented switches but so far I havent been able to get anyone to divulge what they know. I guess its like a secret sauce....

ok so on my system the GPU #0 is integral with the cpu. What is called an "apu" system. GPU #0 is physically on the same chip as the 4 core CPU and so it gets first dibs on everything, bandwidth included. otoh, GPU #1 is on a pcie x16 link thats currently stuck at x8 while I work out some issues with the modified bios I made for myself. But even when it was at x16 it barely made a difference according to gpu-z. I guess im stuck for now as I dont want the headache of multipule boincs. Again, Thanks to all who replied.

The command line switches have been documented by the Developer for years. Always been available.
Re: Some considerations regarding OpenCL MultiBeam app tuning from algorithm view
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 2010815 · Report as offensive     Reply Quote
Previous · 1 . . . 18 · 19 · 20 · 21 · 22 · Next

Message boards : Number crunching : SETI@home v8.22 Windows GPU applications support thread


 
©2019 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.