GPU task stuck - cannot process anymore GPU work

Message boards : Number crunching : GPU task stuck - cannot process anymore GPU work
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888566 - Posted: 8 Sep 2017, 8:09:25 UTC - in response to Message 1888375.  

Good news so far.

I had another batch of GPU work units sent last night after the 24 hour "ban" was up. I missed them appearing in the event log but they were all showing aborted in the web page. So I started afresh this morning uninstalled BOINC, reinstalled it and reinstalled Lunatics. Logged onto my general Windows userid and reconnected to SETI@home and got some new GPU work units. Now just need to see if they validate. Maybe my experiment trying to switch to a different, decidated Windows user account messed up some config somewhere.

Reading around the forum it looks like my Nvidia Geforce GTX 750Ti 2GB should be capable of running two GPU work units. My next question how do I modify the app_config file to allow this? The Lunatics one looks complex. Is there an FAQ that explains this?
ID: 1888566 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888570 - Posted: 8 Sep 2017, 8:53:47 UTC - in response to Message 1888566.  

Reading around the forum it looks like my Nvidia Geforce GTX 750Ti 2GB should be capable of running two GPU work units.

If you're running a CUDA application it's worth while. If you're running the SoG application it isn't. And the SoG application is the one to run- much more productive than the older CUDSA applications, even when they are running 2 WUs at a time.
Grant
Darwin NT
ID: 1888570 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888573 - Posted: 8 Sep 2017, 9:04:12 UTC - in response to Message 1888570.  

Reading around the forum it looks like my Nvidia Geforce GTX 750Ti 2GB should be capable of running two GPU work units.

If you're running a CUDA application it's worth while. If you're running the SoG application it isn't. And the SoG application is the one to run- much more productive than the older CUDSA applications, even when they are running 2 WUs at a time.


I have not heard of SoG application, is that the standard application from SETI@home or is it another optimised application? If so where would I find that one?
ID: 1888573 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888574 - Posted: 8 Sep 2017, 9:11:36 UTC - in response to Message 1888573.  

I have not heard of SoG application, is that the standard application from SETI@home or is it another optimised application? If so where would I find that one?

It's available as a stock application, and is an option in the Lunatics Beta 6 installer.
Grant
Darwin NT
ID: 1888574 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22188
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1888578 - Posted: 8 Sep 2017, 9:18:38 UTC

If you are running stock you may find that the servers slip the odd one in to see if your system can really handle the application.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1888578 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888580 - Posted: 8 Sep 2017, 9:29:18 UTC - in response to Message 1888574.  

Many thanks, still lots to learn about optimising SETI@home.

Do you think SoG applies to my old graphics card? My initial research on the web found the tech specs at

https://www.geforce.co.uk/hardware/desktop-gpus/geforce-gtx-750-ti/specifications

But there is no reference to the Fermi architecture. Does SoG need this architecture?

I found the beta 6 installer at Mike's World - is that the correct one?
ID: 1888580 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888581 - Posted: 8 Sep 2017, 9:29:55 UTC - in response to Message 1888578.  

If you are running stock you may find that the servers slip the odd one in to see if your system can really handle the application.

And if the work mix isn't too out of balance, it will end up using the fastest one- SoG.
Grant
Darwin NT
ID: 1888581 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888582 - Posted: 8 Sep 2017, 9:35:54 UTC - in response to Message 1888580.  

Do you think SoG applies to my old graphics card? My initial research on the web found the tech specs at

When I had it running on my GTX 750Tis, it gave over double the output for GBT WUs compared to the CUDA application.

But there is no reference to the Fermi architecture. Does SoG need this architecture?

No.
It actually works better with the more recent architectures- Maxwell (your GTX 750Ti) and Pascal- particularly with the GBT WUs, which the older CUDA applications struggle with.

I found the beta 6 installer at Mike's World - is that the correct one?

Yep.
Make sure you get the right one- 32 or 64bit.

If your CPU supports AVX, make sure to select it- approx.40% speed up over the SSE3/SSSE3 CPU application from memory.
Grant
Darwin NT
ID: 1888582 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888585 - Posted: 8 Sep 2017, 10:14:47 UTC - in response to Message 1888582.  
Last modified: 8 Sep 2017, 10:15:15 UTC

Do you think SoG applies to my old graphics card? My initial research on the web found the tech specs at

When I had it running on my GTX 750Tis, it gave over double the output for GBT WUs compared to the CUDA application.

But there is no reference to the Fermi architecture. Does SoG need this architecture?

No.
It actually works better with the more recent architectures- Maxwell (your GTX 750Ti) and Pascal- particularly with the GBT WUs, which the older CUDA applications struggle with.

I found the beta 6 installer at Mike's World - is that the correct one?

Yep.
Make sure you get the right one- 32 or 64bit.

If your CPU supports AVX, make sure to select it- approx.40% speed up over the SSE3/SSSE3 CPU application from memory.


Cool, new SoG version installed. It is impressive stuff, the old GPU work units are still running with the CUDA50 but new ones are getting assigned to SoG. Will be interesting to see how quick the SoG ones work out at.

Many thanks for all the help, I pleased to see GPU work units back and I have optimised them in the process.
ID: 1888585 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888587 - Posted: 8 Sep 2017, 10:32:55 UTC - in response to Message 1888585.  

Cool, new SoG version installed. It is impressive stuff, the old GPU work units are still running with the CUDA50 but new ones are getting assigned to SoG. Will be interesting to see how quick the SoG ones work out at.

If you check with Task Manager, you'll see that the existing CUDA50 WUs are being processed with the SoG application, and any new work gets the SoG label.
There are some command line options to help boost output further, but i'd suggest letting the system run for a week or 3 to make sure everything is working OK before worrying about further improvements.
Grant
Darwin NT
ID: 1888587 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888623 - Posted: 8 Sep 2017, 14:47:48 UTC

Had a few (5 to date) GPU work units fail, e.g.

https://setiathome.berkeley.edu/result.php?resultid=6005315081

Not seen such failures with CUDA50, do you recommend that should I switch back to CUDA50 or stick with SoG?
ID: 1888623 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1888625 - Posted: 8 Sep 2017, 14:52:45 UTC - in response to Message 1888623.  

Exit status 197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED
This usually occurs when people reschedule tasks, but don't think that is why in your case. Likely because CUDA50 tasks are so SLOW.

Stick with SoG.
ID: 1888625 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888736 - Posted: 9 Sep 2017, 0:19:00 UTC
Last modified: 9 Sep 2017, 0:22:30 UTC

In the manager, what is the estimated completion time for the GPU WUs?
Are you only running 1 GPU WU at a time?
As other completed WUs are retunred to the server, newly downloaded WUs will have a estimated run time closer to the actual time.
Grant
Darwin NT
ID: 1888736 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888750 - Posted: 9 Sep 2017, 1:19:34 UTC - in response to Message 1888736.  

In the manager, what is the estimated completion time for the GPU WUs?
Are you only running 1 GPU WU at a time?
As other completed WUs are retunred to the server, newly downloaded WUs will have a estimated run time closer to the actual time.


In BOINC manager GPU work units are estimated to take between 12 and 32 minutes.

Only running one GPU work unit at a time. (I haven't figured out how to change the config file to run two yet, puzzled by the GPU and CPU numbers to use).

Never had this error message with CUDA work units, will this errror with SoG work units go away when the estimated time to run gets more accurate?
ID: 1888750 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 1888755 - Posted: 9 Sep 2017, 1:43:37 UTC - in response to Message 1888750.  

In BOINC manager GPU work units are estimated to take between 12 and 32 minutes.

So (most likely) the errors are occurring on the ones with the 12 or so minute estimate and they're taking longer than that to complete and so erroring due to exceeding the estimated run time.

Only running one GPU work unit at a time. (I haven't figured out how to change the config file to run two yet, puzzled by the GPU and CPU numbers to use).

No worries, just checking. With the SoG application on the GTX 750Tis, there is no advantage to running more than 1 WU at a time. With higher end hardware, some people find 2 at a time gives more work per hour, although on my GTX 1070s I've never got any benefit from more than 1 WU at a time, so I've just stuck with 1.
Why some of the WU estimated runtimes are so low compared to the actual time- I've no idea. For me the initial estimated runtimes have always been way higher then the actual run time. Eventually they've come to within a few minutes of the actual time, although generally on the high side, not the low (except for the really short run time WUs, those usually get estimated at slightly lower than the actual run time).

Will this errror with SoG work units go away when the estimated time to run gets more accurate?

Yep.
Usually the actual run time has to be way more than the estimated time for the error to occur, so even with an estimated time of 12 minutes, I don't know why you would get an "Exceeded run time" WU abortion at 15minutes.
Grant
Darwin NT
ID: 1888755 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888812 - Posted: 9 Sep 2017, 13:57:58 UTC
Last modified: 9 Sep 2017, 14:49:20 UTC

Not had any GPU work units error for awhile now and they validating OK. My RAC keeps creeping up so wondered if there was any more I could squeeze out of my hardware?

GPU-Z reports the following on my Nvidia GTX 750 Ti graphics card:

1) GPU Temp 61 Celius

2) GPU load 94-98%, as such doesn't look like I should run 2 GPU work units, but see memory load

3) Memory used is only at 646MB and I have 2GB on the graphics card

4) Memory controller load - very variable from 1% to 80%

5) Power consumption 50-60% TDP

Is there any means of optimising of the GPU memory to make more use of the 2GB? (similiarly for CPU work loads I have lots of spare memory available).
ID: 1888812 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1888844 - Posted: 9 Sep 2017, 16:32:00 UTC - in response to Message 1888812.  

Since you are running the Lunatics installer version of the SoG app, look in the Seti projects directory for the /Docs folder. Read the ReadMe_MultiBeam_OpenCL_NV_SoG.txt file. It tells you how to optimize the SoG app with command line or app_config optional parameters.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1888844 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 1888878 - Posted: 9 Sep 2017, 20:09:03 UTC - in response to Message 1888844.  

Since you are running the Lunatics installer version of the SoG app, look in the Seti projects directory for the /Docs folder. Read the ReadMe_MultiBeam_OpenCL_NV_SoG.txt file. It tells you how to optimize the SoG app with command line or app_config optional parameters.


That's a complex read. It suggests adding command line switches to mb_cmdline.txt. Does this affect both CPU and GPU tasks?

Also in that folder is an empty mb_cmdline_win_x86_SSE3_OpenCL_NV_SoG.txt file. Should I put the command line switchs in that file as it sounds like it is limited to SoG?
ID: 1888878 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1888880 - Posted: 9 Sep 2017, 20:20:22 UTC - in response to Message 1888878.  

<app_config>
  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_SoG</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>0.5</ngpus>
    <cmdline>-sbs 512 -period_iterations_num 10 -high_perf -hp -high_prec_timer -tt 500</cmdline>
  </app_version>
   <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <avg_ncpus>1</avg_ncpus>
    <ngpus>0.5</ngpus>
    <cmdline></cmdline>
  </app_version>
</app_config>


I just find it easier to make a app_config.xml with the commandlines installed in there. Either way, you can copy the section between the > < and place it into OpenCl_NV_SoG.txt file

I did not include any commandlines for APs since I can't recall what they are for a 750 Series card.
ID: 1888880 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1888884 - Posted: 9 Sep 2017, 20:26:20 UTC - in response to Message 1888878.  

Since you are running the Lunatics installer version of the SoG app, look in the Seti projects directory for the /Docs folder. Read the ReadMe_MultiBeam_OpenCL_NV_SoG.txt file. It tells you how to optimize the SoG app with command line or app_config optional parameters.


That's a complex read. It suggests adding command line switches to mb_cmdline.txt. Does this affect both CPU and GPU tasks?

Also in that folder is an empty mb_cmdline_win_x86_SSE3_OpenCL_NV_SoG.txt file. Should I put the command line switchs in that file as it sounds like it is limited to SoG?

No, that just refers in general to any mb_cmdline.txt file. The name changes based on the card type, Nvidia, ATI or Intel. You would put your parameters into the mb_cmdline_win_x86_SSE3_OpenCL_NV_SoG.txt file. Since you have a 750Ti, I would use the suggested parameters from the Nvidia file section:

Super clocked x50TI / x60TI
-sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32 (*requires testing)

Use Notepad to open the empty file and paste "-sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32" (without the quotes) into the file and save as *.txt format.

If you experience sluggish keyboard or mouse input, add -use_sleep to the file.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1888884 · Report as offensive
Previous · 1 · 2 · 3 · Next

Message boards : Number crunching : GPU task stuck - cannot process anymore GPU work


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.