High performance Linux clients at SETI

Message boards : Number crunching : High performance Linux clients at SETI
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 20 · Next

AuthorMessage
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22227
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1992709 - Posted: 5 May 2019, 17:27:13 UTC

For a number of years SETI has only allowed 100 concurrent tasks for the CPU plus 100 per GPU.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1992709 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1992712 - Posted: 5 May 2019, 17:50:30 UTC - in response to Message 1992707.  

Is there any workaround for this, since otherwise every downtime longer then a few hours and machine is becoming idle?

The workaround is anticipation of the loss of work due to the planned and unplanned outages. You can reschedule gpu work from the gpu cache and move it to the cpu cache in advance of an outage and then move it back by using any one of the rescheduling solutions. Rescheduling has had its own dedicated thread for a few years now. I suggest a read.
https://setiathome.berkeley.edu/forum_thread.php?id=79954&postid=1803817#1803817
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1992712 · Report as offensive
Profile M_M
Avatar

Send message
Joined: 20 May 04
Posts: 76
Credit: 45,752,966
RAC: 8
Serbia
Message 1992718 - Posted: 5 May 2019, 18:21:58 UTC - in response to Message 1992708.  
Last modified: 5 May 2019, 18:23:22 UTC

The most work we have ever been able to download for our work caches is 100 tasks per cpu + 100 tasks per gpu. So the days of work and additional days of work setting is meaningless on modern fast hosts. Maybe still applicable to phones and single board computers like the Raspberry Pi.


It would make sense that size of work cache is related to RAC... so maybe to keep 100+100 as default (minimum, appropriate for slow hosts) and to be increased to typical planned outage period based on RAC for each host. This way, everyone is happy, resources are optimally used, and also no "unnecessary" additional work is sent to hosts... Probably not so difficult to implement the logic...
ID: 1992718 · Report as offensive
Loren Datlof

Send message
Joined: 24 Jan 14
Posts: 73
Credit: 19,652,385
RAC: 0
United States
Message 1992720 - Posted: 5 May 2019, 18:43:11 UTC - in response to Message 1992712.  
Last modified: 5 May 2019, 18:59:04 UTC

Is there any workaround for this, since otherwise every downtime longer then a few hours and machine is becoming idle?

The workaround is anticipation of the loss of work due to the planned and unplanned outages. You can reschedule gpu work from the gpu cache and move it to the cpu cache in advance of an outage and then move it back by using any one of the rescheduling solutions. Rescheduling has had its own dedicated thread for a few years now. I suggest a read.
https://setiathome.berkeley.edu/forum_thread.php?id=79954&postid=1803817#1803817
I just use another project and set its resource share to zero. That way it only runs when seti is down or out of Wus and your computer doesn't sit idle.
ID: 1992720 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1992725 - Posted: 5 May 2019, 19:40:45 UTC - in response to Message 1992718.  

It would make sense that size of work cache is related to RAC... so maybe to keep 100+100 as default (minimum, appropriate for slow hosts) and to be increased to typical planned outage period based on RAC for each host. This way, everyone is happy, resources are optimally used, and also no "unnecessary" additional work is sent to hosts... Probably not so difficult to implement the logic...

This topic has been raised innumerable times and the reality is that the current work cache size is not going to change. The project espouses that it has never guaranteed an endless supply of work and recommends backup projects when Seti has no work. The current work cache size was sufficient back when the project first started on the current hardware of the time and the use of the project was to make use of unused computer cycles when the Seti screensaver could run and crunch work in the background. It still hews to that original purpose, as you still have a Seti screensaver even though the performance capabilities of today's hardware is magnitudes greater than when the project started.

Also after 10 years of development, the BOINC code is very complicated and not so simple to fix or change as you suggest.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1992725 · Report as offensive
Loren Datlof

Send message
Joined: 24 Jan 14
Posts: 73
Credit: 19,652,385
RAC: 0
United States
Message 1992763 - Posted: 6 May 2019, 1:00:40 UTC

You guys were right it was a VRAM issue. The GT 730 (2 GB VRAM) is up and running the CUDA60 app flawlessly. Thanks for your help.
ID: 1992763 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1992798 - Posted: 6 May 2019, 9:37:27 UTC - in response to Message 1992763.  

Since you are running a GPU with 2 GB of VRAM you should go back to the default unroll setting. Just remove the -unroll 1 and it will go back to autotune and automatically run unroll 2 or whatever. The times will improve on that card using unroll 2.
ID: 1992798 · Report as offensive
Loren Datlof

Send message
Joined: 24 Jan 14
Posts: 73
Credit: 19,652,385
RAC: 0
United States
Message 1992821 - Posted: 6 May 2019, 14:06:53 UTC - in response to Message 1992798.  

Since you are running a GPU with 2 GB of VRAM you should go back to the default unroll setting. Just remove the -unroll 1 and it will go back to autotune and automatically run unroll 2 or whatever. The times will improve on that card using unroll 2.
Done. Thanks again.
ID: 1992821 · Report as offensive
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 1992880 - Posted: 6 May 2019, 23:31:20 UTC

Hi

Is there specific command line arguments that I should try for the RTX 2070 graphics card?

Thanks in advance,
BoincSpy
ID: 1992880 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1992901 - Posted: 7 May 2019, 1:20:58 UTC - in response to Message 1992880.  

Without knowing what your hardware consists of because you have hidden your hosts, impossible to suggest anything.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1992901 · Report as offensive
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 1993034 - Posted: 7 May 2019, 23:05:41 UTC - in response to Message 1992901.  

Sorry about that I thought I had the host visible. They are now visible...

Thank you in Advance.
ID: 1993034 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1993055 - Posted: 8 May 2019, 1:06:24 UTC - in response to Message 1993034.  

Sorry about that I thought I had the host visible. They are now visible...

Thank you in Advance.

You could speed them up a bit by adding the -nobs parameter to the <cmdline></cmdline> entry location in either the app_info.xml or the app_config.xml.
Make sure you reduce your cpu thread usage as the -nobs parameter requires a full cpu core to support the gpu task.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1993055 · Report as offensive
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 1993132 - Posted: 8 May 2019, 16:15:14 UTC - in response to Message 1993055.  
Last modified: 8 May 2019, 16:15:37 UTC

Thanks for the suggestion, I am now getting work-units being completed just shy of a minute..
ID: 1993132 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1993142 - Posted: 8 May 2019, 17:28:31 UTC - in response to Message 1993132.  

I don't see any use of the -nobs parameter yet. Did you restart BOINC or re-read config files in the Manager?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1993142 · Report as offensive
BoincSpy
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 146
Credit: 124,775,115
RAC: 353
Canada
Message 1993144 - Posted: 8 May 2019, 17:38:27 UTC - in response to Message 1993142.  

I just suspended boinc and restarted after I added nobs, I have no restarted boinc.. Here is portion of my app_info.xml

<app_info>
<app>
<name>setiathome_v8</name>
</app>
<file_info>
<name>setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101</name>
<executable/>
</file_info>
<app_version>
<app_name>setiathome_v8</app_name>
<platform>x86_64-pc-linux-gnu</platform>
<version_num>801</version_num>
<plan_class>cuda90</plan_class>
<cmdline>-nobs</cmdline>
<coproc>
<type>NVIDIA</type>
<count>1</count>
</coproc>
<avg_ncpus>0.1</avg_ncpus>
<max_ncpus>0.1</max_ncpus>
<file_ref>
<file_name>setiathome_x41p_V0.98b1_x86_64-pc-linux-gnu_cuda101</file_name>
<main_program/>
</file_ref>
</app_version>
...
ID: 1993144 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1993145 - Posted: 8 May 2019, 17:43:30 UTC

Suspending BOINC does not read the app_info.xml. You have to completely restart it. If the parameter is placed in the app_config file, only a re-read of config files is necessary.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1993145 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1993431 - Posted: 11 May 2019, 15:50:20 UTC

Hi,
I am wondering if I should retrograde back to the previous version on both of my HEDC boxes?

I am still seeing a lot of inconclusives on a daily basis. And while some of them are "Darwin" disagreements, at least some are not.

Could someone take a look and offer and opinion.

I know some of my problems were pushing my AMD cpu harder than it appears to be able to work. But a lot are just plain inconclusives.

Thank you.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1993431 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1993437 - Posted: 11 May 2019, 17:29:14 UTC

I don't see anything out of the ordinary. You can discount all the overflows too.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1993437 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1993439 - Posted: 11 May 2019, 17:40:21 UTC - in response to Message 1993437.  

I don't see anything out of the ordinary. You can discount all the overflows too.


Ok, Thank you.

I just hate slowing things down because of a simple fixable problem.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1993439 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1995072 - Posted: 24 May 2019, 22:23:22 UTC - in response to Message 1991417.  
Last modified: 24 May 2019, 22:37:46 UTC

Updated to 0.98b1 CUDA90

In stderr:
@
SETI@home using CUDA accelerated device GeForce GTX 1050 Ti
Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1
@

Should I change params or it's OK for Ti1050 to go with such default settings?


. . I thought that unroll =1 was the default for 0.98b1 because it has the reduced external access process that keeps everything in memory on the GPU for pulse find so the unroll doesn't matter.

Stephen

? ?
ID: 1995072 · Report as offensive
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 20 · Next

Message boards : Number crunching : High performance Linux clients at SETI


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.