High performance Linux clients at SETI

Message boards : Number crunching : High performance Linux clients at SETI
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 20 · Next

AuthorMessage
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1984579 - Posted: 11 Mar 2019, 17:28:16 UTC

These days with Linux there are some pretty good GUI around which make the command line all but redundant.
But it is there, and can make some things easier than using a GUI - especially the way many of the GUI remember the last (big number) of commands used, even after a shutdown in many cases.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1984579 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1984583 - Posted: 11 Mar 2019, 17:57:37 UTC - in response to Message 1984536.  

As suggested, starting a thread to separate this discussion from the "panic mode thread"

Thanks to Richard for suggesting it. I had the same thought while out for my morning walk!


Thank you!
A proud member of the OFA (Old Farts Association).
ID: 1984583 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1984585 - Posted: 11 Mar 2019, 18:04:34 UTC

On my SuSE Linux boxen I use KDE 5.15.2, a GUI I prefer to Gnome although I have the chance of using Gnome when I install the OS. I am using SuSE Leap 15.0 on a HP laptop and Thimbleweed, a development version, on a Virtual Machine hosted on a Windows 10 PC.
On my AT&T Olivetti UNIX PC, dated 1986 and still working, I have a primitive GUI on which I can see the LOGO turtle. Happy days!
Tullio
ID: 1984585 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1984586 - Posted: 11 Mar 2019, 18:04:42 UTC - in response to Message 1984548.  
Last modified: 11 Mar 2019, 18:08:28 UTC

We even have a guide that is set to only report 100 results at a time just to prevent these timeouts, because that's only unneccesary requests.


Where do I apply that limit?

--edit----Think I found it:
<cc_config>
 <log_flags>
   <sched_op_debug>1</sched_op_debug>
 </log_flags>
 <options>
   <use_all_gpus>1</use_all_gpus>
   <save_stats_days>365</save_stats_days>
<max_tasks_reported>100</max_tasks_reported>
 </options>
</cc_config>



<max_tasks_reported>100</max_tasks_reported>
---edit----

Right?
A proud member of the OFA (Old Farts Association).
ID: 1984586 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1984587 - Posted: 11 Mar 2019, 18:08:14 UTC - in response to Message 1984567.  
Last modified: 11 Mar 2019, 18:13:55 UTC

OK, since I'm the one who started this hare running, I'd better try and answer some of the points raised overnight.

Tom M wrote:
I dusted off a Linux/Cuda91 HD and Seti didn't want to play. I removed Seti, re-installed Tbars-all-in-One.
That's where I came in. There had been an http 500 internal server error. On a Sunday night, that didn't seem like the normal Tuesday outage recovery problem, and I wondered whether something else might be causing it. Given that the only thing the servers ever see from our computers is that sched_request file, I asked (with a question mark) whether there might be something unusual about it? More for future reference than anything else, since Tom had already re-installed and got past the problem by that stage.

TBar wrote:
As the person that compiled 7.8.3 ...
That's the first time the code version number was quoted, and to be honest it makes me even more suspicious. In my opinion....
In my opinion...
That's the point that matters. The fact is, since the 7.8.3 version was released almost TWO Years ago, every time You hear it mentioned you Jump at the chance to find something wrong with it. The Fact is, Not a single person has had any trouble with it in almost Two Years, and it cured such ills as the jumping Tasks/Transfers page and a non-working Simple View. Again, there is absolutely NOTHING Special about it , it is 100% stock. I suggest you look at the computer list and note which versions are being used. There are a number of people using 7.8.3 without any trouble. The only trouble mentioned in almost Two years is one user who has been known to have troubles trying to get as many GPUs as he can to work with a Non-Ubuntu system. No other Users have reported any trouble, but, You saw 'Tbars-all-in-One' and again, not knowing any facts, or even what versions are being used, again decided to Jump in and try to find trouble were there is none. Until at least One other person reports the same behavior, I'd suggest you lay off 7.8.3 and admit that after almost Two Years it works Very nicely for those that actually Use it. It might help if you actually tried running Linux before forming any opinions on something you know nothing about. BTW, Raistmer has a machine running Linux, guess which version he is using, https://setiathome.berkeley.edu/show_host_detail.php?hostid=8647915
ID: 1984587 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984589 - Posted: 11 Mar 2019, 18:13:30 UTC - in response to Message 1984586.  

Yes, that's what I think Vyper was referring to - it needs to be inside the Options setting, not dropping off the bottom.

Note that it's a maximum, not a 'wait until...' setting, so it probably won't help during normal running (BOINC will report completed tasks when the oldest is 1 hour old anyway). But it certainly helps clear the backlog after maintenance.
ID: 1984589 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984590 - Posted: 11 Mar 2019, 18:20:26 UTC - in response to Message 1984587.  

If there's nothing wrong with v7.8.3, why did I find a list of over 20 fixed bugs which had been omitted from the release code? Github #2065

But as you say, that was two years ago, and the patches were included in v7.10 and later. Can we both agree to let bygones be bygones, please?
ID: 1984590 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1984592 - Posted: 11 Mar 2019, 18:25:23 UTC - in response to Message 1984590.  

I think most of those were Your Windows Errors, if I remember correctly. None actually affected Linux. The Fact is, No One is reporting any trouble with 7.8.3 in Ubuntu. That is Not an opinion, it's fact.
ID: 1984592 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984594 - Posted: 11 Mar 2019, 18:34:10 UTC - in response to Message 1984592.  

As my comment says,

This list does NOT include commits targeted on Mac, Linux, or release 7.10
because I knew I wasn't qualified to judge whether those were ready for release. The purpose of my list was to draw other developers' (and users') attention to how much had been left behind. If they had been taken up, we'd have gone through the rest of the list in much more detail. But the consensus was to leave them dangling for another release cycle, and leave users to muddle through as best they could. The policy was "fix showstoppers only", so I agree that the ones left behind were minor and/or cosmetic. But it was still poor quality control, IMO.
ID: 1984594 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1984597 - Posted: 11 Mar 2019, 19:10:44 UTC

More info about the spoofed builds:

Actually i run with the 7.8.3 boinc manager and the modified boinc 7.15.0 client. I know is a mix but works perfect. I encounter some problem when i tried to run with the boinc manager 7.15.0 on my host, something related to the way the latest CCX compiler works . Keith knows about and has a fix for that (nothing related to Boinc itself), but since my mix works i never tried, he could explain better.
ID: 1984597 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984701 - Posted: 12 Mar 2019, 8:22:13 UTC

Late getting to this thread as I was out and about today. I set NNT before or during the outage since you aren't going to get work anyway with the schedulers unavailable. I the only way I know to consistently make a connection to the scheduler after the outage and they come back online is to only report finished work. Still NNT set and not asking for work. With the spoofed client I can go a day crunching from my cache. I have my max reported tasks set to 100 and that is reasonable size that works pretty much all the time even when the schedulers are busiest directly after they come back and are deluged with requests from normal empty hosts. It takes 6-8 hours to report my finished tasks on the normal 305 second connection schedule. I try and limit my impact to the servers by just reporting a modest amount of finished tasks at each connection.

I too run the 7.8.3 Manager on most of my hosts. I do have a couple of hosts running the 7.15.0 Manager. No difference in performance or stability. The menus are just slightly different. I had to satisfy a few more dependencies to compile the 7.15.0 Manager and install a half dozen extra libraries. Not a big deal once I saw what was missing. But I decided since the 7.8.3 Manager that comes in the All-in-One is perfectly functional, why go through the extra work for the 7.15.0 Manager when the already provided 7.8.3 works fine.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984701 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1984704 - Posted: 12 Mar 2019, 8:58:12 UTC

I try and limit my impact to the servers by just reporting a modest amount of finished tasks at each connection.


Thank you Keith, all of the info I am seeing suggests that you and your fellow "extended GPU" users are aware of the impact you could have and are mitigating the consequences.

I was unaware of this and it is nice to know, as seeing the total of 595 apparent GPU's in the top 20 machines caused me some concern.
ID: 1984704 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1984709 - Posted: 12 Mar 2019, 9:43:12 UTC

This is slightly offtopic!
But i Think that if everyone starts to "delete" their old systems in their their own account the database will shrink when they do a maintenance and removing obsolete computer IDs.
I've cleared up my old computers because why should they ever need to be there when no Workunits have been assigned for ages to that ID.

Thinking this is one part that all of us users can do to minimize ram usage to somewhat degree for seti@home. Get rid of obsolete systems that takes up rows in the database.

https://setiathome.berkeley.edu/hosts_user.php?sort=rpc_time&rev=0&show_all=1&userid=

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1984709 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1984730 - Posted: 12 Mar 2019, 12:24:13 UTC
Last modified: 12 Mar 2019, 12:28:40 UTC

For those who don't realise the link in the last post:

https://setiathome.berkeley.edu/hosts_user.php?sort=rpc_time&rev=0&show_all=1&userid=

Shows you all your computers, even ones that you may no longer own, each person sees their own computers.
ID: 1984730 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984731 - Posted: 12 Mar 2019, 12:30:18 UTC - in response to Message 1984730.  
Last modified: 12 Mar 2019, 12:30:48 UTC


Shows you all your computers, even one that you may no longer own, each person sees their own computers.


Yeah, there's a couple I sold off, several dead machines, my first tablet, my first netbook, my first Dell, my first Linux attempt.

Absolutely NO WAY I'm deleting those records!

They are a few bytes of info and treasured memories of past machines.
ID: 1984731 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1984773 - Posted: 12 Mar 2019, 14:59:28 UTC

the only "problem" i can report with 7.8.3, and only in regards to the boincmgr, not the client itself, is that with a very large number of tasks (>9000, heh :)), it sometimes crashes the mgr. but this doesn't stop the client and it still keeps crunching in the background. This is also a very very specific case, and this probably wouldn't affect anyone else.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1984773 · Report as offensive
Oddbjornik Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 220
Credit: 349,610,548
RAC: 1,728
Norway
Message 1984792 - Posted: 12 Mar 2019, 19:48:53 UTC - in response to Message 1984587.  

...cured such ills as the jumping Tasks/Transfers page...
@TBar; what changes did you make to fix the jumping tasks list in boincmgr?
I ask because I've just compiled the 7.14.2 manager straight from the source repository, and today, after the outage, each time I get new tasks, the list jumps to the end (as if to show off the new tasks)!
ID: 1984792 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1984793 - Posted: 12 Mar 2019, 19:54:03 UTC - in response to Message 1984792.  

...cured such ills as the jumping Tasks/Transfers page...
@TBar; what changes did you make to fix the jumping tasks list in boincmgr?
I ask because I've just compiled the 7.14.2 manager straight from the source repository, and today, after the outage, each time I get new tasks, the list jumps to the end (as if to show off the new tasks)!


oh wow, i thought the devs had fixed this.

anyway, you can sort your list by descending progress and keep the scroll bar at the bottom instead. that's what i did when i ran the old repository version.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1984793 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1984799 - Posted: 12 Mar 2019, 20:23:25 UTC - in response to Message 1984792.  

...cured such ills as the jumping Tasks/Transfers page...
@TBar; what changes did you make to fix the jumping tasks list in boincmgr?
I ask because I've just compiled the 7.14.2 manager straight from the source repository, and today, after the outage, each time I get new tasks, the list jumps to the end (as if to show off the new tasks)!

It's Stiil in the code being compiled today? Amazing...
Well, it was pointed out back here, Posted: 23 Sep 2017, 13:32:50 UTC
So, from 'committed on Jan 22, 2015' to present. That would be a Four year old Bug now?
That Bug caused me to run 7.2.33 for around Three years, 'cause anything much newer had the Jumping Tasks Page which I found unbearable.
ID: 1984799 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984804 - Posted: 12 Mar 2019, 20:50:27 UTC - in response to Message 1984799.  

Well, it was pointed out back here, Posted: 23 Sep 2017, 13:32:50 UTC
The commit you linked in that post is headed

Work around an apparent bug in wxWidgets 3.0 on Linux
Can you confirm that removing that (I have to assume well-intentioned) new line "m_pListPane->EnsureVisible(iDocCount - 1);" was the only change you made to eliminate the jumping bug?

Christian Beer has just started work on making the necessary compatibility changes to enable us to develop with wxWidgets 3.1 and thus (I sincerely hope) remove the apparent bug we were apparently working round. I'll ask him tomorrow to include your report in his testing.

I did actually pass your previous report upstream as #2147, so there's a convenient reference to use, but as you say nobody appears to have actioned it yet.
ID: 1984804 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 20 · Next

Message boards : Number crunching : High performance Linux clients at SETI


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.