Message boards :
Number crunching :
High performance Linux clients at SETI
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 20 · Next
Author | Message |
---|---|
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
I Use Boinc manager 7.9.3 and got that bug to! Boinc executable is 7.15.0. installed with: apt-get install boinc-client from Ubuntu / Debian repository not downloaded elsewhere. _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
Oddbjornik Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 |
Now I'm up and running with a boincmgr without the offending (not-)fix. No list-jumps in the first five minutes :-) As this bug is somewhat sporadic, it may take a while before I'm certain that it's gone. Let's see what TBar says. |
Oddbjornik Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 |
While I'm at it: The reason I decided to build my own boincmgr in the first place, was that I noticed that when the task list tab was selected, the process would use a lot of cpu even when the window was then minimized to the task bar. The problem was not very visible on a client with 400 tasks, but on my main cruncher, with 1600 tasks (admittedly not completely rightfully obtained), the boincmgr process would use 20% cpu when minimized. And I found out why: Lines 1052 to 1056 in the current version of MainDocument.cpp contain the following: // Don't do periodic RPC calls when hidden / minimized if (!pFrame->IsShown()) return; #ifdef __WXMAC__ if (!wxGetApp().IsApplicationVisible()) return; #endif But when the window is minimized, it is apparently both Shown and Visible. So I added a call to IsIconized: // Don't do periodic RPC calls when hidden / minimized if (!pFrame->IsShown() || pFrame->IsIconized()) return; #ifdef __WXMAC__ if (!wxGetApp().IsApplicationVisible()) return; #endif And with that change, cpu usage practically disappears once the window is minimized. [edit: not sure about this line at all]The new call probably belongs inside the #ifdef block, but I haven't tested this under windows, so I'm not sure. Could anyone in here forward this modification as a suggestion to the right people? Richard? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Hmmm, this sounds similar to a question the FBI would ask when trying to accuse you of lying to the FBI. I cannot recall... Actually, the disk with that system on it died long ago. However, I vaguely remember something about using that section from 7.4.42, which doesn't have the Bug, and changing 7.4.44 & 7.8.3 to match that section of 7.4.42. Anyway, it seems to have worked, and I love my 7.4.44. But, it is getting old...Well, it was pointed out back here, Posted: 23 Sep 2017, 13:32:50 UTCThe commit you linked in that post is headed |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I try and limit my impact to the servers by just reporting a modest amount of finished tasks at each connection. Yes, I only have 17 physical gpus in my five hosts. The same type of low physical count in the other top hosts who show a spoofed gpu client. There a some members with actual counts up to about 12 I think is the most we've seen on a repurposed crypto mining motherboard. The members who have more than 3 or 4 gpus in a single host can be counted on one hand. It is very difficult to get more than 4 gpus to cooperate together on typical motherboard hardware. I applaud their efforts because it shows great skills and perseverance. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
This is the very first I, and I expect others have heard about the "shutdown" during and after outages, this is to be commended, but I think that when anyone who cares to look can see these 64 GPU machines in the top 20, it might have been a good "PR" exercise to let people know that you weren't "swamping the server" A little more explanation about how the "shutdown" works for example in the today's outage: The SETI returns from the outage few hours ago, and my host still not ask or report a single WU (i have about 2K WU ready to UL & report) and will not do in the next 4 hours and when it does, it will report only 100 WU each 5 min. Now multiply that for 10 or more hungry hosts who do the same, some with a lot more WU due their top GPU's . This is why we said, with this approach we are actually making we & the servers pass the outages with a little less pain. |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
Yes, same for me. Seti was very early to be back online again but all times are due to be adjusted in the future. My top hosts are now "disabled" , you'll see that by looking here. https://setiathome.berkeley.edu/hosts_user.php?userid=1635 .. They're not allowed to "talk" to the network. That setting unfortunately affect all boinc Projects but i don't care because i only do s@h solely. _________________________________________________________________________ Addicted to SETI crunching! Founder of GPU Users Group |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Yes, same for me. Seti was very early to be back online again but all times are due to be adjusted in the future. My top hosts are now "disabled" , you'll see that by looking here. https://setiathome.berkeley.edu/hosts_user.php?userid=1635 .. They're not allowed to "talk" to the network. That setting unfortunately affect all boinc Projects but i don't care because i only do s@h solely. Yes this is why i change today the setting to: <day_prefs> <day_of_week>2</day_of_week> <net_start_hour>22.00</net_start_hour> <net_end_hour>8.00</net_end_hour> </day_prefs> Stopping a little latter more close to the outage start time and waking 3 hours later for a 14 hrs of shutdown period instead of 12. |
-= Vyper =- Send message Joined: 5 Sep 99 Posts: 1652 Credit: 1,065,191,981 RAC: 2,537 |
|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Hmmm, this sounds similar to a question the FBI would ask when trying to accuse you of lying to the FBI. I cannot recall... Actually, the disk with that system on it died long ago. However, I vaguely remember something about using that section from 7.4.42, which doesn't have the Bug, and changing 7.4.44 & 7.8.3 to match that section of 7.4.42. Anyway, it seems to have worked, and I love my 7.4.44. But, it is getting old...Well, I assure you that I'm not connected with the FBI - nor with its British equivalent (in this context) Special Branch. But this is the way that bugs get fixed. It is incredibly helpful to have clear, unemotional, reports of
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Perhaps if you go back to where I was compiling numerous versions of BOINC and running each one until I found where the BUG started? You can look around here, and below, https://setiathome.berkeley.edu/forum_thread.php?id=81916&postid=1891240#1891240 I probably compiled and tested a couple dozen versions of BOINC before narrowing it down to 7.4.43. From there it was a simple case of testing what was new in 7.4.43 from 7.4.42. That didn't take long, after tracking it down. A couple of victims have already posted about it in this thread; I ask because I've just compiled the 7.14.2 manager straight from the source repository, and today, after the outage, each time I get new tasks, the list jumps to the end (as if to show off the new tasks)! I Use Boinc manager 7.9.3 and got that bug to! Everyone running Linux is a victim here, they will all say the same thing, "each time I get new tasks, the list jumps to the end" All you have to do is run the BOINC Manager in the Tasks Tab, scrolled to the Top. Soon it will start, then just about every 5 minutes, the page will be jumping to the bottom. Unless, you are running one of My versions of BOINC, My BOINCs don't do that. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
If the branches you wanted to compare were newer, then you could use the compare function of Github at the commit level. But I can't figure out how to do a branch commit compare between 7.4.42 and 7.4.43 since the picklist only gives you the major 7.4 branch to choose from and not the sub branches. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Oddbjornik Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 |
Everyone running Linux is a victim here, they will all say the same thing, "each time I get new tasks, the list jumps to the end"Also; the offending call is inside some logic where GetDocCount is compared to GetCacheCount. I think this code means that the bug only shows itself when the number of tasks in the list changes. Specifically when it is reduced, as when, after the outage, I report 50 tasks and typically only get 7 new tasks back. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
OK, I hear what you say. I've updated #2147 to describe the problem in more formal terms (please check I've got that right, because I can't see it myself), and I've asked Christian Beer to investigate while he's working on #3050. He was online and active at the same time I was, so hopefully he will see the report quickly. |
Tom M Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462 |
I don't feel like I have great skill. :) I won't deny the "keep trying" though :) Tom A proud member of the OFA (Old Farts Association). |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
OK, I hear what you say. I've updated #2147 to describe the problem in more formal terms (please check I've got that right, because I can't see it myself), and I've asked Christian Beer to investigate while he's working on #3050. He was online and active at the same time I was, so hopefully he will see the report quickly. It's better than it was. However, My choice of Titles would have been different. "bug/workround no longer needed?" doesn't imply it's actually causing display problems to the effect it is. Also, to the best of My knowledge, this Bug affects All Linux users, especially those returning completed tasks every 5 minutes. The title would imply there isn't any problem, just that some thing isn't needed anymore. Quite misleading in My Opinion. For someone who keeps the Manager open to Tasks, trying to display the Competed/Active Tasks, this Bug is a Showstopper. I certainly won't use a version of BOINC with this Bug, which is why I built a version without this Bug. Try running the Manager with this view, it won't happen with any of the BOINC versions released in the past FOUR years. Every 5 minutes the page will jump to the bottom displaying only recently downloaded tasks. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Christian Beer has tested and reported back: I've tested the described behavior with a local built using wxWidgets 3.1. The workaround is still required to handle the initial problem but it seems the side-effect described by Richard are gone now. So for the next Client Release we should update wxWidgets to also be in sync with the Mac Client.So, assuming my description was accurate and complete, there is the prospect of a resolution on the horizon. Is anyone currently experiencing the problem in a position to build and test a Manager using the development code from the referenced pull request, and confirm that the result is as you would like it? Note that this solution is due to a change in wxWidgets code, not BOINC code. |
Oddbjornik Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 |
Christian Beer has tested and reported back: I suppose I can do a build this weekend and see how it behaves. But I could use a slightly less convoluted description, just so I'll know that I do exactly what you want me to do and not something else. Regular version 3.1.2 of wxWidgets? Is there a more precise definition of "development code from the referenced pull request"? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Is there a more precise definition of "development code from the referenced pull request"?I would start with a clean clone from master. Pull Request #3050 comprises commits: Manager: remove deprecated wxWidgets flags bbceb19b967f77132a78da8c2e94ca7c1df6cf0e Build: update wxWidgets macro from wxWidgets 3.1.2 fbd15aff54030ffacc629d9139f1ec8bca76fa0f Build: prepare m4 macros for wxWidgets 3.1 af1e1cb3607e4e2d46ed99489f74f9b261167a1f You probably need to visit the individual pages to read the more detailed notes on the changes. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Well, if you compile 7.14.2 with wxWidgets 3.0.3 all goes well. Just as with BOINC 7.8.3. Trying it with wxWidgets 3.1.0, which is what My Mac is using, ends with an Error; /home/tbar/wxWidgets-3.1.0/include/wx/vector.h:44:23: note: previous declaration of ‘void wxQsort(void*, size_t, size_t, wxSortCallback, const void*)’ WXDLLIMPEXP_BASE void wxQsort(void* pbase, size_t total_elems, ^ In file included from BOINCListCtrl.h:59:0, from ViewProjects.cpp:29: ViewProjects.cpp: In constructor ‘CViewProjects::CViewProjects(wxNotebook*)’: BOINCBaseView.h:26:58: error: ‘wxADJUST_MINSIZE’ was not declared in this scope #define DEFAULT_TASK_FLAGS wxTAB_TRAVERSAL | wxADJUST_MINSIZE | wxFULL_REPAIN ^ ViewProjects.cpp:186:53: note: in expansion of macro ‘DEFAULT_TASK_FLAGS’ CBOINCBaseView(pNotebook, ID_TASK_PROJECTSVIEW, DEFAULT_TASK_FLAGS, ID_LIST_PROJECTSV ^ Makefile:1685: recipe for target 'boincmgr-ViewProjects.o' failed make[2]: *** [boincmgr-ViewProjects.o] Error 1 make[2]: Leaving directory '/home/tbar/boinc/clientgui'So, that's going to need to be fixed to use 3.1.0. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.