BOINC needs a overhaul


log in

Advanced search

Message boards : Number crunching : BOINC needs a overhaul

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author Message
Profile The Gas Giant
Volunteer tester
Avatar
Send message
Joined: 22 Nov 01
Posts: 1896
Credit: 2,622,541
RAC: 0
Australia
Message 975434 - Posted: 3 Mar 2010, 6:19:12 UTC

All up I think BOINC works very well. I don't always like how it handles work fetch and how it works out what to crunch and what it preempts. I also don't like how it can loose all the stats if your computer freezes at the wrong point in time - which happens all to regularly with 'doze. I also don't like the way GPU jobs are scheduled (FIFO is just all wrong and needs to be fixed).

I'm no fan boy - but it has come a very long way since it's inception and is many times better than the very early versions. And OMG look at the number of projects using it.

It looks like it's doing exactly what it was designed to do.

I also wish reporting bugs / interacting with the devs was easier.

YMMV

Profile The Gas Giant
Volunteer tester
Avatar
Send message
Joined: 22 Nov 01
Posts: 1896
Credit: 2,622,541
RAC: 0
Australia
Message 975649 - Posted: 4 Mar 2010, 5:22:20 UTC
Last modified: 4 Mar 2010, 5:24:04 UTC

Speaking of not liking the work schedular, see this link

PrimeGrid has gotten into deadline issues due to some other weired BOINC issue (I'll try and catch that one next) so now BOINC is going to crunch every wu it thinks is in deadline trouble until the sum of the estimated time to completion is OK again.

This means that each wu will remain in RAM (removing them is not the answer) and cause my machine to start paging to disk. UGH!

Profile Kibble (KB7TIB)
Avatar
Send message
Joined: 6 Dec 99
Posts: 21
Credit: 1,844,286
RAC: 2,265
United States
Message 975677 - Posted: 4 Mar 2010, 8:18:31 UTC

@ WinterKnight:


There is a case for users to participate in more than one project, but it can never be forced and if a user wants to do only one project then as I see it this rule makes BOINC broken. Find a different solution that doesn't affect projects that don't need it.


Please try this point out at the LHC@home fora. LOL!

Regards

____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,797
RAC: 257
Netherlands
Message 975727 - Posted: 4 Mar 2010, 17:43:08 UTC - in response to Message 975677.

Hi, anyway I'm changing opinion, after carfully weighted pro's and con's, of the overhaul. Another thing, manpower, to realize that complete rewrite of the code, even change to another language, too.
I don't anylonger think it's a real solution, though.

Nothing against you, Luke, ofcoarse

____________

Profile Blurf
Volunteer tester
Send message
Joined: 2 Sep 06
Posts: 7623
Credit: 7,027,026
RAC: 814
United States
Message 975757 - Posted: 4 Mar 2010, 20:29:09 UTC

Nice thread...

Luke--who I affectioniately call "Trigger" now--and this fits perfectly this time--I truly think you jumped the gun on this one. I'm saying this not as a "metaphorical punch in the noise" but simply a fact.

Other people have requested changes in code, etc (most recently Sattler) and it just doesn't happen when posted here. A Petition/vote isn't going to help matters and as we've seen-is starting to dredge up old arguments again (credits, etc).

Others have stated it--the project doesn't have the manpower or finances to do a total rewrite--it's simply not going to happen.

While Boinc works just fine for me, I do see where maybe it'd be a fair idea to stop all Boinc production for let's say--6 to 8 months--and revise what is already out there to fix some of the issues. However has anyone considered the status of DA's grant that funds Boinc and the parameters that state when these upgrades must occur? I just have to wonder if that is a factor that may be involved.

Until I'm forced to or see a version that I think is worthy, I'm sticking with 5.10.45
____________


Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 13000
Credit: 7,666,268
RAC: 6,147
United States
Message 975777 - Posted: 4 Mar 2010, 22:45:21 UTC - in response to Message 975434.

I also wish reporting bugs / interacting with the devs was easier.

Yes, that is what need the total re-write!

____________

Profile Cannibal Corpse
Volunteer tester
Avatar
Send message
Joined: 9 Aug 02
Posts: 13
Credit: 154,389
RAC: 0
United States
Message 975792 - Posted: 5 Mar 2010, 0:14:42 UTC - in response to Message 974751.

Since there are third party Op App's that are better than Boinc app's, why cant Boinc sanction those app's (authenticate?)and realse those app's? So yes I agree, need a total restructure.
____________
Do WHAT THO WILL SHALL BE THE WHOLE OF THE LAW.

THE ONLY PATH WORTH TRAVELING IS THE PATH WITH HEART.
Proud member of TEAM CARL SAGAN

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4600
Credit: 121,644,053
RAC: 39,629
United States
Message 975793 - Posted: 5 Mar 2010, 0:16:35 UTC - in response to Message 975792.
Last modified: 5 Mar 2010, 0:17:50 UTC

Since there are third party Op App's that are better than Boinc app's, why cant Boinc sanction those app's (authenticate?)and realse those app's? So yes I agree, need a total restructure.

There are Opt Apps for projects, such as seti@home, not BOINC. BOINC is just the software that runs those projects.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile Cannibal Corpse
Volunteer tester
Avatar
Send message
Joined: 9 Aug 02
Posts: 13
Credit: 154,389
RAC: 0
United States
Message 975794 - Posted: 5 Mar 2010, 0:25:00 UTC - in response to Message 975793.
Last modified: 5 Mar 2010, 0:32:35 UTC

Cool thx...understood. Like Win running Boinc.lol
Oh BTW some might wonder why i am crunchin Mw@h and posting on S@H...Well untill S@H can fix the problems, for which i wish i could donate mony to. And well this is the better message board, i feel. So cudos and wish the best to all the crunchers out there...
____________
Do WHAT THO WILL SHALL BE THE WHOLE OF THE LAW.

THE ONLY PATH WORTH TRAVELING IS THE PATH WITH HEART.
Proud member of TEAM CARL SAGAN

woodenboatguy
Send message
Joined: 10 Nov 00
Posts: 368
Credit: 3,969,364
RAC: 0
Canada
Message 975812 - Posted: 5 Mar 2010, 1:20:58 UTC

Put me down for an informed "no" vote.

I wrote a great long (read "boring") post and then decided not to post it. The long and short of it is, after 10+ years as a coder beginning in the late '70s and as a consuling PM since 1990, my feeling is that those who use the word "rewrite" don't understand just what they are suggesting. One might as well say "rewrite" the Empire State building. An exciting idea and perhaps in a half dozen years we'd get to see what the view is like from the same elevation on the new building. But then again, at least the washrooms would be new.

BOINC, like a substantial building, is a collection of highly complex interconnections worked in over many many (MANY) hours of work. Unless a legion of volunteer uber-coders are about to volunteer fantastic amounts of effort to essentially reproduce a major portion of exactly the same functionality (remember, not everything in this goes out in the trash), then we have no reference point. Add to that massive (MASSIVE) amounts of time and effort to test, deploy, respond to the new bugs (not the same as the old bugs of course so lots and lots AND LOTS of new things to complain about there).

If you are still reading at this point, you must be a PM too. Otherwise you are getting the drift. 98% of what is under the covers needs no "rewrite" and enhancements are the way of IT. I once enhanced code older than I am....it was that good and still running meeting the original needs it was written for, over 30 years prior.

Regards,

____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24788
Credit: 524,053
RAC: 86
United States
Message 975819 - Posted: 5 Mar 2010, 2:08:02 UTC - in response to Message 975052.

SNIP

I would suggest that the 2*cpu's rule be rejected on these grounds.

Sorry, you have not made your case.

1) Multi projects are encouraged. So if you cannot get work from one, you could get work from another.

2) There are real cases where a limit is required as explained.

If you can come up with a better limit, propose it and how to calculate it.

I am so tired of people in this Seti forum trying to get me to run other projects. I signed on to Seti in the Beginning and that is what I want to run. I never liked the idea of Boinc and that was why I left for awhile. If I didn't run seti I would probably Fold.

Fine, you don't have to sign up for another project. However if you project goes down for a month, you cannot expect to be able to keep crunching through that.

But you still have not come up with a different limit, and a rationale for it.
____________


BOINC WIKI

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24788
Credit: 524,053
RAC: 86
United States
Message 975821 - Posted: 5 Mar 2010, 2:18:33 UTC - in response to Message 975812.

Put me down for an informed "no" vote.

I wrote a great long (read "boring") post and then decided not to post it. The long and short of it is, after 10+ years as a coder beginning in the late '70s and as a consuling PM since 1990, my feeling is that those who use the word "rewrite" don't understand just what they are suggesting. One might as well say "rewrite" the Empire State building. An exciting idea and perhaps in a half dozen years we'd get to see what the view is like from the same elevation on the new building. But then again, at least the washrooms would be new.

BOINC, like a substantial building, is a collection of highly complex interconnections worked in over many many (MANY) hours of work. Unless a legion of volunteer uber-coders are about to volunteer fantastic amounts of effort to essentially reproduce a major portion of exactly the same functionality (remember, not everything in this goes out in the trash), then we have no reference point. Add to that massive (MASSIVE) amounts of time and effort to test, deploy, respond to the new bugs (not the same as the old bugs of course so lots and lots AND LOTS of new things to complain about there).

If you are still reading at this point, you must be a PM too. Otherwise you are getting the drift. 98% of what is under the covers needs no "rewrite" and enhancements are the way of IT. I once enhanced code older than I am....it was that good and still running meeting the original needs it was written for, over 30 years prior.

Regards,

I have been party to one re-write. It was going to take 4 years to cram the next two major features into the old code and only 2.5 to rewrite and get the next 5 major features in place. We were only 1 month wrong on the 2.5 years BTW. That was 2.5 years with NO new releases.

Are there things that could be done better in BOINC? Almost certainly. Are some ideas for changing things going to work worse than it does now? Absolutely. Have some changes been made that make things worse? Yes. (most recent example is 6.10.35 that has no preemption for single CPU applications, and no test for loooong times between checkpoints). Contrary to what people said before this was implemented, it DOES lead to late work that gets NO credit, as it is scientifically WORTHLESS.
____________


BOINC WIKI

Lionel
Send message
Joined: 25 Mar 00
Posts: 587
Credit: 240,693,897
RAC: 68,291
Australia
Message 975854 - Posted: 5 Mar 2010, 4:59:46 UTC


I for one would like to see the back of "Communicating with BOINC Client, Please Wait..." It seems that whenever I open Boinc Manager I seem to wait an incredibly long time for the message box to go and then you get about 5 seconds to do something before its back...

The problem is only going to get worse with faster multi core processors and GPUs

I do think its time to re-think how BOINC works and redesign the app.


____________

Aurora Borealis
Volunteer tester
Avatar
Send message
Joined: 14 Jan 01
Posts: 3001
Credit: 5,187,343
RAC: 2,388
Canada
Message 975889 - Posted: 5 Mar 2010, 8:17:48 UTC - in response to Message 975854.
Last modified: 5 Mar 2010, 8:23:24 UTC


I for one would like to see the back of "Communicating with BOINC Client, Please Wait..." It seems that whenever I open Boinc Manager I seem to wait an incredibly long time for the message box to go and then you get about 5 seconds to do something before its back...

The problem is only going to get worse with faster multi core processors and GPUs

I do think its time to re-think how BOINC works and redesign the app.


Your systems have between 3000 and 8000 task. The 'Show active tasks' was added for people with very large caches. I'm sure it helped a little for display, but Boinc still needs to read and update very large files needed to keep track of so many tasks and that takes a lot of time. The only way, I can think of, to speed things up a bit would be for Boinc to have an indexed database of the tasks on the system. This would require a major overhaul and months if not years of coding and debugging.
____________

Boinc V7.2.42
Win7 i5 3.33G 4GB, GTX470

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12397
Credit: 2,668,453
RAC: 826
Netherlands
Message 975902 - Posted: 5 Mar 2010, 9:09:06 UTC - in response to Message 975889.

.. but Boinc still needs to read and update very large files needed to keep track of so many tasks and that takes a lot of time.

And display those in real-time. At present the RPC to fetch the information takes 1.5 seconds to display 1,200 tasks, while the refresh update in BOINC Manager is 1 second. So 3,000 tasks take 2.75 seconds, while 8,000 take 10 seconds. There's just no way to get all that information into a 1 second refresh rate.

And then the question is what's the use of showing/updating all those tasks that aren't running/otherwise actively being used? Hence the use of the "Show active tasks" button, which is on by default. If you clicked the button while it showed "Show all tasks", then left it at that, it's no wonder BM loses the connection to the client.

So now you can't run BM long enough to click that button. No, but you can edit the registry to reactivate this button.

Start->Run, type regedit, click OK.
Navigate to HKEY_CURRENT_USER\Software\Space Sciences Laboratory, U.C. Berkeley\BOINC Manager\Tasks and change the value of ActiveTasksOnly from 0 to 1.
Then start BOINC Manager. Changes are immediate.

Active tasks are only those that are running, waiting to run and suspended.
It will not show the ready to start, uploading, downloading and ready to report.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24788
Credit: 524,053
RAC: 86
United States
Message 975934 - Posted: 5 Mar 2010, 12:04:19 UTC - in response to Message 975902.

.. but Boinc still needs to read and update very large files needed to keep track of so many tasks and that takes a lot of time.

And display those in real-time. At present the RPC to fetch the information takes 1.5 seconds to display 1,200 tasks, while the refresh update in BOINC Manager is 1 second. So 3,000 tasks take 2.75 seconds, while 8,000 take 10 seconds. There's just no way to get all that information into a 1 second refresh rate.

And then the question is what's the use of showing/updating all those tasks that aren't running/otherwise actively being used? Hence the use of the "Show active tasks" button, which is on by default. If you clicked the button while it showed "Show all tasks", then left it at that, it's no wonder BM loses the connection to the client.

So now you can't run BM long enough to click that button. No, but you can edit the registry to reactivate this button.

Start->Run, type regedit, click OK.
Navigate to HKEY_CURRENT_USER\Software\Space Sciences Laboratory, U.C. Berkeley\BOINC Manager\Tasks and change the value of ActiveTasksOnly from 0 to 1.
Then start BOINC Manager. Changes are immediate.

Active tasks are only those that are running, waiting to run and suspended.
It will not show the ready to start, uploading, downloading and ready to report.

That assumes linear. I know there is a sort involved, so it is at least lg(n) * n, and if the wrong sort is used may be as large as n^2 for speed. For simplicity, say 1000 tasks takes 1 second. 10,000 tasks would take 20 seconds with n*lg(n) and 10,000,000 seconds with n^2.

One possible problem (and I don't know if it exists) is that adding tasks to a windows list box that is sorted - sorts tasks every time an item is added.
____________


BOINC WIKI

Profile hiamps
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 975951 - Posted: 5 Mar 2010, 14:02:05 UTC - in response to Message 975902.

.. but Boinc still needs to read and update very large files needed to keep track of so many tasks and that takes a lot of time.

And display those in real-time. At present the RPC to fetch the information takes 1.5 seconds to display 1,200 tasks, while the refresh update in BOINC Manager is 1 second. So 3,000 tasks take 2.75 seconds, while 8,000 take 10 seconds. There's just no way to get all that information into a 1 second refresh rate.

And then the question is what's the use of showing/updating all those tasks that aren't running/otherwise actively being used? Hence the use of the "Show active tasks" button, which is on by default. If you clicked the button while it showed "Show all tasks", then left it at that, it's no wonder BM loses the connection to the client.

So now you can't run BM long enough to click that button. No, but you can edit the registry to reactivate this button.

Start->Run, type regedit, click OK.
Navigate to HKEY_CURRENT_USER\Software\Space Sciences Laboratory, U.C. Berkeley\BOINC Manager\Tasks and change the value of ActiveTasksOnly from 0 to 1.
Then start BOINC Manager. Changes are immediate.

Active tasks are only those that are running, waiting to run and suspended.
It will not show the ready to start, uploading, downloading and ready to report.

Thanks Ageless, I really needed that advice a few weeks ago. I will save this post.
____________
Official Abuser of Boinc Buttons...
And no good credit hound!

Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12397
Credit: 2,668,453
RAC: 826
Netherlands
Message 975955 - Posted: 5 Mar 2010, 14:38:06 UTC

You don't need to run BOINC manager to run the client (and science apps), even when you haven't got BOINC installed as a service.

By enabling the exit dialog (available since BOINC 6.3.23. In case it's off: through Advanced->Options, check "Enable Manager exit dialog?") you can exit BOINC Manager without stopping the client or the science applications. Uncheck "Stop running science applications when exiting Manager" and you're done.

You can even start BOINC without BOINC Manager, just start BOINC from a command line (or batch file) with boinc.exe --detach (detach will close the console window).
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Norwich Gadfly
Avatar
Send message
Joined: 29 Dec 08
Posts: 100
Credit: 488,414
RAC: 0
United Kingdom
Message 975994 - Posted: 5 Mar 2010, 17:59:02 UTC - in response to Message 975907.

OK...the biggest problem I have with Boinc is it's overhead when dealing with large caches.
3,000 to 4,000 WU's and above.
Boincmanager seems to suddenly become self centered, and consumes as much CPU time as possible just pondering what to do next and reading/writing to the hard drive. Hard drive bursts of 5-15 seconds with little going on besides that.

Meow meow.


I'm not surprised, it will spend most of its time scratching its head trying to work out what to do next. But why would anyone want a cache with that many units ? On a four core machine, currently I have five active tasks, four ready to report, and nine ready to start, spread over three projects. My additional work buffer is one day.

____________

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : BOINC needs a overhaul

Copyright © 2014 University of California