BOINC Wish List


log in

Advanced search

Questions and Answers : Wish list : BOINC Wish List

Previous · 1 · 2 · 3 · 4 · 5 · Next
Author Message
John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 24221
Credit: 519,558
RAC: 57
United States
Message 928904 - Posted: 26 Aug 2009, 22:17:04 UTC - in response to Message 928827.

My 2 cents.....

Leave my update button ALONE.
It is very helpful when working through Boinc and/or server problems.

And I personally think this is much ado about nothing.

I rather suspect that the fraction of a percent of users who could possibly be using the update button at any given time represent a non-measurable impact on the server load.

I think the comms backoffs built into Boinc roll back user requests enough to handle things when the server side is having problems. And that covers the 99.whatever percent of users who are not sitting in front of their computer trying to figure out what their problem is.

The tiny fraction of us (that's you and I on the forums, folks) who do monitor their rigs and occasionally abuse the update button to coax things along are NOT going to significantly add to the server load.

As I said....just my opinion.

This would only be for the time that the server would not respond to the button anyway. Once that period is passed, the button would re-enable, even if the client had decided to extend the communications deferral.
____________


BOINC WIKI

Ingleside
Volunteer developer
Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 3,886,962
RAC: 18,001
Norway
Message 928984 - Posted: 27 Aug 2009, 9:17:13 UTC - in response to Message 928904.

This would only be for the time that the server would not respond to the button anyway. Once that period is passed, the button would re-enable, even if the client had decided to extend the communications deferral.

Well, with the current 11-second-deferral, any greying-out of the bottom would have very little effect, so basically is a waste of time to add. Now, some projects has longer deferral, but most of these is either giving fairly steady work-supply, or is (nearly) permanently out of work (like LHC).

But, one thing that's been overlooked is, the client doesn't know the difference between "N seconds deferred to not reconnect immediately", and "N hours deferred due to reaching daily quota"... If user has fixed whatever reason for a quota=1, and has one or more finished tasks that will increase the quota, being blocked from connecting for upto 24 hours before can get more work would be a bad idea.


____________
"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

Profile Steven Meyer
Avatar
Send message
Joined: 24 Mar 08
Posts: 2297
Credit: 2,999,007
RAC: 123
United States
Message 929017 - Posted: 27 Aug 2009, 12:45:26 UTC - in response to Message 928984.

But, one thing that's been overlooked is, the client doesn't know the difference between "N seconds deferred to not reconnect immediately", and "N hours deferred due to reaching daily quota"...


Um ... Correct me if I'm wrong, but isn't "the client" written by the same people in Berkeley who send the messages to defer communications? Or, at least, the two groups of people can talk to each other and decide that "Defer 12 seconds" means "defer 12 seconds", regardless of the reason for the deferral?

But, in any case, we have discussed this to no end already and have (I think) collectively come to the conclusion that any sort of lock out by the client is going to be a big problem for folks who are having trouble with communications and are trying to debug the problem, and thus the lock out idea is a "No go" and we are left with the honor system.
____________
FireFox Personas


1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 929185 - Posted: 28 Aug 2009, 3:06:42 UTC - in response to Message 929017.

But, one thing that's been overlooked is, the client doesn't know the difference between "N seconds deferred to not reconnect immediately", and "N hours deferred due to reaching daily quota"...


Um ... Correct me if I'm wrong, but isn't "the client" written by the same people in Berkeley who send the messages to defer communications?

In a very real sense, no.

BOINC is a project to write a universal volunteer computing platform.

SETI@Home is a volunteer computing platform that uses BOINC.

They're friendly, they probably help each other alot, and some are at least listed as part of both projects, but they're different things.

... and BOINC is project-neutral.

____________

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38667
Credit: 572,453,124
RAC: 587,081
United States
Message 929210 - Posted: 28 Aug 2009, 6:22:53 UTC - in response to Message 929185.
Last modified: 28 Aug 2009, 6:27:59 UTC

But, one thing that's been overlooked is, the client doesn't know the difference between "N seconds deferred to not reconnect immediately", and "N hours deferred due to reaching daily quota"...


Um ... Correct me if I'm wrong, but isn't "the client" written by the same people in Berkeley who send the messages to defer communications?

In a very real sense, no.

BOINC is a project to write a universal volunteer computing platform.

SETI@Home is a volunteer computing platform that uses BOINC.

They're friendly, they probably help each other alot, and some are at least listed as part of both projects, but they're different things.

... and BOINC is project-neutral.

And that's part of the problem occasionally when a few of us (myself included) start to ask for changes to the Boinc client when looking at things from a Seti-centric only point of view.

I must plead guilty to this from time to time, as I ONLY run Seti, and sometimes I forget that Boinc must not only handle the way Seti functions, but every other project under the Boinc umbrella as well.

I also never have any impact or problems involving workshare and debts between different projects, because other than for a very short time here and there, I have never tried to run anything alongside Seti.

What might be grand from a Seti point of view may not always work well with some of the other projects.

As Ned pointed out....Boinc needs to remain, as much as possible, project-neutral.

A more difficult task than some might imagine........
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Ingleside
Volunteer developer
Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 3,886,962
RAC: 18,001
Norway
Message 929215 - Posted: 28 Aug 2009, 7:39:56 UTC - in response to Message 929185.

Um ... Correct me if I'm wrong, but isn't "the client" written by the same people in Berkeley who send the messages to defer communications?

In a very real sense, no.

BOINC is a project to write a universal volunteer computing platform.

SETI@Home is a volunteer computing platform that uses BOINC.

They're friendly, they probably help each other alot, and some are at least listed as part of both projects, but they're different things.

... and BOINC is project-neutral.

Well, as long as a project doesn't start re-coding the scheduling-server, atleast in this instance BOINC is writing both the client-side and server-side...

And, it is a fact in this instance, that client doesn't know why it's asked to "deferr for N hours", if it's due to project being down (getting 1 hour), or due to user reaching daily quota (upto 24 hours deferral), or due to projects setting of "don't re-connect before in N seconds". The client, if left alone, will treat all the same, and will deferr for N seconds (or N hours...).

User on the other hand will see the difference between "don't re-connect before in N seconds, since if you do, server will deny you any work anyway", and "oops, the problem with quota taking a dive has I now manually fixed, I will report the finished results so I can get new work"...

The suggested client-change would remove the user the option of re-connecting in the 2nd. instance, something that could lead to many hours downtime, after user has fixed the problem.

Well, of course BOINC could change both client-side and server-side, so you'll get messages like "Quota exceeded, don't re-connect before in (example) 6 hours, 4 minutes and 23 seconds... Oh, and disable the 'update'-bottom for project for... 11 seconds".

But, frankly, only 11 seconds is so short an interwall that atleast my optinion it's a waste of time to do this change...


BTW, AFAIK the 24-hour-deferrals due to "not enough free disk space", "not enough memory", "no application for your OS/platform" and so on has been removed from the scheduling-server. But it's maybe still possible to reach the project that runs with very old server-code, and nothing stops BOINC from re-introducing something like this again. And, if something like this code re-appears, you definitely don't want users to be blocked from re-connecting after fixing a problem like "oops, setting disk-parameters to leave 100 GB free wasn't my intention, I wanted 1 GB free".


____________
"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."

Profile Steven Meyer
Avatar
Send message
Joined: 24 Mar 08
Posts: 2297
Credit: 2,999,007
RAC: 123
United States
Message 929247 - Posted: 28 Aug 2009, 13:19:26 UTC
Last modified: 28 Aug 2009, 13:28:14 UTC

Ok, here is a thought...

User XYZZY has connected to, and is actively processing WU for projects ABC, BCD, and DEF.

BOINC has issued new client-side software with the "Update All Projects" item on the context menu:

+------------------------------------+ | Open BOINC manager | +------------------------------------+ | Snooze | +------------------------------------+ | * Use GPU while computer is in use | | Update all projects | | Do network communications | +------------------------------------+ | About BOINC Manager | +------------------------------------+ | Exit | +------------------------------------+

Now user XYZZY is fond of keeping everything up-to-date, so s/he often clicks on the "Update all projects" button which causes the BOINC client to send Update messages to all of the connected projects.

Each of the three projects keeps a record of the last (5?) times when each user has contacted the project to do updates, and each of the three projects has code that decides when the user is over using the update facility.

When that threshold is reached, then that project responds to the next update request with "Please Don't Do That Again For x Seconds/Minutes/Hours", which the BOINC client will respect.

When a project has responded with such a deferral, then BOINC client will not include that project in any future "Update all projects" requested by the user until the deferral period has elapsed.

If the user's use of the "Update all projects" button has been deferred by all of the attached projects, the BOINC client will cause the button to respond to the user with a message such as "Your attached projects have requested that you not do that again until ..." and list the projects and the date and time when each project will allow user XYZZY to contact the project again.

If user XYZZY is trying to debug some issue with his or her connection to project ABC by contacting ABC to explain the problem and an adviser at project ABC has responded and told the user to try some fixes and then use the Update button to test the fix, the adviser will set a temporary flag on the user's account to prevent the user from getting any "Please Don't Do That Again ..." messages for a period of time.

If the user is able to connect to the other projects with the Update button, and those projects respond with "Please Don't Do That Again ..." messages, then those projects will not be bothered by the user's BOINC client during the testing, but the BOINC client will not disable the Update button because the user has not received any such messages from project ABC.

This puts the "Please Don't Do That Again ..." messages under the control of the projects, and puts the disabling of the Update button under the control of the BOINC client, while still allowing use of the Update button for debugging purposes.

[edit]
Oh, and by the way, if any project has not implemented the "Please Don't Do That Again ..." messages, then that project will always get those Update messages and the project's users will never get the "Your attached projects have requested that you not do that again..." message.
[/edit]
____________
FireFox Personas


hbomber
Volunteer tester
Send message
Joined: 2 May 01
Posts: 437
Credit: 50,852,854
RAC: 31
Bulgaria
Message 929388 - Posted: 29 Aug 2009, 2:24:57 UTC
Last modified: 29 Aug 2009, 2:48:44 UTC

I modified sources and compiled BOINC Manager 6.6.36 for Windows x86(my x64 development rig has dead memory), so "Use GPU" option appears in context menu and it does same like checkbox in Preferences dialog. Works OK, shows checked and unchecked, depending on current preferences.
I'm not sure under what circumstances I can distribute modified executable to other ppl and do I have right t do so.
If it is okay, I'll put it for public download. Let me know if anyone is interested.
____________

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 929390 - Posted: 29 Aug 2009, 2:31:29 UTC - in response to Message 929247.

When that threshold is reached, then that project responds to the next update request with "Please Don't Do That Again For x Seconds/Minutes/Hours", which the BOINC client will respect.

For the case where this is most needed, connecting to the project servers will fail -- it is the extreme overload that's hard.

The rest of this (trying to slow up users so they don't push the project into overload to start with) is a little bit like rearranging deck chairs on the Titanic.

... and solving the real problem is very difficult.

____________

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38667
Credit: 572,453,124
RAC: 587,081
United States
Message 929475 - Posted: 29 Aug 2009, 16:35:12 UTC - in response to Message 929390.

When that threshold is reached, then that project responds to the next update request with "Please Don't Do That Again For x Seconds/Minutes/Hours", which the BOINC client will respect.

For the case where this is most needed, connecting to the project servers will fail -- it is the extreme overload that's hard.

The rest of this (trying to slow up users so they don't push the project into overload to start with) is a little bit like rearranging deck chairs on the Titanic.

... and solving the real problem is very difficult.

Again I will say......
The tiny fraction of a percent of users that have their finger on the buttons at any given time could not crash the server even it they tried......
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5774
Credit: 57,410,899
RAC: 48,626
Australia
Message 929599 - Posted: 30 Aug 2009, 1:23:17 UTC - in response to Message 929475.

Again I will say......
The tiny fraction of a percent of users that have their finger on the buttons at any given time could not crash the server even it they tried......

Nope.
But it does mean it takes longer for people to be able to return work, longer for them to get new work, longer for them to report work.
It just makes the system load heavier for longer than it needs to be.
____________
Grant
Darwin NT.

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1139
Credit: 3,602,362
RAC: 4,100
United Kingdom
Message 929674 - Posted: 30 Aug 2009, 11:18:05 UTC

I would like a page where all the errors that can happen are explain in laymans terms. Is it in the wikipedia?
____________

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13562
Credit: 29,737,137
RAC: 16,636
United States
Message 929771 - Posted: 30 Aug 2009, 18:57:43 UTC - in response to Message 929599.

Again I will say......
The tiny fraction of a percent of users that have their finger on the buttons at any given time could not crash the server even it they tried......

Nope.
But it does mean it takes longer for people to be able to return work, longer for them to get new work, longer for them to report work.
It just makes the system load heavier for longer than it needs to be.


Agreed. I don't think I've ever suggested the idea that those who abuse the Update button have caused irreparable harm to the servers (in fact, I think those who abuse the button bring this point up so they do not feel guilty about abusing it), but rather like you said: it still causes unnecessary load and makes things that much worse when trying to get through. Like people cutting in line; there's so few who do it that it doesn't increase line wait times considerably, but it still means that others have to wait that much longer because you (collectively) feel that you're more important than everyone else that you must get yours in immediately.

I just think it's selfish and inconsiderate.
____________

Profile Steven Meyer
Avatar
Send message
Joined: 24 Mar 08
Posts: 2297
Credit: 2,999,007
RAC: 123
United States
Message 930078 - Posted: 1 Sep 2009, 1:51:39 UTC
Last modified: 1 Sep 2009, 1:58:21 UTC

More wishes.


  • Savable/Loadable Profiles with items like ...

    • Use at most X% of the CPU time.
    • Use at most Y% of the GPU time.
    • Use at most Z% of the processors.


  • Command-line method to load a profile.

These profiles can be useful when setting up BOINC for special situations that come up often enough to be come tedious to do it all from scratch each time.

For example, the "Triple 100 Days", when


  • Ambient temperatures are over 100F.
  • The fans are all going at 100%.
  • The GPU temperature is over 100C.



With a monitor program that can run some command when temps get to some level, one can automate the process of slowing BOINC down to keep the processors from going super-critical.
____________
FireFox Personas


Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12280
Credit: 2,566,923
RAC: 804
Netherlands
Message 930117 - Posted: 1 Sep 2009, 5:42:59 UTC - in response to Message 930078.

* Use at most Y% of the GPU time.

There's no reliable (cross-platform) way available yet to throttle a GPU. It can only do all or not much, or in other words, it can do intricate calculations or it can show you what's happening on your desktop. So you may scratch that one.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Profile Steven Meyer
Avatar
Send message
Joined: 24 Mar 08
Posts: 2297
Credit: 2,999,007
RAC: 123
United States
Message 930124 - Posted: 1 Sep 2009, 6:56:47 UTC - in response to Message 930117.
Last modified: 1 Sep 2009, 7:00:44 UTC

* Use at most Y% of the GPU time.

There's no reliable (cross-platform) way available yet to throttle a GPU. It can only do all or not much, or in other words, it can do intricate calculations or it can show you what's happening on your desktop. So you may scratch that one.


It seems to me that if the user wants less than 100% GPU use, then sending a batch of data to be crunched on for a while, then wait for a bit before sending the next batch would do the trick.

After all, that is what the games do when they are showing stuff on the screen.

Not that they deliberately wait, just that something has to happen like a rocket move a little ways across the screen towards your face . . . or feet.
____________
FireFox Personas


Profile Ageless
Avatar
Send message
Joined: 9 Jun 99
Posts: 12280
Credit: 2,566,923
RAC: 804
Netherlands
Message 930139 - Posted: 1 Sep 2009, 9:12:06 UTC - in response to Message 930124.

It seems to me that if the user wants less than 100% GPU use, then sending a batch of data to be crunched on for a while, then wait for a bit before sending the next batch would do the trick.

It isn't that easy. A task will be translated by the application running on the CPU into kernels which are sent to the videocard's memory, where they wait until they're done by the GPU. When done, that data is sent back to the application running on the CPU, which will translate it back into something it can write to disk. Any lull in this to-and-fro transfer can break the task or stop BOINC receiving a heartbeat from the application, which will crash the task.

After all, that is what the games do when they are showing stuff on the screen.

Most games off late use large maps (750MB and more), of which parts are constantly moved into and out of the videocard's memory, waiting until they're being used or being discarded (unceremoniously dumped out of memory).

You can't really compare how a game is run and how CUDA is done, as games use a lot more functionality on the videocard, including registers for DirectX, pixel and vertex shaders, (anti-)aliasing, where to place what pixel and in what color, etc.
Most of those aren't used when doing CUDA calculations.
____________
Jord

Fighting for the correct use of the apostrophe, together with Weird Al Yankovic

Profile Steven Meyer
Avatar
Send message
Joined: 24 Mar 08
Posts: 2297
Credit: 2,999,007
RAC: 123
United States
Message 930168 - Posted: 1 Sep 2009, 12:51:14 UTC - in response to Message 930139.
Last modified: 1 Sep 2009, 12:52:18 UTC

It seems to me that if the user wants less than 100% GPU use, then sending a batch of data to be crunched on for a while, then wait for a bit before sending the next batch would do the trick.

It isn't that easy. A task will be translated by the application running on the CPU into kernels which are sent to the videocard's memory, where they wait until they're done by the GPU. When done, that data is sent back to the application running on the CPU, which will translate it back into something it can write to disk. Any lull in this to-and-fro transfer can break the task or stop BOINC receiving a heartbeat from the application, which will crash the task.

After all, that is what the games do when they are showing stuff on the screen.

Most games off late use large maps (750MB and more), of which parts are constantly moved into and out of the videocard's memory, waiting until they're being used or being discarded (unceremoniously dumped out of memory).

You can't really compare how a game is run and how CUDA is done, as games use a lot more functionality on the videocard, including registers for DirectX, pixel and vertex shaders, (anti-)aliasing, where to place what pixel and in what color, etc.
Most of those aren't used when doing CUDA calculations.


And yet, the games move data into memory on the GPU when it needs to be there, and then back out of the GPU's memory when that is needed, all without losing anything, or crashing, or such. S@H would seem to be a far simpler process.

I know that all this moving data around would slow down the processing of the S@H task, but then, that is the whole point isn't it? When the GPU is overheating at 100% usage, you will want to reduce that to maybe 50% usage.

Maybe the percentage for the GPU usage will have to have some large granularity. Maybe there would have to allow just 5 values: 0%, 25%, 50%, 75%, or 100%.

Currently we do not even have the 0% option. Short of suspending every CUDA task, or suspending BOINC completely, there doesn't seem to be any way to temporarily turn off usage of the GPU.

Maybe it will be difficult to make a GPU throttle happen, but I think that it is time to make it happen.

Have you ever considered the possibility of using the heartbeat as an opportunity to put a pause into the CUDA code? After each heartbeat it could wait for some time period before it continues.

Seconds 0 ... 1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 .. 11 .. 12 ..Crunching data.. ...Cooling Down... ..Crunching data.. ...Cooling Down...

____________
FireFox Personas


wlh2008
Avatar
Send message
Joined: 25 Aug 09
Posts: 8
Credit: 11,432
RAC: 0
United States
Message 934267 - Posted: 18 Sep 2009, 14:30:22 UTC

I am new, but I think it would be nice if the server would provide a way to allow the client to request that particular WUs be resent in case of a local error which wipes out current work units. An example would be where a system crash causes the OS to zap certain inodes in linux/unix on reboot due to the instance being dirty and those inodes are connected to WUs.

wlh2008

OzzFan
Volunteer tester
Avatar
Send message
Joined: 9 Apr 02
Posts: 13562
Credit: 29,737,137
RAC: 16,636
United States
Message 934279 - Posted: 18 Sep 2009, 14:45:51 UTC - in response to Message 934267.

I am new, but I think it would be nice if the server would provide a way to allow the client to request that particular WUs be resent in case of a local error which wipes out current work units. An example would be where a system crash causes the OS to zap certain inodes in linux/unix on reboot due to the instance being dirty and those inodes are connected to WUs.

wlh2008


There is an option on the server side that allows for this and many other projects use this function. SETI@Home once used it as well, but due to the sheer size of the project and the bandwidth/resources required to enable this option, the SETI Admins were forced to turn it off.
____________

Previous · 1 · 2 · 3 · 4 · 5 · Next

Questions and Answers : Wish list : BOINC Wish List

Copyright © 2014 University of California