Making two BOINC projects play nice together

Message boards : Number crunching : Making two BOINC projects play nice together
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Cheopis

Send message
Joined: 17 Sep 00
Posts: 156
Credit: 18,451,329
RAC: 0
United States
Message 1620829 - Posted: 30 Dec 2014, 22:10:08 UTC
Last modified: 30 Dec 2014, 22:11:26 UTC

OK, Asteroids@home is dominating my machine, and I can't get any SETI work done.

I have SETI@home set for 100 resource share
I have Asteroids@home set for 5 resource share

Asteroids@home average work is @28,000
SETI@home average work is @2,600

Asteroids@home has completed @1,000,000 work units
SETI@home has completed @11,000,000 work units

Despite these numbers, Asteroids@home is using every available resource in my machine and letting SETI have none.

It's clear that something here is terribly broken.

First, and most importantly, does anyone know a rock-solid method of ensuring that two projects devote a proportion of time that I want devoted to each project, without needing to constantly monitor things? I want to set it, and forget it.

Secondly, and this is NOT critical, the only time when I want the machine to ignore the distribution of CPU time is when there is NO work available from one or the other projects. When that happens, up to 100% of resources can go to whatever project has work.
ID: 1620829 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22200
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1620836 - Posted: 30 Dec 2014, 22:28:08 UTC

One thing to grasp is that BOINC works on a long time base, the work share is based on a long term average, not a short term resource share.
Additionally, until the status-quo is reached the project with the shorter deadlines will often "swamp" a cruncher with work, until the "deficit" at other projects is too high.
And third, SETI is suffering from a shortage of work to send out just now, so even if everything else were in balance you wouldn't be getting a full quota from S@H :-(
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1620836 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1620840 - Posted: 30 Dec 2014, 22:38:36 UTC - in response to Message 1620829.  
Last modified: 30 Dec 2014, 22:48:39 UTC

I have SETI@home set for 100 resource share
I have Asteroids@home set for 5 resource share

Asteroids@home average work is @28,000
SETI@home average work is @2,600


Achtung!: "Is your machine plugged in" type question follows

Are you 100% sure you don't have the resource share the other way around? ie Astro@100 and SETI@5?

One really good thing about credit new* is that it will look at RAC for resource share so projects with ridiculous GFLOPs claims essentially get penalized. However I have NO idea if Asteroids is an overclaimer and I'm not saying that it is.

Anyway, if u put both @100 RS for example then they'll eventually have a similar RAC (when both projects have WUs flowing of course).

*Actually, it could be a newer version of Boinc but I have no idea right now. Too ill to think straight:)
ID: 1620840 · Report as offensive
BONNSaR

Send message
Joined: 9 Nov 04
Posts: 38
Credit: 21,538,589
RAC: 9
Australia
Message 1620843 - Posted: 30 Dec 2014, 22:51:52 UTC

Are you running Asteroids and Seti GPU or CPU. I run both and have one GPU dedicated to Asteroids and the other GPU dedicated to Seti MB. Is this what you are trying to achieve ?
ID: 1620843 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1620853 - Posted: 30 Dec 2014, 23:16:28 UTC

I think I may have tracked this one down a few days ago and the history is in this thread (Bonic Bug??).. Currently I don't see a good way around it when you are mixing CPU and GPU in a single REC value. The GPU and CPU need to each have a REC but that will impact a fair amount of code.
ID: 1620853 · Report as offensive
Cheopis

Send message
Joined: 17 Sep 00
Posts: 156
Credit: 18,451,329
RAC: 0
United States
Message 1620944 - Posted: 31 Dec 2014, 2:47:57 UTC
Last modified: 31 Dec 2014, 2:48:26 UTC

Thanks for the answers folks, and here are some clarifications.

1) I have work for both projects queued. It's not a matter of getting work. I have SETI work to do, but Asteroids is taking all the processor time.

2) I am absolutely certain about my resource share being set (100 SETI) and (5 Asteroids)

**Disclaimer - NONE of the below is aimed at the SETI team, or the Asteroids team. I recognize that these are not the BOINC forums. I'm venting here, and if no solution is forthcoming, then I'll go over to the BOINC forums and play a little nicer there since I've already vented here, heh. **

Right now, the way things are currently set, BOINC should be requesting 20x more SETI work than Asteroids work, regardless of any deadlines. I've had this issue before, years ago, trying to make another project work with SETI. BOINC did the same thing then. It's almost like the BOINC folks don't recognize that I'm donating my computer time here, and I want to say how much work it does for any given project.

If I cannot make this work to my satisfaction, I will once again pull myself out of alternate projects and only work a single project. SETI. Forcing me to put more work into a project I want to put less work into will just make me stop doing anything for that project.
ID: 1620944 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1620948 - Posted: 31 Dec 2014, 3:02:29 UTC - in response to Message 1620944.  

Thanks for the answers folks, and here are some clarifications.

1) I have work for both projects queued. It's not a matter of getting work. I have SETI work to do, but Asteroids is taking all the processor time.

2) I am absolutely certain about my resource share being set (100 SETI) and (5 Asteroids)

**Disclaimer - NONE of the below is aimed at the SETI team, or the Asteroids team. I recognize that these are not the BOINC forums. I'm venting here, and if no solution is forthcoming, then I'll go over to the BOINC forums and play a little nicer there since I've already vented here, heh. **

Right now, the way things are currently set, BOINC should be requesting 20x more SETI work than Asteroids work, regardless of any deadlines. I've had this issue before, years ago, trying to make another project work with SETI. BOINC did the same thing then. It's almost like the BOINC folks don't recognize that I'm donating my computer time here, and I want to say how much work it does for any given project.

If I cannot make this work to my satisfaction, I will once again pull myself out of alternate projects and only work a single project. SETI. Forcing me to put more work into a project I want to put less work into will just make me stop doing anything for that project.

The other little nasty is hurry up mode. If it looks like a work unit may not be processed before it's deadline, all other projects are locked out until the work unit is processed. I had this problem with World Community Grid because I had short turn around times so I received units that failed to agree with very short deadlines. In the end it balances out but I wasn't happy about it. I haven't seen the problem with the new computer because it turns records around so fast but it is something to watch out if a project has units that take a bunch of time to process.
ID: 1620948 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1621218 - Posted: 31 Dec 2014, 11:45:25 UTC - in response to Message 1620944.  
Last modified: 31 Dec 2014, 12:00:02 UTC

Thanks for the answers folks, and here are some clarifications.

1) I have work for both projects queued. It's not a matter of getting work. I have SETI work to do, but Asteroids is taking all the processor time.

You have Seti 175 tasks here atm, with Seti having a 100 task limit per device, once Boinc fills up a device with it's 100 Seti tasks,
it'll then only be able to get tasks from other projects, should be Nvidia GPU tasks, But since you're got your host hidden at Asteroids I can't be sure.

2) I am absolutely certain about my resource share being set (100 SETI) and (5 Asteroids)

Putting the resource share too low will guarantee that Asteroids will be run first, otherwise there won't be enough time available to get them done by deadline.
Putting your cache settings too high will also guarantee that Asteroids is run first,
With Asteroids having a 10 day deadline, anything above about 5 days minimum buffer size will mean Asteroids will be run first,
(The minimum Buffer size also means how many days offline the host will have, meaning that 5 days work needs to be done five days early to meet the deadline),
Or a Combination of.

Boinc 7.4.24 and later no longer report high priority as a task status, So you won't be able to see why they're running first:

http://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commit;h=bca7c006deae20cc31be20fc37396bb4c0cfafc2
Manager: omit ", high priority" from task status This makes it sound like BOINC is running the job at high OS priority.

Boinc 7.4.28 and later will report High priority in the Event Log if you set cpu_sched_debug:

http://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commit;h=28f18bea30d819290ba647286fdec01763c53180
client: indicate "high-priority" tasks in event log (if cpu_sched_debug set)

So you could upgrade to Boinc 7.4.36, set the cpu_sched_debug flag, then check the Event Log to see if Asteroids tasks are running in high priority.

Claggy
ID: 1621218 · Report as offensive
Cheopis

Send message
Joined: 17 Sep 00
Posts: 156
Credit: 18,451,329
RAC: 0
United States
Message 1621259 - Posted: 31 Dec 2014, 13:08:01 UTC - in response to Message 1621218.  

Thanks Claggy!

I have unhidden my machine over at Asteroids. Not quite sure how that happened.

I also noticed that both projects are on the same account, but they do have different usernames. I'm not certain why that is. If it's potentially a problem, I'll try to figure out how to assign a username at Asteroids that matches the SETI username.

The resource share is making no sense to me. Maybe I'm just being silly.

If I say that I want my machine to resource share X work on one project and resource share Y work on another, then when BOINC reports to the project that I want ten days work for each, then I should be retrieving work from the projects based on the resource share I requested.

For example:

SETI @ 100 resource share
Asteroids @ 5 resource share.

BOINC calculates that my client should ask for:
SETI work equivalent to 95.24% of my machine's GPU capacity for ten days.
SETI work equivalent to 95.24% of my machine's CPU capacity for ten days.
Asteroids work equivalent to 4.76% of my machine's GPU capacity for ten days.
Asteroids work equivalent to 4.76% of my machine's CPU capacity for ten days.

BUT this doesn't appear to be working. I think what is happening is that BOINC is simply giving the total GPU & CPU computing capacity of my machine to BOTH projects, and BOTH projects are sending me ten days work like they were BOTH running on my machine with 100% of the resources.

If I am understanding this right, or mostly right, then it's quite simply broken. BOINC should only report to projects how much CPU/GPU time I am willing to commit to their projects, and then the projects provide me work based on that data. If that means that I run a couple Asteroids work units at high priority a few times per month, I have no problem with that.

Right now, what seems to be happening is that Asteroids is pushing everything REALLY hard. The deadlines I am currently looking at for Asteroids work start on 1/8, within 10 days, while the next SETI deadline I have is for 1/16.

Now, you might say that the method I propose above could still be gamed. You are right. However, if the client machine tracks the resource share that is actually devoted to each project in CPU/GPU time and usage% then the client could autocorrect.

Example:

I want two projects to do equal work on my machine.

Resource share is set to 100 and 100.

My client monitors CPU and GPU usage in terms of percentage and time.

If either project starts running ahead of the other by a significant margin, then it is auto-throttled by the client running on the host machine!! The discrepancy is then reported to BOINC so they can bring up the discrepancy with the team that's being a resource hog.

Clearly, if one project runs out of work, then the other project would NOT be throttled. When work is once again available on the other project, the counter would reset and percentage work distribution would once again be controlled and monitored.

I think it's about time I wander over to the BOINC forums and bounce this off them, especially the part about the client side resource control, and how much computing resources are reported to projects.
ID: 1621259 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1621263 - Posted: 31 Dec 2014, 13:26:06 UTC

Boinc is working pretty much as designed but in my years of debugging other peoples software, design doesn't always work out to be the best way in the real world. There are weaknesses in Bonic that can be exploited by a project. Most of the time this exploitation wasn't intentional but happens because the project wants their data processed in a timely manner that conflicts with a more relaxed project. The best way around this problem is running a really small queue of a day or less of work. With what you want, you also need to avoid projects with long run times if you aren't willing to give them a large part of your run time.

In short as we say all the time, it's not a bug, it's a design feature!!!!
ID: 1621263 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1621280 - Posted: 31 Dec 2014, 14:19:01 UTC - in response to Message 1621259.  
Last modified: 31 Dec 2014, 14:20:05 UTC

The resource share is making no sense to me. Maybe I'm just being silly.

If I say that I want my machine to resource share X work on one project and resource share Y work on another, then when BOINC reports to the project that I want ten days work for each, then I should be retrieving work from the projects based on the resource share I requested.

For example:

SETI @ 100 resource share
Asteroids @ 5 resource share.

BOINC calculates that my client should ask for:
SETI work equivalent to 95.24% of my machine's GPU capacity for ten days.
SETI work equivalent to 95.24% of my machine's CPU capacity for ten days.
Asteroids work equivalent to 4.76% of my machine's GPU capacity for ten days.
Asteroids work equivalent to 4.76% of my machine's CPU capacity for ten days.

Cache settings trump resource share when projects restrict work availability,

If you set Boinc to a cache of 10+0, Boinc will ask for work from Seti for 10 days (864000seconds),
Seti's scheduler will go here's 75 tasks for your CPU, and here's 100 tasks for your Nvidia GPU, (you're not having any more, you have enough)
Boinc will go, have enough work for the CPU, But need another 5 days work for the Nvidia GPU,
Boinc will ask each project in turn for 5 days work, If Asteroids has the highest priority for work it'll get work from that project,
Once it has the work, the Scheduler in Boinc will decide which work to run,
Since you're managed to get 5 days GPU work from Asteroids, Boinc will now to get that 5 days work done 10 days before deadline,
so potentially that 5 days work needs to have been started 5 days ago to meet Boinc's perceived deadline.

Running a smaller cache will fix this, anything under Seti's ability to supply work, say 5+0 days, you could even do 3+2 days.

Claggy
ID: 1621280 · Report as offensive
Cheopis

Send message
Joined: 17 Sep 00
Posts: 156
Credit: 18,451,329
RAC: 0
United States
Message 1621291 - Posted: 31 Dec 2014, 14:48:19 UTC

Thanks Claggy,

From what you are saying, there is nothing that can be fixed from my end. I was pretty sure about this after the last time around. There's a logical flaw in the BOINC system that is preventing me from distributing my computer resources as I see fit.

Resource share should be THE end-all-be-all decision making process. Period. The individual projects should not be allowed to hijack my machine and ignore my wishes.

Thank you everyone for your time. I've created a new thread over in the BOINC forums.
ID: 1621291 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1621359 - Posted: 31 Dec 2014, 18:16:42 UTC - in response to Message 1621291.  

From what you are saying, there is nothing that can be fixed from my end.

It is you that is causing Boinc on your host to fill up with work from Asteroids, it is entirely under your control what cache settings you set,
If you set a cache level that doesn't exceed Seti's capability of supplying work then you'll find that Boinc will pick up tasks from Asteroids in smaller numbers,
Because of your Cache settings you are causing all that 5/100 Asteroids share to be done now, Boinc now won't ask for work from Asteroids until it has priority again,
Remember Recourse share is long term, not short term:

12/31/2014 8:27:20 AM | Asteroids@home | Sending scheduler request: To report completed tasks.
12/31/2014 8:27:20 AM | Asteroids@home | Reporting 1 completed tasks
12/31/2014 8:27:20 AM | Asteroids@home | Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project)
12/31/2014 8:27:22 AM | Asteroids@home | Scheduler request completed

So saying that there's nothing that you can do is just plain wrong.

Claggy
ID: 1621359 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1621362 - Posted: 31 Dec 2014, 18:22:14 UTC
Last modified: 31 Dec 2014, 18:25:41 UTC

Would it help to set Seti at a 5 day cache, and Asteroids to 1 day?
That should help with the deadline to report issue.

I'm not even sure if that is possible.
ID: 1621362 · Report as offensive
Aurora Borealis
Volunteer tester
Avatar

Send message
Joined: 14 Jan 01
Posts: 3075
Credit: 5,631,463
RAC: 0
Canada
Message 1621364 - Posted: 31 Dec 2014, 18:32:11 UTC - in response to Message 1621362.  

Would it help to set Seti at a 3 day cache, and Asteroids to 1 day?
That should help with the deadline to report issue.

I'm not even sure if that is possible.

The cache is global. But, a smaller setting like 2 and 1 is better to let BOINC make quicker adjustments to the resource share. In any case, it will take BOINC a couple of weeks to find a balance every time you make a change to preferences. You can't expect BOINC to instantly reflect changes.
ID: 1621364 · Report as offensive
Dena Wiltsie
Volunteer tester

Send message
Joined: 19 Apr 01
Posts: 1628
Credit: 24,230,968
RAC: 26
United States
Message 1621379 - Posted: 31 Dec 2014, 19:19:03 UTC

Correct me if I am wrong, but if you request 10 days of work on each of your projects, you could have up to 40 days of work in your system. With that much work you will be hitting deadlines and the work will most likely not be done the way you want. If you want to keep a bunch of work on hand, use a 2-3 day limit on your projects. I only run a single days worth of work and I sometimes find I have more work in my system than I want. With backup projects, it is rare to run out of work and have your system go cold. You might not always be crunching the balance you want, but it will work out over time.
ID: 1621379 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1621382 - Posted: 31 Dec 2014, 19:25:18 UTC - in response to Message 1621379.  

Correct me if I am wrong, but if you request 10 days of work on each of your projects, you could have up to 40 days of work in your system. With that much work you will be hitting deadlines and the work will most likely not be done the way you want. If you want to keep a bunch of work on hand, use a 2-3 day limit on your projects. I only run a single days worth of work and I sometimes find I have more work in my system than I want. With backup projects, it is rare to run out of work and have your system go cold. You might not always be crunching the balance you want, but it will work out over time.

When you specify the amount of work for BOINC to cache it is the total number of days of work to cache. Not per project. Cache settings of 5 days + 5 days would would be 10 days of total work if you were running 1 project or 50.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1621382 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1621402 - Posted: 31 Dec 2014, 20:23:16 UTC

Having run into these problems in the past, then please read and try to understand Dena's posts.

I found the only way to get resource allocations to work from day one, that is assuming all projects are up 24/7, is to run a cache size of ZERO.

Running a small cache size BOINC will try to honour the resouce setting but it will probably take a long time if one of the projects has frequent downtime. Having Seti as the largest resource setting will probably be doomed to failure.
ID: 1621402 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1621405 - Posted: 31 Dec 2014, 20:33:15 UTC - in response to Message 1620829.  

Asteroids@home has completed @1,000,000 work units
SETI@home has completed @11,000,000 work units

Apropos, no one took note of this yet, but the above is wrong. At Asteroids, you may have a million credits, while here at Seti you may have 11 million credits, but that doesn't mean BOINC ran a million and 11 million tasks respectively.

Asteroids pays 480 credit per task, Seti pays a variety depending on angle range and whether or not you run Multibeam or Astropulse. Multibeam can be 20s, 30s, 40s, 50s, 100s. Astropulse I've seen 500s and 600s. There is probably a lot of variety in between.

For Asteroids you ran 2,221 tasks.
For Seti at a wild guess 22,000 tasks (who can say).
ID: 1621405 · Report as offensive
Cheopis

Send message
Joined: 17 Sep 00
Posts: 156
Credit: 18,451,329
RAC: 0
United States
Message 1621488 - Posted: 1 Jan 2015, 1:15:43 UTC - in response to Message 1621359.  
Last modified: 1 Jan 2015, 1:16:22 UTC

From what you are saying, there is nothing that can be fixed from my end.

It is you that is causing Boinc on your host to fill up with work from Asteroids, it is entirely under your control what cache settings you set,
If you set a cache level that doesn't exceed Seti's capability of supplying work then you'll find that Boinc will pick up tasks from Asteroids in smaller numbers,
Because of your Cache settings you are causing all that 5/100 Asteroids share to be done now, Boinc now won't ask for work from Asteroids until it has priority again,
Remember Recourse share is long term, not short term:

12/31/2014 8:27:20 AM | Asteroids@home | Sending scheduler request: To report completed tasks.
12/31/2014 8:27:20 AM | Asteroids@home | Reporting 1 completed tasks
12/31/2014 8:27:20 AM | Asteroids@home | Not requesting tasks: don't need (CPU: not highest priority project; NVIDIA GPU: not highest priority project)
12/31/2014 8:27:22 AM | Asteroids@home | Scheduler request completed

So saying that there's nothing that you can do is just plain wrong.

Claggy


In every way that I can calculate the resource "sharing" that is happening on my machine, Asteroids is getting 100% of my system resources when both projects have work in my cache.

Sorry, but where I grew up, that's not called sharing.

It should not matter what I have in my cache. If I say I want a 20:1 ratio of SETI to Asteroids work done, then that's what should be happening, if both projects have work in my cache. Anything else is broken.
ID: 1621488 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Making two BOINC projects play nice together


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.