Long term debt, again

Message boards : Number crunching : Long term debt, again
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 8 · Next

AuthorMessage
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 819762 - Posted: 17 Oct 2008, 17:25:43 UTC

I'm sure this has been hashed several times here but I can't get my arms around it.

I have 7 computers each with increasing long term debts building. The back-up project doesn't get much work, but it does get a bit and keeps pushing my LTD scores up. Most of the computers have a debt in the millions. I have a resource share set at 0.1% on the backup project, connect continuously to the net, and have a 1 d cache as default.

Boinc seems incapable of digging out of this hole. I have disabled new work on a couple of the slowest computers, but that didn't matter.

The back-up project will occasionally sync with its home-base, but in almost all cases it is denied new work due to the existing local seti queue. So I would guess I'm sending about 1 result back per month per cpu to the back-up project ( the one with an increasing LTD ).

Any ideas? Does boinc balance LTD only if the imbalance is small? It seems that something is broke.

Oh yes, I am running boinc 5.10.45 on some and 6.? on a couple. So I don't think that is the 'problem' here.
ID: 819762 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 819769 - Posted: 17 Oct 2008, 17:39:23 UTC - in response to Message 819762.  

I'm sure this has been hashed several times here but I can't get my arms around it.

I have 7 computers each with increasing long term debts building. The back-up project doesn't get much work, but it does get a bit and keeps pushing my LTD scores up. Most of the computers have a debt in the millions. I have a resource share set at 0.1% on the backup project, connect continuously to the net, and have a 1 d cache as default.

Boinc seems incapable of digging out of this hole. I have disabled new work on a couple of the slowest computers, but that didn't matter.

The back-up project will occasionally sync with its home-base, but in almost all cases it is denied new work due to the existing local seti queue. So I would guess I'm sending about 1 result back per month per cpu to the back-up project ( the one with an increasing LTD ).

Any ideas? Does boinc balance LTD only if the imbalance is small? It seems that something is broke.

Oh yes, I am running boinc 5.10.45 on some and 6.? on a couple. So I don't think that is the 'problem' here.

There appears to be a combination of parameters when you have a large cache (and have to crunch one project to meet deadlines) and widely varied resource shares (100 to 1 or higher) where it appears that BOINC is caught in a real Catch-22.

In my opinion, if you run multiple projects, a very small cache (6 hours or less) is probably the best answer. You'll always have work, and the "shorty" issue you posted about in another thread will also be less pronounced.

I've tried to cause it to happen by messing with the duration correction factors, and I haven't hit on the deadly combination.

I don't know if anyone has completely confirmed the issue, I don't see it here.
ID: 819769 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 819782 - Posted: 17 Oct 2008, 18:25:11 UTC - in response to Message 819769.  

Sigh....

We've been over this more times than I can remember.

The reason is that late model CC's will always try to keep the overall work cache full, regardless of whether LTD says a project should get work at all.

Therefore, if any individual project has cache 'slack', and all other LTD eligible projects are unavailable or do not have sufficient cache slack based on the CI (or cache override setting), then the CC will pull a task from a project with high negative LTD.

Back in the day, the CC policy was to not draw tasks from a project with negative LTD unless there was no other choice, but of course the people who like to carry the absolute maximum amount of in their cache were whining about the fact that this could mean that the CC would deliberately let the cache run down in some circumstances.

The result was the policy was changed and thus we now have to deal with this 'Catch 22' effect where some host will routinely break from expected Resource Share behaviour, even if you run the lowest CI/CO settings possible.

Alinator
ID: 819782 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 819793 - Posted: 17 Oct 2008, 18:48:03 UTC

While I decipher this, is there something I can/should do? Ned has a suggestion. Are there others?

Will the LTD number update if I select no-new-work for the back-up project? I seem to remember someone saying 'no', but I can't find that thread and can't decipher the boinc wiki very well yet (assuming the answer is there).
ID: 819793 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 819820 - Posted: 17 Oct 2008, 19:32:59 UTC - in response to Message 819793.  

While I decipher this, is there something I can/should do? Ned has a suggestion. Are there others?

Will the LTD number update if I select no-new-work for the back-up project? I seem to remember someone saying 'no', but I can't find that thread and can't decipher the boinc wiki very well yet (assuming the answer is there).

What projects are you crunching?
ID: 819820 · Report as offensive
web03
Volunteer tester
Avatar

Send message
Joined: 13 Feb 01
Posts: 355
Credit: 719,156
RAC: 0
United States
Message 819822 - Posted: 17 Oct 2008, 19:35:31 UTC

Looks like his backup is Einstein.
Wendy



Click Here for BOINC FAQ Service
ID: 819822 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 819830 - Posted: 17 Oct 2008, 19:48:09 UTC - in response to Message 819822.  

Looks like his backup is Einstein.

You're right, I didn't peek. Thanks.

There are two mechanisms in BOINC to deal with outages:

- You can cache work and crunch from the cache when a project is down.

- You can crunch more than one project.

If you have two or more diverse projects, you don't need much cache. For two or more, I'd probably set the "connect every 'x' days" setting to 0.5/projects, and set "extra days" to zero.

... and I'd be interested to hear if that lets the LTD settle. Probably take a while.
ID: 819830 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 819863 - Posted: 17 Oct 2008, 20:36:15 UTC - in response to Message 819830.  

Looks like his backup is Einstein.

You're right, I didn't peek. Thanks.

There are two mechanisms in BOINC to deal with outages:

- You can cache work and crunch from the cache when a project is down.

- You can crunch more than one project.

If you have two or more diverse projects, you don't need much cache. For two or more, I'd probably set the "connect every 'x' days" setting to 0.5/projects, and set "extra days" to zero.

... and I'd be interested to hear if that lets the LTD settle. Probably take a while.


Whereas that helps in keeping the host from running out of work, it doesn't do anything to address the issue that PhoenAcq is describing.

When talking about work fetch behaviour you have to take into account (not necessarily in priority order):

1.) Resource Share.

2.) CI and CO settings.

3.) Total Work Cache slack.

4.) Individual Project Cache slack.

5.) Project availability at the time of a work fetch request.

The problem is that operationally speaking, the CC tries to keep the Total Work Cache full at all times. It does this by taking on as much work as it can for all LTD eligible projects up to the CI/CO settings.

However lets look at the case where all LTD eligible projects caches are 'full' except for the one which generates a work fetch, and the high negative LTD project is empty or has cache slack left (usually due to low resource share).

Further assume that the CC can't hit the scheduler when it generates the fetch.

What will happen is that the CC will end up pulling a task from the high negative LTD project to satisfy the 'Keep Total Cache Full' policy, even though this will force a break from Resource Share in the short term.

In addition to that, since the work fetch behaviour is to replace every task near completion with a new one for all projects which are LTD eligible (which is set by the -TSI gate), any time you can't hit the project or don't have enough individual project cache slack in another LTD eligible project the CC will resort to pulling a task from a high negative LTD project if possible.

So where does that leave it in terms of what you can do about it?

If the goal is to make a CC 'tweak' to implement an automatic 'fix', the answer is you can't do it completely. The resaon is you have no way to predict most project outages or network connection difficulties (except for the weekly outage here at SAH). I know that for a fact, because I've tried.

The best you can do is to carry enough cache to cover at least a day or two, and then make sure all your attached project are up, have work, etc. and manually allow the CC to contact them periodically when conditions look favorable. Even then this doesn't guarantee that things will go the way you expect every time.

Alinator

ID: 819863 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 819890 - Posted: 17 Oct 2008, 21:46:58 UTC - in response to Message 819863.  

Alinator seems to understand the issue. His remedy is what I've been trying for a couple of months-- manually topping off my seti cache and leaving Einstein alone. It hasn't helped it seems, to reduce the LTD.

Two thoughts:
1) does the LTD update when one of my two projects have been told to not request more work? If not, shouldn't it? The ability to request more work and the LTD seem orthogonal in nature, and so why are they linked in the code?

2) the mechanism of CC/CI/CO described (I'm only guessing what these mean) is what I'm observing I think. Yet it derives from some sort of binary decision of when to request additional work. That is, I've seen my system ask for less than 60s of work (for a queue that is several days long). I surmise that it is trying to top off the tank. This seems too inflexible. Why not put a grey scale in there. Say, let it give the resource allocation preference until the deficit gets too great. that way I may or may not get seti wu's for a while, but will definitely not get einstein units until the tank drops below its maximum by a significant amount. At which point, the system goes into yellow alert and tries to find a willing gas station.

With the mechanism Alinator describes, I wonder if the LTD balancing works for anyone over the long term, or, conversely, whether it makes any sense at all to specify a resource share. That is in time the LTD balance will get out of balance permanently, unless one manually intervenes. Right?
ID: 819890 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 819915 - Posted: 17 Oct 2008, 22:32:15 UTC - in response to Message 819890.  
Last modified: 17 Oct 2008, 22:52:55 UTC

LOL...

Sorry, the acronyms are:

CC = Core Client

CI = Connect to Network Interval setting

CO = Work Cache Override setting

TSI = Task Switch Interval setting.

LTD Eligible Project = LTD > -TSI

To answer your first followup question; When you have a project set to No New Tasks (NNT), as long as there is work for it onboard, both STD and LTD update normally. One the last task is completed, STD goes to zero and LTD stops updating and remains at the last value, until there is work for the project again.

As far as the rest goes, here's the rest of the story. This is the part that usually causes eyes to glaze over until you have thought about it for a significant amount of time. ;-)

To understand the complete work fetch behaviour, in addition to the basic parameters I mentioned before you also have to take into account the 'Tightness factor' of all the attached projects. This is defined as the ratio of the average runtime of the work compared to the deadline, and the other time metrics which impact how much work the fetch algorithm is going to request.

The really hard part is reconciling all the fetch parameters with the concept of Cache Slack.

Overall Cache Slack is the difference between your Work Cache Setting and the sum of the estimated runtimes of all the tasks onboard. This value determines whether the host needs to get new work from any project in the first place, and the policy is to try and keep the value as close to zero as possible.

The other part is the Individual Project Slack. This is defined as the difference between Work Cache Setting and the sum of the estimated runtimes for the tasks of that project. One important thing to remember here is that this value is calculated assuming the project is the only one running on the host.

So I think you can see now that any combination of settings and circumstances which leads to a situation where a high negative LTD project is the one which has the most slack when a work fetch is requested and an LTD eligible one can't be reached will lead to a 'forced' DL from the negative one to satisfy the first policy.

The only way to accommodate this is to break from the Resource Share setting in order to not miss deadlines for the 'overloaded' project.

As I said before, this policy was implemented to placate the noisy group who demanded that the cache be stuffed to the max at all times no matter what a couple of years ago.

Finally to sum it up, the faster the host, the fewer number of projects running, and the lower you set the CI and CO, the better the odds are that the CC will honor your resource share both in the short and long term. However given the policy is to keep the cache as full as possible at all times, there is no guarantee that it will maintain short term compliance with resource share, especially for slower and/or part time hosts as well a fast full time hosts with big caches.

Alinator
ID: 819915 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 820068 - Posted: 18 Oct 2008, 4:19:20 UTC - in response to Message 819890.  

Alinator seems to understand the issue. His remedy is what I've been trying for a couple of months-- manually topping off my seti cache and leaving Einstein alone. It hasn't helped it seems, to reduce the LTD.

So, if what you've done so far hasn't worked, then trying a very short cache couldn't hurt, could it?
ID: 820068 · Report as offensive
Profile Gary Charpentier Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 30651
Credit: 53,134,872
RAC: 32
United States
Message 820074 - Posted: 18 Oct 2008, 4:54:23 UTC - in response to Message 819915.  

LOL...

Sorry, the acronyms are:

CC = Core Client

CI = Connect to Network Interval setting

CO = Work Cache Override setting

TSI = Task Switch Interval setting.

LTD Eligible Project = LTD > -TSI

To answer your first followup question; When you have a project set to No New Tasks (NNT), as long as there is work for it onboard, both STD and LTD update normally. One the last task is completed, STD goes to zero and LTD stops updating and remains at the last value, until there is work for the project again.

As far as the rest goes, here's the rest of the story. This is the part that usually causes eyes to glaze over until you have thought about it for a significant amount of time. ;-)

To understand the complete work fetch behaviour, in addition to the basic parameters I mentioned before you also have to take into account the 'Tightness factor' of all the attached projects. This is defined as the ratio of the average runtime of the work compared to the deadline, and the other time metrics which impact how much work the fetch algorithm is going to request.

The really hard part is reconciling all the fetch parameters with the concept of Cache Slack.

Overall Cache Slack is the difference between your Work Cache Setting and the sum of the estimated runtimes of all the tasks onboard. This value determines whether the host needs to get new work from any project in the first place, and the policy is to try and keep the value as close to zero as possible.

The other part is the Individual Project Slack. This is defined as the difference between Work Cache Setting and the sum of the estimated runtimes for the tasks of that project. One important thing to remember here is that this value is calculated assuming the project is the only one running on the host.

So I think you can see now that any combination of settings and circumstances which leads to a situation where a high negative LTD project is the one which has the most slack when a work fetch is requested and an LTD eligible one can't be reached will lead to a 'forced' DL from the negative one to satisfy the first policy.

The only way to accommodate this is to break from the Resource Share setting in order to not miss deadlines for the 'overloaded' project.

As I said before, this policy was implemented to placate the noisy group who demanded that the cache be stuffed to the max at all times no matter what a couple of years ago.

Finally to sum it up, the faster the host, the fewer number of projects running, and the lower you set the CI and CO, the better the odds are that the CC will honor your resource share both in the short and long term. However given the policy is to keep the cache as full as possible at all times, there is no guarantee that it will maintain short term compliance with resource share, especially for slower and/or part time hosts as well a fast full time hosts with big caches.

Alinator

Ah, so. Not naming a project that has now set its deadline so short that all work units from that project run in panic mode (on most machines.) That explains why when the work unit is done it always fetches another and of course goes right back into panic mode. That despite the project share. Of course on a multi-core machine at least some other project work gets done.

I knew there had to be a flaw there and it was the people screaming for high RAC's -- not science -- that got the code broke so projects could fiddle with it.

Gary


ID: 820074 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 820165 - Posted: 18 Oct 2008, 13:49:23 UTC - in response to Message 820068.  

Alinator seems to understand the issue. His remedy is what I've been trying for a couple of months-- manually topping off my seti cache and leaving Einstein alone. It hasn't helped it seems, to reduce the LTD.

So, if what you've done so far hasn't worked, then trying a very short cache couldn't hurt, could it?


Yes I did so already. It hasn't helped yet, but lets give it a couple of days. My default condition is now CI continuous and CO=0 days.
ID: 820165 · Report as offensive
Profile Ace Casino
Avatar

Send message
Joined: 5 Feb 03
Posts: 285
Credit: 29,750,804
RAC: 15
United States
Message 820178 - Posted: 18 Oct 2008, 14:19:08 UTC

The easiest way to do away with your LTD is to detach and reattach.

I’ve been fiddling with resource share and dept for a few months now. I’ve detached and reattached several times to get rid of LTD. No one gets hurt….Redundancy…remember

I said the resource share should be based on 100%, it is….but it isn’t, from my experience. Here is what I mean:

Example 1

2 projects: 99% on one project, 1% on another project = 100%. This ratio is not good for what I want. I will download days, sometimes weeks worth of work, for the project with the 1% share, on a project I only want to run a little. Even with this ratio the dept can climb and climb, especially if no work is available from the project, even for a short time.

Example 2

2 projects: 375% on one project, 1% on another project: 99.73% to .27% = 100%. This ratio is getting close to what I want. But the difference in these 2 ways of dividing resource share is amazing. I know 1 project is only getting .27% of resource share in this example as opposed to 1% with the other example, but dividing resource like example 2, is like night and day.

Alinator, I know you know more about Boinc than most. I also see that you don’t have a dual or quad computer. Things might look one way on paper, but real world application might surprise you. To see hundreds of WU’s download when resource share is set to 1% amazes me. I hear all the time that Boinc has been set up for the average user to be as simple as possible. How many average users would know to raise the percent of resource share into the hundreds (example 500%) on one project and possibly a fraction of a percent on another to run a project part-time? 99% to 1% seems logical, but is not on a multiple core computer. Having single, dual and quad computers myself, I notice the ratio goes up exponentially with more cores (don’t say ya daa) I’m talking huge differences. Differences the average (and the above average) person could not see coming when setting resource share. Example 1 and 2 the difference is only .73% for the second project, but the way the resource share is set up (99% vs. 375%) is a gigantic difference.


This is why sometime in the future it would be nice to be able to divide resource share with the number of cores of your computer. The number of cores on computers keep rising, it seems a logical step to take at some point. 16 cores - 16 different projects (if that is what you choose), all running simultaneously at 100% resource share.
ID: 820178 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19062
Credit: 40,757,560
RAC: 67
United Kingdom
Message 820189 - Posted: 18 Oct 2008, 14:55:17 UTC

I am having the same problems with LTD that PhonAcq is seeing. The problem running Seti as main project and A.N.Other project as backup, and the un-required fix to keep cache full **. Means that every time Seti has a problem the cache is filled with units from the other project, then because the scheduler sees it has work from both projects it runs as per the resource settings and the LTD is never reduced. And probably never will be unless Seti has several months without problems or the other project goes down for an extended period. If the A.N.Other project is Einstein, as it is in PhonAcq's and my case, it is probably never going to happen.

So that my resource share is honoured, resetting debt's will not do this, and NTT and suspend don't allow debts to be adjusted.
So as a trial, I am suspending comms for several hours, so that any Seti hiccup's don't force Einstein downloads. Then check Seti is running OK, set Einstein to NTT, allow comms to upload and report all completed work and refill cache with Seti tasks. After cache is refilled, suspend comms again and set Einstein back to allow new tasks.

** If you see this JM7 please, get them to reset it back to the way you designed it.
Ignorance of others is not an excuse to introduce a non-required feature.
ID: 820189 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 820192 - Posted: 18 Oct 2008, 15:21:02 UTC - in response to Message 820189.  

You are affirming my point: the LTD algorithm is unstable and pretty much useless. Alinator certainly has a deep understanding of why.

I wouldn't care but it does seem that boinc is lying to the public when they introduce the resource sharing concept. It just doesn't work, UNLESS, you have happy fingers and want to fiddle with the downloads, ad nauseum.

If historical (hysterical?) rabble rousers persuaded someone to introduce this artifact, please eliminate it or improve the algorithm to correct the issues.

Are the code keepers listening?
ID: 820192 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 820197 - Posted: 18 Oct 2008, 15:32:03 UTC - in response to Message 820165.  
Last modified: 18 Oct 2008, 15:53:18 UTC

Alinator seems to understand the issue. His remedy is what I've been trying for a couple of months-- manually topping off my seti cache and leaving Einstein alone. It hasn't helped it seems, to reduce the LTD.

So, if what you've done so far hasn't worked, then trying a very short cache couldn't hurt, could it?


Yes I did so already. It hasn't helped yet, but lets give it a couple of days. My default condition is now CI continuous and CO=0 days.


Well as I said before, my CI/CO is set at 0.01 and 0.25 respectively, which I went to when you originally brought up your observations awhile back. I would recommend not going to Zero/Zero though. The reason is that zero for the CI in the latest CC's emulates RRI, and that could result in losing work if the 'backend needs time for DB bookkeeping' problem crops up again. Zero for the CO means that the CC won't start looking for work for a project until there isn't any left for it. Obviously, this can aggravate the Resource Share problem if the LTD eligible project doesn't respond on the first try.

When I was running the high bias share scenario with EAH as the primary and all my other ones at a low share, my observation was that even though EAH is one of the best projects out there in terms of uptime and scheduler availability, my slow hosts had trouble maintaining Resource Share, especially short term.

I had them all set up for extensive logging, and the reason for virtually every single break from expected behaviour was due to the CC not being able to hit the LTD eligible project it was seeking work from and thus resorted to another one to satisfy the Full Cache Overall policy, typically a high negative LTD one.

If you have SAH set as the primary, and given they go off the air every Tuesday, it's not hard to see why breaks form Resource Share are to be expected regularly when running with the network always available to the CC.

Currently I'm running even share splits, and even with MW being an ultra high Tightness Factor project, all of my hosts are doing a fairly decent job of maintaining Resource Share overall. As it turns out, Leiden is the project which causes the majority of share breaks. This is because they use homogeneous redundancy, and the way their work generator is set up it doesn't always have work queued up in the 'bullpen' for all the classes of hosts which they support.

Alinator
ID: 820197 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 820199 - Posted: 18 Oct 2008, 15:42:45 UTC - in response to Message 820189.  
Last modified: 18 Oct 2008, 16:24:18 UTC

I am having the same problems with LTD that PhonAcq is seeing. The problem running Seti as main project and A.N.Other project as backup, and the un-required fix to keep cache full **. Means that every time Seti has a problem the cache is filled with units from the other project, then because the scheduler sees it has work from both projects it runs as per the resource settings and the LTD is never reduced. And probably never will be unless Seti has several months without problems or the other project goes down for an extended period. If the A.N.Other project is Einstein, as it is in PhonAcq's and my case, it is probably never going to happen.

So that my resource share is honoured, resetting debt's will not do this, and NTT and suspend don't allow debts to be adjusted.
So as a trial, I am suspending comms for several hours, so that any Seti hiccup's don't force Einstein downloads. Then check Seti is running OK, set Einstein to NTT, allow comms to upload and report all completed work and refill cache with Seti tasks. After cache is refilled, suspend comms again and set Einstein back to allow new tasks.

** If you see this JM7 please, get them to reset it back to the way you designed it.
Ignorance of others is not an excuse to introduce a non-required feature.


I'd have to agree there.

If you are running a large cache, then so what if the CC decides it should let it run down a bit overall in order to stay closer to the Resource Share in the short term. In the minimal cache configuration, any break caused by project unavailability or other reasons tends to get corrected much sooner.

There should be some logic in the fetch algorithm which takes into account the share setting for a high negative LTD project and whether the LTD eligible project (or the machine overall) is in danger of going dry before the next connect interval expires. Personally I don't see any need to force a share break when the host has tens to hundreds of hours of work currently onboard and the next connection opportunity is minutes away! ;-)

Alinator

<edit> @ Wavemaker:

I do have a dual on this account. I also have quads and higher, just not running on my 'public' account.

The reason I don't run them here is that my big gun battleships are dedicated to economic 'warfare' as their primary function, and thus I don't want any chance that the games I play with BOINC experimentally can have any negative impact on their main mission.

<edit2> One other thing which just caught my eye in your post was the part about the Resource Share settings on the web site UI.

I agree that the impression a casual participant would take away is that the setting is 'percentage' and so ranges from zero to one hundred. However, it is stated in the description of the setting it is a proportionality value.

Granted that distinction and difference would not be picked up on by a lot of folks immediately.

Alinator
ID: 820199 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 820277 - Posted: 18 Oct 2008, 22:28:06 UTC - in response to Message 820165.  
Last modified: 18 Oct 2008, 22:56:36 UTC

Alinator seems to understand the issue. His remedy is what I've been trying for a couple of months-- manually topping off my seti cache and leaving Einstein alone. It hasn't helped it seems, to reduce the LTD.

So, if what you've done so far hasn't worked, then trying a very short cache couldn't hurt, could it?


Yes I did so already. It hasn't helped yet, but lets give it a couple of days. My default condition is now CI continuous and CO=0 days.

I would expect it to take a few weeks to show any great improvement.

Edit:

I run SETI with a share of 900, SETI BETA with a share of 99, and BOINC Alpha with a share of 1.

My observation is that, on average, I mostly crunch SETI. At the moment I have a BETA AP unit running "priority" due to deadlines. I have two AP units for SETI (crunching along nicely with the enhanced AP app.).

I'll go several weeks, then get a few UpperCase work units (BOINC Alpha does no useful work, but the work units go very quickly), then I'll go for days with just SETI "main" then I'll get one Beta.

In other words, it appears that Long Term Debt does properly enforce resource share if your cache isn't too big, and that it is much less effective if the short-term scheduling is always trying to stay out of trouble.

I think there is a corner here that just plain doesn't work, but I haven't found it.
ID: 820277 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 820304 - Posted: 19 Oct 2008, 0:04:34 UTC - in response to Message 820277.  

I think a better way to judge LTD alone is to look in the client_state file and record the number listed. BoinLogX does so automatically and gives a snap shot of the actual LTD number. Either way, I jot down the numbers and compare with time.

What I find is that LTD is bogus but the main project gets most of the work. Right now my resource share is 999 for seti and 1 for einstein (0.1%). But in reality, the cache system can't seem to let the einstein component to drop below 1% of the rac. (and the LTD just keeps getting more negative)

So lets see what happens with the reduced cache settings. I may switch over to the Alinator's suggestion once I see what happens with the 0/0 setting. Of course, what I see will depend on the cpu. The old cpu's will not know the difference because they take more than a day for a wu (Pentium II's), for example.
ID: 820304 · Report as offensive
1 · 2 · 3 · 4 . . . 8 · Next

Message boards : Number crunching : Long term debt, again


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.