How does work fetch in 7.0.25 work?

Message boards : Number crunching : How does work fetch in 7.0.25 work?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1225198 - Posted: 30 Apr 2012, 3:27:49 UTC - in response to Message 1225151.  
Last modified: 30 Apr 2012, 3:29:33 UTC

Correct.

But then that isnt a 10 day cache, which I have told it to maintain... :)

So it should still be requesting work, and its not.

with 0.5/10 you have asked for a 0.5 day cache with 10 days extra, try 10/0.5 that would be 10 day cache with upto 0.5 day extra.


Nope, he has 7.0.25 installed and that is what he had and was not getting work.

Version 7 we were told now has a form of hysterisis built into the scheduler. You set your minimum cache setting and then use the second setting to set the hysterisis size. So with a 10/0.5 setting it should (if you can get that much work) request work when the cache falls below 10 days and fill you up with 10.5 days. Then not make another request until the cache has fallen below 10days again.

Setting 0.5/10 sets the cache at 0.5 with a hopeful fill to 10.5 and not request again until the cache has fallen to 0.5.

With this setting it could request work, get a little and then because you are now above 0.5 not request more.

Hysterisis was something that I and others suggested and discussed over 4 years ago. At that stage Dr. A did not like it, thought it was too complex to implement. I do not know what has caused a change of mind.
But I would not have implemented it in its present form.
ID: 1225198 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1225204 - Posted: 30 Apr 2012, 3:43:33 UTC - in response to Message 1225198.  

Correct.

But then that isnt a 10 day cache, which I have told it to maintain... :)

So it should still be requesting work, and its not.

with 0.5/10 you have asked for a 0.5 day cache with 10 days extra, try 10/0.5 that would be 10 day cache with upto 0.5 day extra.


Nope, he has 7.0.25 installed and that is what he had and was not getting work.

Version 7 we were told now has a form of hysterisis built into the scheduler. You set your minimum cache setting and then use the second setting to set the hysterisis size. So with a 10/0.5 setting it should (if you can get that much work) request work when the cache falls below 10 days and fill you up with 10.5 days. Then not make another request until the cache has fallen below 10days again.

Setting 0.5/10 sets the cache at 0.5 with a hopeful fill to 10.5 and not request again until the cache has fallen to 0.5.

With this setting it could request work, get a little and then because you are now above 0.5 not request more.

Hysterisis was something that I and others suggested and discussed over 4 years ago. At that stage Dr. A did not like it, thought it was too complex to implement. I do not know what has caused a change of mind.
But I would not have implemented it in its present form.


On my 1 computer running 7.0.25, I have my cache setting at
Maintain enough tasks to keep busy for at least (max 10 days).	5 days
... and up to an additional	                              0.5 days


While my 6.12.43 computer runs this
 Maintain enough tasks to keep busy for at least (max 10 days).	--- days
... and up to an additional	                                  5 days 


That is keeping both computers around current max tasks.

ID: 1225204 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1225223 - Posted: 30 Apr 2012, 4:50:45 UTC - in response to Message 1225204.  


Version 7 we were told now has a form of hysterisis built into the scheduler. You set your minimum cache setting and then use the second setting to set the hysterisis size. So with a 10/0.5 setting it should (if you can get that much work) request work when the cache falls below 10 days and fill you up with 10.5 days. Then not make another request until the cache has fallen below 10days again.

Setting 0.5/10 sets the cache at 0.5 with a hopeful fill to 10.5 and not request again until the cache has fallen to 0.5.

With this setting it could request work, get a little and then because you are now above 0.5 not request more.

Hysterisis was something that I and others suggested and discussed over 4 years ago. At that stage Dr. A did not like it, thought it was too complex to implement. I do not know what has caused a change of mind.
But I would not have implemented it in its present form.


On my 1 computer running 7.0.25, I have my cache setting at
Maintain enough tasks to keep busy for at least (max 10 days).	5 days
... and up to an additional	                              0.5 days


While my 6.12.43 computer runs this
 Maintain enough tasks to keep busy for at least (max 10 days).	--- days
... and up to an additional	                                  5 days 


That is keeping both computers around current max tasks.

Which confirms the change of settings required when upgrading to Ver 7
ID: 1225223 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1225237 - Posted: 30 Apr 2012, 5:47:00 UTC - in response to Message 1225198.  

Setting 0.5/10 sets the cache at 0.5 with a hopeful fill to 10.5 and not request again until the cache has fallen to 0.5.

Which is completely & totally counter intuitive.

However i now notice that the wording on the settings page has been changed to match the new version.

What used to be the "Connect every x days" has now become
"Maintain enough tasks to keep busy for at least x days"
It's done away with any reference to how often the softare is able to access the net.
And the second one is
... and up to an additioal x days.

It doesn't make any sense.
The idea of having a 4 day cache is that you have enough work for for 4 days. If you want work for an aditional 3 days then what you're saying is you want a 7 day cache.
Why not just set it to 7 days?

I understand what hysteresis is, i don't understand why you would want to apply it in relation to work caches other than a very slight reduction in Scheduler requests. And i don't see the reduction in Scheduler requests being significant enough to bother with such a change.
If people want a 4 day cache, they'll set it to that (once they realise how). Only now they can load up yet more work (putting more load on the database)

I forsee a similar load for the Scheduler with requests to now, only with a considerably higher load on the database with work in progress (once, or if) the server side limits are ever removed.
Grant
Darwin NT
ID: 1225237 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19012
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1225248 - Posted: 30 Apr 2012, 7:18:56 UTC

When I and the others first proposed hysterisis, the requirement was to reduce server load, by having many tasks reported at the same time.

The main problem these days is moving the requested tasks from the servers to our hosts and I, also, don't see how it this is going to do that. We are still going to be requesting 2 million tasks/day, or more as our hosts get more powerful.
ID: 1225248 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1225249 - Posted: 30 Apr 2012, 7:22:23 UTC - in response to Message 1225248.  

We are still going to be requesting 2 million tasks/day, or more as our hosts get more powerful.

And more still as people try for ... an additional x days.
Grant
Darwin NT
ID: 1225249 · Report as offensive
Profile Karsten Vinding
Volunteer tester

Send message
Joined: 18 May 99
Posts: 239
Credit: 25,201,931
RAC: 11
Denmark
Message 1225367 - Posted: 30 Apr 2012, 17:30:21 UTC - in response to Message 1225249.  

Well setting 0.5/10 didn't really work out well.

But setting 10/0.5 in BM settings, seems to have helped.

All my computers are now requesting work like mad (except the Ubuntu one where seti won't work :( ).

A lot of my WU's have gone high priority (?), and every single WU thats finished seems to be uploaded and returned emidiatly. I don't see the large numbers of waiting to report that I had.

Off course the servers are not giving me any work, but at least the client asks for it now.

Why it behaves so differently now, I can't explain.
ID: 1225367 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1225444 - Posted: 30 Apr 2012, 20:50:52 UTC - in response to Message 1225367.  

Well setting 0.5/10 didn't really work out well.

But setting 10/0.5 in BM settings, seems to have helped.

All my computers are now requesting work like mad (except the Ubuntu one where seti won't work :( ).

A lot of my WU's have gone high priority (?), and every single WU thats finished seems to be uploaded and returned emidiatly. I don't see the large numbers of waiting to report that I had.

Off course the servers are not giving me any work, but at least the client asks for it now.

Why it behaves so differently now, I can't explain.

That's because the minimum buffer of 10 days is also used as an indicator that the host may not connect to the internet for the next 10 days. So BOINC tries to get all tasks finished, uploaded, and reported that early.

The BOINC developers basically think the only reason a user should try for a large cache is because they're going on vacation or the hosts are in a situation where an IT Department only allows connecting to the servers once a week, etc.
                                                                  Joe
ID: 1225444 · Report as offensive
Profile Karsten Vinding
Volunteer tester

Send message
Joined: 18 May 99
Posts: 239
Credit: 25,201,931
RAC: 11
Denmark
Message 1225450 - Posted: 30 Apr 2012, 21:09:31 UTC - in response to Message 1225444.  
Last modified: 30 Apr 2012, 21:44:10 UTC


That's because the minimum buffer of 10 days is also used as an indicator that the host may not connect to the internet for the next 10 days. So BOINC tries to get all tasks finished, uploaded, and reported that early.

                                                                  Joe


OK. That makes sense (sort of).

What I dont understand is why it behaves this way now, when it didn't when I had the cache set to 10/10 in a desperate attempt to get work.
Back then nothing worked, no work was being requested, no work was being reported, no tasks was in high priority.
The high priority tasks was allready in the cache for the GPU back then.

I have gone down to 6/0.5 on my ATI host because I was hitting a 400 WU limit, and that wasn't enough for 10 days cache for the GPU.
So I was afraid the servers would still try to fill the GPU cache (because the client would still ask for work), but would not send much work because of that limit. And since they honored for the GPU first, the CPU would not get much work.

I havent tried if it works that way in reality, (a 6 days cache should be fine) but I wouldnt be surprised.
ID: 1225450 · Report as offensive
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 1225478 - Posted: 30 Apr 2012, 22:13:04 UTC - in response to Message 1225450.  
Last modified: 30 Apr 2012, 22:17:58 UTC

I may be wrong here but if you had it set for 10/10 wouldn't that tell it you only connect every ten days and you are requesting work for ten days?

Okay, to finish my thought. Seems it might have problems with finding work units with short enough time limits for that. Now you are telling it you can connect twice a day and want six days worth of additional work so it is loading you up to the limit.


PROUD MEMBER OF Team Starfire World BOINC
ID: 1225478 · Report as offensive
Tom95134

Send message
Joined: 27 Nov 01
Posts: 216
Credit: 3,790,200
RAC: 0
United States
Message 1225485 - Posted: 30 Apr 2012, 22:23:53 UTC
Last modified: 30 Apr 2012, 22:25:37 UTC

FYI...

This new work fetch approach doesn't play well with others. It has an issue when you are running other processes that use BOINC to fetch work but do their crunching outside of BOINC. One of these projects is T4T which lives under BOINC but does its crunching using an Orcale Virtual Machine. The VM can run dry and then end up waiting for work until BOINC goes through its work fetch cycle based solely on the new fetch scheme.

This has little effect on BOINC work but the other project can end up doing a lot of nothing.
ID: 1225485 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1225581 - Posted: 1 May 2012, 2:07:33 UTC - in response to Message 1225478.  

I may be wrong here but if you had it set for 10/10 wouldn't that tell it you only connect every ten days and you are requesting work for ten days?

Okay, to finish my thought. Seems it might have problems with finding work units with short enough time limits for that. Now you are telling it you can connect twice a day and want six days worth of additional work so it is loading you up to the limit.

No, 10/10 is 10 days minimum plus 10 days additional. With 7.0.x they still add to define the max level, it's just that the cache drains down to the minimum before trying to refill to maximum.

As Karsten has already discovered, the short deadline tasks tend to go into high priority if the minimum is more than half the deadline. So it's long deadline tasks which are suitable for a large cache. 10/10 would be practical for someone specializing in Astropulse who had a good reason to ignore what it does to the servers.
                                                                  Joe
ID: 1225581 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1225644 - Posted: 1 May 2012, 5:47:25 UTC - in response to Message 1225444.  
Last modified: 1 May 2012, 5:50:09 UTC

The BOINC developers basically think the only reason a user should try for a large cache is because they're going on vacation or the hosts are in a situation where an IT Department only allows connecting to the servers once a week, etc.

Unlike the main reason where people just want to have enough work to get through any project outages & downtime. If it weren't for project downtime then only being able to connect to the net every few days would be the only reason for needing a cache.


EDIT-
If they don't think maintaining work during down times is the main reason, then they obviously haven't spent much time reading all the posts here bemoaning the server side limits & other problems the changes that screwed estimated completion times so much caused.
Grant
Darwin NT
ID: 1225644 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1226080 - Posted: 2 May 2012, 5:51:37 UTC - in response to Message 1225644.  
Last modified: 2 May 2012, 5:52:48 UTC

I don't know why, but i decided to change my cache settings, just to see what would happen.

I'm running 6.10.xx and had my settings as 0.1 & 10 (usually 4 days, but due to the DCF problems i upped it to get what i could). After changing it to 4 & 10 BOINC still requested GPU work, but stopped requesting CPU work. I was probably down to less than half a days work when i set it back to 0.1, and BOINC started requesting CPU work again.
Grant
Darwin NT
ID: 1226080 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1226129 - Posted: 2 May 2012, 10:44:55 UTC - in response to Message 1226080.  

After changing it to 4 & 10 BOINC still requested GPU work, but stopped requesting CPU work.


Same here. Had it at 4 & 8 and just pretty much ran out. Saw the same behavior as the OP, until LadyL suggested the workaround I posted above (which worked like a treat).
ID: 1226129 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : How does work fetch in 7.0.25 work?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.