Bigger work buffer

Message boards : Number crunching : Bigger work buffer
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19072
Credit: 40,757,560
RAC: 67
United Kingdom
Message 657264 - Posted: 10 Oct 2007, 9:37:23 UTC - in response to Message 657232.  
Last modified: 10 Oct 2007, 9:37:58 UTC

% of time BOINC client is running 71.8875 %
While BOINC running, % of time work is allowed 99.9883 %
Average CPU efficiency 0.767899
Result duration correction factor 0.58026

the only reason why the top one is so low is because when im away with work and bonic finished all its units that are qued up i close it till i get back to an internet connection to upload them.

surely this shouldnt be causeing me to only download about 3 days worth of units when its set at 10/10

The thing is, if you have any VHAR units in there that have deadline if 8 days then the BOINC manager is going to think because of 10 day connection interval that they will have difficulty in being returned on time. Therefore no more work, until they are returned. To allow for any unexpected delays in uploading and reporting the manager will try to complete units 24 hrs before the actual deadline.

Andy
ID: 657264 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 657462 - Posted: 10 Oct 2007, 19:42:23 UTC - in response to Message 657232.  

% of time BOINC client is running 71.8875 %
While BOINC running, % of time work is allowed 99.9883 %
Average CPU efficiency 0.767899
Result duration correction factor 0.58026

the only reason why the top one is so low is because when im away with work and bonic finished all its units that are qued up i close it till i get back to an internet connection to upload them.

surely this shouldnt be causeing me to only download about 3 days worth of units when its set at 10/10

Several people have suggested that you might try a shorter connect interval (i.e. less than one day). The connect interval interacts with cache size in ways you might not expect.

... but if you won't try them, even as an experiment, then we can't help.
ID: 657462 · Report as offensive
Profile Rowe Family and Friends

Send message
Joined: 25 Dec 00
Posts: 17
Credit: 38,395,231
RAC: 67
New Zealand
Message 657630 - Posted: 11 Oct 2007, 4:19:59 UTC

i have and it makes no difference i cant get anymore than about 20 work units at once. the thing is it works fine changing the cashe size on my desktop. i change it from 5/5 to 10/10 and it downloads heaps of units, yet my laptop doesnt get any

ill leave it and see what it does over the weekend
ID: 657630 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 657644 - Posted: 11 Oct 2007, 5:40:55 UTC - in response to Message 657630.  

i have and it makes no difference i cant get anymore than about 20 work units at once. the thing is it works fine changing the cashe size on my desktop. i change it from 5/5 to 10/10 and it downloads heaps of units, yet my laptop doesnt get any

ill leave it and see what it does over the weekend

You're using numbers like 5 and 10, when I'm suggesting values below 1.
ID: 657644 · Report as offensive
Profile Rowe Family and Friends

Send message
Joined: 25 Dec 00
Posts: 17
Credit: 38,395,231
RAC: 67
New Zealand
Message 657702 - Posted: 11 Oct 2007, 8:30:47 UTC - in response to Message 657644.  

i have and it makes no difference i cant get anymore than about 20 work units at once. the thing is it works fine changing the cashe size on my desktop. i change it from 5/5 to 10/10 and it downloads heaps of units, yet my laptop doesnt get any

ill leave it and see what it does over the weekend

You're using numbers like 5 and 10, when I'm suggesting values below 1.

thats cuase i was talking about my desktop

like i said it hasnt made any difference having the cashe setting at .1/10 or .5/5 or anything like that i still only get about 20 units
ID: 657702 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14653
Credit: 200,643,578
RAC: 874
United Kingdom
Message 657706 - Posted: 11 Oct 2007, 8:38:03 UTC - in response to Message 657702.  

i have and it makes no difference i cant get anymore than about 20 work units at once. the thing is it works fine changing the cashe size on my desktop. i change it from 5/5 to 10/10 and it downloads heaps of units, yet my laptop doesnt get any

ill leave it and see what it does over the weekend

You're using numbers like 5 and 10, when I'm suggesting values below 1.

thats cuase i was talking about my desktop

like i said it hasnt made any difference having the cashe setting at .1/10 or .5/5 or anything like that i still only get about 20 units

If making all those changes is having no effect at all, then the change isn't being recognised by the laptop. Why not?

1. Not updating client after making preference change (unlikely, but included for completeness)
2. Not updating correct 'venue' preference for computer
3. Preference override file on computer negates all changes

Got to be one of those - check all settings (including those you know couldn't possibly be wrong, LOL) and tell us what you see.
ID: 657706 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 657890 - Posted: 11 Oct 2007, 17:41:37 UTC - in response to Message 657706.  

If making all those changes is having no effect at all, then the change isn't being recognised by the laptop. Why not?

1. Not updating client after making preference change (unlikely, but included for completeness)
2. Not updating correct 'venue' preference for computer
3. Preference override file on computer negates all changes

Got to be one of those - check all settings (including those you know couldn't possibly be wrong, LOL) and tell us what you see.


Hmmm...

Take another look at the host's time metrics. From that I would expect it to only carry about half of what you would think it should from the cache settings.

Also, it hasn't been mentioned here yet, but when you run decoupled the CI doesn't play a role in determining the cache size anymore per se. IOW. setting BOINC to 10/10 doesn't mean you will carry 20 days worth of work.

As John has tried to explain, the difference between coupled cache mode and decoupled is that in coupled mode the CC will make sure that all deadlines can be met by at least one contact session sooner than when they are ultimately due back and adjust work fetching and scheduling accordingly. When running decoupled, the CC ignores the impact of any contact schedule when deciding how much work to get and when to run it and only considers the deadlines of the tasks onboard. This doesn't make for much of a difference if the host has unrestricted network access and/or runs 24/7, but can lead to missed deadlines for dialup hosts, part timers, and/or ones where you have set a restricted network schedule when running decoupled.

Alinator
ID: 657890 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 658000 - Posted: 11 Oct 2007, 21:33:08 UTC - in response to Message 657890.  

If making all those changes is having no effect at all, then the change isn't being recognised by the laptop. Why not?

1. Not updating client after making preference change (unlikely, but included for completeness)
2. Not updating correct 'venue' preference for computer
3. Preference override file on computer negates all changes

Got to be one of those - check all settings (including those you know couldn't possibly be wrong, LOL) and tell us what you see.


Hmmm...

Take another look at the host's time metrics. From that I would expect it to only carry about half of what you would think it should from the cache settings.

Also, it hasn't been mentioned here yet, but when you run decoupled the CI doesn't play a role in determining the cache size anymore per se. IOW. setting BOINC to 10/10 doesn't mean you will carry 20 days worth of work.

As John has tried to explain, the difference between coupled cache mode and decoupled is that in coupled mode the CC will make sure that all deadlines can be met by at least one contact session sooner than when they are ultimately due back and adjust work fetching and scheduling accordingly. When running decoupled, the CC ignores the impact of any contact schedule when deciding how much work to get and when to run it and only considers the deadlines of the tasks onboard. This doesn't make for much of a difference if the host has unrestricted network access and/or runs 24/7, but can lead to missed deadlines for dialup hosts, part timers, and/or ones where you have set a restricted network schedule when running decoupled.

Alinator

10 days Connect Interval and 10 days of extra work would indeed be an attempt to keep 20 days worh of work on the host. The Connect interval is used to both determine the minimum cache, and the calculation of the task completion deadlines (work must be completed before one connect interval before the report deadline). If the computer has uninterupted network access, you are best of with a CI of 0. Use the extra work setting to maintain a queue of work.


BOINC WIKI
ID: 658000 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 658011 - Posted: 11 Oct 2007, 21:55:15 UTC - in response to Message 658000.  

10 days Connect Interval and 10 days of extra work would indeed be an attempt to keep 20 days worh of work on the host. The Connect interval is used to both determine the minimum cache, and the calculation of the task completion deadlines (work must be completed before one connect interval before the report deadline). If the computer has uninterupted network access, you are best of with a CI of 0. Use the extra work setting to maintain a queue of work.


Well I stand corrected on that. I must have misunderstood you when we we're talking about that when the feature came about. When I tried testing the 10/10 20 day hypothesis on one of my fastest hosts, and I never got over what I would carry in the cache in coupled mode, so I assumed you had addressed the matter of irresolveable scheduling paradoxes when running decoupled by having the CI just set the work fetch and report trigger point and nothing else.

Based on this info, you can still shoot yourself in the foot pretty good if you go 10/10 and don't pay attention to what's going on under some circumstances. ;-)

Alinator
ID: 658011 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 658016 - Posted: 11 Oct 2007, 22:13:57 UTC - in response to Message 658011.  

10 days Connect Interval and 10 days of extra work would indeed be an attempt to keep 20 days worh of work on the host. The Connect interval is used to both determine the minimum cache, and the calculation of the task completion deadlines (work must be completed before one connect interval before the report deadline). If the computer has uninterupted network access, you are best of with a CI of 0. Use the extra work setting to maintain a queue of work.


Well I stand corrected on that. I must have misunderstood you when we we're talking about that when the feature came about. When I tried testing the 10/10 20 day hypothesis on one of my fastest hosts, and I never got over what I would carry in the cache in coupled mode, so I assumed you had addressed the matter of irresolveable scheduling paradoxes when running decoupled by having the CI just set the work fetch and report trigger point and nothing else.

Based on this info, you can still shoot yourself in the foot pretty good if you go 10/10 and don't pay attention to what's going on under some circumstances. ;-)

Alinator

12 guage anyone?


BOINC WIKI
ID: 658016 · Report as offensive
recondas

Send message
Joined: 3 Nov 06
Posts: 2
Credit: 58,756
RAC: 0
United States
Message 658449 - Posted: 12 Oct 2007, 17:22:07 UTC

I've seen a number of convincing arguements for both large and small buffer sizes.

Unfortunately, it looks like I've got a work unit pending where my wingman is the poster child for too much buffer. Looking at the listed computer, I'm not sure how he managed to accumulate *2157* tasks, but dozen's of them have died on the vine, and more, including mine probably aren't far behind.

ID: 658449 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 658563 - Posted: 12 Oct 2007, 19:44:53 UTC - in response to Message 658449.  

I've seen a number of convincing arguements for both large and small buffer sizes.

Unfortunately, it looks like I've got a work unit pending where my wingman is the poster child for too much buffer. Looking at the listed computer, I'm not sure how he managed to accumulate *2157* tasks, but dozen's of them have died on the vine, and more, including mine probably aren't far behind.


How in the world is a doggy Celery processor sitting on that much work??
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 658563 · Report as offensive
recondas

Send message
Joined: 3 Nov 06
Posts: 2
Credit: 58,756
RAC: 0
United States
Message 658587 - Posted: 12 Oct 2007, 20:41:15 UTC - in response to Message 658563.  

How in the world is a doggy Celery processor sitting on that much work??


I scrolled thru all 20+ pages of work units. It looks like the last one processed was in mid September - but the system continued to download work on a nearly daily basis since (as recently as yesterday).

There may be good reasons why processing stopped, but it is rather surprising the system continued to request work. Makes one wonder what the quota settings are.
ID: 658587 · Report as offensive
web03
Volunteer tester
Avatar

Send message
Joined: 13 Feb 01
Posts: 355
Credit: 719,156
RAC: 0
United States
Message 658592 - Posted: 12 Oct 2007, 21:17:29 UTC - in response to Message 658587.  
Last modified: 12 Oct 2007, 21:35:47 UTC

How in the world is a doggy Celery processor sitting on that much work??


I scrolled thru all 20+ pages of work units. It looks like the last one processed was in mid September - but the system continued to download work on a nearly daily basis since (as recently as yesterday).

There may be good reasons why processing stopped, but it is rather surprising the system continued to request work. Makes one wonder what the quota settings are.

At least it's down to a 1 per day quota. EDIT: But is taking 37 days to turn anything around.
ID: 658592 · Report as offensive
Profile Keith T.
Volunteer tester
Avatar

Send message
Joined: 23 Aug 99
Posts: 962
Credit: 537,293
RAC: 9
United Kingdom
Message 658595 - Posted: 12 Oct 2007, 21:24:07 UTC
Last modified: 12 Oct 2007, 21:25:18 UTC

According to BOINC Stats that host does occasionally return a valid WU.

http://boincstats.com/stats/host_graph.php?pr=sah&id=2415609

Why is it hoarding so many? I can't work out if this is being done accidentally or deliberately !
Sir Arthur C Clarke 1917-2008
ID: 658595 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 658740 - Posted: 13 Oct 2007, 1:21:44 UTC - in response to Message 658595.  

According to BOINC Stats that host does occasionally return a valid WU.

http://boincstats.com/stats/host_graph.php?pr=sah&id=2415609

Why is it hoarding so many? I can't work out if this is being done accidentally or deliberately !


That sounds like a BOINC server feature request! "Absolute max number of sent WU per computer". This would limit downloaded tasks to a specific number, until either the machine returns valid result or a valid result past its deadline.
ID: 658740 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 658749 - Posted: 13 Oct 2007, 1:29:41 UTC
Last modified: 13 Oct 2007, 1:30:07 UTC

It doesn't really matter much in the total context of things. If one host has a few thousand extra WUs, the rest of the system goes on.......
Hangs up a few results, but it's no biggy.
The whole thing about pending WUs is getting stale.
They have been increasing, and they will increase until they reach an apex and then they will settle down.
If you put 38,000 in pending like I have, and then it stays about the same, it's like you are getting your results granted in real time, they shouldn't build any further. And if they do, for a short span, they will come back down and things will even out.
It's all just a puff of smoke, and should not concern anybody.
The real concern is whether the servers can be configured to stay stable for more than a few days at a time, whether Seti can secure enough funding to stay running, and whether Arecibo comes back and stays on line to feed Seti the data that we are all analyzing.
Any of the above fails, and we are all without our pet project.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 658749 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Bigger work buffer


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.