4.42 has been posted.

Message boards : Number crunching : 4.42 has been posted.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 112190 - Posted: 17 May 2005, 23:48:25 UTC - in response to Message 112186.  

Just my rant continuing........

ps. Wow really futzed boards on IE..I had to go in and edit this post to make it readable!


I agree on all counts. Since 4.35 I've been asking (on the forums) if resource share was respected in "panic" mode, but have NEVER got an answer. I see Rom and JM7 respond to questions before and after mine. They leave mine alone. I'm wondering if it's because of the uproar the real answer would cause.

PS setting your forum preference to "don't show Signatures" will fix your scroll issue. Rob is working on it.

tony
ID: 112190 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 112194 - Posted: 18 May 2005, 0:00:29 UTC - in response to Message 112190.  

I agree on all counts. Since 4.35 I've been asking (on the forums) if resource share was respected in "panic" mode, but have NEVER got an answer. I see Rom and JM7 respond to questions before and after mine. They leave mine alone. I'm wondering if it's because of the uproar the real answer would cause.
tony

Actually, Tony, you can answer this yourself by observation.

In panic mode, resource share is not respected -- it is after a "panic" to finish work that may run past deadline.

When the panic is over, the other projects (which got "locked out") have accumulated debt, and that debt gets paid back.

Resource shares are part of the "debt accumulation" so if a project gets a big boost from "panic mode" then BOINC won't even download another WU until the other projects get their share.
ID: 112194 · Report as offensive
Profile ilyanep

Send message
Joined: 16 Nov 04
Posts: 90
Credit: 3,172,949
RAC: 9
United States
Message 112196 - Posted: 18 May 2005, 0:08:43 UTC - in response to Message 112169.  

Oh and come on, why does BOINC have to wait so long for a scheduler reply? It can't do anything inbetween, like checking other projects?
\.



It shouldn't take a minute for a scheduler to respond. For me it takes 10-35 seconds.
<a href="http://tinyurl.com/9hemz"><img src="http://www.boincsynergy.com/images/stats/comb-1441.jpg"></img></a>
ID: 112196 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 112208 - Posted: 18 May 2005, 0:44:11 UTC - in response to Message 112194.  
Last modified: 18 May 2005, 0:47:02 UTC

Resource shares are part of the "debt accumulation" so if a project gets a big boost from "panic mode" then BOINC won't even download another WU until the other projects get their share.

And there's the problem.

I don't ask much of the other projects I am attached to, 0.1, 0.2 days at max.
So I should get 1 unit per project, but am getting 2 or 3.

I want to crunch lots of the project I have put in front. So my resource share for that one is way up, it is also the project that has been having problems with reaching the deadline since CC4.35 .. Then again since CC4.35 that project has gotten priority upon the "earlies deadline" in the Shell.

So after all those units got out today, none were reloaded. The project had units enough, it was just that my -71,942 second LTD was keeping units out.

"Nooooh, we first have to crunch these Einstein & Seti units (with deadlines of 23 to 31 May!!), as they have the most postive debt. So go home boy, I won't let you have any new units of your fave project!"

Say again? Since when is BOINC Boss to tell me which project I have to crunch first, when my resource shares say something different?

So I would say, either take out the resource share on the preferences pages, or take out the STD/LTD on the new client, as the two together don't work. With them I can't even get Pirates or LHC units with a short deadline, since "other projects have priority!"...

yeah, right...

P.S: I know LHC is off for about a week.
ID: 112208 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 112210 - Posted: 18 May 2005, 0:46:42 UTC - in response to Message 112194.  

Actually, Tony, you can answer this yourself by observation.

In panic mode, resource share is not respected -- it is after a "panic" to finish work that may run past deadline.

When the panic is over, the other projects (which got "locked out") have accumulated debt, and that debt gets paid back.


Hi ned, my slow puter don't do many WU/day, and with the switching from previous clients to new ones, I can't count on what it's doing now to be definitive. I ofcourse believe that resource share is NOT honored in "panic" mode, and that's why I'd like it answered.

So what Seti/Boinc is saying is that if you do work for projects with a 7 day deadline you can't have more than a 1 day cache to have resource share work? If so then what happens in bad times? Especially when 1 day is LESS than a days work, and outages usually last a day for outage, then another day for recovery. This is just plain wrong.

I have seen it myself with 4.35 (the only version that lasted long enough to start to see patterns) that PPAH and Einstein were getting a lot more of my CPU cycles than Seti was. As for it won't download more work with a negative long term debt, I have seen both LHC and PPAH download more, one after another until seti was in Panic mode. Seeing this additional downloading is why I feel we need an answer from the Devs on this.

have a nice nite Ned

tony
ID: 112210 · Report as offensive
Profile MJKelleher
Volunteer tester
Avatar

Send message
Joined: 1 Jul 99
Posts: 2048
Credit: 1,575,401
RAC: 0
United States
Message 112231 - Posted: 18 May 2005, 1:19:43 UTC

For the folk who are having issues with long-term debts overriding current resource share allocations, there might be a setting to re-set the debt loads. I manually re-set all the debt numbers in client_state.xml to 0.000000 when I switched to 4.40. Since all my projects have work at the moment, the load has been pretty well distributed the way I set it (40 S@H-40 E@H-10 Protein-10 CPDN, connect every .25 days). I'll see what happens when somebody goes down for a while.

ID: 112231 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 112243 - Posted: 18 May 2005, 1:31:52 UTC - in response to Message 112231.  
Last modified: 18 May 2005, 1:53:05 UTC

I know how to do it, MJ, but it still isn't the right way to do. No one should fiddle around with his or her client_state.xml file!!

Heck, the thing was somewhat invented to go against cheating. If we can cheat our applications into thinking they need more work, then what's next?

On the other hand, if BOINC thinks my application that has the highest priority (in my eyes) doesn't need ny work, then hell yes! I will edit the file. :)

Something I just found in my messages list:
18/05/2005 02:09:34||Computer is overcommitted
18/05/2005 02:09:34||Nearly overcommitted.
18/05/2005 02:09:34||New work fetch policy: no work fetch allowed.
18/05/2005 02:09:34||New CPU scheduler policy: earliest deadline first.

Hello? The next deadline is 2.5 days away!!

Then it switched its mind, without interference:
18/05/2005 03:38:06||schedule_cpus: time 5400.000000
18/05/2005 03:38:06||New work fetch policy: work fetch allowed.
18/05/2005 03:38:06||New CPU scheduler policy: highest debt first.

Great... so now Einstein is running alone again, is it?

I'm thinking about switching back to 4.35 .. at least that one only tried to hunt after the closest deadline.
ID: 112243 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 112260 - Posted: 18 May 2005, 1:53:36 UTC

I'm using a dial up connection. I'm also now on my "office" puter, vs the laptop I used for my previous posts. I'm using Winxpsp2 Boinc 4.42. I switched yesterday to the new 4.42. I got work (connect to set at 2 days). I finished all my PPAH and All my seti but one, and that one will be finished in 7 minutes. I signed on, Uploaded my work via Retry NOw. Then reported the work via Project update I go this message:

5/17/2005 9:32:55 PM|ProteinPredictorAtHome|Sending scheduler request to http://predictor.scripps.edu/predictor_cgi/cgi
5/17/2005 9:32:58 PM|ProteinPredictorAtHome|Scheduler request to http://predictor.scripps.edu/predictor_cgi/cgi succeeded
5/17/2005 9:33:45 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
5/17/2005 9:33:47 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded

notice the time. As of 9:51 It still hasn't requested any new work. I'll be out of work in minutes. Just how long do Dial up people need to stay connected to get work?

Anyone else not getting new work??

tony

ID: 112260 · Report as offensive
Profile MJKelleher
Volunteer tester
Avatar

Send message
Joined: 1 Jul 99
Posts: 2048
Credit: 1,575,401
RAC: 0
United States
Message 112268 - Posted: 18 May 2005, 2:04:10 UTC - in response to Message 112243.  

I know how to do it, MJ, but it still isn't the right way to do. No one should fiddle around with his or her client_state.xml file!!

Heck, the thing was somewhat invented to go against cheating. If we can cheat our applications into thinking they need more work, then what's next?

I don't think of it as cheating, though I agree it isn't the right way to do it. But the system lets us suspend activity on a project, or a work unit, at any time. Why shouldn't we be able to tell our system to start the balancing again from scratch? Ideally, and maybe this could be put on a wish list for future versions, there could be a button on the Projects tab to reset the debt load and start the calculations over again.

ID: 112268 · Report as offensive
wckesq
Volunteer tester

Send message
Joined: 17 May 99
Posts: 15
Credit: 246,704
RAC: 0
United States
Message 112269 - Posted: 18 May 2005, 2:04:44 UTC

Tony:

I must agree. Since 4.35, that has been my problem (getting new work) and as a result, I have stayed with 4.32 which has been stable and with a scheduler that downloads work for my machines. I am connected to about 6 or 7 projects with a 1 day cache, for this 4.32 has performed very well.

When the newest DEV versions can do the same, I'll download.

Best luck!

WCK
ID: 112269 · Report as offensive
Profile Kajunfisher
Volunteer tester
Avatar

Send message
Joined: 29 Mar 05
Posts: 1407
Credit: 126,476
RAC: 0
United States
Message 112275 - Posted: 18 May 2005, 2:12:56 UTC

I thought it was 4.42 having a problem, looks like it's not.

"Over the weekend we had problems generating work. At some point duplicate work units were created and sent out. The validator does not know how to handle duplicate work units. This is the cause of the disappearing work units. We are looking into it."

dlb


David Lee Braun
Predictor@Home

This message can be found at:

http://predictor.scripps.edu/forum_thread.php?id=1652

ID: 112275 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 112281 - Posted: 18 May 2005, 2:22:30 UTC - in response to Message 112208.  
Last modified: 18 May 2005, 2:23:08 UTC

Resource shares are part of the "debt accumulation" so if a project gets a big boost from "panic mode" then BOINC won't even download another WU until the other projects get their share.

And there's the problem.

I don't ask much of the other projects I am attached to, 0.1, 0.2 days at max.
So I should get 1 unit per project, but am getting 2 or 3.

I want to crunch lots of the project I have put in front. So my resource share for that one is way up, it is also the project that has been having problems with reaching the deadline since CC4.35 .. Then again since CC4.35 that project has gotten priority upon the "earlies deadline" in the Shell.

So after all those units got out today, none were reloaded. The project had units enough, it was just that my -71,942 second LTD was keeping units out.

"Nooooh, we first have to crunch these Einstein & Seti units (with deadlines of 23 to 31 May!!), as they have the most postive debt. So go home boy, I won't let you have any new units of your fave project!"

Say again? Since when is BOINC Boss to tell me which project I have to crunch first, when my resource shares say something different?

So I would say, either take out the resource share on the preferences pages, or take out the STD/LTD on the new client, as the two together don't work. With them I can't even get Pirates or LHC units with a short deadline, since "other projects have priority!"...

yeah, right...

P.S: I know LHC is off for about a week.

The part that you're missing here is that while you can't get Pirates or LHC, both Pirates and LHC are accumulating debt, and once the work you have is crunched, priority will go to Pirates and LHC -- to the exclusion of Einstein and SETI. Then it will be their turn again.

If you look at what is in your cache at any given moment, or look at the work short-term, it really does look broken. If you look at what gets done over a several day period, you can start to see the pattern.

During the last SETI outage, my machine crunched a bunch of LHC and E@H. For a few days afterwards, it crunched only SETI until the time balanced.

4.25 didn't do that. It just returned work late.
ID: 112281 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 112288 - Posted: 18 May 2005, 2:32:17 UTC - in response to Message 112210.  

Actually, Tony, you can answer this yourself by observation.

In panic mode, resource share is not respected -- it is after a "panic" to finish work that may run past deadline.

When the panic is over, the other projects (which got "locked out") have accumulated debt, and that debt gets paid back.


So what Seti/Boinc is saying is that if you do work for projects with a 7 day deadline you can't have more than a 1 day cache to have resource share work? If so then what happens in bad times? Especially when 1 day is LESS than a days work, and outages usually last a day for outage, then another day for recovery. This is just plain wrong.

That isn't what I said at all.

Let's say you have Einstein and SETI, and let's say 10 days of cache (just to get the worst case). 50% resource share, and reasonable history so that isn't taken into account.

You download 5 days of Einstein and 5 days of SETI, and bang! we're in panic mode because you won't be reporting within the 7 day Einstein limit.

So you crunch Einstein solid for five days -- no SETI at all because Einstein has short deadlines.

Now, the Einstein is done, and SETI is owed 5 days of processing.

So, for the next five days, BOINC will not even download Einstein work -- it is going to do five days of SETI to "pay back" the five days of Einstein.

As BOINC runs SETI, the Einstein debt decreases, and at some point SETI owes Einstein, and some work is downloaded.

BOINC may go right back to Panic Mode, but crunching builds debt that has to be "paid back" to the other projects.

The big difference is that you can't look at the cache and say "hey, I've got 12 hours of SETI and 12 hours of Einstein -- so all is well with the world" anymore than you can see Star Wars Episode III by looking at a stack of snapshots your buddy took when he saw the movie.

It all balances, but on a longer timeline.

ID: 112288 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 112291 - Posted: 18 May 2005, 2:35:57 UTC

read my last post to understand this one as it's a continuation.

After the last WU finished. It uploaded and the scheduler requested more work from both PPAH and Seti. If I had not of been signed on at the second it finished I'd be sitting dead in the water. Why doesn't the scheduler request work when prompted by project update? There was no other work for any other project. Doesn't this kind of defeat the the idea of a cache? Doesn't seti want dial uppers? Not everyone can be there the second the last one gets sent in.

tony
ID: 112291 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 112292 - Posted: 18 May 2005, 2:44:27 UTC - in response to Message 112243.  

Something I just found in my messages list:
18/05/2005 02:09:34||Computer is overcommitted
18/05/2005 02:09:34||Nearly overcommitted.
18/05/2005 02:09:34||New work fetch policy: no work fetch allowed.
18/05/2005 02:09:34||New CPU scheduler policy: earliest deadline first.

Hello? The next deadline is 2.5 days away!!

Then it switched its mind, without interference:
18/05/2005 03:38:06||schedule_cpus: time 5400.000000
18/05/2005 03:38:06||New work fetch policy: work fetch allowed.
18/05/2005 03:38:06||New CPU scheduler policy: highest debt first.

Great... so now Einstein is running alone again, is it?

I'm thinking about switching back to 4.35 .. at least that one only tried to hunt after the closest deadline.

Actually, 4.35 worked strictly on short-term debt. It didn't really pay attention to deadlines.

What I see here is that BOINC ran for 90 minutes, reviewed the scheduling, and saw that it didn't need to panic any more.

As an aside, I've fiddled a bit with the debt numbers just to see how they work, and I'm convinced that if you leave them alone they'll balance out.

If you're really convinced that everything is totally out of whack, stop BOINC and edit the client state file. Set all of the debts to zero. Then try your best to leave it alone and just watch. BoincView makes watching easy.

I've crunched SETI most of the day. Right now, SETI has 0 debt, and Einstein has +90 minutes worth, so I think it's Einstein for the next couple of hours.
ID: 112292 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 112293 - Posted: 18 May 2005, 2:45:42 UTC - in response to Message 112288.  
Last modified: 18 May 2005, 2:48:13 UTC

It all balances, but on a longer timeline.


Yes Ned I agree that's the way it's supposed to work, but I'm seeing different behaviour (or did with 4.35). That's why I want the devs to answer. It would keep downloading PPAH and LHC and crunching those one after another even though Seti was already downloaded. 4.35 was only out a week, so I haven't had the time to confirm it. Also, these newer versions haven't been out long enough to witness it either. I have seen it enter panic mode crunch 5 PPAH, download 5 more PPAH, crunch them, download 5 more PPAH, then switch to normal mode crunch part of ONE seti, then back into panic mode crunching PPAH and LHC. Over time I'll be able to witness how this really affects credit/resource share.

Ned, try setting you cache to 4 days and see what you think. the negative LTD is supposed to stop downloads but doesn't with PPAH and LHC. again, I will be watching this and I realize what 4.35 does is not necessarily what 4.42 will do, so I asked the question in an effort to avoid watching for the next week or so.
ID: 112293 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 112294 - Posted: 18 May 2005, 2:48:40 UTC - in response to Message 112291.  

read my last post to understand this one as it's a continuation.

After the last WU finished. It uploaded and the scheduler requested more work from both PPAH and Seti. If I had not of been signed on at the second it finished I'd be sitting dead in the water. Why doesn't the scheduler request work when prompted by project update? There was no other work for any other project. Doesn't this kind of defeat the the idea of a cache? Doesn't seti want dial uppers? Not everyone can be there the second the last one gets sent in.

tony

Tony,

The new scheduler is a work in progress, and from what I've read they're aware that it isn't the best on dialup.

... and there certainly are issues with BOINC controlling a modem connection.

At this time, the stable release is probably better for modem connections, or you can keep running 4.42 and reporting so the folks working on it can see how it looks from where you are.

-- Ned
ID: 112294 · Report as offensive
Profile The Gas Giant
Volunteer tester
Avatar

Send message
Joined: 22 Nov 01
Posts: 1904
Credit: 2,646,654
RAC: 0
Australia
Message 112295 - Posted: 18 May 2005, 2:49:31 UTC

So what's the point of the preference "switch between applications every" if BOINC doesn't respect it?
ID: 112295 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 112300 - Posted: 18 May 2005, 2:56:25 UTC - in response to Message 112294.  

At this time, the stable release is probably better for modem connections, or you can keep running 4.42 and reporting so the folks working on it can see how it looks from where you are.

-- Ned


Ned, I know it's not the best. I want them to be aware of issues so they can fix them. If I post problems here, and someone(or many someones) see it and respond with similar issues, then it's more likely that Berkeley will realize there is a problem and FIX the problem. I don't like software telling me that I can only have less than a days work and still share my CPU cycles with the projects the way I choose to do so.

I haven't seen any recognition of this problem by devs or programmers. I'd like them to say it.

Ned I know you're trying to help, and thank you for it.

tony
ID: 112300 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 112302 - Posted: 18 May 2005, 2:57:01 UTC - in response to Message 112295.  

So what's the point of the preference "switch between applications every" if BOINC doesn't respect it?

Let's say you have SETI at 90% and E@H at 10% in your project preferences.

If we have "switch every 60 minutes" your statement is that BOINC should run SETI for an hour and then switch to Einstein. 60 minutes later it should respect the "switch every" setting and go back to SETI.

"Switch every 60 minutes" means that BOINC will check every 60 minutes and see if it should switch. If SETI is running, it will stay with SETI for 9 hours before it runs Einstein for 1 hour.

In 4.4x, the decision will be reviewed every 60 minutes, but if Einstein owes alot to SETI then it will go much longer than 9 hours.

But, on average, SETI will get 9 hours for every 1 hour of Einstein, even if Einstein runs for 10 hours straight because of approaching deadlines.
ID: 112302 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : 4.42 has been posted.


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.