Message boards :
Number crunching :
4.42 has been posted.
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next
Author | Message |
---|---|
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
Just my rant continuing........ I agree on all counts. Since 4.35 I've been asking (on the forums) if resource share was respected in "panic" mode, but have NEVER got an answer. I see Rom and JM7 respond to questions before and after mine. They leave mine alone. I'm wondering if it's because of the uproar the real answer would cause. PS setting your forum preference to "don't show Signatures" will fix your scroll issue. Rob is working on it. tony |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
I agree on all counts. Since 4.35 I've been asking (on the forums) if resource share was respected in "panic" mode, but have NEVER got an answer. I see Rom and JM7 respond to questions before and after mine. They leave mine alone. I'm wondering if it's because of the uproar the real answer would cause. Actually, Tony, you can answer this yourself by observation. In panic mode, resource share is not respected -- it is after a "panic" to finish work that may run past deadline. When the panic is over, the other projects (which got "locked out") have accumulated debt, and that debt gets paid back. Resource shares are part of the "debt accumulation" so if a project gets a big boost from "panic mode" then BOINC won't even download another WU until the other projects get their share. |
ilyanep Send message Joined: 16 Nov 04 Posts: 90 Credit: 3,172,949 RAC: 9 |
Oh and come on, why does BOINC have to wait so long for a scheduler reply? It can't do anything inbetween, like checking other projects? It shouldn't take a minute for a scheduler to respond. For me it takes 10-35 seconds. <a href="http://tinyurl.com/9hemz"><img src="http://www.boincsynergy.com/images/stats/comb-1441.jpg"></img></a> |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Resource shares are part of the "debt accumulation" so if a project gets a big boost from "panic mode" then BOINC won't even download another WU until the other projects get their share. And there's the problem. I don't ask much of the other projects I am attached to, 0.1, 0.2 days at max. So I should get 1 unit per project, but am getting 2 or 3. I want to crunch lots of the project I have put in front. So my resource share for that one is way up, it is also the project that has been having problems with reaching the deadline since CC4.35 .. Then again since CC4.35 that project has gotten priority upon the "earlies deadline" in the Shell. So after all those units got out today, none were reloaded. The project had units enough, it was just that my -71,942 second LTD was keeping units out. "Nooooh, we first have to crunch these Einstein & Seti units (with deadlines of 23 to 31 May!!), as they have the most postive debt. So go home boy, I won't let you have any new units of your fave project!" Say again? Since when is BOINC Boss to tell me which project I have to crunch first, when my resource shares say something different? So I would say, either take out the resource share on the preferences pages, or take out the STD/LTD on the new client, as the two together don't work. With them I can't even get Pirates or LHC units with a short deadline, since "other projects have priority!"... yeah, right... P.S: I know LHC is off for about a week. |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
Actually, Tony, you can answer this yourself by observation. Hi ned, my slow puter don't do many WU/day, and with the switching from previous clients to new ones, I can't count on what it's doing now to be definitive. I ofcourse believe that resource share is NOT honored in "panic" mode, and that's why I'd like it answered. So what Seti/Boinc is saying is that if you do work for projects with a 7 day deadline you can't have more than a 1 day cache to have resource share work? If so then what happens in bad times? Especially when 1 day is LESS than a days work, and outages usually last a day for outage, then another day for recovery. This is just plain wrong. I have seen it myself with 4.35 (the only version that lasted long enough to start to see patterns) that PPAH and Einstein were getting a lot more of my CPU cycles than Seti was. As for it won't download more work with a negative long term debt, I have seen both LHC and PPAH download more, one after another until seti was in Panic mode. Seeing this additional downloading is why I feel we need an answer from the Devs on this. have a nice nite Ned tony |
MJKelleher Send message Joined: 1 Jul 99 Posts: 2048 Credit: 1,575,401 RAC: 0 |
For the folk who are having issues with long-term debts overriding current resource share allocations, there might be a setting to re-set the debt loads. I manually re-set all the debt numbers in client_state.xml to 0.000000 when I switched to 4.40. Since all my projects have work at the moment, the load has been pretty well distributed the way I set it (40 S@H-40 E@H-10 Protein-10 CPDN, connect every .25 days). I'll see what happens when somebody goes down for a while. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
I know how to do it, MJ, but it still isn't the right way to do. No one should fiddle around with his or her client_state.xml file!! Heck, the thing was somewhat invented to go against cheating. If we can cheat our applications into thinking they need more work, then what's next? On the other hand, if BOINC thinks my application that has the highest priority (in my eyes) doesn't need ny work, then hell yes! I will edit the file. :) Something I just found in my messages list: 18/05/2005 02:09:34||Computer is overcommitted 18/05/2005 02:09:34||Nearly overcommitted. 18/05/2005 02:09:34||New work fetch policy: no work fetch allowed. 18/05/2005 02:09:34||New CPU scheduler policy: earliest deadline first. Hello? The next deadline is 2.5 days away!! Then it switched its mind, without interference: 18/05/2005 03:38:06||schedule_cpus: time 5400.000000 18/05/2005 03:38:06||New work fetch policy: work fetch allowed. 18/05/2005 03:38:06||New CPU scheduler policy: highest debt first. Great... so now Einstein is running alone again, is it? I'm thinking about switching back to 4.35 .. at least that one only tried to hunt after the closest deadline. |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
I'm using a dial up connection. I'm also now on my "office" puter, vs the laptop I used for my previous posts. I'm using Winxpsp2 Boinc 4.42. I switched yesterday to the new 4.42. I got work (connect to set at 2 days). I finished all my PPAH and All my seti but one, and that one will be finished in 7 minutes. I signed on, Uploaded my work via Retry NOw. Then reported the work via Project update I go this message: 5/17/2005 9:32:55 PM|ProteinPredictorAtHome|Sending scheduler request to http://predictor.scripps.edu/predictor_cgi/cgi 5/17/2005 9:32:58 PM|ProteinPredictorAtHome|Scheduler request to http://predictor.scripps.edu/predictor_cgi/cgi succeeded 5/17/2005 9:33:45 PM|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi 5/17/2005 9:33:47 PM|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded notice the time. As of 9:51 It still hasn't requested any new work. I'll be out of work in minutes. Just how long do Dial up people need to stay connected to get work? Anyone else not getting new work?? tony |
MJKelleher Send message Joined: 1 Jul 99 Posts: 2048 Credit: 1,575,401 RAC: 0 |
I know how to do it, MJ, but it still isn't the right way to do. No one should fiddle around with his or her client_state.xml file!! I don't think of it as cheating, though I agree it isn't the right way to do it. But the system lets us suspend activity on a project, or a work unit, at any time. Why shouldn't we be able to tell our system to start the balancing again from scratch? Ideally, and maybe this could be put on a wish list for future versions, there could be a button on the Projects tab to reset the debt load and start the calculations over again. |
wckesq Send message Joined: 17 May 99 Posts: 15 Credit: 246,704 RAC: 0 |
Tony: I must agree. Since 4.35, that has been my problem (getting new work) and as a result, I have stayed with 4.32 which has been stable and with a scheduler that downloads work for my machines. I am connected to about 6 or 7 projects with a 1 day cache, for this 4.32 has performed very well. When the newest DEV versions can do the same, I'll download. Best luck! WCK |
Kajunfisher Send message Joined: 29 Mar 05 Posts: 1407 Credit: 126,476 RAC: 0 |
I thought it was 4.42 having a problem, looks like it's not. "Over the weekend we had problems generating work. At some point duplicate work units were created and sent out. The validator does not know how to handle duplicate work units. This is the cause of the disappearing work units. We are looking into it." dlb David Lee Braun Predictor@Home This message can be found at: http://predictor.scripps.edu/forum_thread.php?id=1652 |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Resource shares are part of the "debt accumulation" so if a project gets a big boost from "panic mode" then BOINC won't even download another WU until the other projects get their share. The part that you're missing here is that while you can't get Pirates or LHC, both Pirates and LHC are accumulating debt, and once the work you have is crunched, priority will go to Pirates and LHC -- to the exclusion of Einstein and SETI. Then it will be their turn again. If you look at what is in your cache at any given moment, or look at the work short-term, it really does look broken. If you look at what gets done over a several day period, you can start to see the pattern. During the last SETI outage, my machine crunched a bunch of LHC and E@H. For a few days afterwards, it crunched only SETI until the time balanced. 4.25 didn't do that. It just returned work late. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Actually, Tony, you can answer this yourself by observation. That isn't what I said at all. Let's say you have Einstein and SETI, and let's say 10 days of cache (just to get the worst case). 50% resource share, and reasonable history so that isn't taken into account. You download 5 days of Einstein and 5 days of SETI, and bang! we're in panic mode because you won't be reporting within the 7 day Einstein limit. So you crunch Einstein solid for five days -- no SETI at all because Einstein has short deadlines. Now, the Einstein is done, and SETI is owed 5 days of processing. So, for the next five days, BOINC will not even download Einstein work -- it is going to do five days of SETI to "pay back" the five days of Einstein. As BOINC runs SETI, the Einstein debt decreases, and at some point SETI owes Einstein, and some work is downloaded. BOINC may go right back to Panic Mode, but crunching builds debt that has to be "paid back" to the other projects. The big difference is that you can't look at the cache and say "hey, I've got 12 hours of SETI and 12 hours of Einstein -- so all is well with the world" anymore than you can see Star Wars Episode III by looking at a stack of snapshots your buddy took when he saw the movie. It all balances, but on a longer timeline. |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
read my last post to understand this one as it's a continuation. After the last WU finished. It uploaded and the scheduler requested more work from both PPAH and Seti. If I had not of been signed on at the second it finished I'd be sitting dead in the water. Why doesn't the scheduler request work when prompted by project update? There was no other work for any other project. Doesn't this kind of defeat the the idea of a cache? Doesn't seti want dial uppers? Not everyone can be there the second the last one gets sent in. tony |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Something I just found in my messages list: Actually, 4.35 worked strictly on short-term debt. It didn't really pay attention to deadlines. What I see here is that BOINC ran for 90 minutes, reviewed the scheduling, and saw that it didn't need to panic any more. As an aside, I've fiddled a bit with the debt numbers just to see how they work, and I'm convinced that if you leave them alone they'll balance out. If you're really convinced that everything is totally out of whack, stop BOINC and edit the client state file. Set all of the debts to zero. Then try your best to leave it alone and just watch. BoincView makes watching easy. I've crunched SETI most of the day. Right now, SETI has 0 debt, and Einstein has +90 minutes worth, so I think it's Einstein for the next couple of hours. |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
It all balances, but on a longer timeline. Yes Ned I agree that's the way it's supposed to work, but I'm seeing different behaviour (or did with 4.35). That's why I want the devs to answer. It would keep downloading PPAH and LHC and crunching those one after another even though Seti was already downloaded. 4.35 was only out a week, so I haven't had the time to confirm it. Also, these newer versions haven't been out long enough to witness it either. I have seen it enter panic mode crunch 5 PPAH, download 5 more PPAH, crunch them, download 5 more PPAH, then switch to normal mode crunch part of ONE seti, then back into panic mode crunching PPAH and LHC. Over time I'll be able to witness how this really affects credit/resource share. Ned, try setting you cache to 4 days and see what you think. the negative LTD is supposed to stop downloads but doesn't with PPAH and LHC. again, I will be watching this and I realize what 4.35 does is not necessarily what 4.42 will do, so I asked the question in an effort to avoid watching for the next week or so. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
read my last post to understand this one as it's a continuation. Tony, The new scheduler is a work in progress, and from what I've read they're aware that it isn't the best on dialup. ... and there certainly are issues with BOINC controlling a modem connection. At this time, the stable release is probably better for modem connections, or you can keep running 4.42 and reporting so the folks working on it can see how it looks from where you are. -- Ned |
The Gas Giant Send message Joined: 22 Nov 01 Posts: 1904 Credit: 2,646,654 RAC: 0 |
So what's the point of the preference "switch between applications every" if BOINC doesn't respect it? |
Astro Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
At this time, the stable release is probably better for modem connections, or you can keep running 4.42 and reporting so the folks working on it can see how it looks from where you are. Ned, I know it's not the best. I want them to be aware of issues so they can fix them. If I post problems here, and someone(or many someones) see it and respond with similar issues, then it's more likely that Berkeley will realize there is a problem and FIX the problem. I don't like software telling me that I can only have less than a days work and still share my CPU cycles with the projects the way I choose to do so. I haven't seen any recognition of this problem by devs or programmers. I'd like them to say it. Ned I know you're trying to help, and thank you for it. tony |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
So what's the point of the preference "switch between applications every" if BOINC doesn't respect it? Let's say you have SETI at 90% and E@H at 10% in your project preferences. If we have "switch every 60 minutes" your statement is that BOINC should run SETI for an hour and then switch to Einstein. 60 minutes later it should respect the "switch every" setting and go back to SETI. "Switch every 60 minutes" means that BOINC will check every 60 minutes and see if it should switch. If SETI is running, it will stay with SETI for 9 hours before it runs Einstein for 1 hour. In 4.4x, the decision will be reviewed every 60 minutes, but if Einstein owes alot to SETI then it will go much longer than 9 hours. But, on average, SETI will get 9 hours for every 1 hour of Einstein, even if Einstein runs for 10 hours straight because of approaching deadlines. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.