Message boards :
Number crunching :
Project Status - 08/26/2005 4pm PST
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next
Author | Message |
---|---|
Mike Send message Joined: 17 Feb 01 Posts: 34255 Credit: 79,922,639 RAC: 80 |
Hi You are not alone. They are working on it. Maybe they open the door for a few houres. greetz Mike With each crime and every kindness we birth our future. |
Blanckaert Send message Joined: 30 May 99 Posts: 15 Credit: 393,159 RAC: 14 |
Rom, I'm sitting here with 685 work units waiting to upload, some have drifted past due by a day or two, many others are getting close. Will Berkeley take into account the down time and allow these work units as finished on time or am I just generating heat at a time when CALISO would prefer that I use my "appliances" sparingly? I'm guessing they will still accept them... I know if a quorum ain't reached and a late WU shows up and meets the quorum it'll be accepted... (But I'm sure someone will correct me...) Mark |
lynxtra Send message Joined: 3 Sep 04 Posts: 137 Credit: 273,636 RAC: 0 |
is this why i am not getting any work units at the moments how long is this situation going to be like this? |
KWSN - MajorKong Send message Joined: 5 Jan 00 Posts: 2892 Credit: 1,499,890 RAC: 0 |
I don't think anyone, not even the Devs, knows for sure. Short answer (without meaning to be a smartass) is "until it isn't." Of course, everyone hopes it is sooner rather than later. https://youtu.be/iY57ErBkFFE #Texit Don't blame me, I voted for Johnson(L) in 2016. Truth is dangerous... especially when it challenges those in power. |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
Murphy is alive and well and living at my house. While we are all waiting for these latest issues to be resolved, as with most people all of my crunchers ran out of work. No problem you say? The results will eventually be returned, credited, and added to the science db? Well, not quite. Murphy couldn't resist this golden opportunity to totally trash the hard disk on one of my faster machines. Even booting from the Windoz distribution CD and going into COnsole mode I am unable to overwrite the files that Windoz complains about while trying to boot. I couldn't even rename or delete them and try again to copy new files. Finally, "Chkdsk /R" reports "one or more unrecoverable errors". And of course worst of all, Console mode won't even read the BOINC directory to save those files off to a flash drive. Can you say Toast? So now, in addition to losing a full queue's worth of new workunits, (bad enough but not tragic) I've also lost almost 4 days worth of crunching. Disaster? End of the world? Hell not, it just sucks. Oh well, life's a bitch and then you die. Guess I'm on my way out the door to buy a new drive. This is NOT aimed as anyone in particular, but my message is: Don't complain, this could be worse. |
[B^S] Spydermb Send message Joined: 16 Jul 99 Posts: 496 Credit: 10,860,148 RAC: 0 |
Just just added to Front Page News August 27, 2005 Outage Update: We have been offline since Tuesday, August 23 and plan to stay offline until after the weekend. This is because of a disk failure as well as an unwieldy backlog of results to validate and delete. We apologize for the inconvenience, but credit that has been pending for weeks is now being granted. We want to be as "caught up" as possible before coming back on line. The process of catching up is taking much longer than expected. More information in Technical News. O Well, Enjoy the weekend. BOINC SYNERGY is an International Team and We Welcome All BOINC Participants! BOINC Synergy Click to Join BOINC Synergy |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Hi folks ... are there any plans to add new/better HW to the existing one ? There were many problems in the availibility of the system all year long, and you people at seti is giving their best to keep going ... Sometimes, more hardware is not the solution. The problem is the sheer number of files that have accumulated for various reasons: the "antique" work units mentioned in the technical news, a backlog caused by work on the science database (the back-end we don't see), etc. I can think of at least two ways to avoid the large number of files, and others have come up with additional ideas. My personal favorite: put instructions in the master fetch file telling the clients to upload/download/report more slowly (or more quickly). If the validators/assimilators/deleters start getting full, slow down the BOINC clients. (My favorite algorithm is p-Persistence -- see "Computer Networks" by Andrew Tannenbaum). |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Just just added to Front Page News Note that this was updated today, Saturday. Earlier in this thread Rom gave us his update, and it sounds like they are going to be turning processes off and on all weekend to "coax" the work through the system -- giving the validators unrestricted access, then giving the transitioners or assimilators or deleters unrestricted access. Less competition between processes should be faster. |
[B^S] Spydermb Send message Joined: 16 Jul 99 Posts: 496 Credit: 10,860,148 RAC: 0 |
Note that this was updated today, Saturday. Earlier in this thread Rom gave us his update, and it sounds like they are going to be turning processes off and on all weekend to "coax" the work through the system -- giving the validators unrestricted access, then giving the transitioners or assimilators or deleters unrestricted access. Thanks Ned, I missed that information... BOINC SYNERGY is an International Team and We Welcome All BOINC Participants! BOINC Synergy Click to Join BOINC Synergy |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Note that this was updated today, Saturday. Earlier in this thread Rom gave us his update, and it sounds like they are going to be turning processes off and on all weekend to "coax" the work through the system -- giving the validators unrestricted access, then giving the transitioners or assimilators or deleters unrestricted access. For something like this, you don't need someone camping out in the office. You do something, you go do something else for a few hours, then you come back and see what happened. If it was good, you either leave it running, or maybe change the mix a little to see if you can get more speed, then come back again in a few hours and see what happened. It looks like someone is doing just that -- and for the most part, doing it through SSH or something similar. |
BarryAZ Send message Joined: 1 Apr 01 Posts: 2580 Credit: 16,982,517 RAC: 0 |
I may be wrong, but I believe this is turning out to be the longest sustained outage of the year, so one would expect (especially after what has been a troubled two months here) to find folks at the edge of their 'patience is a virtue' quota. I am also sure the folks at Berkeley are doing all that they are capable of to get this project going again. Realizing that, I think we all should cut the Berkeley folks some slack by not tossing stones at them -- I suspect they are at least as much as many of us on the edge of their patience quota. |
dre Send message Joined: 17 May 99 Posts: 26 Credit: 429,124 RAC: 0 |
The validator really seems to be working now. In the last 15 minutes the waiting for validation queue had gone down by almost 18,000 WUs. Let's hope the worst is over and that the hardware will now be able to keep up with the demands placed on it. |
Blablablabla Send message Joined: 23 Jan 05 Posts: 1 Credit: 9,117,463 RAC: 0 |
Just a thought: It seems the problems are related to pending credit, right??? When 4 results are returned before the deadline; the credit is calculated through some formula (that I just cannot figure out...). So credit is pending until the fourth result is in. What puzzles me is that when a result returns past the deadline, you throw away the formula and grant credit according to an "in-the-middle" principle (results: 12-23-20, granted: 20). Why not throw away formula and use the "in-the-middle" crediting. When the fourth result is in before deadline it is granted the same credit as the other three. This would mean: - 3 results are enough to grant credit - no more calculation of credit, just sort 3 results en grant #2. As I said.....just a thought. BTW: I just love a regular update. |
KWSN - MajorKong Send message Joined: 5 Jan 00 Posts: 2892 Credit: 1,499,890 RAC: 0 |
Just a thought: To the best of my understanding, that is pretty much the way things work now. The only time the 'formula' is used is when the 4th result comes back before the first three get validated. Then, the high and low are discarded, and the 2 remaining ones are averaged. Of course, this all assumes that all of the returns pass validation. https://youtu.be/iY57ErBkFFE #Texit Don't blame me, I voted for Johnson(L) in 2016. Truth is dangerous... especially when it challenges those in power. |
Blanckaert Send message Joined: 30 May 99 Posts: 15 Credit: 393,159 RAC: 14 |
Well Something is slowly working... From the Stat pages at 1610UTC and 1710UTC... the Ready to Send is down 3,088 In Progress is down 958 Waiting Validation is down 19,178 So with luck maybe the system will be cleared by monday, and hardware to replace the Disk failure will be in place and maybe have a good run next week... (well after the system attempts to catch up from the backlog that is waiting, with ppl ready to hold down there enter keys....) *gone to get a beer, and wait for the flames that may or may not come...* *and make some popcorn....* **EDIT** From the Stat pages at 1710UTC and 1730UTC... the Ready to Send is down 2,691 In Progress is down 656 Waiting Validation is down 14,294 So it is coming down.... slowly... but hey it's progress.... **End Edit** Mark |
chrisjohnston Send message Joined: 31 Aug 99 Posts: 385 Credit: 91,410 RAC: 0 |
Why not throw away formula and use the "in-the-middle" crediting. When the fourth result is in before deadline it is granted the same credit as the other three. Isn't that how it works? If the three are similar enough, then it grants credit on 3, if there is a problem, it takes the fourth. - cJ |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 |
Just a thought: That's pretty close to what it does now: When there are three results, you toss the highest and lowest, and that's it, the work is done, and the work can be transitioned/deleted. When there are more than three, the highest and lowest are tossed, and the granted credit is the average of the remaining work. I don't think the problem is caused by the normal validation of work, It seems that the problem is caused by the abnormal accumulation of results. |
xx Send message Joined: 23 May 99 Posts: 166 Credit: 3,450,910 RAC: 0 |
If the scheduler is off, how is it possible that the "ready to send" queue has decreased? |
Blanckaert Send message Joined: 30 May 99 Posts: 15 Credit: 393,159 RAC: 14 |
If the scheduler is off, how is it possible that the "ready to send" queue has decreased? Because as it validates WUs it may find that having to send a fifth or sixth unit isn't needed anymore since it now has it. (or is may have the 7th or 8th result to send, that is didn't need as the 5th or 6th was in the Waiting to Validate queue) Mark |
Sir Ulli Send message Joined: 21 Oct 99 Posts: 2246 Credit: 6,136,250 RAC: 0 |
btw my credit is climbing.. :) since my last Contakt to the Berkeley Servers i got 1.000 new CS, thanks to the Firefox Plugin...to notice this. Greetings from Germany NRW Ulli |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.