Message boards :
Number crunching :
Panic Mode On (33) Server problems
Message board moderation
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next
Author | Message |
---|---|
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Oh Boy, do you have some reading ahead of you :-) (and commiserations on the Hayfever) And hitting the 'ignore' button is gonna fix things, eh? Nice attitude. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
W-K 666 Send message Joined: 18 May 99 Posts: 18996 Credit: 40,757,560 RAC: 67 |
According to the "in Progress" page there are now 203 tasks for my quad after it downloaded that 5 CUDA tasks, that includes 4 AP tasks being processed and 6 waiting. But still cannot get AP tasks for the CPU, if no more received today the cpu will go cold at ~03:00 tomorrow morning. ASAP after that it will be switched off. |
Terror Australis Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44 |
BOINC will always ask for so many seconds of work. Even if 1 second of work, you get a task that's more than 1 second long. If asking for 113426 seconds of work and this is slightly more than quota, you'll get it. The BOINC client used to show how many seconds of work were being requested in the Messages tab. This disappeared somewhere in the early 6.10.xx versions. Any idea why ? Brodo |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
BOINC will always ask for so many seconds of work. Even if 1 second of work, you get a task that's more than 1 second long. If asking for 113426 seconds of work and this is slightly more than quota, you'll get it. Because they tried to 'dumb it down' so folks like you and I would have less information to question what is actually happening. Enough said? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14644 Credit: 200,643,578 RAC: 874 |
I've looked for John's post and found it here: 'fraid not. Sutaru thought John had made a typo, and I had to correct him: now you think he has made a different typo. There is no typo. I haven't had time to get up to 1,000 at Beta, but host 28361 is up to 631 - in principle, I don't think there's any limit. And when you report an error, it does go down to 99 - or typically 100, because you're usually reporting more than one task at a time, and the good ones instantly reset the base quota. But the 'Consecutive valid tasks' counter does reset to zero: SnakesAndLadders@home, anyone? |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14644 Credit: 200,643,578 RAC: 874 |
BOINC will always ask for so many seconds of work. Even if 1 second of work, you get a task that's more than 1 second long. If asking for 113426 seconds of work and this is slightly more than quota, you'll get it. Absolutely true. But you can restore it with the [sched_op_debug] logging flag - which isn't too verbose for general use, and very useful. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Answering these two separately. (For legibility I'll be numbering them) Identification of applications, 1. Which is where you (plural, not Andy alone) come in. Complain, comment, don't comply, tell how to do it better. Tell it to the developers themselves, do not expect of them to trawl through 40 threads looking for your post. And of course, comments like "dictatorship", "this is ****" and "test it better" are best left behind. On the latter, they're testing it here in a live environment just to see what could go wrong. 2. Which will eventually be fixed. It'll always be eventually fixed. And then when it's introduced on the next project, they will run into their own problems, again due to incompatibilities with database used, hardware, software, the amount of cosmic rays passing through their office and the phase of the moon causing shifts in the Earth's magnetic field which affects the platters in their servers. :-) Credits 3. See Credit New for the low-down. Best to be read after 17 cups of coffee. Smoke 'em if you got and do 'em as well. 4. I actually like that. The biggest problem was always that people expected the claimed credit to be theirs, no matter what. OK, you won't get fun threads anymore that you claimed 17 trillion credits, but let's be honest, the method in which the claimed credit is calculated isn't in use here anymore (time * benchmarks). 5. I don't understand? But then I don't run APs. Perhaps that I'll enable it on my new system, but so far all the APs I have ever seen grace any of my systems can be counted on one finger. :-) |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
SnakesAndLadders@home, anyone? Enough board games for me, Richard. I have always highly regarded your input, but I think you have been far too condescending of the recent debacle. Your background knowledge is always appreciated, but you seem to lack consideration for those like myself that have been whacked by this new code. I am considering shutting down....I have been treading hitting the button for a day or two now. And you know I am not prone to doing so easily. I just want to be able to contribute to the best my hardware can do. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Oh Boy, do you have some reading ahead of you :-) (and commiserations on the Hayfever) Sorry? Why attack me over this? I don't work here, I run Seti on some of my systems and I try completely voluntarily try to help people with their BOINC troubles. I am not responsible for the code or its introduction, while due to me installing a completely new system I for once am away while new things are introduced around here. Including this thread there are 35 threads in this forum alone about this problem. I don't have the need to read them all. If that's not good enough for you, then tough! But be as it may, I'll stop posting and try to help clear things a little. Things I picked up, things as I see them from my perspective. Have a good rest of the day and continue in your struggle to anticipate changes. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14644 Credit: 200,643,578 RAC: 874 |
1. Which is where you (plural, not Andy alone) come in. Complain, comment, don't comply, tell how to do it better. Tell it to the developers themselves, do not expect of them to trawl through 40 threads looking for your post. And of course, comments like "dictatorship", "this is ****" and "test it better" are best left behind. On the latter, they're testing it here in a live environment just to see what could go wrong. That's the reality, but it's bad practice. In the real computing world, whenever a major application or upgrade is launched, the developers should be proactively monitoring the rollout and catching issues as they arise. I still remember (with cold shivers down my spine) the Saturday night I migrated a live telesales database from Microsoft Access to SQL Server. I had to wait until the last call ended at 10pm, then perform the transfer. But I regarded it as a consequent duty to be on-site at 10am the following (Sunday) morning, when the sales lines opened again, to monitor that everything was running smoothly. It was - we didn't lose an order. 4. I actually like that. The biggest problem was always that people expected the claimed credit to be theirs, no matter what. OK, you won't get fun threads anymore that you claimed 17 trillion credits, but let's be honest, the method in which the claimed credit is calculated isn't in use here anymore (time * benchmarks). Not true. The "claimed credit" shown on this project's website has been derived from the flopcounter for years, and is incredibly stable and reliable - except for the minute percentage of users still using the very earliest v5 clients or before. |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
Quota This is performance-limiting factor now indeed. Resetting quota from close to infinity to some default value (before it was 100*NUMBER_OF_CPU_CORES+100*5*NUMBERS_OF_GPUs AFAIK (GPU part can differ)) if error encountered is OK. If it's only random error like -12 host will have enough work to continue and to prove it's good one to server. But if it's first sign of big host failure the sooner fetch will be inhibited the better. If we could decrease "close to infinity" only by 1 for each failure quota mechanism will be uneffective to deal with broken host. But currently all says that new quota implemented with bugs. My own host still recives message aboout reached quota (294 so far), but it did not download smth even close to this number for past few days already. That is, downloaded tasks conter reset is broken. And it looks also as same quota still applied to all app versions. I too get quota reached message on ATI GPU AP work requests too. It's absolutely clear that this host can't download ~300 AP tasks last day at any conditions, actually it downloaded no AP tasks yesterday, no AP tasks today... |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Oh Boy, do you have some reading ahead of you :-) (and commiserations on the Hayfever) Cute......... I could respond in a way that would get me banned........really cutesey. I have a valid complaint.......and if you cannot acknowledge that...... You might just as well just jump offa the same bridge as your Boinc companions......... Don't EVEN give me any crap about voicing my thoughts on this matter. You are in the wrong. Have a nice day. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
All, Mark knows best. He'll fix it. |
W-K 666 Send message Joined: 18 May 99 Posts: 18996 Credit: 40,757,560 RAC: 67 |
1 & 2. The identification of apps should have been the first step with nothing more done until it was accurate. Quota, The quota should be per application, and therefore 1 & 2 apply. 3 & 4. Richard effectively answered that. 5. Credits for AP, because of Eric's modifying flop count method are at ~800cr. Brodo answered what I was going to say about extra tasks downloaded, i.e. we no longer know how much is asked for. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
All, Mark knows best. He'll fix it. That was a simple comment from a simple mind, apparently. Your making slight of me and the situation both appears to make your attitude clear. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
[AF>france>pas-de-calais]symaski62 Send message Joined: 12 Aug 05 Posts: 258 Credit: 100,548 RAC: 0 |
18/06/2010 11:40:31 SETI@home Sending scheduler request: To fetch work. 18/06/2010 11:40:31 SETI@home Requesting new tasks for GPU 18/06/2010 11:40:36 SETI@home Scheduler request completed: got 0 new tasks 18/06/2010 11:40:36 SETI@home Message from server: Project has no jobs available RED servers ^^ SETI@Home Informational message -9 result_overflow with a general handicap of 80% and it makes much d' efforts for the community and s' expimer, thank you d' to be understanding. |
Miep Send message Joined: 23 Jul 99 Posts: 2412 Credit: 351,996 RAC: 0 |
Oh Dear: WU waiting to validate 43000 and climbing - according to status page one of the validators is down. So the trickle of work people are getting from quota going up from valid taks will be even smaller. And it's still a few hours till the guys get in. There are now so many small fires to put out, it starts looking like the forest is up in flames. Intressting dilemma: will people be angrier if they fix on the fly (and introduce other problems or we just can't see the fixes quickly enough to appease the community) or if they shut down for another day? Seems that nerves are so frayed even the most patient of us are having a hard time. Carola ------- I'm multilingual - I can misunderstand people in several languages! |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 65690 Credit: 55,293,173 RAC: 49 |
I'm on empty and so far Seti will not send Me any new work, But then I gather It won't send work out unless one has the right app, whatever that is, Yet My useless quota kept going up and My complaints have gone unanswered. 6/18/2010 7:50:26 AM SETI@home Sending scheduler request: To fetch work. 6/18/2010 7:50:26 AM SETI@home Requesting new tasks 6/18/2010 7:50:27 AM SETI@home Scheduler request completed: got 0 new tasks 6/18/2010 7:50:27 AM SETI@home Message from server: No work sent 6/18/2010 7:50:27 AM SETI@home Message from server: Your app_info.xml file doesn't have a usable version of SETI@home Enhanced. 6/18/2010 7:50:27 AM SETI@home Message from server: (reached daily quota of 241 tasks) The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's |
Blurf Send message Joined: 2 Sep 06 Posts: 8962 Credit: 12,678,685 RAC: 0 |
From Matt's 6/16 post I know most of you who read these updates know this already, but it bears repeating: nobody working directly on SETI@home (all 5 of us) works full time, and we all have enough other things going on that make it impossible for us to be "on call" in case of outage/emergencies. In my case, I currently have four regular separate sources of income with jobs/gigs in four completely different industries (covering all the bases in case one or more dry up). As for last night, when the httpd problems arose, I was working elsewhere, and when I checked in again around 10:30pm everyone else was asleep and I didn't want to start up the scheduler processes without others' input as they were still effectively on the operating table. We're pretty much given up any hope for 24/7 uptime, but BOINC takes care of that as long as you sign up for other projects. Something for all of us to keep in mind. |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
Thank you Blurf.. and thank you for helping renew my ticked-offedness. Yeah we get it. it is not 24/7. Never asked for that. Yeah we get it. We can sign up for other projects we may or may not agree with or want to help with. We understand they are under funded/paid. And we get it that on top of previous server problems, they dumped a bunch of poorly written non-tested code on us, basically keeping things tied up in a knot for over a week. How about being up 12/2?? Cause it has been ages since I remember seeing all servers up at once. But.. feel free to quote "so sayeth Matt" again. It really.. helps. Not sure who it helps, but I am sure it does. Or not. "Bite my shiney metal a**" So sayeth Bender. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.