Message boards :
Number crunching :
Panic Mode On (78) Server Problems?
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 . . . 22 · Next
Author | Message |
---|---|
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Might be worth posting a link to a few of them so those that know about these things can have a look. WU 1109239375 is enough to demonstrate that they weren't all VLAR. They were judged infeasible for some other reason. All 2877 were expired between 3:57:47 UTC and 3:57:54 UTC, so the database or other server delays apparently took about 7 seconds to get through that long list of "lost" tasks. Joe |
Tron Send message Joined: 16 Aug 09 Posts: 180 Credit: 2,250,468 RAC: 0 |
ok, now all may machines are empty ...what the heck are you guys doing? turn the work back on! it's freezing in here :P |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
ok, now all may machines are empty ...what the heck are you guys doing? I don't think any of my rigs actually ran out yet. But I haven't checked all 9 of them. If they do, they all have Einstein as a backup project. But hopefully da boyz in da lab will have on their best thinking hats and kicking boots today and will start to get to the root of the problem. Best of luck with it, guys. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
Mark Sattler posted an interesting theory yesterday. He wondered whether asking Synergy to run the Scheduler, several MB splitters, and several AP splitters all at the same time might have been too much, and caused the inital slowdown we saw after maintenance last week. Sounds plausible to me. Take a look at the database graphs- usual activity these days is around 700-800 queries/s. Untill the splitters were shut down, it didn't drop below 1,000/s with suspstain periods of just below 1,500/s & many peaks over 1,500/s. Even now there are many surges to 1,500/s+, but it's also dropping down to 700/s or less on occasion. Grant Darwin NT |
Mad Fritz Send message Joined: 20 Jul 01 Posts: 87 Credit: 11,334,904 RAC: 0 |
@Joe Thanks for looking into it :-) Am I right that there will not really be any serious harm as the tasks were given to other crunchers in the meantime? Andy |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
My notional list of "work in progress" has gone up from 1,500 to 2,100 in the last two hours. It has been 27 hours since you said that. The ready to send has been at or near 0 (I assume it only ticks upward because of occasional timeout reassignments) and the splitters off for six hours that I'm aware of, probably a lot longer, and the Crickets are still maxed out! There was a mild downspike yesterday and an even smaller one just now, but there can't possibly still be that many ghost resends going on, can there? It's got me wondering if either something is wrong with the servers or there's an outside DOS attack going on. Or perhaps a web spider slipped through the filters and is trying to catalog every one of those 9 millions results out in the field and 7 million waiting for validation, or something like that. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
I was going to remark on the high number of error WUs for my i7 where I had a short time timeout and now my original wingman and my replacement have both had natural timeouts and it's been sent to two more hosts, but now I'm wondering if the first two hosts completed the work and have been unable to upload and report due to the server problems the last few days. (And if that's the case, they'll eventually report late and the WUs will end up stuck and take even longer to disappear off my error list.) David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
, and the Crickets are still maxed out! Seems to be slowly dropping back, hopefully this is a good sign! |
rob smith Send message Joined: 7 Mar 03 Posts: 22160 Credit: 416,307,556 RAC: 380 |
Dave (N9JFE) A look at one of your timed out task shows that it was sent to you with an "impossible" deadline, so you timed out, the same thing happened, at the same time to your wingman. The task was no sent out to two more crunchers, one of whom has reported, and the other is still "in progress". Since the task was delivered to you in September, and faulted on the same day it is pretty safe to say that task is not under the influence of the current server woes. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Dave (N9JFE) I know what happened to me (and also to my wingman in at least one case). What I found remarkable was that so many of my timeouts have now had full-time (i.e., not impossible) timeouts by the other users, and I wondered if *they* might be caused by the server problems. But that's a minor issue. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Well the Crickets aren't maxed out anymore, and I just made a scheduler contact and it took three seconds to acknowledge five completed APs. That's pretty quick. Regarding having too many AP splitters going.. I've been wondering/asking for a while now if we can just knock it down to one AP splitter. If you load up 10 full tapes and let the splitters go as fast as they can, AP finishes all 10 tapes in usually around the same time as MB takes to get through 2-3. Maybe just slow AP down and limit the hindering effect it has on everything else? Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Here comes fun - they've turned on every splitter, and the cricket graph has fallen through the floor. I'm setting NNT and going to bed - tell me about it in the morning. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
The largest blind is who don´t want to see... Hope i´m wrong... (edit) 05/11/2012 22:33:08 SETI@home Message from server: This computer has reached a limit on tasks in progress A new limit? |
fscheel Send message Joined: 13 Apr 12 Posts: 73 Credit: 11,135,641 RAC: 0 |
The largest blind is who don´t want to see... I have one pc that's getting that message also..wonder what it means or what the limit is. |
Fred E. Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0 |
I'm back to timeouts when just reporting on NNT. Wonder what they worked on today? Edit: The next try was successful, of course. Will try work fetch now, but may go NNT all night. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. |
Khangollo Send message Joined: 1 Aug 00 Posts: 245 Credit: 36,410,524 RAC: 0 |
Timeouts are back plus now I can't download tons of lost tasks I still have; when scheduler is finally successful, all I'm getting is limit reached message. Looks like limits are ridiculous like they were before (50tasks/CPU) which means even my slowest hosts are going to make scheduler requests *endlessly*, never able to fill a 5 day cache, only compounding the scheduler overload problem... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
@Joe Yes, you're right. Just a slight delay in actually getting tasks to 2 hosts. Joe |
Donald L. Johnson Send message Joined: 5 Aug 02 Posts: 8240 Credit: 14,654,533 RAC: 20 |
Timeouts are back plus now I can't download tons of lost tasks I still have; when scheduler is finally successful, all I'm getting is limit reached message. So you might as well reduce your cache settings to something close to the limits, to reduce the number of failed scheduler requests and ease the strain on the servers..... Donald Infernal Optimist / Submariner, retired |
Sutaru Tsureku Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Message from server: This computer has reached a limit on tasks in progress I don't know how it's this time .. (no admin announced it) .. If I remember correct - the last time it was max. 50 WUs/CPU-thread and 400 WUs/GPU in BOINC. So my Intel Core2 Duo E7600 with NVIDIA GeForce GTX260 should get (50 x 2) + 400 = 500 WUs - maybe also this time. * Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. * |
Fred E. Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0 |
If I remember correct - the last time it was max. 50 WUs/CPU-thread and 400 WUs/GPU in BOINC. Not even sure it is 50 per core. I'm down to 230 CPU tasks for my hexcore, and still get the limit message. That's with a CPU-only request. I'm only using 5 for crunching, but if that value is used I'd still get 250. I'm way over 400 on GPU tasks, and hate to see a limit there. I won't be able to handle much more than a one day outage, especially if there are a lot of shorties. New card is faster than the one I was running the last time we had these limits. Haven't seen more timeouts like I reported earlier in the thread. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.