Message boards :
Number crunching :
Panic Mode On (83) Server Problems?
Message board moderation
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 20 · Next
Author | Message |
---|---|
ExchangeMan Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0 ![]() |
Yesterday evening I had a feeling we were going to run into problems overnight. Ya, the MB splitters seem to have been running OK but with the AP splitters not running at full capacity problems were inevitable overnight. It seems like for a few days, things weren't too bad. But partial functionality on either of the splitters eventually results in problem. Maybe someone forgot to change tapes. It seems that they need to have 6 splitters of each type to keep up with work unit demand. |
Richard Haselgrove ![]() Send message Joined: 4 Jul 99 Posts: 14690 Credit: 200,643,578 RAC: 874 ![]() ![]() |
I was running low until I hit the right moment, and got 69 new tasks. Apart from one solitary mid-AR resend, every single one was a shorty. They just get snapped up too darn quickly during a shorty storm - we need mid-AR or AP to damp things down a bit. |
![]() Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0 ![]() |
I was running low until I hit the right moment, and got 69 new tasks. Apart from one solitary mid-AR resend, every single one was a shorty. They just get snapped up too darn quickly during a shorty storm - we need mid-AR or AP to damp things down a bit. Had a similar result with a lucky fetch. I'm out of AP and have mostly shorties, so my 83 gpu tasks sum to 2 hours of crunch time per BoincTasks. Have set up my zero resource share project for the maintenance window. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ![]() |
ExchangeMan Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0 ![]() |
I was running low until I hit the right moment, and got 69 new tasks. Apart from one solitary mid-AR resend, every single one was a shorty. They just get snapped up too darn quickly during a shorty storm - we need mid-AR or AP to damp things down a bit. It seemed that for a few days there was sort of a nice mix of shorties, mid-ARs and AP work units. I don't know if anyone planned it that way or it was just random chance. With a decent mix, the system can almost maintain itself except until someone needs to change tapes - then it's pot luck. |
![]() ![]() Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 ![]() ![]() |
It is a game of hungry hungry hippos right now it would seem. With my 24 core box reporting a 3 hour queue it looks like it will be doing some PG work today once maintenance starts. SETI@home classic workunits: 93,865 CPU time: 863,447 hours ![]() |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13932 Credit: 208,696,464 RAC: 304 ![]() ![]() |
+ 1, the splitters our next bottleneck? It would appear so. Just coming back from the outage, there is only 1 splitter running, and it is only producing 6 Wus/sec. With shorties we need at least 55/s. So we really need 10 PFB splitters, and there are only 6, and it's been a while since they've all been running. No wonder we keep running out of work, even while it is being split. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13932 Credit: 208,696,464 RAC: 304 ![]() ![]() |
+ 1, the splitters our next bottleneck? All the splitters are now running, and they're producing less than half the rate of what is needed. Grant Darwin NT |
andybutt ![]() Send message Joined: 18 Mar 03 Posts: 262 Credit: 164,205,187 RAC: 516 ![]() ![]() |
Hungry GPU's sitting here doing nothing! Ho Hum! |
![]() Send message Joined: 22 Jul 99 Posts: 768 Credit: 24,140,697 RAC: 0 ![]() |
Just got a "Project down for maintenance" and a one hour backoff, so something's underway. Another Fred Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop. ![]() |
Sirius B ![]() ![]() Send message Joined: 26 Dec 00 Posts: 24929 Credit: 3,081,182 RAC: 7 ![]() |
Hmmn, I get this.... 30/04/2013 22:30:55 | SETI@home | Requesting new tasks for CPU 30/04/2013 22:30:57 | SETI@home | Scheduler request completed: got 0 new tasks 30/04/2013 22:30:57 | SETI@home | Project is temporarily shut down for maintenance |
![]() ![]() Send message Joined: 26 May 99 Posts: 9958 Credit: 103,452,613 RAC: 328 ![]() ![]() |
My main cruncher just downloaded 98 GPU tasks, but my second one still can't get any. |
Sakletare ![]() Send message Joined: 18 May 99 Posts: 132 Credit: 23,423,829 RAC: 0 ![]() |
And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance? http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-211/gigabitethernet6_17&ranges=d%3Aw&view=Octets |
![]() ![]() Send message Joined: 5 Mar 01 Posts: 783 Credit: 348,560,338 RAC: 223 ![]() ![]() |
And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance? I don't think anyone ever guaranteed that that link was exclusively for seti@home. ![]() ![]() ![]() |
![]() ![]() Send message Joined: 24 Jan 00 Posts: 38019 Credit: 261,360,520 RAC: 489 ![]() ![]() |
The MB splitters are certainly having a hard time getting up to their usual speed. Cheers. |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance? By the nature of a gigabit router interface, that link must be for SETI@home exclusively. My guess is the backup operation which is the reason for the Tuesday outage is sensibly being done to someplace outside the colocation facility. Joe |
Tom* Send message Joined: 12 Aug 11 Posts: 127 Credit: 20,769,223 RAC: 9 ![]() |
Never thought I'd say this - Please Split some AP's to slow down the feeder, Link and Client systems. |
![]() ![]() Send message Joined: 15 Mar 01 Posts: 1011 Credit: 230,314,058 RAC: 0 ![]() |
And what is this 90 Mbits/sec upload that's running 24/7, even when the project is down for maintenance? then lets program the upload and download servers to report data rates and output on status page. ![]() ![]() |
Keith White ![]() Send message Joined: 29 May 99 Posts: 392 Credit: 13,035,233 RAC: 22 ![]() ![]() |
What screams at me looking at the Munin graphs is that only when we ran out of AstroPulse units did the MB units start to plummet from 300K to 0. This suggests to me that a lot of GPU AstroPulse is being done and when that ran out the MB reserve was quickly devoured by hungry hungry GPUs. It's interesting that 30 units/s creation rate isn't enough, at least during a shorty storm. "Life is just nature's way of keeping meat fresh." - The Doctor |
ExchangeMan Send message Joined: 9 Jan 00 Posts: 115 Credit: 157,719,104 RAC: 0 ![]() |
What screams at me looking at the Munin graphs is that only when we ran out of AstroPulse units did the MB units start to plummet from 300K to 0. This suggests to me that a lot of GPU AstroPulse is being done and when that ran out the MB reserve was quickly devoured by hungry hungry GPUs. I noticed this too before the shutdown. When MB and AP units were both being split, things kind of behaved themselves and I got all the MB I could crunch. I sure hope this is fixed tomorrow. The current server status page shows 6 MB splitters active. 8 were active earlier and couldn't keep up. This is very frustrating for dedicated Seti crunchers like me. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13932 Credit: 208,696,464 RAC: 304 ![]() ![]() |
It's interesting that 30 units/s creation rate isn't enough, at least during a shorty storm. During a shorty storm 55/s is the minimum needed to meet demand. The storm is over, but still the splitters aren't able to crank out enough work. For a while they were doing about 30/s (barely enough when there's a lot of VLARs in the mix). Now they've dropped down to less than 20. Someone in the lab needs to take a look at what is going on- the splitters used to be able sustain 70/s no problems at all, now they can't even reach it as a peak. EDIT- at least the shorty storm was over for a while. The work my systems were able to get after the outage while i was at work didn't have many shorties in it, but one of the systems was just able to get some more work (still nowhere near enough...) & it was almost all shorties. Grant Darwin NT |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.