Message boards :
Number crunching :
Panic Mode On (89) Server Problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 24 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
On the contrary when we have AP spliting the intensity of the problem is smaller because less hosts ask for MB WU, releasing some of the remaining MB splitters workload. Yep, that's what Richard & I were suggesting. The only reason we have a Ready-to-send buffer building up at the moment is all the machines that are now running AP WUs. If they were still chewing on MB, there wouldn't be enough to meet demand, let alone build up the Ready-to-send buffer. The problem with the splitters producing work is just because of the sticking tapes, not the amount of work being downloaded. Grant Darwin NT |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
On the contrary when we have AP spliting the intensity of the problem is smaller because less hosts ask for MB WU, releasing some of the remaining MB splitters workload. The amount of MB work being downloaded alters by a factor of at least 2, possibly nearer 3, depending on whether they're shorties or not (uncertainty because I'm not sure what the ratio of CPU to CUDA is there days. CPUs have nearer to the 3::1 ratio for shorties, compared to the 2::1 ratio for the currently inefficient (at VHAR) CUDA apps). The number of straws draining beer from the bottom of the barrel affects the ready-to-drink level, just as the number of barmaids pouring buckets of beer into the top of the barrel affects it. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I´m not sure about that. Why? Simply because the slow down of the splitting process happening before the AP frenzy starts, the problem starts when any tape is stuck. At that time 3 days ago there where no AP splitters running, just remember the last monday. Before maintenance 4 splitters were keeping RTS around 300,000. With it falling and rising a bit. However once the stuck tape had a 2nd splitter on it RTS dropped quickly. There were around 40K RTS by the time maintenance stared. For whatever reason splitting shorties also seemed to slow down the process & the average output has also picked up a little. Of course the AP going out is also helpful for building the MB queue up, but I wouldn't put all of my money on it alone. As AP RTS started growing after MB RTS was already doing so. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
The answer to all our problems is simple: kick the dam 18fe09ag tape and you all will see the MB cache fill very fast. Can anyone ping the lab people to at least verify why this splitter is stuck at the #3 channel after 3 days? The WOW event start tomorrow and in about a day we will running out of new AP WU so the MB cache will need to be read to handle all the requests. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
The answer to all our problems is simple: kick the dam 18fe09ag tape and you all will see the MB cache fill very fast. With WOW event demand may be greater than normal. It would certainly make things interesting. It looks like the tapes are being processed in FIFO order instead of alphanumerical order. So after sitting in the queue since at least May(maybe April) 31au13aa, 31au13ac, & 31mr13ae may actually get processed! SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
rob smith Send message Joined: 7 Mar 03 Posts: 22204 Credit: 416,307,556 RAC: 380 |
There is a throttle on the size of the RTS queue, and APs are 8Mb compared to 300kB of an MB. It is a shame they don't throttle back the AP production somewhat which would allow a larger number of MBs in the RTS queue. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
There is a throttle on the size of the RTS queue, and APs are 8Mb compared to 300kB of an MB. The problem is not with the size of the RTS queue. It is with the splitting rate. MB RTS when all is well usually hangs around 300k +/- 20k or so. Which is plenty, even in times of shorty storms. UNLESS we have a hung splitter or two, which unfortunately seems to have happened all too frequently in the last few weeks. If I understand correctly, there IS a mechanism in place that is supposed to be able to detect a splitting process that is hung up. It apparently is not robust enough or needs some more tweaking to be able to detect and restart things automatically under whatever conditions that have been causing the recent problems. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Was anybody able to contact the lab and talk about the dam 18fe09ag tape? |
Blurf Send message Joined: 2 Sep 06 Posts: 8962 Credit: 12,678,685 RAC: 0 |
Was anybody able to contact the lab and talk about the dam 18fe09ag tape? Patience Juan....Eric is out of town. It may need to wait until he gets back. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Was anybody able to contact the lab and talk about the dam 18fe09ag tape? Seriusly? He is the only at the lab? I don´t belive... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
No need to panic until all of the present AP work is gone. Then whatever Ready-to-send MB buffer we might have by then will start to feel the demand. Then we can panic. Grant Darwin NT |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Was anybody able to contact the lab and talk about the dam 18fe09ag tape? There is a very small number of people that work the lab & I believe only Matt and Eric are the only 2 that know how to it correct the issue at hand. Please correct me if I am wrong. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Was anybody able to contact the lab and talk about the dam 18fe09ag tape? I'm sure Dr. Anderson could figure it out, but I expect he wouldn't touch it unless everything had crashed. Even if the RTS buffer gets depleted from reduced splitter capacity. The project is still up and running. Just not at full capacity. Some requests for new work would be answered and some would not. As far as I know Eric, Matt, and Jeff are the only people that do work for the SETI@home project. Which is only part time of their focus. Jeff seems to be the least talkative of them. So I'm not sure if he is less involved or is there even less than the other guys. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Ok guy´s very soon we will see if is to be in panic or no. This round of tapes for AP splitting are allmost at the end (just 7 channels to do) Soon (in few hours at most, still are about 20k of AP WU in the buffer) we will be out of new AP WU for a while and all the duty will be on the hands of the MB spliters and our 5 days old stuck tape. And since the WOW event starts today i imagine a lot of users will transfer their processing power for other projest to SETI so that will be an interesting moment to follow. |
Oz Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 212 |
You are right, there may be problems, I will be watching the graphs as well. But so far less than 700 users have signed up for WOW. I am guessing most of them are "heavy hitters" - so it should be interesting. I have committed my modest resources to WOW as well, just for fun! Member of the 20 Year Club |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
AP spliting just stop few moments ago, an the AP buffer downs from 24K to about 18K at this rate will be empty tomorrow. The WOW event will going to start in about 1 1/2 hrs, that´s a hell of a coincidence of course, we will see the fun very soon. Hope i´m wrong and the MB splitters can handle because the dam tape still stuck at the #3 channel and aparently nothing will be done until the next week. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Ok guy´s very soon we will see if is to be in panic or no. If we use Results returned in the last hour as an indicator of demand. With the average returned for the past month being 64100. Then average demand is about 17.8055 per second. Meaning an average splitter output as little as 18 per second is more than enough. With 4 splitters the average output seems to be running 18.75 and rising. Splitting VHAR seems to slow the splitting down for some reason. So if we get a bunch of tapes full of shorties, which would also drive up demand, or another splitter goes down we should be alright. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
OK, kitties. AP RTS just ran out. I hope somebody's in the lab to babysit the servers. We're gonna need more power splitting MB, I suspect. Meow and away! "Freedom is just Chaos, with better lighting." Alan Dean Foster |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Panic Mode: ON Somebody please kick the dam stuck tape splitter. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Once again, demand exceeds supply. MB Ready-to-send buffer slowly draining. Grant Darwin NT |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.