Panic Mode On (42) Server problems

Author	Message
kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004	Message 1057436 - Posted: 18 Dec 2010, 10:24:08 UTC It would be interesting if they could actually implement true QoS..... Favoring established connections at the expense of new ones. I don't think that is what was happening at the time. But, as things seem to have settled down a bit, I think it's time to raise the limits, eh? "Time is simply the mechanism that keeps everything from happening all at once." ID: 1057436 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1057536 - Posted: 18 Dec 2010, 16:46:38 UTC Whatever. By my calculations the current limits are compatible with a 4 day cache. As any future *planned* outages will be only 3 days. Isn't this discussion academic?? T.A. ID: 1057536 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004	Message 1057546 - Posted: 18 Dec 2010, 17:28:46 UTC - in response to Message 1057536. Last modified: 18 Dec 2010, 17:30:11 UTC Whatever. By my calculations the current limits are compatible with a 4 day cache. As any future *planned* outages will be only 3 days. Isn't this discussion academic?? T.A. And what, pray tell, are you basing your calculations on? The Frozen 920, running at 4.1Ghz, completes most MB work in about an hour. Many only take half that time or less. Depends on the AR. 150 WUs won't get that rig through a single day on the 4 CPU cores. You can have a look at the rig's valid results here. "Time is simply the mechanism that keeps everything from happening all at once." ID: 1057546 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1057577 - Posted: 18 Dec 2010, 19:06:11 UTC - in response to Message 1057536. By my calculations the current limits are compatible with a 4 day cache. ? My system isn't a powerfull one, and my 4 day cache isn't full due to the present server side limits. I've got about 2.5-3 days worth with the present limit on work. Grant Darwin NT ID: 1057577 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1057667 - Posted: 19 Dec 2010, 1:10:39 UTC - in response to Message 1057536. Whatever. By my calculations the current limits are compatible with a 4 day cache. As any future *planned* outages will be only 3 days. Isn't this discussion academic?? T.A. OK. Statement retracted. Jose` Quervo was helping me with my calculations and we misplaced a decimal point. Sorry for any angst caused :-) T.A. ID: 1057667 ·

perryjay Volunteer tester Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0	Message 1057669 - Posted: 19 Dec 2010, 1:24:16 UTC - in response to Message 1057667. For those that missed it, Matt says the limits have been raised. PROUD MEMBER OF Team Starfire World BOINC ID: 1057669 ·

Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597	Message 1057686 - Posted: 19 Dec 2010, 3:45:35 UTC - in response to Message 1057536. Last modified: 19 Dec 2010, 3:46:58 UTC Whatever. By my calculations the current limits are compatible with a 4 day cache. As any future *planned* outages will be only 3 days. Isn't this discussion academic?? T.A. not so my friend ... it appears as though the limit is 320 per GPU. In my case that means that the dual GTX470s are limited to 640 wus. This is just under 2 days worth of work for each of these boxes the problem is that they will go dry before the next outage finishes (as will many others) and the feeding frenzy begins again and it will take 2-3 days for these caches to fill back to this low limit, and then the cycle repeats. all of us are effectively brought down to the lowest common denominator. the limits need to be raised or abolished so that we can slowly build our caches up so that we can operate outside of seti's 3 day outage period, unscheduled down time, etc., and not be affected by the virtual DDoS attack that is generated by the outage. in the past i ran my caches deep to avoid seti's ups and downs and never had an issue. if it went down i just waited till after it came back for few days to avoid congestion. with the limits in place i am now subject to seti's intermittent behaviour and its congestion issues. L. ID: 1057686 ·

-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0	Message 1057688 - Posted: 19 Dec 2010, 3:47:37 UTC - in response to Message 1057686. Last modified: 19 Dec 2010, 3:48:04 UTC not so my friend ... it appears as though the limit is 320 per GPU. In my case that means that the dual GTX470s are limited to 640 wus. This is just under 2 days worth of work for each of these boxes the problem is that they will go dry before the next outage finishes (as will many others) and the feeding frenzy begins again and it will take 2-3 days for these caches to fill back to this low limit, and then the cycle repeats. all of us are effectively brought down to the lowest common denominator. the limits need to be raised or abolished so that we can slowly build our caches up so that we can operate outside of seti's 3 day outage period, unscheduled down time, etc., and not be affected by the virtual DDoS attack that is generated by the outage. in the past i ran my caches deep to avoid seti's ups and downs and never had an issue. if it went down i just waited till after it came back for few days to avoid congestion. with the limits in place i am now subject to seti's intermittent behaviour and its congestion issues. L. It must be because right now my GTS 250 has 564 tasks and my 480 has 376 before I communicate with the project again. And that's not counting the tasks for my cpus. Traveling through space at ~67,000mph! ID: 1057688 ·

Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597	Message 1057689 - Posted: 19 Dec 2010, 3:48:48 UTC - in response to Message 1057669. For those that missed it, Matt says the limits have been raised. to what ... ID: 1057689 ·

-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0	Message 1057692 - Posted: 19 Dec 2010, 3:50:32 UTC - in response to Message 1057689. For those that missed it, Matt says the limits have been raised. to what ... Gotta be near double or higher considering I have 564 units on one machine. Traveling through space at ~67,000mph! ID: 1057692 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19227 Credit: 40,757,560 RAC: 67	Message 1057694 - Posted: 19 Dec 2010, 3:52:12 UTC Last modified: 19 Dec 2010, 3:52:32 UTC At the moment, this discussion about d/load limits is a bit academic, cause if you take a peek at the server status page they will run out of tasks very soon. Unless someone comes in and finds some blanked data and puts it in the splitters. ID: 1057694 ·

-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0	Message 1057697 - Posted: 19 Dec 2010, 3:55:25 UTC - in response to Message 1057694. At the moment, this discussion about d/load limits is a bit academic, cause if you take a peek at the server status page they will run out of tasks very soon. Unless someone comes in and finds some blanked data and puts it in the splitters. It'll get handled. It's Saturday night, and I'm sure they won't be back around to deal with any of that till Monday morning. Kind of expected at this point in time. I know there has been some talk about re-striping the raid etc. So maybe they want it dried up for a reason, or maybe they are still underestimating the power of the new servers. Either way my machines have work according to Boinc for the next 8/5 days and then some. Traveling through space at ~67,000mph! ID: 1057697 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1057700 - Posted: 19 Dec 2010, 4:14:22 UTC - in response to Message 1057392. ... Not sure how accurate their cricket graphs are but the highest I've ever seen it was about 97mbps, so kind of makes you wonder what QoS they do have running. ... I've looked numerous times at setting up Cricket on my own equipment (to no success so far), but from what I've seen, the "chirps" (each vertical column of pixels) can be configured as an average of an interval of a time period. It's not a case of making a query to the hardware every X seconds, but instead, the hardware sends out SNMP packets saying what is going on, and Cricket just simply listens to those packets and makes sense of the data it is looking for. The default is, I believe, 5 minutes per chirp. That makes it not an excellent tool for real-time monitoring, but for time-lapse trends instead. As far as what the throttling is set for, or the QoS... no idea. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1057700 ·

Scarecrow Send message Joined: 15 Jul 00 Posts: 4520 Credit: 486,601 RAC: 0	Message 1057703 - Posted: 19 Dec 2010, 4:42:59 UTC Last modified: 19 Dec 2010, 4:44:16 UTC I think I feel a disturbance in the Scheduler force. Project communication failed: attempting access to reference site Scheduler request failed: Server returned nothing (no headers, no data) Internet access OK - project servers may be temporarily down. ID: 1057703 ·

-BeNt- Send message Joined: 17 Oct 99 Posts: 1234 Credit: 10,116,112 RAC: 0	Message 1057720 - Posted: 19 Dec 2010, 5:42:41 UTC - in response to Message 1057700. ... Not sure how accurate their cricket graphs are but the highest I've ever seen it was about 97mbps, so kind of makes you wonder what QoS they do have running. ... I've looked numerous times at setting up Cricket on my own equipment (to no success so far), but from what I've seen, the "chirps" (each vertical column of pixels) can be configured as an average of an interval of a time period. It's not a case of making a query to the hardware every X seconds, but instead, the hardware sends out SNMP packets saying what is going on, and Cricket just simply listens to those packets and makes sense of the data it is looking for. The default is, I believe, 5 minutes per chirp. That makes it not an excellent tool for real-time monitoring, but for time-lapse trends instead. As far as what the throttling is set for, or the QoS... no idea. Yeah sounds about right. I've setup MRTG on hardware numerous times. And it only updates at max every 5 minutes. So that could be, I suppose, the reason why it never goes over that, however on hardware I've dealt with if you spike all the way out it will show eventually with the graphs running 24/7. No huge deal anyways as it's just an indication not 100% since it only updates every 5 minutes or so. Traveling through space at ~67,000mph! ID: 1057720 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13797 Credit: 208,696,464 RAC: 304	Message 1058350 - Posted: 21 Dec 2010, 7:33:18 UTC I was going to post about the assimilator queue radipdly growning, but then read in the Tech News that Matt has turned them off in preparation for some shuffling about of data over the next couple of days. Grant Darwin NT ID: 1058350 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004	Message 1058426 - Posted: 21 Dec 2010, 13:14:48 UTC Ruh roh........ I noticed some glitches and timeouts in forum access in the last half hour or so. And now I see that the Cricket graphs seem to have taken a dive. Could there be problems in kittyland? Meow? "Time is simply the mechanism that keeps everything from happening all at once." ID: 1058426 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 1058427 - Posted: 21 Dec 2010, 13:17:54 UTC - in response to Message 1058426. Probably just a maintenance brown-out, like Matt posted about last week. ID: 1058427 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51470 Credit: 1,018,363,574 RAC: 1,004	Message 1058428 - Posted: 21 Dec 2010, 13:19:11 UTC - in response to Message 1058427. Probably just a maintenance brown-out, like Matt posted about last week. At THIS time of the morning????? "Time is simply the mechanism that keeps everything from happening all at once." ID: 1058428 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14661 Credit: 200,643,578 RAC: 874	Message 1058430 - Posted: 21 Dec 2010, 13:28:14 UTC - in response to Message 1058428. Probably just a maintenance brown-out, like Matt posted about last week. At THIS time of the morning????? Quite likely - it was around this time last week too. Campus maintenance elves have to work unsocial hours - they get complaints from faculty otherwise. ID: 1058430 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.