Message boards :
Number crunching :
Panic Mode On (97) Server Problems?
Message board moderation
Previous · 1 . . . 27 · 28 · 29 · 30 · 31 · 32 · 33 · Next
Author | Message |
---|---|
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
What is the answer to life the universe and everything? 42 |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I've had situations where a power failure over a holiday or some other event caused my machines to be off. So when I get around to bringing them up I will abort work that is already past due or to near its due date. I don't have nearly that many machines, & have my hosts show. However out of my 30 or so machines 5 are identical. So that is at least 500, or updates to 3500, tasks that would get dumped if I was in that situation. I know I have dumped as many as 1500 in one go before. Looking at BOINCstats it shows there are 591 hosts with the same CPU as those hosts that dumped their work. I like to think all 591 are are all from 1 account. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
WezH Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95 |
Berkeley, Seti@Home staff. We have a problem. Anonymous user. Several identical computers. Several hundred Astropulse tasks. Aborted, 203 (0xcb) EXIT_ABORTED_VIA_GUI. Just before Time Out. Bug in system, or Script from user. |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Looking through the few more tasks I got, I found another runaway machine that has lots and lots of MB invalids. host 7259786 I have not sent a PM, because I'm not sure what the problem is. I figure some of you can get a better idea of what may be going wrong. I just see a very bad ratio of valid:invalid (49:491, presently), with 943 inconclusives. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
wiesel111 Send message Joined: 5 Jan 08 Posts: 9 Credit: 1,227,675 RAC: 1 |
Here a small update of the identical computer. One was doubled (7568508), but I find some more: 7572890, 7567931, 7568597, 7568503, 7569519, 7572867 In chronological order of building (all between 7th and 9th of May, the last two on 13th of May): 7567886, 7567912, 7567913, 7567924, 7567931, 7567941, 7567951, 7568499, 7568501, 7568503, 7568504, 7568508, 7568511, 7568512, 7568513, 7568516, 7568519, 7568596, 7568597, 7568637, 7568640, 7568642, 7568643, 7568646, 7569519, 7572867, 7572890 27 identical of this anonymous and there may be much more... Edit: Looking at BOINCstats it shows there are 591 hosts with the same CPU as those hosts that dumped their work. I like to think all 591 are are all from 1 account. I don't think all 591 belong to this anonymous, all of them I looked are built in the last 4 weeks and did not get very much points; the most of the tasks are aborted. Only a few ap's were crunched and as far I see no MB. |
rob smith Send message Joined: 7 Mar 03 Posts: 22190 Credit: 416,307,556 RAC: 380 |
Looking through the few more tasks I got, I found another runaway machine that has lots and lots of MB invalids. host 7259786 A quick look shows the gpu to be overclocked 1267MHz vs stock 1046, so that could be part of the problem. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
What is the answer to life the universe and everything? +1 That would make it 43 ;) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
Berkeley, Seti@Home staff. That may be so true. Det kan vara sannt. Be ... To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
Hi Phil, Don't let it bother you:-) It crept up on my some time ago:-) Its not age that's the problem, its everyone else being younger:-) Regards, Cliff, Been there, Done that, Still no damm T shirt! |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
Hi Jord,
I don't think that there are all that many users with NDA's crunching WU's any more than those actually being stalked. I cant see a problem with the USERS id being undisclosed as long as its not permanent, it should only happen in special circumstances with the approval of S@H staff. Not just available as an option anyone can choose willy nilly. Regards, Cliff, Been there, Done that, Still no damm T shirt! |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Hi Jord, And WHY NOT? If somebody wishes to participate in this project and does not wish anybody else on the planet to know that they are doing so, or what hardware they are running it on, what is the problem with that? They can already hide their identity with whatever screen name they wish to use. Why should they be forced to display the platforms they are running on? "Freedom is just Chaos, with better lighting." Alan Dean Foster |
cliff Send message Joined: 16 Dec 07 Posts: 625 Credit: 3,590,440 RAC: 0 |
Hi Mark, Because far to often the anon user simply abuses it, as seen from recent posts, far to large a cache, aborted WU, timed out WU and as often as not the user get a large tranche of WU then takes a hike.. Leaving every one if their wingmen waiting for better days to get WU 'they' have spent 'their' time, resources and money completing, just to have some anon prat do a runner and leave them waiting for the server to re-assign the WU, which could just as well be to another anon prat, who will blodge around as well. I don't see you opting to use anon, nor do the majority of other setizens. I'd rather like to get a return on my effort before I go toes up:-) Regards, Cliff, Been there, Done that, Still no damm T shirt! |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Hi Mark, I don't hide my computers. I am actually rather proud of what I do with the rather aging mishmosh of what I run here...LOL. All I am saying is some may have a reason to do so, and the project allows for that. You can have a user out in the open that has the same problems of caching abuse, errors up the wazoo, etc., etc., etc., and if they don't respond to PMs, it's really no different than if they were hiding their computers. Might make it easier to diagnose the problem at hand, but does not change the situation. And given the microscopic percentage of users that we here on the Number Crunching board might call out as having a problem.... I don't see it as an issue, I really don't. I don't know what the number of current ACTIVE users is right now on Seti, but if you take that into consideration, you might see my point on just how insignificant it is........(according to Boincstats, there are 113,727 active users as I look it up.) EDIT.... The only viewpoint on the subject that I DO agree with, is that is would be wonderful if the servers could properly diagnose and deal with as many of the rouge rigs on the project as possible via server side coding. But, first, we have to have somebody with time to diagnose the situations and do the coding. And the other fly in the ointment is that some very old versions of Boinc, although still capable of running a computer and doing good work at this time, do not recognize some of the limiting coding already in place. And I don't think the project is ready to chop any existing, working version of Boinc off at the knees and disconnect those users. I am sure there are many thousands of them out there still producing good results. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Back in October, 2013, I did some analysis of about 80,000 WUs that I had processed in the previous 6 months to see if my "Anonymous" wingmen were demonstrably more responsible for trashing WUs than were identifiable wingmen (which was my "gut" feeling at the time). Surprisingly, the bottom line was that "Anonymous" users (12.87% failure rate) were only marginally worse Setizens than everybody else (10.66% failure rate). Take a look at my Message 1434876 for the details of my findings. There are also some very valuable crunchers who choose to keep their computers hidden. Charles Long, for instance, the top cruncher by far for about year now, is apparently still stress-testing a large number of his "hidden" data center computers using S@h. I know that last summer he would apparently sign up many new boxes on the same day and load them up with tasks. Some of those computers would only appear to be active for 3 or 4 weeks, then stop cold, leaving the remaining tasks to time out. Perhaps that wasn't particularly considerate of his wingmen, but I would consider that a minor quibble in light of the huge contribution he's made (and continues to make) to the project. I think there are some other data center administrators who also keep their computers hidden for perfectly legitimate reasons, and to outlaw "Anonymous" users could very well cause a rapid decline in their participation in S@h. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Back in October, 2013, I did some analysis of about 80,000 WUs that I had processed in the previous 6 months to see if my "Anonymous" wingmen were demonstrably more responsible for trashing WUs than were identifiable wingmen (which was my "gut" feeling at the time). Surprisingly, the bottom line was that "Anonymous" users (12.87% failure rate) were only marginally worse Setizens than everybody else (10.66% failure rate). Take a look at my Message 1434876 for the details of my findings. LOL...I have a slightly different slant on users like Charles Long, who seem to have unlimited access to unlimited resources. It makes my own contributions to the project seem a bit skewed, as I happen to be one of the few, and I know one other, with whom I am friends with, that can do what we do from our own homes, with our own resources, and when somebody comes in and stomps on us with a corporate server farms or such, it kinda takes our thunder....LOL. All is fair in love and Seti. I wish there was a way to discern between the two types of activity, but I am content in that most intelligent folks can understand that CL has access to many servers, I have 9 computers here that are under my control at all times, and struggle at times to keep running. Not a mega server farm. The kitty crunching farm is made up of bits accumulated over the years. Gifts from many admirers, GPUs from far and near. The oldest is my first core 2 duo. The newest is a gift from a fellow cruncher, a mobo, that is also at least 3 years old in the tooth. The GPUs were in most part donated to the kitties by a true friend that sent them when he retired them in upgrades. This, my friends, is what legends are made of. Not some guy with a keyboard and access to a hundred servers. No diss on CL..but, I did it ..............with a little help from my friends. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I remember discussing Charles Long last year. I think Richard had done some digging and found multiple blocks of 20-hundreds of identical machines that were created within minutes of each other, and all the hardware was identical, and 12-32 core CPUs at that. So... he's got to be at a datacenter, and a quite sizable one at that. Just imagine how much more the RAC could be with optimized apps... but then.. who would want to install those on potentially a thousand machines? Regarding hiding computers and being anonymous.. what if there were a compromise? It's not total ambiguity like it is now... but what if you could just disable the public list of the machines for your account, but all individual machines (when you find them as a wingmate to one of your own tasks) would still show who they belong to? At least then, you'd be able to, but then if someone has a hundred machines, the rest of us wouldn't know how many they truly had. Or.. what if there was a "report this host to admins" button where you had to A) prove that you're a human (capcha or something similar) and state why you are reporting that machine (similar to what I mentioned yesterday about a machine that has a valid:invalid ratio of 49:491.. ie: roughly 1 in 10 being a good enough reason to report them). Of course, that would add one more thing for the over-worked staff to handle.. but it's still an idea. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
I cant see a problem with the USERS id being undisclosed as long as its not permanent, it should only happen in special circumstances with the approval of S@H staff. I think the staff has more important things to do than decide whether or not a person asking to be hidden to be able to run work on new alpha-hardware can have his request granted. Aside from that, time zones. A person in Japan may not want to wait 20+ hours for approval of something that shouldn't be a problem in the first place. And what about vacations? Staff on vacation, person will have to wait 3 weeks plus for approval, in this electronic day and age? Next you demand that people write in their request through snail mail to the Berkeley P.O.box or use a pigeon carrier. :) Even if there's only a handful of users that are testing certain hardware or software, they should be allowed to decide for themselves and given the option to hide their names. And as said, in case of gross wrong-doing, the staff can reach them because they can read the person's email address. If that's at least an existing address, because anyone can make an account here and fill in a non-existing email address. But even then, the administrators can disable the account quite easily. Over at CPDN the admins work closely together with the board moderators. When someone finds another user is wasting models by the tens, hundreds and more, their hostID is given to the admin, who will block downloads to that host. That will always attract the attention of the user, who will then come complain in the forums or in PM to one of the moderators that he cannot download work. After which he'll be told that his system has a problem, and that it's blocked until he has had time to fix his problem. And only when he has done so and shown he has done so, will his hostID be allowed to download work again off that project. That's something the staff here could do as well. It's friendlier than many of the other solutions, and still allows the user to stay anonymous if he wants to be. |
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1 |
Perhaps the larger question here is how does 'anonymous' amass such a cache of AP's on a brand new machine? I guess I don't understand the scheduler priorities. "Sour Grapes make a bitter Whine." <(0)> |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Perhaps the larger question here is how does 'anonymous' amass such a cache of AP's on a brand new machine? I guess I don't understand the scheduler priorities. That is a pretty valid curiosity question, for sure. The rest of us have time-tested, reliable machines and when APs are being split, we're lucky to get a small handful of tasks, meanwhile, these users with farms of machines that were created the same day the splitting started can somehow get 100+ APs on each machine. *scratches head* Something tastes weird about that. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Perhaps the larger question here is how does 'anonymous' amass such a cache of AP's on a brand new machine? I guess I don't understand the scheduler priorities. When you request work ever 310 seconds you have much higher chances of getting work then whatever BOINC does by default. If BOINC has a low cache condition it will check ever 5 min on its own unless it can't contact the server. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.