Panic Mode On (97) Server Problems?

Message boards : Number crunching : Panic Mode On (97) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 27 · 28 · 29 · 30 · 31 · 32 · 33 · Next

AuthorMessage
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1688220 - Posted: 5 Jun 2015, 18:19:15 UTC - in response to Message 1688212.  
Last modified: 5 Jun 2015, 18:19:52 UTC

What is the answer to life the universe and everything?


42
ID: 1688220 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1688230 - Posted: 5 Jun 2015, 18:39:32 UTC - in response to Message 1688212.  
Last modified: 5 Jun 2015, 18:44:49 UTC

I've had situations where a power failure over a holiday or some other event caused my machines to be off. So when I get around to bringing them up I will abort work that is already past due or to near its due date.


This is different.

At least 22 IDENTICAL computers fron ANONYMOUS user.

Almost all of this users Astropulse tasks are aborted by GUI just before deadline.

203 (0xcb) EXIT_ABORTED_VIA_GUI

Why?
Who?
What is the answer to life the universe and everything?

I don't have nearly that many machines, & have my hosts show. However out of my 30 or so machines 5 are identical. So that is at least 500, or updates to 3500, tasks that would get dumped if I was in that situation. I know I have dumped as many as 1500 in one go before.

Looking at BOINCstats it shows there are 591 hosts with the same CPU as those hosts that dumped their work. I like to think all 591 are are all from 1 account.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1688230 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1688232 - Posted: 5 Jun 2015, 18:54:15 UTC

Berkeley, Seti@Home staff.

We have a problem.

Anonymous user.

Several identical computers.

Several hundred Astropulse tasks.

Aborted, 203 (0xcb) EXIT_ABORTED_VIA_GUI.

Just before Time Out.

Bug in system, or Script from user.
ID: 1688232 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1688239 - Posted: 5 Jun 2015, 19:16:26 UTC

Looking through the few more tasks I got, I found another runaway machine that has lots and lots of MB invalids. host 7259786

I have not sent a PM, because I'm not sure what the problem is. I figure some of you can get a better idea of what may be going wrong. I just see a very bad ratio of valid:invalid (49:491, presently), with 943 inconclusives.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1688239 · Report as offensive
wiesel111

Send message
Joined: 5 Jan 08
Posts: 9
Credit: 1,227,675
RAC: 1
Germany
Message 1688258 - Posted: 5 Jun 2015, 20:08:14 UTC - in response to Message 1688144.  
Last modified: 5 Jun 2015, 20:24:15 UTC

Here a small update of the identical computer. One was doubled (7568508), but I find some more: 7572890, 7567931, 7568597, 7568503, 7569519, 7572867

In chronological order of building (all between 7th and 9th of May, the last two on 13th of May):
7567886, 7567912, 7567913, 7567924, 7567931, 7567941, 7567951, 7568499, 7568501, 7568503, 7568504, 7568508, 7568511, 7568512, 7568513, 7568516, 7568519, 7568596, 7568597, 7568637, 7568640, 7568642, 7568643, 7568646, 7569519, 7572867, 7572890

27 identical of this anonymous and there may be much more...

Edit:
Looking at BOINCstats it shows there are 591 hosts with the same CPU as those hosts that dumped their work. I like to think all 591 are are all from 1 account.

I don't think all 591 belong to this anonymous, all of them I looked are built in the last 4 weeks and did not get very much points; the most of the tasks are aborted. Only a few ap's were crunched and as far I see no MB.
ID: 1688258 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1688264 - Posted: 5 Jun 2015, 20:45:52 UTC - in response to Message 1688239.  

Looking through the few more tasks I got, I found another runaway machine that has lots and lots of MB invalids. host 7259786

I have not sent a PM, because I'm not sure what the problem is. I figure some of you can get a better idea of what may be going wrong. I just see a very bad ratio of valid:invalid (49:491, presently), with 943 inconclusives.

A quick look shows the gpu to be overclocked 1267MHz vs stock 1046, so that could be part of the problem.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1688264 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1688270 - Posted: 5 Jun 2015, 21:00:43 UTC - in response to Message 1688220.  

What is the answer to life the universe and everything?


42


+1

That would make it 43 ;)
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1688270 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1688274 - Posted: 5 Jun 2015, 21:03:59 UTC - in response to Message 1688240.  

Berkeley, Seti@Home staff.

We have a problem.

Anonymous user.

Several identical computers.

Several hundred Astropulse tasks.

Aborted, 203 (0xcb) EXIT_ABORTED_VIA_GUI.

Just before Time Out.

Bug in system, or Script from user.

Or the project dumping AP's from users that likely will not crunch them in time, so they can go back out again. All in preparation for the introduction of the new repaired full AP database.

That may be so true. Det kan vara sannt.


Be ...
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1688274 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1688339 - Posted: 6 Jun 2015, 0:01:07 UTC - in response to Message 1688078.  

Hi Phil,

Don't let it bother you:-) It crept up on my some time ago:-)

Its not age that's the problem, its everyone else being younger:-)

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1688339 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1688340 - Posted: 6 Jun 2015, 0:07:49 UTC - in response to Message 1688107.  

Hi Jord,


People testing such hardware will have to sign a Non-Disclosure Agreement (NDA) that states that while you may use the hardware for anything you want, you will also do everything in your power to hide from the world that you are testing this unreleased and unannounced hardware. And as such, at projects you can set to have your computers shown or not.


I don't think that there are all that many users with NDA's crunching WU's any more than those actually being stalked.

I cant see a problem with the USERS id being undisclosed as long as its not permanent, it should only happen in special circumstances with the approval of S@H staff.
Not just available as an option anyone can choose willy nilly.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1688340 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1688365 - Posted: 6 Jun 2015, 2:27:36 UTC - in response to Message 1688340.  
Last modified: 6 Jun 2015, 2:35:22 UTC

Hi Jord,


People testing such hardware will have to sign a Non-Disclosure Agreement (NDA) that states that while you may use the hardware for anything you want, you will also do everything in your power to hide from the world that you are testing this unreleased and unannounced hardware. And as such, at projects you can set to have your computers shown or not.


I don't think that there are all that many users with NDA's crunching WU's any more than those actually being stalked.

I cant see a problem with the USERS id being undisclosed as long as its not permanent, it should only happen in special circumstances with the approval of S@H staff.
Not just available as an option anyone can choose willy nilly.

Regards,

And WHY NOT?
If somebody wishes to participate in this project and does not wish anybody else on the planet to know that they are doing so, or what hardware they are running it on, what is the problem with that?
They can already hide their identity with whatever screen name they wish to use.
Why should they be forced to display the platforms they are running on?
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1688365 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1688381 - Posted: 6 Jun 2015, 3:14:57 UTC - in response to Message 1688365.  

Hi Mark,

Because far to often the anon user simply abuses it, as seen from recent posts, far to large a cache, aborted WU, timed out WU and as often as not the user get a large tranche of WU then takes a hike..

Leaving every one if their wingmen waiting for better days to get WU 'they' have spent 'their' time, resources and money completing, just to have some anon prat do a runner and leave them waiting for the server to re-assign the WU, which could just as well be to another anon prat, who will blodge around as well.

I don't see you opting to use anon, nor do the majority of other setizens.
I'd rather like to get a return on my effort before I go toes up:-)

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1688381 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1688384 - Posted: 6 Jun 2015, 3:22:15 UTC - in response to Message 1688381.  
Last modified: 6 Jun 2015, 3:35:45 UTC

Hi Mark,

Because far to often the anon user simply abuses it, as seen from recent posts, far to large a cache, aborted WU, timed out WU and as often as not the user get a large tranche of WU then takes a hike..

Leaving every one if their wingmen waiting for better days to get WU 'they' have spent 'their' time, resources and money completing, just to have some anon prat do a runner and leave them waiting for the server to re-assign the WU, which could just as well be to another anon prat, who will blodge around as well.

I don't see you opting to use anon, nor do the majority of other setizens.
I'd rather like to get a return on my effort before I go toes up:-)

Regards,

I don't hide my computers. I am actually rather proud of what I do with the rather aging mishmosh of what I run here...LOL.

All I am saying is some may have a reason to do so, and the project allows for that. You can have a user out in the open that has the same problems of caching abuse, errors up the wazoo, etc., etc., etc., and if they don't respond to PMs, it's really no different than if they were hiding their computers. Might make it easier to diagnose the problem at hand, but does not change the situation.

And given the microscopic percentage of users that we here on the Number Crunching board might call out as having a problem....
I don't see it as an issue, I really don't.

I don't know what the number of current ACTIVE users is right now on Seti, but if you take that into consideration, you might see my point on just how insignificant it is........(according to Boincstats, there are 113,727 active users as I look it up.)

EDIT....
The only viewpoint on the subject that I DO agree with, is that is would be wonderful if the servers could properly diagnose and deal with as many of the rouge rigs on the project as possible via server side coding. But, first, we have to have somebody with time to diagnose the situations and do the coding.
And the other fly in the ointment is that some very old versions of Boinc, although still capable of running a computer and doing good work at this time, do not recognize some of the limiting coding already in place.
And I don't think the project is ready to chop any existing, working version of Boinc off at the knees and disconnect those users. I am sure there are many thousands of them out there still producing good results.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1688384 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1688404 - Posted: 6 Jun 2015, 4:43:22 UTC

Back in October, 2013, I did some analysis of about 80,000 WUs that I had processed in the previous 6 months to see if my "Anonymous" wingmen were demonstrably more responsible for trashing WUs than were identifiable wingmen (which was my "gut" feeling at the time). Surprisingly, the bottom line was that "Anonymous" users (12.87% failure rate) were only marginally worse Setizens than everybody else (10.66% failure rate). Take a look at my Message 1434876 for the details of my findings.

There are also some very valuable crunchers who choose to keep their computers hidden. Charles Long, for instance, the top cruncher by far for about year now, is apparently still stress-testing a large number of his "hidden" data center computers using S@h. I know that last summer he would apparently sign up many new boxes on the same day and load them up with tasks. Some of those computers would only appear to be active for 3 or 4 weeks, then stop cold, leaving the remaining tasks to time out. Perhaps that wasn't particularly considerate of his wingmen, but I would consider that a minor quibble in light of the huge contribution he's made (and continues to make) to the project. I think there are some other data center administrators who also keep their computers hidden for perfectly legitimate reasons, and to outlaw "Anonymous" users could very well cause a rapid decline in their participation in S@h.
ID: 1688404 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1688407 - Posted: 6 Jun 2015, 5:01:11 UTC - in response to Message 1688404.  

Back in October, 2013, I did some analysis of about 80,000 WUs that I had processed in the previous 6 months to see if my "Anonymous" wingmen were demonstrably more responsible for trashing WUs than were identifiable wingmen (which was my "gut" feeling at the time). Surprisingly, the bottom line was that "Anonymous" users (12.87% failure rate) were only marginally worse Setizens than everybody else (10.66% failure rate). Take a look at my Message 1434876 for the details of my findings.

There are also some very valuable crunchers who choose to keep their computers hidden. Charles Long, for instance, the top cruncher by far for about year now, is apparently still stress-testing a large number of his "hidden" data center computers using S@h. I know that last summer he would apparently sign up many new boxes on the same day and load them up with tasks. Some of those computers would only appear to be active for 3 or 4 weeks, then stop cold, leaving the remaining tasks to time out. Perhaps that wasn't particularly considerate of his wingmen, but I would consider that a minor quibble in light of the huge contribution he's made (and continues to make) to the project. I think there are some other data center administrators who also keep their computers hidden for perfectly legitimate reasons, and to outlaw "Anonymous" users could very well cause a rapid decline in their participation in S@h.

LOL...I have a slightly different slant on users like Charles Long, who seem to have unlimited access to unlimited resources.
It makes my own contributions to the project seem a bit skewed, as I happen to be one of the few, and I know one other, with whom I am friends with, that can do what we do from our own homes, with our own resources, and when somebody comes in and stomps on us with a corporate server farms or such, it kinda takes our thunder....LOL.

All is fair in love and Seti.

I wish there was a way to discern between the two types of activity, but I am content in that most intelligent folks can understand that CL has access to many servers, I have 9 computers here that are under my control at all times, and struggle at times to keep running.
Not a mega server farm.

The kitty crunching farm is made up of bits accumulated over the years. Gifts from many admirers, GPUs from far and near. The oldest is my first core 2 duo.
The newest is a gift from a fellow cruncher, a mobo, that is also at least 3 years old in the tooth. The GPUs were in most part donated to the kitties by a true friend that sent them when he retired them in upgrades.

This, my friends, is what legends are made of. Not some guy with a keyboard and access to a hundred servers.
No diss on CL..but, I did it ..............with a little help from my friends.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1688407 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1688441 - Posted: 6 Jun 2015, 7:07:10 UTC
Last modified: 6 Jun 2015, 7:12:35 UTC

I remember discussing Charles Long last year. I think Richard had done some digging and found multiple blocks of 20-hundreds of identical machines that were created within minutes of each other, and all the hardware was identical, and 12-32 core CPUs at that. So... he's got to be at a datacenter, and a quite sizable one at that.

Just imagine how much more the RAC could be with optimized apps... but then.. who would want to install those on potentially a thousand machines?



Regarding hiding computers and being anonymous.. what if there were a compromise? It's not total ambiguity like it is now... but what if you could just disable the public list of the machines for your account, but all individual machines (when you find them as a wingmate to one of your own tasks) would still show who they belong to? At least then, you'd be able to, but then if someone has a hundred machines, the rest of us wouldn't know how many they truly had.

Or.. what if there was a "report this host to admins" button where you had to A) prove that you're a human (capcha or something similar) and state why you are reporting that machine (similar to what I mentioned yesterday about a machine that has a valid:invalid ratio of 49:491.. ie: roughly 1 in 10 being a good enough reason to report them). Of course, that would add one more thing for the over-worked staff to handle.. but it's still an idea.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1688441 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1688545 - Posted: 6 Jun 2015, 14:34:08 UTC - in response to Message 1688340.  
Last modified: 6 Jun 2015, 15:24:27 UTC

I cant see a problem with the USERS id being undisclosed as long as its not permanent, it should only happen in special circumstances with the approval of S@H staff.

Not just available as an option anyone can choose willy nilly.

I think the staff has more important things to do than decide whether or not a person asking to be hidden to be able to run work on new alpha-hardware can have his request granted. Aside from that, time zones. A person in Japan may not want to wait 20+ hours for approval of something that shouldn't be a problem in the first place. And what about vacations? Staff on vacation, person will have to wait 3 weeks plus for approval, in this electronic day and age? Next you demand that people write in their request through snail mail to the Berkeley P.O.box or use a pigeon carrier. :)

Even if there's only a handful of users that are testing certain hardware or software, they should be allowed to decide for themselves and given the option to hide their names. And as said, in case of gross wrong-doing, the staff can reach them because they can read the person's email address. If that's at least an existing address, because anyone can make an account here and fill in a non-existing email address. But even then, the administrators can disable the account quite easily.

Over at CPDN the admins work closely together with the board moderators. When someone finds another user is wasting models by the tens, hundreds and more, their hostID is given to the admin, who will block downloads to that host. That will always attract the attention of the user, who will then come complain in the forums or in PM to one of the moderators that he cannot download work. After which he'll be told that his system has a problem, and that it's blocked until he has had time to fix his problem. And only when he has done so and shown he has done so, will his hostID be allowed to download work again off that project.

That's something the staff here could do as well. It's friendlier than many of the other solutions, and still allows the user to stay anonymous if he wants to be.
ID: 1688545 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1688548 - Posted: 6 Jun 2015, 14:41:47 UTC

Perhaps the larger question here is how does 'anonymous' amass such a cache of AP's on a brand new machine? I guess I don't understand the scheduler priorities.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1688548 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1688630 - Posted: 6 Jun 2015, 18:34:20 UTC - in response to Message 1688548.  

Perhaps the larger question here is how does 'anonymous' amass such a cache of AP's on a brand new machine? I guess I don't understand the scheduler priorities.

That is a pretty valid curiosity question, for sure. The rest of us have time-tested, reliable machines and when APs are being split, we're lucky to get a small handful of tasks, meanwhile, these users with farms of machines that were created the same day the splitting started can somehow get 100+ APs on each machine. *scratches head* Something tastes weird about that.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1688630 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1688638 - Posted: 6 Jun 2015, 18:54:36 UTC - in response to Message 1688630.  

Perhaps the larger question here is how does 'anonymous' amass such a cache of AP's on a brand new machine? I guess I don't understand the scheduler priorities.

That is a pretty valid curiosity question, for sure. The rest of us have time-tested, reliable machines and when APs are being split, we're lucky to get a small handful of tasks, meanwhile, these users with farms of machines that were created the same day the splitting started can somehow get 100+ APs on each machine. *scratches head* Something tastes weird about that.

When you request work ever 310 seconds you have much higher chances of getting work then whatever BOINC does by default. If BOINC has a low cache condition it will check ever 5 min on its own unless it can't contact the server.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1688638 · Report as offensive
Previous · 1 . . . 27 · 28 · 29 · 30 · 31 · 32 · 33 · Next

Message boards : Number crunching : Panic Mode On (97) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.