Panic Mode On (19) Server problems

Message boards : Number crunching : Panic Mode On (19) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next

AuthorMessage
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 914990 - Posted: 6 Jul 2009, 22:43:37 UTC - in response to Message 914978.  

OK -- we may disagree about the value of diabling manual retry, but we agree that this won't have significant impact.

In my 'client wishlist' would be a means to more effectively control individual projects on systems which have multiple projects. Currently, I believe the only way that is done is via resource share or at the workstation to suspend a project, and that doesn't provide the level of control needed for project specific conditions. For example, I'd love to set a condition on the workstation of, don't talk to SETI on Tuesdays, without shutting off contact to other projects.

I think there is value in disabling the manual retry, but ultimately, you can't control people, users are never going to do what you want, and in this case, I don't think they are the source of the problem.

When I say "client" I'm talking about the BOINC program on users' machines. If BOINCstats can be trusted, there are 180,000 of those, with multiple work units, and I don't think the backoff parameters can be tuned currently at all.


ID: 914990 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915014 - Posted: 6 Jul 2009, 23:19:11 UTC - in response to Message 914973.  

Whoops -- so instead of the phrase 'silly users' (my bad choice of words which garnered a reaction), it is 'grown adults who should know better'. Seems to me that you've reintroduced a variant of my phrase here and in doing so are blaming the users. Probably not your intent.


The only nerve it touched was the suggestive comment that we are "blaming the users" when the problem lies elsewhere, and I don't feel that's an accurate accusation of my views...

...it may be a small subset of users, but grown adults should know better and not try to force the situation. Allowing the option to remain only allows abuse of that option that quite a few people do.


Successfully taken out of context.

All I've said is that the users do not help the situation. That can be easily spun into "blaming the users", but I know its not the users causing the majority of the problems. Whatever way you choose to spin it, the fact remains that it doesn't help.

So no, not my intent, and not exactly what I was saying either.
ID: 915014 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915038 - Posted: 6 Jul 2009, 23:43:08 UTC - in response to Message 914986.  
Last modified: 6 Jul 2009, 23:56:58 UTC

Then again, I didn't perceive *myself* to be defensive either....


If your suggestion is that you didn't perceive yourself to be defensive in that you also think I am not "perceiving" myself to be defensive while suggesting that I was is off base.

I made a comment about taking away a problem that users were adding to, and you took off into left field about wanting to control several other aspects you did not have control over, certainly showing a degree of irritation with my "blame the users" (again, your perception) statement.

To your point about frustration levels -- I agree, that won't solve problems and may generate more heat than light. We (those who post here) are most inclined to engage in back and forth that might portray angst of one form or another simply because, by posting here, we are demonstrating a level of interest that the vast majority of SETI users simply don't have. Your participation as a moderator indicates that you too have a level of interest far above that of the vast majority of SETI users.


My participation as a Moderator indicates that I was asked to volunteer my time to help make sure things run smoothly. There are others here who seem to have a far higher interest in SETI than I do with my Mod tag.

Now the back and forth of 'I'm right, you're wrong' -- well I suspect that can cause angst, frustration, and defensiveness even among the best of us -- even, dare I say it, moderators such as yourself. Again, I don't mean that in a negative way, just voicing my view of the human condition.


That's so black and white: right and wrong. You're suggesting a game of one-ups-manship, or that somehow the "back and forth" goes on only to prove oneself "right or wrong".

Why can't it simply be a discussion and subsequent differences of opinion? Your insistence that your words were not meant to be defensive as I had labeled them suggest that you believed this to be the case for your side, but somehow at the same moment you're suggesting that I'm simply engaging in a "back and forth" simply to be right, and that this back and forth is causing angst, frustration and defensiveness in the best of us, with your direct suggestion "including myself", implicating this view of myself from your perspective to be that of the nature of my posts.

I think your view of the human condition is a bit askewed.

And that level of interest means that even you just might be subject to feelings of defensiveness, angst, or being insulted even when that was not the intent of the original post. (insert appropriate mantra here).


I think you've overstated my level of interest, and have assumed incorrectly from there. But I do note that you're still suggesting that I am subject to feelings of defensiveness and angst when it was inappropriate to do so - still being off base.

It may not have been your intent to insult in your original reply to me, but most certainly anyone can see via your reply that you seemed to have a defensive tone at your perception that I was "blaming the users" and your subsequent attempts at misdirection and out-of-context quotes of mine to justify your replies. I questioned you on it and you deny having any feelings of defensiveness, and have even gone so far as to suggest psychological protrusion of said feelings back onto myself, which is falsely misplaced.

I'd not characterize you as an apoligist, or purist or sycophant. The term I'd use, and NOT in pejorative manner, is advocate.


I think this is one point I can agree with.
ID: 915038 · Report as offensive
Profile Vistro
Avatar

Send message
Joined: 6 Aug 08
Posts: 233
Credit: 316,549
RAC: 0
United States
Message 915042 - Posted: 6 Jul 2009, 23:45:51 UTC - in response to Message 915038.  

There are cases where the manual retry is nesesary (it can't connect because the user made a fail, like turning on the microwave during a transfer, disconnecting the wireless card, router off, ISP outage, etc.), and after it's fixed, they can retry then.

ID: 915042 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915048 - Posted: 6 Jul 2009, 23:53:35 UTC - in response to Message 915042.  

There are cases where the manual retry is nesesary (it can't connect because the user made a fail, like turning on the microwave during a transfer, disconnecting the wireless card, router off, ISP outage, etc.), and after it's fixed, they can retry then.


Then perhaps after the server has failed to respond, the retry button for that project should be disabled for as long as the automatic backoff is enacted.
ID: 915048 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 915051 - Posted: 6 Jul 2009, 23:57:19 UTC - in response to Message 915042.  

There are cases where the manual retry is nesesary (it can't connect because the user made a fail, like turning on the microwave during a transfer, disconnecting the wireless card, router off, ISP outage, etc.), and after it's fixed, they can retry then.


BOINC should be able to handle this by itself.
ID: 915051 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915054 - Posted: 7 Jul 2009, 0:00:05 UTC - in response to Message 915051.  

There are cases where the manual retry is nesesary (it can't connect because the user made a fail, like turning on the microwave during a transfer, disconnecting the wireless card, router off, ISP outage, etc.), and after it's fixed, they can retry then.


BOINC should be able to handle this by itself.


Indeed it already does. No manual intervention necessary.
ID: 915054 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 915056 - Posted: 7 Jul 2009, 0:01:37 UTC - in response to Message 914990.  

In my 'client wishlist' would be a means to more effectively control individual projects on systems which have multiple projects. Currently, I believe the only way that is done is via resource share or at the workstation to suspend a project, and that doesn't provide the level of control needed for project specific conditions. For example, I'd love to set a condition on the workstation of, don't talk to SETI on Tuesdays, without shutting off contact to other projects.

This is really just a case of aesthetics, you know (through out-of-band knowledge) that SETI will be down on Tuesdays, and rather than have it try, you'd like it to take a pass.

BOINC on the other hand, tries to connect, fails, and the failed connection just goes on the "retry" list.

I know that there were versions of BOINC that didn't follow resource share correctly when a project was unreachable, and that is a functional flaw that should be (and I believe, has been) fixed.

In other words, the scheduling is nice for those who don't like to see failed connects, but it doesn't really add to the capabilities.

ID: 915056 · Report as offensive
Profile Vistro
Avatar

Send message
Joined: 6 Aug 08
Posts: 233
Credit: 316,549
RAC: 0
United States
Message 915068 - Posted: 7 Jul 2009, 0:13:47 UTC - in response to Message 915056.  
Last modified: 7 Jul 2009, 0:14:35 UTC

I don't want to be "this guy"... but... I suppose somebody has to break the news...


Uploading has no problems now.

EDIT: Of course now they won't go through. Disregard..... >:(
ID: 915068 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 915076 - Posted: 7 Jul 2009, 0:20:26 UTC - in response to Message 915014.  

Again, as I noted, I didn't think it was your intent. But please, do understand, if my phrasing was poor (and I agree with you here, can you agree with me agreeing with you?), I'd submit yours was as well.

I actually don't think it is an out of context thing here - I think your phrasing, whatever the context is no more appropriate than mine. I suggested you wanted to see the change to protect the project from 'silly users' (my words, not yours) by not allowing the retry option, and you replied that this is unfair, but then indicated what you wanted to do was remove an unhelpful option in order to stop people who don't act as grown adults. OK, fair enough.

The thing is, we both agree, and heck Ned agrees as well, that this is not the cause of the 'stuck uploads' problem. I believe we all agree that the root cause is that there is a bandwidth constraint (which is particularly evident in its adverse effects at this juncture only with the SETI project). Ned has suggested that a significant contributing factor is the default (and as far as either he or I can figure out, currently fixed in the BOINC client) retry backoff settings. I can't argue with that, though I'd submit that's a secondary factor. I think we do agree that *users* can do little to favorably affect a solution here -- short of a mass exodus, which I am NOT advocating.

I am REALLY trying to not be argumentative here, though it seems that my posts have that effect on you and for my part in that I do apologize.

I suspect that until bandwidth is increased (and we can all agree that getting that done is at least for now, simply too expensive for the project), this problem is going to persist.

I further suspect that even with the change you suggest (elimination or the retry option in future client releases), or even with the change that Ned suggests (revision in the default retry backoff settings in future client releases), we can all agree that the 'stuck uploads' problem will persist for SETI.

But I think the core difference we have may be one regarding some of the thinking behind the BOINC client development these days. In this area, some of my frustration with various issues which have surfaced and remain with the 6.x client in its many iterations may well have affected my tone in this thread and for this I apologize. On the other hand, I don't think that a 'cleaner client' would help SETI's bandwidth problem all that much, it would just make my BOINC life a bit simpler.

Lastly, I wonder if some of the reaction I've gotten has to do with me not being SETI-centric (and I am not saying this in a negative sense).

The only nerve it touched was the suggestive comment that we are "blaming the users" when the problem lies elsewhere, and I don't feel that's an accurate accusation of my views...

...it may be a small subset of users, but grown adults should know better and not try to force the situation. Allowing the option to remain only allows abuse of that option that quite a few people do.

Successfully taken out of context.

All I've said is that the users do not help the situation. That can be easily spun into "blaming the users", but I know its not the users causing the majority of the problems. Whatever way you choose to spin it, the fact remains that it doesn't help.

So no, not my intent, and not exactly what I was saying either.


ID: 915076 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 915080 - Posted: 7 Jul 2009, 0:26:30 UTC - in response to Message 915056.  

But it might help simply by, in effect, eliminating network traffic - those failed retries, such as completed work that is now in 'stuck upload land'. I essentially do that manually by suspending the project for maintenance Tuesdays but it would be nice to do it without the intervention.

In any event, I recognize it as a wishlist type item. I'd rather the various work fetch issues be figured out and cleared first. My sense is that they are made even more problematic with the GPU/CPU mix -- I saw that in the *released* 6.6.36 client and bailed out to 6.4.5 earlier this week on a workstation.



In other words, the scheduling is nice for those who don't like to see failed connects, but it doesn't really add to the capabilities.


ID: 915080 · Report as offensive
Profile Vistro
Avatar

Send message
Joined: 6 Aug 08
Posts: 233
Credit: 316,549
RAC: 0
United States
Message 915089 - Posted: 7 Jul 2009, 0:35:46 UTC - in response to Message 915080.  

We are getting more and more computers attached every day. I attached two machines, 24 hours apart, and the ID numbers were hundreds apart.

This problem will get worse and worse. This requires an emergency file change (a version update won't be enough), something that will edit exactly how BOINC deals with failed uploads.

I heard a previous idea of letting x amount of IP ranges in, and slowly adding more to keep it steady. This cycle would repeat every Tuesday, but at least it would only last for a couple of hours.

A friend (the guy with the Vista Duo) uninstalled the client completely when he saw that it would not download new tasks. I remember him saying something about the messages, but a mass exodus WILL occur with people like him; new to the BOINC thing, doesn't really understand the server issue.

We will lose CPU time, and it won't really matter if we add the new link.

Users that actually stay with the project are more likely to donate to the cause.
ID: 915089 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915092 - Posted: 7 Jul 2009, 0:37:26 UTC - in response to Message 915076.  

Lastly, I wonder if some of the reaction I've gotten has to do with me not being SETI-centric (and I am not saying this in a negative sense).


I can answer that very honestly for you: not at all. I'm not the type of person that likes to create "cliques" and then attack those whom are not part of it. I fully support everyone's right to do as they wish, including not making SETI their primary project. In fact, my gf has insisted she has wanted to make other projects our primary, but not because of server problems, rather because of her interest in other sciences - namely biological sciences finding cures to diseases. She has a vested interest in this subject because her family owns the Intermediate Health Care facility for the Mentally Ill that I work at.
ID: 915092 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915099 - Posted: 7 Jul 2009, 0:43:00 UTC - in response to Message 915089.  

A friend (the guy with the Vista Duo) uninstalled the client completely when he saw that it would not download new tasks. I remember him saying something about the messages, but a mass exodus WILL occur with people like him; new to the BOINC thing, doesn't really understand the server issue.

We will lose CPU time, and it won't really matter if we add the new link.

Users that actually stay with the project are more likely to donate to the cause.


Agreed, but what Ned and I have been attempting to do is show people that their expectations of the project were simply too high. The perception that any upload errors means that its "not working" is an erred assumption, and we try to help people understand that.

Of course, fixing the issues on the server side would be preferred, but the entire point of BOINC is that its not necessary to have even 90% availability. Part of the grand experiment for BOINC is to help under-funded science projects get the computing power they need through volunteers. If every project had the funds to have 99.9% server availability would mean they're not very "under-funded" or they don't have enough users contributing to an obscenely large database. SETI, for what its worth, pushes the limits of BOINC, and that's a good thing. It lets all other projects know how far they can go before breaking down or having to buy better hardware.
ID: 915099 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 915100 - Posted: 7 Jul 2009, 0:43:31 UTC - in response to Message 915048.  

There are cases where the manual retry is nesesary (it can't connect because the user made a fail, like turning on the microwave during a transfer, disconnecting the wireless card, router off, ISP outage, etc.), and after it's fixed, they can retry then.


Then perhaps after the server has failed to respond, the retry button for that project should be disabled for as long as the automatic backoff is enacted.

The whole point of having a "retry" available is in cases the problem is user-side and not server-side. Since BOINC-client can't know if user has made an "oops, shouldn't have blocked that server in my firewall" or something, you can't just remove the "retry".

Also, for users that's still stuck on dialup, you definitely don't want them to rely on the normal upto 4-hour random backoff after "fixing" a problem that's on user's side of things...


"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 915100 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915105 - Posted: 7 Jul 2009, 0:45:55 UTC - in response to Message 915100.  

There are cases where the manual retry is nesesary (it can't connect because the user made a fail, like turning on the microwave during a transfer, disconnecting the wireless card, router off, ISP outage, etc.), and after it's fixed, they can retry then.


Then perhaps after the server has failed to respond, the retry button for that project should be disabled for as long as the automatic backoff is enacted.

The whole point of having a "retry" available is in cases the problem is user-side and not server-side. Since BOINC-client can't know if user has made an "oops, shouldn't have blocked that server in my firewall" or something, you can't just remove the "retry".

Also, for users that's still stuck on dialup, you definitely don't want them to rely on the normal upto 4-hour random backoff after "fixing" a problem that's on user's side of things...


Point taken. Unfortunately, anything with good intentions always has abuses that go along with it, and my thought process was to simply minimize or eliminate such abuses so they do not add to the problem.
ID: 915105 · Report as offensive
Profile Vistro
Avatar

Send message
Joined: 6 Aug 08
Posts: 233
Credit: 316,549
RAC: 0
United States
Message 915141 - Posted: 7 Jul 2009, 1:38:33 UTC
Last modified: 7 Jul 2009, 1:39:01 UTC

I have a question:

People ARE uploading, right? Or is nobody able to upload because of all of the request packets?

I should at least hope that SOME people are uploading.
ID: 915141 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15692
Credit: 84,761,841
RAC: 28
United States
Message 915157 - Posted: 7 Jul 2009, 2:06:57 UTC - in response to Message 915141.  

I've had uploads go through after a few automatic retries.
ID: 915157 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 915174 - Posted: 7 Jul 2009, 2:40:00 UTC

I've throttled my uploads on most of my hosts to a few hours a day, and run a larger cache to span the frequent days of berkeley blues. Things more or less work ok, albeit lots of no-connects and no-work messages occur during those hours. Doing so, I'm trying to be part of the solution and not part of the problem, I guess.
ID: 915174 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 915193 - Posted: 7 Jul 2009, 3:33:46 UTC - in response to Message 915141.  

I have a question:

People ARE uploading, right? Or is nobody able to upload because of all of the request packets?

I should at least hope that SOME people are uploading.

Admittedly, I don't crank through work like most on here, but I've had all of my pending uploads complete today.
ID: 915193 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next

Message boards : Number crunching : Panic Mode On (19) Server problems


 
©2026 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.