Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 24 · Next

AuthorMessage
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1326371 - Posted: 10 Jan 2013, 15:37:30 UTC - in response to Message 1326191.  

I did say my "fear" is that nobody cares, and by "nobody" I meant anyone who could do anything about server issues.


I do not believe that Matt Jeff and Eric "don't care". Sorry that is going too far.
I don't believe they have the time and resources to do what they want but to say the guys in the lab "don't care" is to me and insult.

I think you're still not understanding him. He didn't say they don't care, he said his fear is that they don't care. There's a difference between a fear and a belief.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1326371 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1326412 - Posted: 10 Jan 2013, 17:59:21 UTC - in response to Message 1326371.  

I did say my "fear" is that nobody cares, and by "nobody" I meant anyone who could do anything about server issues.


I do not believe that Matt Jeff and Eric "don't care". Sorry that is going too far.
I don't believe they have the time and resources to do what they want but to say the guys in the lab "don't care" is to me and insult.

I think you're still not understanding him. He didn't say they don't care, he said his fear is that they don't care. There's a difference between a fear and a belief.

I understand the words but why even hint that perhaps no one cares, because that is what that statement suggests. Anyone who has been around these boards should realise that "a lack of caring" is NOT the problem and it is disingenuous to even suggest it. Also coming from someone with a total of 95,733,064 and running 13 machines. It just strikes the wrong note with me.
ID: 1326412 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13904
Credit: 208,696,464
RAC: 304
Australia
Message 1326414 - Posted: 10 Jan 2013, 18:14:03 UTC - in response to Message 1326371.  
Last modified: 10 Jan 2013, 18:14:37 UTC

I think you're still not understanding him. He didn't say they don't care, he said his fear is that they don't care. There's a difference between a fear and a belief.

Belief or fear makes no difference- he is implying that they don't care.
Of course if he was to read the donation thread he'd see that there are plans to address a couple of the major problems the project has. That they've gone to the effort to inform the fund raisers of their requirements would indicate to most people that they do care.
Grant
Darwin NT
ID: 1326414 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1326415 - Posted: 10 Jan 2013, 18:21:52 UTC

I think I'm going to assume that the fact that you've got nothing better to do than argue semantic points from a post eight messages, two days and a maintenance outage ago is a good thing.

It means that the project has (very quietly) started running so smoothly that you've all got nothing better to panic about :P
ID: 1326415 · Report as offensive
Profile QuietDad
Avatar

Send message
Joined: 2 Oct 99
Posts: 83
Credit: 28,926,603
RAC: 59
United States
Message 1326437 - Posted: 10 Jan 2013, 19:30:29 UTC

What some forget, the original idea for distributed computing was that we donate our spare computer cycles for scientists to use for research purpuses. This was NEVER intended to be a "game" for points.
ID: 1326437 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19550
Credit: 40,757,560
RAC: 67
United Kingdom
Message 1326444 - Posted: 10 Jan 2013, 19:49:01 UTC - in response to Message 1326415.  

I think I'm going to assume that the fact that you've got nothing better to do than argue semantic points from a post eight messages, two days and a maintenance outage ago is a good thing.

It means that the project has (very quietly) started running so smoothly that you've all got nothing better to panic about :P

It's at times like this when I think we should have a "Preparing to Panic" thread.
ID: 1326444 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1326447 - Posted: 10 Jan 2013, 20:13:53 UTC - in response to Message 1326415.  

I think I'm going to assume that the fact that you've got nothing better to do than argue semantic points from a post eight messages, two days and a maintenance outage ago is a good thing.

It means that the project has (very quietly) started running so smoothly that you've all got nothing better to panic about :P

Quite correct I haven't been panicking for ages
ID: 1326447 · Report as offensive
Rolf

Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 0
Switzerland
Message 1326450 - Posted: 10 Jan 2013, 20:24:45 UTC

Please, no panic!
Panic is when people start loosing connection to reality - but we are living in reality.

If "he" doesn't care (or "they"don't care), there are still two possibilities: Either he doesn't want to take care (his own decision) or he does not have the time to take care (somebody else decided).
As an example take Matt's blog: http://setiathome.berkeley.edu/forum_thread.php?id=59157
Although it's nearly 3 years old, some parts could still be valid!

I prefer the last Q-A-block:
Q: What do you mean "you hit the limit?" Are you mad at us?
A: No. No. No. Nothing personal, but I'd really like other project staff to chime in more often, and maybe I need to step away in order for that to happen. Plus being in the spotlight I sometimes find myself having to argue on behalf of policies or practices around here which I don't know much about, or I don't exactly agree with. Meh. I do wish SETI@home had better avenues and resources for public relations. In any case I'll still keep working towards improving that in the future whenever I can, whether or not that's direct contact from me or otherwise. The daily notes were fun, but honestly probably not the best use of my time. I may just report about bigger projects as they happen. We'll see.

So I'll be around - just a lot quieter.

ID: 1326450 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1326452 - Posted: 10 Jan 2013, 20:30:31 UTC

I've been receiving Short CPU tasks that insist they run Immediately. Plus, all my recently downloaded CUDA 23 tasks are...Shorties. I see an approaching Shortie Storm.

Batten down the hatches, no need to panic, it's just a little Short thing. Unless...
ID: 1326452 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1326513 - Posted: 11 Jan 2013, 0:15:39 UTC

Matt has posted an update and I believe that all serious Setizens should read:

Matt's Post Here
ID: 1326513 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51523
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1326670 - Posted: 11 Jan 2013, 14:21:55 UTC

Something's gone away....
Everything was still going fine about 5 hours ago when I went to bed. This morning I have a bunch of rigs that have dropped back to Einstein because they can't connect to the server to report and get new work.

Whazzup?
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1326670 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1326673 - Posted: 11 Jan 2013, 14:31:16 UTC - in response to Message 1326670.  

Something's gone away....
Everything was still going fine about 5 hours ago when I went to bed. This morning I have a bunch of rigs that have dropped back to Einstein because they can't connect to the server to report and get new work.

Whazzup?

Looking over my logs for the night I see spurts of "Scheduler request failed: HTTP gateway timeout", but I often see that the bandwidth is maxed constantly pegged.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1326673 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51523
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1326675 - Posted: 11 Jan 2013, 14:36:45 UTC - in response to Message 1326673.  

Something's gone away....
Everything was still going fine about 5 hours ago when I went to bed. This morning I have a bunch of rigs that have dropped back to Einstein because they can't connect to the server to report and get new work.

Whazzup?

Looking over my logs for the night I see spurts of "Scheduler request failed: HTTP gateway timeout", but I often see that the bandwidth is maxed constantly pegged.

Seems to be not working as smoothly as it has been for a while...
My #1 rig took a bunch of update button pushing and about 20-25 attempts to connect to report. Mostly could not connect to server and a couple of partial connects with no response. Finally just now got through and got 100 GPU tasks in one shot.
But something is tying up scheduler comms badly again.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1326675 · Report as offensive
Sp@ceNv@der Project Donor
Avatar

Send message
Joined: 10 Jul 05
Posts: 41
Credit: 117,366,167
RAC: 152
Belgium
Message 1326679 - Posted: 11 Jan 2013, 14:51:00 UTC - in response to Message 1326675.  

Same here in Belgium .... can upload finished tasks, but can't report them nor get new work as long as reporting fails ...

here we go again lol :D
To boldly crunch ...
ID: 1326679 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51523
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1326681 - Posted: 11 Jan 2013, 14:56:05 UTC - in response to Message 1326679.  
Last modified: 11 Jan 2013, 15:01:48 UTC

Same here in Belgium .... can upload finished tasks, but can't report them nor get new work as long as reporting fails ...

here we go again lol :D

Yup.
You can't fool the kitties. When in a matter of hours 5 out of 9 rigs here all of a sudden have started running Einstein, something's gone away in the server closet again.
That means that they have run through their 100 task GPU cache, which takes at least a few of hours on the fast rigs, have not been able to replenish the cache, and the kitties have gone off sniffing for something else to do.

And if you look at the inbound traffic on the Cricket graph, the flow had been very smooth....now looks a bit ragged.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1326681 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1326694 - Posted: 11 Jan 2013, 15:35:45 UTC - in response to Message 1326415.  

Not quietly enough :-)

It means that the project has (very quietly) started running so smoothly that you've all got nothing better to panic about :P

LANDO is not running also errors on MB Channels

someone said we started getting a storm of shorties that with AP's is what usually throws us over the Network Performance Knee
ID: 1326694 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1326712 - Posted: 11 Jan 2013, 16:26:31 UTC

I just did an update on a baby host that only does about 1 CPU task a day. Report one, request zero, only three tasks listed as 'other results'. It still took 71 seconds to turn it round (successfully) - that's mighty slow.
ID: 1326712 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51523
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1326718 - Posted: 11 Jan 2013, 16:37:44 UTC - in response to Message 1326712.  
Last modified: 11 Jan 2013, 16:52:10 UTC

I just did an update on a baby host that only does about 1 CPU task a day. Report one, request zero, only three tasks listed as 'other results'. It still took 71 seconds to turn it round (successfully) - that's mighty slow.

Well, like I said Richard, you can't fool the kitties.
With 9 rigs running, all with GPUs, some fast, some slow, when I can see a trend in 5 of them going off to another project, nobody can tell me something has not changed. Been at this too long not to see the writing on the wall.

It's loosened up just a little bit in the last half an hour, but still not back to where it was last week.


EDIT...
And, LOL...
I must confess it drives me a bit mad when I seen my backup project, Einstein, download MBits worth of files with all speed and see reporting work requests go through in less than a second or two. (And I understand why this is, I just wish it could be so for Seti).
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1326718 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1326723 - Posted: 11 Jan 2013, 17:00:10 UTC
Last modified: 11 Jan 2013, 17:02:33 UTC

My logs started showing reporting problems starting around 5am EST. Uploads fine just scheduler reports.

Edit: And always as soon as I post here a request goes through...
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1326723 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1326747 - Posted: 11 Jan 2013, 18:31:01 UTC - in response to Message 1326723.  

My logs started showing reporting problems starting around 5am EST. Uploads fine just scheduler reports.

Edit: And always as soon as I post here a request goes through...


Well, it worked that one time... now back to not reporting unless I NNT.

"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1326747 · Report as offensive
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.