Panic Mode On (40) Server problems

Message boards : Number crunching : Panic Mode On (40) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1042758 - Posted: 16 Oct 2010, 18:22:58 UTC - in response to Message 1042621.  

Not sure what's happening, but inbound traffic is way up. DDoS attack? When everyone hits the per machine limit the inbound traffic ramps up, but outbound is still pegged & i expect it'll be another day till everyone hits the limits.


Inbound traffic is still way above normal. My downloads are also taking several attempts before they finally complete (5kB/s at best, some as slow as 1kB/s). And i notice that the assimilators are no longer making headway- the queue is growning. Last time that happened was 2 (or 3?) weeks ago & they had to shut down the forums & some other services to get things working again.
Grant
Darwin NT
ID: 1042758 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 37377
Credit: 261,360,520
RAC: 489
Australia
Message 1042817 - Posted: 16 Oct 2010, 21:35:33 UTC - in response to Message 1042665.  

DLs?

Scheduler request completed: got 0 new tasks
Message from server: No work sent
Message from server: This computer has reached a limit on tasks in progress


IIRC, Jeff mentioned that we'll have this week after the outage a/the high limit.

Hmm.. it's look like not.
At least not at my machines..

Someone know why we have only the small limit?

Well I've received a lot more work this time than I got at this same time last week.

Cheers.
ID: 1042817 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 37377
Credit: 261,360,520
RAC: 489
Australia
Message 1042819 - Posted: 16 Oct 2010, 21:37:29 UTC - in response to Message 1042758.  

Not sure what's happening, but inbound traffic is way up. DDoS attack? When everyone hits the per machine limit the inbound traffic ramps up, but outbound is still pegged & i expect it'll be another day till everyone hits the limits.


Inbound traffic is still way above normal. My downloads are also taking several attempts before they finally complete (5kB/s at best, some as slow as 1kB/s). And i notice that the assimilators are no longer making headway- the queue is growning. Last time that happened was 2 (or 3?) weeks ago & they had to shut down the forums & some other services to get things working again.

Yes it's just like pulling teeth from a hen.

Cheers.
ID: 1042819 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51511
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1042821 - Posted: 16 Oct 2010, 21:40:47 UTC - in response to Message 1042819.  

Not sure what's happening, but inbound traffic is way up. DDoS attack? When everyone hits the per machine limit the inbound traffic ramps up, but outbound is still pegged & i expect it'll be another day till everyone hits the limits.


Inbound traffic is still way above normal. My downloads are also taking several attempts before they finally complete (5kB/s at best, some as slow as 1kB/s). And i notice that the assimilators are no longer making headway- the queue is growning. Last time that happened was 2 (or 3?) weeks ago & they had to shut down the forums & some other services to get things working again.

Yes it's just like pulling teeth from a hen.

Cheers.

Eric will be looking into it when he can.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1042821 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14687
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1042832 - Posted: 16 Oct 2010, 21:54:59 UTC - in response to Message 1042821.  

Eric will be looking into it when he can.

Not until Monday, I trust.

With the cricket graphs still pegged out at maximum, there's absolutely no point in him sacrificing a family weekend to speed up any other part of the system.
ID: 1042832 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 37377
Credit: 261,360,520
RAC: 489
Australia
Message 1042933 - Posted: 17 Oct 2010, 0:59:18 UTC - in response to Message 1042832.  

Maybe I should have spoken a bit sooner as not long after my last post the dam burst.

All 3 pc's are now happy campers with plenty of work.

Cheers.
ID: 1042933 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6659
Credit: 121,090,076
RAC: 0
United States
Message 1042937 - Posted: 17 Oct 2010, 1:12:09 UTC - in response to Message 1042933.  

Even my rig, which would not request work, seems to have a decent backlog of tasks to download. I feel much better now, as when there is available bandwith, I have wu's to download.

It seemed to pick up speed as the day drew on. The less I looked at it, the better it did. Hmmmmm...

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1042937 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1042944 - Posted: 17 Oct 2010, 1:22:51 UTC

hitting project limits here.

So done.. for now.
Janice
ID: 1042944 · Report as offensive
Profile SciManStev Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Jun 99
Posts: 6659
Credit: 121,090,076
RAC: 0
United States
Message 1042965 - Posted: 17 Oct 2010, 2:02:44 UTC

From the days earlier situation of not being able to request work, I have gone to plenty of wu's in the download cache with at least 80 AP units. I do the AP units in about 7 hours each, and with 5 cores crunching on them, bring em on! My chiller won't start cycling on and off tonight!

Steve
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website
ID: 1042965 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 1042986 - Posted: 17 Oct 2010, 3:07:12 UTC

This is a classic computer science problem. N computers request data from a server at a poisson rate. If request succeeds, computer waits for big random some time. If request fails, computer tries again in 11 seconds + small random time.

Expected outcome: Server overload.
ID: 1042986 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1043006 - Posted: 17 Oct 2010, 4:17:11 UTC - in response to Message 1042544.  


already have 1 machine on at "limit on taks in progress"...

how about switching it off and letting us fill our caches up before the next outage...


Don't know which of your machines you are talking about but..........

2 of your machines show nearly 5000 wu on board. Allowing some for ghosts, just one of your machines has enough work to last for the entire 3 day outage for all 4 of my machines. I do about 4000 total for the 3 day outage.

How about leave some for us..........


re allocated wus...

one machine has ~1k wus
one still downloading has ~720 wus with about 300 still coming in
one with ~1600 wus

the rest are just ghosts ...

in terms of throughput, the above is just enough to cover the 3 day outage on 1/2 boxes only...the third will run dry by end of day 2 ... my caches normally run with ~6k-7k wus per machine to avoid seti outages/downtime but what i have noticed is that since the 3 day outage was implemented, the caches have been dwindling and pre last outage they ran dry and have not been able to refill over 2 successive cycles ... (i don't think its working very well) ...

at the moment all the boxes are asking seti for more work about every 1-2 minutes and getting "reached a limit...." ... i know i'm not the only one that's in this position or feeling pain

my pending credit is now at 857k+ ... it used to sit around 130k...








ID: 1043006 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1043035 - Posted: 17 Oct 2010, 8:08:01 UTC - in response to Message 1042817.  
Last modified: 17 Oct 2010, 8:09:58 UTC

DLs?

Scheduler request completed: got 0 new tasks
Message from server: No work sent
Message from server: This computer has reached a limit on tasks in progress


IIRC, Jeff mentioned that we'll have this week after the outage a/the high limit.

Hmm.. it's look like not.
At least not at my machines..

Someone know why we have only the small limit?

Well I've received a lot more work this time than I got at this same time last week.

Cheers.


We have currently:
 40 WUs CPU/Core
320 WUs GPU


But I'm confused - why?
Jeff mentioned that this week we will have the high limit. But it's not.

-----------------------------------------------------------

The one and only problem is the traffic bandwidth (max. ~ 95 Mbits/sec) between S@h and us.
So why not solve this problem?

What give it to buy new servers, if the memebers can't reach them?

How is the currently status? We'll see one day the double speed or much more?

If more and more people will upgrade their systems with GPUs.. one day noone can reach S@h..
OTOH, more and more people will leave S@h - because they can't reach S@h..
But always, the possible traffic bandwidth is too low!
ID: 1043035 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1043039 - Posted: 17 Oct 2010, 8:23:45 UTC - in response to Message 1043035.  

From my understanding, I believe that if the new servers work out as planned, the old ones can be turned into machines to do the science on the back end, without the need for a full 3 day shutdown. If that is the case, the bandwidth will not be an issue, as the project can run longer than 4 days a week. I know I read in one of the threads on here that it is at least being considered.
ID: 1043039 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 37377
Credit: 261,360,520
RAC: 489
Australia
Message 1043047 - Posted: 17 Oct 2010, 8:53:47 UTC - in response to Message 1043035.  

DLs?

Scheduler request completed: got 0 new tasks
Message from server: No work sent
Message from server: This computer has reached a limit on tasks in progress


IIRC, Jeff mentioned that we'll have this week after the outage a/the high limit.

Hmm.. it's look like not.
At least not at my machines..

Someone know why we have only the small limit?

Well I've received a lot more work this time than I got at this same time last week.

Cheers.


We have currently:
 40 WUs CPU/Core
320 WUs GPU


But I'm confused - why?
Jeff mentioned that this week we will have the high limit. But it's not.

-----------------------------------------------------------

The one and only problem is the traffic bandwidth (max. ~ 95 Mbits/sec) between S@h and us.
So why not solve this problem?

What give it to buy new servers, if the memebers can't reach them?

How is the currently status? We'll see one day the double speed or much more?

If more and more people will upgrade their systems with GPUs.. one day noone can reach S@h..
OTOH, more and more people will leave S@h - because they can't reach S@h..
But always, the possible traffic bandwidth is too low!

Well I don't know where you are getting your figures from for this week but I have about 30% more work against this time last week.

Cheers.
ID: 1043047 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13886
Credit: 208,696,464
RAC: 304
Australia
Message 1043050 - Posted: 17 Oct 2010, 9:09:32 UTC - in response to Message 1043047.  


I see the network traffic has dropped off quite a bit. It's still a bit high, but a lot better than what it was.
However the assimilator queue still continues to grow- disk space may yet become an issue.
Grant
Darwin NT
ID: 1043050 · Report as offensive
CHARLES JACKSON
Avatar

Send message
Joined: 17 May 99
Posts: 49
Credit: 39,349,563
RAC: 0
United States
Message 1043051 - Posted: 17 Oct 2010, 9:09:41 UTC - in response to Message 1043047.  

Hi, As I recall SETI lab has a limit of 100 mbit per second.
ID: 1043051 · Report as offensive
Blake Bonkofsky
Volunteer tester
Avatar

Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,383,149
RAC: 0
United States
Message 1043055 - Posted: 17 Oct 2010, 9:14:06 UTC - in response to Message 1043050.  


I see the network traffic has dropped off quite a bit. It's still a bit high, but a lot better than what it was.
However the assimilator queue still continues to grow- disk space may yet become an issue.


It's still pegging the outbound line, sitting right at 91-93MB. Uploads did drop off a bunch though, from about 35mb to 25mb.
ID: 1043055 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 37377
Credit: 261,360,520
RAC: 489
Australia
Message 1043058 - Posted: 17 Oct 2010, 9:31:40 UTC - in response to Message 1043055.  


Hi, As I recall SETI lab has a limit of 100 mbit per second.

I see the network traffic has dropped off quite a bit. It's still a bit high, but a lot better than what it was.
However the assimilator queue still continues to grow- disk space may yet become an issue.


It's still pegging the outbound line, sitting right at 91-93MB. Uploads did drop off a bunch though, from about 35mb to 25mb.

Yes it's only a 100Mb link and only the uploads have dropped off a bit but the downloads are certainly still running at max and likely will be for another 24hrs or so.

Cheers.
ID: 1043058 · Report as offensive
Robert Ribbeck
Avatar

Send message
Joined: 7 Jun 02
Posts: 644
Credit: 5,283,174
RAC: 0
United States
Message 1043191 - Posted: 17 Oct 2010, 17:46:51 UTC - in response to Message 1043051.  

Hi, As I recall SETI lab has a limit of 100 mbit per second.


Why can't that limit be raised on weekends
When no one is around
ID: 1043191 · Report as offensive
andybutt
Volunteer tester
Avatar

Send message
Joined: 18 Mar 03
Posts: 262
Credit: 164,205,187
RAC: 516
United Kingdom
Message 1043205 - Posted: 17 Oct 2010, 22:32:41 UTC

web pages are back
ID: 1043205 · Report as offensive
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 · Next

Message boards : Number crunching : Panic Mode On (40) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.