Panic Mode On (68) Server problems?

Message boards : Number crunching : Panic Mode On (68) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1196786 - Posted: 18 Feb 2012, 10:41:30 UTC

Its worth noting that quite a number of US hosted sites are suffering a performance hit just now so there may be a (US) wider problem.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1196786 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196803 - Posted: 18 Feb 2012, 12:19:14 UTC

The number of completed WUs being returned is rising steadily as well. Either because of a shorty storm or some rigs that finally got some work.

Looking at the cache on my top rig, out of about 1500 WUs, 1100 are VHAR shorties. My total cache across 9 rigs has dropped by about 600 total over the last 2 hours.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196803 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1196811 - Posted: 18 Feb 2012, 13:35:53 UTC - in response to Message 1196806.  

Hi,
Back on a diet again:-) All servers working but no new WU showing up for boinc to grab.
Guess its gonna take a bit of time for stuff to get back in the swing.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1196811 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1196814 - Posted: 18 Feb 2012, 13:41:28 UTC

I got up cause My shoulder blades are hurting(OA), noticed I had about 85 wu's waiting to download and then I started downloading them, must not have been a lot of traffic as I've got them all now.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1196814 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1196819 - Posted: 18 Feb 2012, 13:52:35 UTC - in response to Message 1196814.  

I got up cause My shoulder blades are hurting(OA), noticed I had about 85 wu's waiting to download and then I started downloading them, must not have been a lot of traffic as I've got them all now.


I cleared my DL cache (500ish) in minutes. Not sure if that's a good sign or not.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1196819 · Report as offensive
Profile Khangollo
Avatar

Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1196825 - Posted: 18 Feb 2012, 14:43:31 UTC - in response to Message 1196817.  

The cricket graph doesn't seem as reliable as it used to be.

I'm sure the simple traffic grapher is as reliable as ever, it's just that there is probably one or more rogue hosts out there, constantly in a state of downloading resends...or something similar.
ID: 1196825 · Report as offensive
Profile John Clark
Volunteer tester
Avatar

Send message
Joined: 29 Sep 99
Posts: 16515
Credit: 4,418,829
RAC: 0
United Kingdom
Message 1196841 - Posted: 18 Feb 2012, 15:51:43 UTC

After a couple of days with downloads becoming increasing slow, and the number of WUs growing. This morning the speed of download was good, and they all cleared while I was in town.

It's good to be back amongst friends and colleagues



ID: 1196841 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1196844 - Posted: 18 Feb 2012, 15:57:21 UTC - in response to Message 1196841.  

After a couple of days with downloads becoming increasing slow, and the number of WUs growing. This morning the speed of download was good, and they all cleared while I was in town.

Yeah and I'm on the rebound too, now I just have to get the energy to dust out 2 295 cards today...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1196844 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1196851 - Posted: 18 Feb 2012, 16:04:20 UTC - in response to Message 1196817.  



The cricket graph doesn't seem as reliable as it used to be. When there was little work, it was visible directly on the cricket, or if there was some other problems. Now though, cricket is pegged all the time when the system is up, no matter if there seems to be anything to send out or not. Just like now, when the status page shows that not much is being split at all, despite the splitters having tons of files to work on.


There is a big gray area, we know how many WU's are ready to send but if a WU is removed from this list as soon as it is alocated to a machine, which I suspect it is, then it goes into that big unknown - awaiting download.

When the pipe is running at max the queue could build up to a substantial amount, remember all of those machines running 6.12.xx, some of them WU's could be waiting days before they get downloaded.



Kevin


ID: 1196851 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1196853 - Posted: 18 Feb 2012, 16:10:15 UTC - in response to Message 1196833.  

i am pretty sure the green bars in the cricket, not only it counts the outgoing connection but also all the bandwith inside their own network.
when data are outgoing, it can go to me, to you, to us.. but also outgoing toward another server of the network. from tape to spliter to storage_rdy_to_send to 100_WU_buffer ... etc... and then when we send results, there are some inside moves. linked to the results and the old datas to be deleted and no more required.

Not true, I'm afraid. The cricket graphs are run by the campus-wide infrastructure guys, and are outside the SSL labs themselves.

You can see the whole campus network: our graphs are just one of the 48 gigabit interfaces on router inr-250 - about half-way down the Tier2 column.

We are "gigabitethernet2_3: 169.229.0.190: SETI@Home_P2P_to_sslringeva1fes_Gi0/1" - third up from the bottom of the inr-250 page. As in the more normal view, the busy direction (light blue in the first column) is downloads from the lab to us - labelled 'bits in' from the point of view of that router, receiving data from SSL.

Having said that, and having bothered to call that page up for the first time in ages, I'm a bit worried by the number of 'Broadcast/Multicast Packets/Second' (third column) being received from the lab on that port. Ideas, anyone?
ID: 1196853 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1196854 - Posted: 18 Feb 2012, 16:11:59 UTC - in response to Message 1196803.  

The number of completed WUs being returned is rising steadily as well. Either because of a shorty storm or some rigs that finally got some work.

Looking at the cache on my top rig, out of about 1500 WUs, 1100 are VHAR shorties. My total cache across 9 rigs has dropped by about 600 total over the last 2 hours.


No shorties here, I left this machine in shorty bashing mode.

OTOH GPU cache is only at 50%:-(



Kevin


ID: 1196854 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196859 - Posted: 18 Feb 2012, 16:17:53 UTC

Little soon to tell, but it looks like something may have come to a crashing halt. And the Cricket graph is starting to show signs of download traffic coming down too.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196859 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1196861 - Posted: 18 Feb 2012, 16:21:06 UTC - in response to Message 1196853.  



Having said that, and having bothered to call that page up for the first time in ages, I'm a bit worried by the number of 'Broadcast/Multicast Packets/Second' (third column) being received from the lab on that port. Ideas, anyone?


Strange!!
But its a "feature" of a number of other "sites" on the same page as you ref to Richard, and most of those appear (by name) to be something to do with sports/recreation activities, for example:
gigabitethernet1_43: 10.129.129.97: RecCEV Zone PPCS1918 Net


Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1196861 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1196866 - Posted: 18 Feb 2012, 16:25:45 UTC - in response to Message 1196859.  
Last modified: 18 Feb 2012, 16:27:13 UTC

Little soon to tell, but it looks like something may have come to a crashing halt. And the Cricket graph is starting to show signs of download traffic coming down too.

Well, nothing unexpected with the current result creation rate.
ID: 1196866 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1196869 - Posted: 18 Feb 2012, 16:31:00 UTC - in response to Message 1196861.  



Having said that, and having bothered to call that page up for the first time in ages, I'm a bit worried by the number of 'Broadcast/Multicast Packets/Second' (third column) being received from the lab on that port. Ideas, anyone?

Strange!!
But its a "feature" of a number of other "sites" on the same page as you ref to Richard, and most of those appear (by name) to be something to do with sports/recreation activities, for example:
gigabitethernet1_43: 10.129.129.97: RecCEV Zone PPCS1918 Net

Wierd!!
hazmat-net is quiet, but old-hazmat-ppcs-net is broadcasting fit to bust. No recreational hazardous materials, I trust.
ID: 1196869 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196870 - Posted: 18 Feb 2012, 16:32:15 UTC - in response to Message 1196866.  

Little soon to tell, but it looks like something may have come to a crashing halt. And the Cricket graph is starting to show signs of download traffic coming down too.

Well, nothing unexpected with the current result creation rate.

Didn't know what to expect with the way the bandwidth stayed up when little or no work was being generated before this week's outage.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196870 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1196882 - Posted: 18 Feb 2012, 17:07:55 UTC

Well, whatever is going on, it isn't pretty for the kitties....
Across 9 rigs, I got exactly 1 WU in the last hour.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1196882 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1196902 - Posted: 18 Feb 2012, 17:34:34 UTC - in response to Message 1196896.  
Last modified: 18 Feb 2012, 17:35:41 UTC

Well, whatever is going on, it isn't pretty for the kitties....
Across 9 rigs, I got exactly 1 WU in the last hour.


oh ya sure !

from my end, in here, everything looks fine. plenty of works, when my client need more WU to replace those made, i am getting some :)

i am looking back my log, and i dont see any HTTP Error at all since a while, probably yesterday...

everything perfect
for the moment

Mine could be cooler, but for the moment, it's working, later after 1pm I've got to try and clean 2 cards out, 1 might be easy as I don't need do too much to it besides blow it out, the other 1 needs the cards cover removed so that it can be cleaned out.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1196902 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1196918 - Posted: 18 Feb 2012, 18:17:18 UTC - in response to Message 1196870.  

Little soon to tell, but it looks like something may have come to a crashing halt. And the Cricket graph is starting to show signs of download traffic coming down too.

Well, nothing unexpected with the current result creation rate.

Didn't know what to expect with the way the bandwidth stayed up when little or no work was being generated before this week's outage.

Yeah, that was unusual. The current behavior is the one we are used to from the past.
ID: 1196918 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1196926 - Posted: 18 Feb 2012, 18:44:05 UTC

The scheduler and upload server are currently disabled. So someone might be in, or remoted in, the lab poking the servers.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1196926 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : Number crunching : Panic Mode On (68) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.