Panic Mode On (77) Server Problems?

Message boards : Number crunching : Panic Mode On (77) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 22 · Next

AuthorMessage
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1291125 - Posted: 4 Oct 2012, 10:38:23 UTC - in response to Message 1291099.  

There's a workaround if you are willing to jump through a few hoops. Click Account (bottom and/or top of this page) and then click on SETI@home preferences and set Use Nvidia GPU to NO.

An additional advantage of this is that so lost VLARs are not resent to the GPU and timed out immediately!

Gruß,
Gundolf
ID: 1291125 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1291127 - Posted: 4 Oct 2012, 10:59:01 UTC - in response to Message 1291119.  

Bandwidth is obviously an issue here, I've been wondering why it's restricted but the last few posts have helped. So if I understand it right:

1) SETI@home servers are in the SSL building on the UCLA Berkeley campus. The SSL building is way up on a hill.

2) SETI@home has purchased a gigabit connection to the outside world through Hurricane Electric.

3) The Hurricane Electric line terminates somewhere across campus, and all our traffic must move through the University's network, specifically through a single fiber going up the hill to the SSL building.

4) Right now the University is giving Seti@home 10% of that line or 100MB.

5) In order to get more bandwidth down to the Hurricane Electric switch, there needs to be permission granted by the University to use more of their network.

6) This is tricky because of politics and the need to serve people
on campus.

Right?

I think that pretty much sums it up. Maybe just a couple more points...

7) The Hurricane Electric gigabit link actually terminates at Palo Alto, the other side of San Francisco Bay from Berkeley. There's a powerful dedicated router in the data centre there - donated by a volunteer - which supports a VPN to a matching router in the SSL. That makes configuration changes difficult.

8) The connection between SSL on the hill, and the campus network centre that all the cables pass through, has gone through many changes over the years. One time, a substantial length of copper wire was dug up and stolen for scrap. I think it would probably be over-simplistic to describe the current connection as "one single fiber" - but I don't know the full story.
ID: 1291127 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1291129 - Posted: 4 Oct 2012, 11:04:24 UTC - in response to Message 1291125.  

There's a workaround if you are willing to jump through a few hoops. Click Account (bottom and/or top of this page) and then click on SETI@home preferences and set Use Nvidia GPU to NO.

An additional advantage of this is that so lost VLARs are not resent to the GPU and timed out immediately!

Gruß,
Gundolf

As soon as an outage is over I set all my rigs to accept CPU work only and once those requests are finished do I then return them to accepting GPU work (each PC also has a separate venue so I can adjust the settings as each one needs it).

[P/S] But it would be nice if the SETI@home science database could be turned back on.

Cheers.
ID: 1291129 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1291151 - Posted: 4 Oct 2012, 12:42:43 UTC - in response to Message 1291067.  

We're not seeing significantly more upload failures on the server side than usual from what I can tell. 20 to 30 successful uploads per second. Are there any geographic or ISP similarities for people who are having problems?

Last week I had to help BOINC along get work, going for the Retry option in the Transfers tab multiple times. Or exiting BOINC & restarting it. Or by aborting the couple of really stuck downloads.

This week however, I've yet have to help BOINC along. At the time of writing this answer there's nothing waiting to upload or download and my cache (1.00 + 0.50) is full. Perhaps that I got into Mike's cloud of no trouble. :-)
ID: 1291151 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1291168 - Posted: 4 Oct 2012, 13:24:07 UTC - in response to Message 1291127.  
Last modified: 4 Oct 2012, 13:30:26 UTC

But the main question remains unaswered:

Why each time the Ap_spliting starts we can´t DL MB?

The saturation of the Bandwith is undestanded by all, but there are some simple tricks to bypass problems like that until they do the policits work to get more band.

A simple load balancing IE something 25% for AP and 75% for MB will keep the data flow. slower OK but is better than nothing. Just to clarify i have nothing against AP, just whant to keep my host working on the searching for our little green friends.

I think somebody miss the main focus of the project, the SETI so they need to keep the SETI work (MB) running, if they could use the data and the resources to another project (AP) thats OK and makes all sense but only if they could supply both projects with work.

If you look closely, when AP is out, the saturation of the Band still at almost 100%, so the MB alone uses almost all the band avaiable, so if you try to extract more from the band problems like this must be expected.

And you could expect more and more problems as the GPU´s get faster and worldwide spreaded, a lot of new >50K RAC crunchers will apear in the next months and they need to be feeded. So any temporary solution will not last for a while.
ID: 1291168 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1291184 - Posted: 4 Oct 2012, 14:11:40 UTC

Hey - what happened to available WUs? I noticed the splitters are turned off - did they run out of tapes? Or is new code being installed (I hope) to take care of the shorty problem?
ID: 1291184 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1291221 - Posted: 4 Oct 2012, 15:46:07 UTC - in response to Message 1291067.  

We're not seeing significantly more upload failures on the server side than usual from what I can tell. 20 to 30 successful uploads per second. Are there any geographic or ISP similarities for people who are having problems?

I had a backlog of uploads (a fairly small one, compared to what others here are describing) late last week, but it was slow-but-okay (uploads lingered on the Transfers tab for over 10 seconds as opposed to normal where they're gone in under 3 seconds and often so fast I can't even see them) Sunday evening (I think). I don't, however, monitor my machines constantly or even daily -- mostly every fourth day I spend about an hour with them, and if everything is okay I use most of that time playing Internet Backgammon on my Win7 box (match is 5 points, not 3 like on XP).

I'm on AT&T U-verse in the Chicago suburbs.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1291221 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1291233 - Posted: 4 Oct 2012, 16:08:20 UTC - in response to Message 1291221.  
Last modified: 4 Oct 2012, 16:08:50 UTC

At times like these I'm glad to jump over to a backup project like Milkyway, LHC, or Cosmology to keep my machines crunching. I'll stop the backup projects when the crisis here is over


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1291233 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1291251 - Posted: 4 Oct 2012, 16:25:12 UTC

Here we go!

The science database is back online and a couple of splitter tapes are running, in the last 20 minutes.

No doubt someone started that as soon as he came in this morning.

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1291251 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1291357 - Posted: 4 Oct 2012, 19:48:03 UTC - in response to Message 1291251.  


Glad to see the splitters splitting again.
Unfortunately we still have the same problem with them as since the last outage-- they aren't splitting enough work to keep up with demand, let alone to create a Ready to Send buffer.
Grant
Darwin NT
ID: 1291357 · Report as offensive
Profile Len
Avatar

Send message
Joined: 15 Mar 10
Posts: 52
Credit: 11,725,173
RAC: 86
United Kingdom
Message 1291394 - Posted: 4 Oct 2012, 21:30:18 UTC - in response to Message 1289365.  

Yup, downloading is borked again :-(
and its taken uploading with it just to make sure we have a "good" weekend


... and its not just seti@home either.

I tried rejoining CPDN in order to get something for the CPU cores to nibble on during the day. All I'm getting there is project backoff, yet SETI is able to 'send 111 of those tasks that expire before the download is complete. This after a rash of tiny tasks that take longer to download than they do to crunch.

I could be a happier bunny I guess.
I think I am. Therefore I am. I think.
ID: 1291394 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1291431 - Posted: 4 Oct 2012, 23:44:51 UTC - in response to Message 1291067.  

We're not seeing significantly more upload failures on the server side than usual from what I can tell. 20 to 30 successful uploads per second. Are there any geographic or ISP similarities for people who are having problems?


DLs < 1kbps
ID: 1291431 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1291450 - Posted: 5 Oct 2012, 0:56:02 UTC - in response to Message 1291067.  

We're not seeing significantly more upload failures on the server side than usual from what I can tell. 20 to 30 successful uploads per second. Are there any geographic or ISP similarities for people who are having problems?


10/4/2012 6:40:43 PM | SETI@home | Computation for task 02au12af.4481.63107.140733193388039.10.32_1 finished
10/4/2012 6:40:43 PM | SETI@home | Starting task 02au12af.4481.63107.140733193388039.10.74_1 using setiathome_enhanced version 603 in slot 3
10/4/2012 6:40:45 PM | SETI@home | Started upload of 02au12af.4481.63107.140733193388039.10.32_1_0
10/4/2012 6:40:52 PM | SETI@home | Finished upload of 02au12af.4481.63107.140733193388039.10.32_1_0
10/4/2012 6:42:08 PM | SETI@home | Computation for task 02au12af.4481.63107.140733193388039.10.44_1 finished
10/4/2012 6:42:08 PM | SETI@home | Starting task 02au12af.4481.63107.140733193388039.10.134_0 using setiathome_enhanced version 603 in slot 5
10/4/2012 6:42:10 PM | SETI@home | Started upload of 02au12af.4481.63107.140733193388039.10.44_1_0
10/4/2012 6:42:16 PM | SETI@home | Finished upload of 02au12af.4481.63107.140733193388039.10.44_1_0
10/4/2012 6:42:18 PM | SETI@home | Computation for task 01se10aa.17614.19014.140733193388041.10.63_1 finished
10/4/2012 6:42:18 PM | SETI@home | Starting task 01se10aa.17614.19014.140733193388041.10.177_0 using setiathome_enhanced version 610 (ati13ati) in slot 1
10/4/2012 6:42:20 PM | SETI@home | Started upload of 01se10aa.17614.19014.140733193388041.10.63_1_0
10/4/2012 6:42:26 PM | SETI@home | Finished upload of 01se10aa.17614.19014.140733193388041.10.63_1_0

Huge improvement :) .. all uploads completed and downloads are cruising along @ 25-40 kbps
ID: 1291450 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1291451 - Posted: 5 Oct 2012, 0:59:18 UTC - in response to Message 1291450.  

Huge improvement :) .. all uploads completed and downloads are cruising along @ 25-40 kbps

It'd just be nice if there were work to download, most of the time there isn't.

Grant
Darwin NT
ID: 1291451 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1291453 - Posted: 5 Oct 2012, 1:05:11 UTC

I think i will join GoastBusters@home I get more work from them,
up to twenty at a time.
ID: 1291453 · Report as offensive
chromespringer
Avatar

Send message
Joined: 3 Dec 05
Posts: 296
Credit: 55,183,482
RAC: 0
United States
Message 1291455 - Posted: 5 Oct 2012, 1:09:06 UTC - in response to Message 1291451.  

Huge improvement :) .. all uploads completed and downloads are cruising along @ 25-40 kbps

It'd just be nice if there were work to download, most of the time there isn't.


That be a truth .. however in the last hour i've received 60 tasks for the cpu .. the likes i haven't seen for well over a week
ID: 1291455 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1291501 - Posted: 5 Oct 2012, 4:13:56 UTC - in response to Message 1291455.  
Last modified: 5 Oct 2012, 4:14:34 UTC

however in the last hour i've received 60 tasks for the cpu .. the likes i haven't seen for well over a week

I'm still getting work, but not that much. Probably every 4th or 5th request results in work. So my caches continue to shrink, but much more slowly than they have been.
Grant
Darwin NT
ID: 1291501 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1291603 - Posted: 5 Oct 2012, 13:29:46 UTC

A quick look at the server status page - tapes available, but splitters not splitting them, and no tasks available.

As its Friday afternoon here I'll pull up a chair, pop the top on a beer and sip it quietly...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1291603 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1291607 - Posted: 5 Oct 2012, 13:37:18 UTC - in response to Message 1291603.  

The splitters are spliting them, current result creation rate for MB is 32.5953/sec. Apparently not enough to build up any ready to send buffer, but enough to max out the bandwidth.
ID: 1291607 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1291763 - Posted: 5 Oct 2012, 20:54:15 UTC - in response to Message 1291501.  

however in the last hour i've received 60 tasks for the cpu .. the likes i haven't seen for well over a week

I'm still getting work, but not that much. Probably every 4th or 5th request results in work. So my caches continue to shrink, but much more slowly than they have been.


Although the splitters are still limited in what they can produce, luckily most of the current WUs aren't shorties, so my cache has actually grown overnight. Not enough to get a cache of CPU work (i'm always only hours from running out), but at least my cache of GPU work has stopped shrinking.
Now if they could just crank the splitters up a couple of notches...
Grant
Darwin NT
ID: 1291763 · Report as offensive
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (77) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.