Panic Mode On (83) Server Problems?

Message boards : Number crunching : Panic Mode On (83) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 21 · Next

AuthorMessage
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1362538 - Posted: 29 Apr 2013, 15:34:56 UTC - in response to Message 1362275.  

Never mind, must have been traffic gettin in the way, downloads completed.

Been getting the occasional project backoff on downloads for a couple of weeks now, but haven't been able to spot any particular patterns.

I got an AP assigned last night and was watching the transfers tab and it took over 30 seconds for it to actually start receiving data, and when it did, I received it at 3MB/sec. I was actually waiting for it to time-out and retry, but then it started, finally.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1362538 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1362540 - Posted: 29 Apr 2013, 15:38:50 UTC - in response to Message 1362538.  

Never mind, must have been traffic gettin in the way, downloads completed.

Been getting the occasional project backoff on downloads for a couple of weeks now, but haven't been able to spot any particular patterns.

I got an AP assigned last night and was watching the transfers tab and it took over 30 seconds for it to actually start receiving data, and when it did, I received it at 3MB/sec. I was actually waiting for it to time-out and retry, but then it started, finally.

Well, thankfully, that's the exception rather than the rule these days.
When I happen to be checking one of my rigs and some new tasks come through, I have to be pretty fast to click the downloads tab to watch them stream in.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1362540 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1362545 - Posted: 29 Apr 2013, 15:52:12 UTC

I have noticed over the past few days that all my machines have one work unit that is in retry. When I hit the retry button they get downloaded fast. Maybe they go so fast now the server just loses track:)
[/quote]

Old James
ID: 1362545 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1362626 - Posted: 29 Apr 2013, 17:58:37 UTC - in response to Message 1362545.  


Looks like some more shorties going through the system. And once again the splitters are unable to pick up the pace. Ready-to-send buffer rapidly shrinking.
Grant
Darwin NT
ID: 1362626 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1362642 - Posted: 29 Apr 2013, 18:39:44 UTC
Last modified: 29 Apr 2013, 18:43:05 UTC

looking at the graphs of result creation for week there is a definate

dropoff on 26-apr when I first noticed all the mb_splitters were renamed pfb_splitter. Not a major dropoff but noticible.

ID: 1362642 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1362700 - Posted: 29 Apr 2013, 21:12:03 UTC - in response to Message 1362538.  

I got an AP assigned last night and was watching the transfers tab and it took over 30 seconds for it to actually start receiving data, and when it did, I received it at 3MB/sec. I was actually waiting for it to time-out and retry, but then it started, finally.


The delay you notice in downloading is not specific to AP, it has been happening to MB as well. I first noticed it about two weeks ago.

ID: 1362700 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1362722 - Posted: 29 Apr 2013, 23:37:40 UTC - in response to Message 1362700.  

time to PANIC

downloads have stopped!!!


ID: 1362722 · Report as offensive
Profile RottenMutt
Avatar

Send message
Joined: 15 Mar 01
Posts: 1011
Credit: 230,314,058
RAC: 0
United States
Message 1362724 - Posted: 29 Apr 2013, 23:47:41 UTC - in response to Message 1362722.  

rig needed a reboot. all good now:D

time to PANIC

downloads have stopped!!!



ID: 1362724 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1362725 - Posted: 29 Apr 2013, 23:48:44 UTC - in response to Message 1362722.  
Last modified: 29 Apr 2013, 23:49:05 UTC

time to PANIC

downloads have stopped!!!


Of course uploads are still jumping to the server...
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1362725 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1362737 - Posted: 30 Apr 2013, 1:33:29 UTC - in response to Message 1362700.  

I got an AP assigned last night and was watching the transfers tab and it took over 30 seconds for it to actually start receiving data, and when it did, I received it at 3MB/sec. I was actually waiting for it to time-out and retry, but then it started, finally.


The delay you notice in downloading is not specific to AP, it has been happening to MB as well. I first noticed it about two weeks ago.

I figured it applied to everything, but AP is more noticeable to see what happens since the WU is much larger and takes longer to download. Pretty much one update interval for MB and the WU is gone, but at least with AP, it takes 3-5 UI update intervals before it finishes.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1362737 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13722
Credit: 208,696,464
RAC: 304
Australia
Message 1362842 - Posted: 30 Apr 2013, 9:10:19 UTC - in response to Message 1362737.  


Almost out of CPU work. Most requests for work result in "Project has no tasks available" messages.
The splitters need fixing.
Grant
Darwin NT
ID: 1362842 · Report as offensive
andybutt
Volunteer tester
Avatar

Send message
Joined: 18 Mar 03
Posts: 262
Credit: 164,205,187
RAC: 516
United Kingdom
Message 1362858 - Posted: 30 Apr 2013, 10:41:33 UTC - in response to Message 1362842.  

same here
ID: 1362858 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1362860 - Posted: 30 Apr 2013, 10:56:12 UTC

+ 1, the splitters our next bottleneck?
ID: 1362860 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1362861 - Posted: 30 Apr 2013, 10:59:49 UTC

Maybe we are drinking the wrong bottles of Beer? Too many bottlenecks?

If we switch to Cans of Beer would that help?
ID: 1362861 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1362864 - Posted: 30 Apr 2013, 11:21:09 UTC - in response to Message 1362861.  
Last modified: 30 Apr 2013, 11:22:14 UTC

Yes, Now switching to beer can, a 24 pack is a good start, I realy belive i need some, all my hosts are running out of GPU WU.
ID: 1362864 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1362865 - Posted: 30 Apr 2013, 11:25:32 UTC

Yesterday evening I had a feeling we were going to run into problems overnight. Ya, the MB splitters seem to have been running OK but with the AP splitters not running at full capacity problems were inevitable overnight.

It seems like for a few days, things weren't too bad. But partial functionality on either of the splitters eventually results in problem. Maybe someone forgot to change tapes.

It seems that they need to have 6 splitters of each type to keep up with work unit demand.

ID: 1362865 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1362867 - Posted: 30 Apr 2013, 11:29:45 UTC

I was running low until I hit the right moment, and got 69 new tasks. Apart from one solitary mid-AR resend, every single one was a shorty. They just get snapped up too darn quickly during a shorty storm - we need mid-AR or AP to damp things down a bit.
ID: 1362867 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1362872 - Posted: 30 Apr 2013, 11:51:03 UTC - in response to Message 1362867.  

I was running low until I hit the right moment, and got 69 new tasks. Apart from one solitary mid-AR resend, every single one was a shorty. They just get snapped up too darn quickly during a shorty storm - we need mid-AR or AP to damp things down a bit.

Had a similar result with a lucky fetch. I'm out of AP and have mostly shorties, so my 83 gpu tasks sum to 2 hours of crunch time per BoincTasks. Have set up my zero resource share project for the maintenance window.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1362872 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1362877 - Posted: 30 Apr 2013, 12:09:37 UTC - in response to Message 1362867.  

I was running low until I hit the right moment, and got 69 new tasks. Apart from one solitary mid-AR resend, every single one was a shorty. They just get snapped up too darn quickly during a shorty storm - we need mid-AR or AP to damp things down a bit.

It seemed that for a few days there was sort of a nice mix of shorties, mid-ARs and AP work units. I don't know if anyone planned it that way or it was just random chance. With a decent mix, the system can almost maintain itself except until someone needs to change tapes - then it's pot luck.

ID: 1362877 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1362900 - Posted: 30 Apr 2013, 14:11:04 UTC

It is a game of hungry hungry hippos right now it would seem. With my 24 core box reporting a 3 hour queue it looks like it will be doing some PG work today once maintenance starts.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1362900 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (83) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.