Panic Mode On (88) Server Problems?

Message boards : Number crunching : Panic Mode On (88) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 21 · Next

AuthorMessage
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1509975 - Posted: 29 Apr 2014, 19:22:47 UTC - in response to Message 1509934.  

I would agree with that, but the Cricket graph shows 15 hours of download followed by 13 hours of upload. New tapes wouldn't cause that pattern to appear, would they?


Maybe - consider this scenario:

Download new data for 15 hours.
Start splitting (some of) the new data.
=> Lots of APs for a while.
Means higher rates of data (WUs) being sent out until the APs run out.
ID: 1509975 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1510287 - Posted: 30 Apr 2014, 16:15:20 UTC - in response to Message 1509975.  

I would agree with that, but the Cricket graph shows 15 hours of download followed by 13 hours of upload. New tapes wouldn't cause that pattern to appear, would they?


Maybe - consider this scenario:

Download new data for 15 hours.
Start splitting (some of) the new data.
=> Lots of APs for a while.
Means higher rates of data (WUs) being sent out until the APs run out.

The pattern is backwards for that. If we were seeing new tapes being loaded and then new work being sent, the blue line would spike first, followed by the green mass.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1510287 · Report as offensive
FeK9

Send message
Joined: 20 May 99
Posts: 40
Credit: 61,229,677
RAC: 26
South Africa
Message 1510500 - Posted: 30 Apr 2014, 22:22:18 UTC

My 'Three Kittens' are still purring...
Noli tangere circulos meos...
ID: 1510500 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1510620 - Posted: 1 May 2014, 5:49:17 UTC - in response to Message 1510287.  

Charge!!!
ID: 1510620 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1511148 - Posted: 2 May 2014, 7:19:26 UTC

The credit meter of my team is stuck for a few hours :(
It would not be a problem somewhere ?
The SSP doesn't detect any abnormality.
Houston ?
ID: 1511148 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1511173 - Posted: 2 May 2014, 7:57:28 UTC

It goes back very slowly...
ID: 1511173 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 1511246 - Posted: 2 May 2014, 13:19:03 UTC - in response to Message 1511148.  
Last modified: 2 May 2014, 13:21:21 UTC

The credit meter of my team is stuck for a few hours :(
It would not be a problem somewhere ?
The SSP doesn't detect any abnormality.
Houston ?


Hi,

This is likely because AP units are being split and distributed. Since they are larger and take much longer to complete, it becomes more likely that one will be waiting on one's wingman for a report. I find that my scores go down for a few days when AP units become available until the "pendings" catch up. This ripple effect would not occur if AP units were always available, but that's not how it is.

Cheers!
Member of the 20 Year Club



ID: 1511246 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1511276 - Posted: 2 May 2014, 14:17:06 UTC - in response to Message 1511246.  

The credit meter of my team is stuck for a few hours :(
It would not be a problem somewhere ?
The SSP doesn't detect any abnormality.
Houston ?


Hi,

This is likely because AP units are being split and distributed. Since they are larger and take much longer to complete, it becomes more likely that one will be waiting on one's wingman for a report. I find that my scores go down for a few days when AP units become available until the "pendings" catch up. This ripple effect would not occur if AP units were always available, but that's not how it is.

Cheers!

Maybe, maybe not... It's a strange day for the stats...
THX for your response
ID: 1511276 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1511412 - Posted: 2 May 2014, 17:14:57 UTC - in response to Message 1511246.  

The credit meter of my team is stuck for a few hours :(
It would not be a problem somewhere ?
The SSP doesn't detect any abnormality.
Houston ?


Hi,

This is likely because AP units are being split and distributed. Since they are larger and take much longer to complete, it becomes more likely that one will be waiting on one's wingman for a report. I find that my scores go down for a few days when AP units become available until the "pendings" catch up. This ripple effect would not occur if AP units were always available, but that's not how it is.

Cheers!

I've noticed this behavior repeatedly. When I switch from 100% MB to 100% AP, my RAC will drop noticeably (since MBs validate a lot faster than APs), but then after a couple of days it turns around and starts climbing. When I switch back to 100% MB, the RAC keeps climbing for a while then starts dropping after many of the AP units get validated. So what you have is a roller coaster as you switch between AP and MB. If the credit awarding system behaved as it should, this wouldn't happen; you would reach a plateau and only have slight variations thereafter.
ID: 1511412 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1511590 - Posted: 3 May 2014, 1:41:19 UTC - in response to Message 1511412.  
Last modified: 3 May 2014, 1:51:38 UTC

... If the credit awarding system behaved as it should, this wouldn't happen; you would reach a plateau and only have slight variations thereafter.


exactly. In engineering terms the credit system, manifesting in RAC, is 'unstable'. The modes of instability are overshoot, ringing (oscillation, roller coaster-ing), and drift. You can even see complex self-similar oscillations in successive credit awards for similar tasks.

I would slightly extend your comment and say that 'if the credit system was behaving as it should...', then it would also adapt (converge) quickly to hardware, workunit data, or application change without inducing those instability markers.

I suppose that sounds complex, but these matters are well understood in control engineering theory and practice since the 1800's, and have simple solutions. "Don't use statistics when an engineered [proven] optimal classical solution exists"
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1511590 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 1511612 - Posted: 3 May 2014, 3:23:24 UTC - in response to Message 1511590.  

While I agree that the RAC activity follow well understood models of improperly damped or controlled systems, it should be noted that, while the operator(s) control the work load, the systems capacity to perform work is completely variable. Therefore, critical damping of the system can only be accomplished by limiting the available "load" to the minimum anticipated capacity - and no one would like that.
An extreme example, limiting all users to one WU per day would smooth things out quite nicely.
Member of the 20 Year Club



ID: 1511612 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1511619 - Posted: 3 May 2014, 4:07:26 UTC - in response to Message 1511612.  
Last modified: 3 May 2014, 4:18:10 UTC

While I agree that the RAC activity follow well understood models of improperly damped or controlled systems, it should be noted that, while the operator(s) control the work load, the systems capacity to perform work is completely variable. Therefore, critical damping of the system can only be accomplished by limiting the available "load" to the minimum anticipated capacity - and no one would like that.
An extreme example, limiting all users to one WU per day would smooth things out quite nicely.


Certainly critically damped wouldn't be ideal here, too slow too respond. We want to be responsive to change at least a bit, which requires tuning. In front of natural variation (which can be damped) The main inputs at the moment are a coarse scaling error by using FPU Whetstone for SIMD (a factor of ~2-6x), sensitivity to initial conditions (estimates) and stochastic (at least non-linear non-deterministic) time between validations (formally topological mixing in the temporal domain). Together these meet the criteria under chaos theory to setup self-similar oscillation. They currently have no damping at all.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1511619 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 1511624 - Posted: 3 May 2014, 4:37:23 UTC - in response to Message 1511619.  
Last modified: 3 May 2014, 4:38:59 UTC

Wow, sorry mate, all the experience I have is with control of mechanical and electronic system in the real world. I had to have solutions that worked. The stuff I think you are talking about (and personal do not understand) is what gets applied to things like the demultiplexing of cell phone audio, which works on paper but not in practice, at least not well or I could stop saying "what? what? Please repeat that. Can you spell it? Send me an Email. Email. E M A I L... Edward, Michael, Albert, Irving, Larry... yeah - bye."
But if you have a solution, write the code and send it to the Berkeley team, I'm sure that they would like a solution as much as the rest of us. Be warned, if you are proposing a solution that varies work unit credit, you may find you will be burned at the stake.

Cheers ;^)
Member of the 20 Year Club



ID: 1511624 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1511625 - Posted: 3 May 2014, 4:45:53 UTC - in response to Message 1511624.  
Last modified: 3 May 2014, 4:57:18 UTC

Wow, sorry mate, all the experience I have is with control of mechanical and electronic system in the real world. I had to have solutions that worked. The stuff I think you are talking about (and personal do not understand) is what gets applied to things like the demultiplexing of cell phone audio, which works on paper but not in practice, at least not well or I could stop saying "what? what? Please repeat that. Can you spell it? Send me an Email. Email. E M A I L... Edward, Michael, Albert, Irving, Larry... yeah - bye."
But if you have a solution, write the code and send it to the Berkeley team, I'm sure that they would like a solution as much as the rest of us. Be warned, if you are proposing a solution that varies work unit credit, you may find you will be burned at the stake.

Cheers ;^)


Nope. I'm aware of my language barrier and apologies for that ;). This should look familiar:



They are called PID oontrollers, used as governors for a throttle. Granted the mathematical detail is horrendous, and you wouldn't be the first to want to burn me at the stake for suggesting science and engineering need to shake hands and get on with the job.

[Edit:] implementation in code is soon to be tested at albert@home, so you're arcing up for no reason.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1511625 · Report as offensive
Profile Oz
Avatar

Send message
Joined: 6 Jun 99
Posts: 233
Credit: 200,655,462
RAC: 212
United States
Message 1511628 - Posted: 3 May 2014, 4:57:41 UTC - in response to Message 1511625.  

You are a better man than I, I barely made it through calculus.

And I would not want to burn you at the stake, particularly for suggesting a solution - any solution, but after the last attempt to re-adjust things by credit value manipulation, many folks here are - shall we say - a bit "sensitive".

I say, let's just keep crunching!
Member of the 20 Year Club



ID: 1511628 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1511631 - Posted: 3 May 2014, 5:10:10 UTC - in response to Message 1511628.  

You are a better man than I, I barely made it through calculus.

And I would not want to burn you at the stake, particularly for suggesting a solution - any solution, but after the last attempt to re-adjust things by credit value manipulation, many folks here are - shall we say - a bit "sensitive".

I say, let's just keep crunching!


Good on you for having a go. You're the first to see the bigger picture that's come along in the five years since I studied this problem (which start as simple DCF divergence back then).

All that's happened is the screwed up project DCF we used to have, has been moved server side, renamed to pfc_scale, and squared adding in host_scale.

Yeah, some of us are working to get fixes through 'proper channels', and I expect there'll be many more roadblocks and witch hunts to come. The only part I find frustrating is that such a noble institution can't fund proper solutions to real problems, but then I guess that's what we're here for.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1511631 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1511634 - Posted: 3 May 2014, 5:17:59 UTC - in response to Message 1511625.  

, and you wouldn't be the first to want to burn me at the stake for suggesting science and engineering need to shake hands and get on with the job.

Engineering = applied science.
:-)
Grant
Darwin NT
ID: 1511634 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1511637 - Posted: 3 May 2014, 5:24:31 UTC - in response to Message 1511634.  

, and you wouldn't be the first to want to burn me at the stake for suggesting science and engineering need to shake hands and get on with the job.

Engineering = applied science.
:-)


I agree in principle. I happen to have a background in both Computer Science and Engineering (various). If you say that in front of a traditionally western trained theoretical physicist or mathematician though, you might want to duck.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1511637 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1511669 - Posted: 3 May 2014, 6:35:01 UTC

topological mixing in the temporal domain

Gee I love that kind of talk. :D

T.A.
ID: 1511669 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1511672 - Posted: 3 May 2014, 6:38:02 UTC - in response to Message 1511669.  

topological mixing in the temporal domain

Gee I love that kind of talk. :D

T.A.


Lol, me too. It just means 'Stirring a cup of tea'.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1511672 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (88) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.