Panic Mode On (98) Server Problems?

Message boards : Number crunching : Panic Mode On (98) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · Next

AuthorMessage
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1703695 - Posted: 21 Jul 2015, 11:50:51 UTC - in response to Message 1703690.  
Last modified: 21 Jul 2015, 11:52:52 UTC


If you can stand before the angry mob and say 'yes I did it, I had my reasons' - well. But don't be amazed if you get lynched before you managed to explain your reasons.


LoL, I thought westerns became more civilized since those times, that illusion your mass media try to embed in our minds :P

Regarding topic - time estimates as screwed as credit are, does anybody not know this since last app updates?
Anyway host-related effect if any will occur much sooner than project-wide so no reason to vary anyway. Also, you mentioned credit lowering as result not as putative but as fact. Links, please.
In general, moving task from one device to another roughly similar to change properties of device itself.
Few examples: CPU overclocking, GPU underclocking. If underclocking too big indeed, task could be aborted due to not meeting BOINC's expectations for its duration. Well, even taking too much vitamins can kill you as any other mindless action. Всё хорошо в меру (~Measure is a treasure) they say.
ID: 1703695 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703697 - Posted: 21 Jul 2015, 11:55:40 UTC - in response to Message 1703695.  


If you can stand before the angry mob and say 'yes I did it, I had my reasons' - well. But don't be amazed if you get lynched before you managed to explain your reasons.


LoL, I thought westerns became more civilized since those times, that illusion your mass media try to embed in our minds :P

Regarding topic - time estimates as screwed as credit are, does anybody not know this since last app updates? ...


Well we might disagree on many things, but I can proudly say this is one we agree on. Nice to see state of the art east-west chaos has some common ground :D
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703697 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1703702 - Posted: 21 Jul 2015, 12:05:52 UTC - in response to Message 1703695.  
Last modified: 21 Jul 2015, 12:14:39 UTC

BTW, speaking about technical consequencies worth to mention top + low-end same vendor GPUs host config.
In such config "re-scheduling" happens on task by task 24/7 basis cause as was said huge number of time BOINC had not device-oriented design. Being non device-oriented it must be fault-tolerant to such configs via broadening acceptable range of deviations from expected values.
We know that low-end GPU devices roughly similar if not slower than modern CPU core.
Hence, there should be no distinction in CPU+GPU with re-scheduling versus low-end GPU + top GPU configs. re-scheduling between CPU core and GPU can't deceive BOINC estimation mechanism more than host with low-end + top GPU (not nessesary top, just low-end GPU and "average" GPU will still fit into my logic).
And hence, either BOINC stable versus re-scheduling or it just can't handle naturally occuring hosts (that, obviously, requires fix).
ID: 1703702 · Report as offensive
Profile BANZAI56
Volunteer tester

Send message
Joined: 17 May 00
Posts: 139
Credit: 47,299,948
RAC: 2
United States
Message 1703706 - Posted: 21 Jul 2015, 12:34:11 UTC - in response to Message 1703690.  

I would discourage from rescheduling in general - you don't want to rock the boat more than you absolutely have to.


Reading deep between the lines here, dangerous I know, but perhaps some feel that a serious rocking of the boat is the only way to inspire change or improvement?

Besides isn't the religious manta here all about how S@H participants are discouraged from caring about their RAC and/or overall credit?
No toasters for you soup guy...

(e.g. If RAC/Credit is really that important, one should be running Collatz or the like where the payout is much much better.)
ID: 1703706 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34398
Credit: 79,922,639
RAC: 80
Germany
Message 1703709 - Posted: 21 Jul 2015, 12:39:40 UTC

Richard, please.


With each crime and every kindness we birth our future.
ID: 1703709 · Report as offensive
qbit
Volunteer tester
Avatar

Send message
Joined: 19 Sep 04
Posts: 630
Credit: 6,868,528
RAC: 0
Austria
Message 1703723 - Posted: 21 Jul 2015, 13:30:19 UTC

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.
ID: 1703723 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703727 - Posted: 21 Jul 2015, 13:35:55 UTC - in response to Message 1703723.  
Last modified: 21 Jul 2015, 13:36:48 UTC

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.


Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703727 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14680
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1703730 - Posted: 21 Jul 2015, 13:43:55 UTC - in response to Message 1703727.  

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.

Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed.

When did I collect estimated runtimes? I'd forgotten that.

Outurns, yes. But estimates would be tricky.
ID: 1703730 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703734 - Posted: 21 Jul 2015, 13:48:50 UTC - in response to Message 1703730.  

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.

Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed.

When did I collect estimated runtimes? I'd forgotten that.

Outurns, yes. But estimates would be tricky.


What's an "outturn" ?,You may or may not be describing a number literally representable in engineering terms. My closest english language description would be 'best estimate', whether it be an exact figure or some noise function. In the cases you presented at the time, they were Gaussian in appearance, though most recently appear as log-normal like Eirc had projected.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703734 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34398
Credit: 79,922,639
RAC: 80
Germany
Message 1703735 - Posted: 21 Jul 2015, 13:51:38 UTC - in response to Message 1703727.  
Last modified: 21 Jul 2015, 13:52:05 UTC

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.


Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed.


Or just a little bit of cheating.
I`m using some sort of mirroring technique to fix this.


With each crime and every kindness we birth our future.
ID: 1703735 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703736 - Posted: 21 Jul 2015, 13:54:16 UTC - in response to Message 1703734.  

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.

Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed.

When did I collect estimated runtimes? I'd forgotten that.

Outurns, yes. But estimates would be tricky.


What's an "outturn" ?,You may or may not be describing a number literally representable in engineering terms. My closest english language description would be 'best estimate', whether it be an exact figure or some noise function. In the cases you presented at the time, they were Gaussian in appearance, though most recently appear as log-normal like Eirc had projected.


Nevermind, I get it. Yeah the numbers you yield ( I guess outurns) stimulate the system. The predictive behaviiour is the domain of creditnew, and engineering. That should cover it I think.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703736 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703737 - Posted: 21 Jul 2015, 13:55:50 UTC - in response to Message 1703735.  

Maybe, just maybe, they should do their own "credit" system, like vLHC has for example.


Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed.


Or just a little bit of cheating.
I`m using some sort of mirroring technique to fix this.


Yep, I think Raistmer might agree there is no black or white here too :)
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703737 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14680
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1703739 - Posted: 21 Jul 2015, 14:11:08 UTC - in response to Message 1703734.  

What's an "outturn" ?

How things turn out after the event, compared to what you estimated before you started. Perhaps more familiar in a financial control context: "It cost how much ??? - you said it would only be tuppence-halfpenny".
ID: 1703739 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703741 - Posted: 21 Jul 2015, 14:16:24 UTC - in response to Message 1703739.  
Last modified: 21 Jul 2015, 14:19:56 UTC

What's an "outturn" ?

How things turn out after the event, compared to what you estimated before you started. Perhaps more familiar in a financial control context: "It cost how much ??? - you said it would only be tuppence-halfpenny".


So estimate quality ? OK yep. Under our controlled Albert conditions, you saw > 37% variation in awarded credit, despite relatively constant runtimes. That reflects directly on the remaining variability, the estimate quality. I don't know for certain, as far as certainty goes in these times, but I'm pretty sure that you could predict the runtimes and proper credit heuristically on the back of an envelope, better than a robot should be able to do. at least as good or better. That's an algorithm, better than creditnew, achieving the intended purpose.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703741 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14680
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1703742 - Posted: 21 Jul 2015, 14:25:11 UTC - in response to Message 1703741.  

What's an "outturn" ?

How things turn out after the event, compared to what you estimated before you started. Perhaps more familiar in a financial control context: "It cost how much ??? - you said it would only be tuppence-halfpenny".

So estimate quality ? OK yep. Under our controlled Albert conditions, you saw > 37% variation in awarded credit, despite relatively constant runtimes. That reflects directly on the remaining variability, the estimate quality. I don't know for certain, as far as certainty goes in these times, but I'm pretty sure that you could predict the runtimes and proper credit heuristically on the back of an envelope, better than a robot should be able to do. at least as good or better. That's an algorithm, better than creditnew, achieving the intended purpose.

Ah, credit - yes (that's an outturn). I thought you were guiding us to concentrate on runtime estimates instead - which would be easier to do at Albert, where tasks were essentially identical. Initial runtime estimates here are further complicated (for MultiBeam only) by the AR --> fpops calibration curve only being fine-tuned (and not very fine, even then) for stock CPU apps.
ID: 1703742 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1703765 - Posted: 21 Jul 2015, 19:03:36 UTC - in response to Message 1703742.  
Last modified: 21 Jul 2015, 19:04:40 UTC

And the free fall continues, still struggling to get any work for 2 of my computers. Maybe 20 here and there but no where near a full cache...

edit..

Ok, looks like 1 struck gold, the other still is struggling
ID: 1703765 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703785 - Posted: 21 Jul 2015, 20:52:09 UTC - in response to Message 1703742.  
Last modified: 21 Jul 2015, 21:25:29 UTC

Ah, credit - yes (that's an outturn). I thought you were guiding us to concentrate on runtime estimates instead - which would be easier to do at Albert, where tasks were essentially identical. Initial runtime estimates here are further complicated (for MultiBeam only) by the AR --> fpops calibration curve only being fine-tuned (and not very fine, even then) for stock CPU apps.


Well yes. that too. At the risk of seeming very Californian, I'll wave my arms wide and say "It's all connected.... dude", lol

[Edit:] to be fair on the system, closed loop control well tuned seems to be capable of giving pretty good estimates (sometimes to the second) without the addition of a transfer function by workunit parameters, which is also a reflection on the base estimate quality (which for MB does include a transfer function by AR already). I suspect with good tuning and that kindof compensation added it could seem downright spooky to a casual observer, but isn't really more complex. It'd just reflect a deeper understanding of the system, to the point that it might not be ready for yet.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703785 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703799 - Posted: 21 Jul 2015, 21:51:53 UTC - in response to Message 1703526.  

Now I'd be happy to get VLARs to my NVIDIA cards.
Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time)
(Fake they are ATI/AMD ...)


you could pretend they are CPU instead.
And revive my old @teammod@ to process both CPU and GPU apps via CPU-only BOINc scheduling. Good old days before BOINC even know what GPU is :DDD


I thought of that. I remember seeing a bit of code that determined what GPU to use by looking which one had most free memory or something similar.

I coded once a solution that had a variable in shared (CPU) memory and an increment counter modulo N (N=number of GPUs) to put the next task to that GPU..

But none of that is not necessary since the normal MB work is flowing in again.

I used the down-time to upgrade my Linux (Ubuntu) version to a newer one.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703799 · Report as offensive
Profile jason_gee
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 24 Nov 06
Posts: 7489
Credit: 91,093,184
RAC: 0
Australia
Message 1703802 - Posted: 21 Jul 2015, 21:59:54 UTC - in response to Message 1703799.  

I thought of that. I remember seeing a bit of code that determined what GPU to use by looking which one had most free memory or something similar.
That's cool :) I think adaptive and heterogeneous things are still a bit scary for lots of people, but to me if it saves micromanaging so I have more special beer time, then it's good.
"Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions.
ID: 1703802 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1703811 - Posted: 21 Jul 2015, 22:16:32 UTC - in response to Message 1703487.  

My version has still some accuracy problems.


Still didn't find the complete story there, though have the full team winding up to put each bit in and see what breaks (watch that Github soon). The Chirp explains a little, but not all of the issue. It'll be interesting what falls out the next few weeks in the background.


My answer is off topic. Not a server problem. I'll pm.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1703811 · Report as offensive
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · Next

Message boards : Number crunching : Panic Mode On (98) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.