Panic Mode On (98) Server Problems?

Author	Message
Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1703695 - Posted: 21 Jul 2015, 11:50:51 UTC - in response to Message 1703690. Last modified: 21 Jul 2015, 11:52:52 UTC If you can stand before the angry mob and say 'yes I did it, I had my reasons' - well. But don't be amazed if you get lynched before you managed to explain your reasons. LoL, I thought westerns became more civilized since those times, that illusion your mass media try to embed in our minds :P Regarding topic - time estimates as screwed as credit are, does anybody not know this since last app updates? Anyway host-related effect if any will occur much sooner than project-wide so no reason to vary anyway. Also, you mentioned credit lowering as result not as putative but as fact. Links, please. In general, moving task from one device to another roughly similar to change properties of device itself. Few examples: CPU overclocking, GPU underclocking. If underclocking too big indeed, task could be aborted due to not meeting BOINC's expectations for its duration. Well, even taking too much vitamins can kill you as any other mindless action. Ð’ÑÑ‘ Ñ…Ð¾Ñ€Ð¾ÑˆÐ¾ Ð² Ð¼ÐµÑ€Ñƒ (~Measure is a treasure) they say. ID: 1703695 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703697 - Posted: 21 Jul 2015, 11:55:40 UTC - in response to Message 1703695. If you can stand before the angry mob and say 'yes I did it, I had my reasons' - well. But don't be amazed if you get lynched before you managed to explain your reasons. LoL, I thought westerns became more civilized since those times, that illusion your mass media try to embed in our minds :P Regarding topic - time estimates as screwed as credit are, does anybody not know this since last app updates? ... Well we might disagree on many things, but I can proudly say this is one we agree on. Nice to see state of the art east-west chaos has some common ground :D "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703697 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1703702 - Posted: 21 Jul 2015, 12:05:52 UTC - in response to Message 1703695. Last modified: 21 Jul 2015, 12:14:39 UTC BTW, speaking about technical consequencies worth to mention top + low-end same vendor GPUs host config. In such config "re-scheduling" happens on task by task 24/7 basis cause as was said huge number of time BOINC had not device-oriented design. Being non device-oriented it must be fault-tolerant to such configs via broadening acceptable range of deviations from expected values. We know that low-end GPU devices roughly similar if not slower than modern CPU core. Hence, there should be no distinction in CPU+GPU with re-scheduling versus low-end GPU + top GPU configs. re-scheduling between CPU core and GPU can't deceive BOINC estimation mechanism more than host with low-end + top GPU (not nessesary top, just low-end GPU and "average" GPU will still fit into my logic). And hence, either BOINC stable versus re-scheduling or it just can't handle naturally occuring hosts (that, obviously, requires fix). ID: 1703702 ·

BANZAI56 Volunteer tester Send message Joined: 17 May 00 Posts: 139 Credit: 47,299,948 RAC: 2	Message 1703706 - Posted: 21 Jul 2015, 12:34:11 UTC - in response to Message 1703690. I would discourage from rescheduling in general - you don't want to rock the boat more than you absolutely have to. Reading deep between the lines here, dangerous I know, but perhaps some feel that a serious rocking of the boat is the only way to inspire change or improvement? Besides isn't the religious manta here all about how S@H participants are discouraged from caring about their RAC and/or overall credit? No toasters for you soup guy... (e.g. If RAC/Credit is really that important, one should be running Collatz or the like where the payout is much much better.) ID: 1703706 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34398 Credit: 79,922,639 RAC: 80	Message 1703709 - Posted: 21 Jul 2015, 12:39:40 UTC Richard, please. With each crime and every kindness we birth our future. ID: 1703709 ·

qbit Volunteer tester Send message Joined: 19 Sep 04 Posts: 630 Credit: 6,868,528 RAC: 0	Message 1703723 - Posted: 21 Jul 2015, 13:30:19 UTC Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. ID: 1703723 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703727 - Posted: 21 Jul 2015, 13:35:55 UTC - in response to Message 1703723. Last modified: 21 Jul 2015, 13:36:48 UTC Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703727 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874	Message 1703730 - Posted: 21 Jul 2015, 13:43:55 UTC - in response to Message 1703727. Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed. When did I collect estimated runtimes? I'd forgotten that. Outurns, yes. But estimates would be tricky. ID: 1703730 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703734 - Posted: 21 Jul 2015, 13:48:50 UTC - in response to Message 1703730. Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed. When did I collect estimated runtimes? I'd forgotten that. Outurns, yes. But estimates would be tricky. What's an "outturn" ?,You may or may not be describing a number literally representable in engineering terms. My closest english language description would be 'best estimate', whether it be an exact figure or some noise function. In the cases you presented at the time, they were Gaussian in appearance, though most recently appear as log-normal like Eirc had projected. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703734 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34398 Credit: 79,922,639 RAC: 80	Message 1703735 - Posted: 21 Jul 2015, 13:51:38 UTC - in response to Message 1703727. Last modified: 21 Jul 2015, 13:52:05 UTC Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed. Or just a little bit of cheating. I`m using some sort of mirroring technique to fix this. With each crime and every kindness we birth our future. ID: 1703735 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703736 - Posted: 21 Jul 2015, 13:54:16 UTC - in response to Message 1703734. Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed. When did I collect estimated runtimes? I'd forgotten that. Outurns, yes. But estimates would be tricky. What's an "outturn" ?,You may or may not be describing a number literally representable in engineering terms. My closest english language description would be 'best estimate', whether it be an exact figure or some noise function. In the cases you presented at the time, they were Gaussian in appearance, though most recently appear as log-normal like Eirc had projected. Nevermind, I get it. Yeah the numbers you yield ( I guess outurns) stimulate the system. The predictive behaviiour is the domain of creditnew, and engineering. That should cover it I think. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703736 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703737 - Posted: 21 Jul 2015, 13:55:50 UTC - in response to Message 1703735. Maybe, just maybe, they should do their own "credit" system, like vLHC has for example. Well that's the illusion, that it's about credit. It's actually about the task estimates the servers make, so responsible for the control of how many and when what type of tasks you get. Best numbers collected so far (incidentally by Richard) seem to suggest > +/- 30% variability on a rapid timescale under best conditions. I suspect that'll work for watering orange trees just fine, but afaik I'm less predictable than an orange tree, so better established engineering methods are needed. Or just a little bit of cheating. I`m using some sort of mirroring technique to fix this. Yep, I think Raistmer might agree there is no black or white here too :) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703737 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874	Message 1703739 - Posted: 21 Jul 2015, 14:11:08 UTC - in response to Message 1703734. What's an "outturn" ? How things turn out after the event, compared to what you estimated before you started. Perhaps more familiar in a financial control context: "It cost how much ??? - you said it would only be tuppence-halfpenny". ID: 1703739 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703741 - Posted: 21 Jul 2015, 14:16:24 UTC - in response to Message 1703739. Last modified: 21 Jul 2015, 14:19:56 UTC What's an "outturn" ? How things turn out after the event, compared to what you estimated before you started. Perhaps more familiar in a financial control context: "It cost how much ??? - you said it would only be tuppence-halfpenny". So estimate quality ? OK yep. Under our controlled Albert conditions, you saw > 37% variation in awarded credit, despite relatively constant runtimes. That reflects directly on the remaining variability, the estimate quality. I don't know for certain, as far as certainty goes in these times, but I'm pretty sure that you could predict the runtimes and proper credit heuristically on the back of an envelope, better than a robot should be able to do. at least as good or better. That's an algorithm, better than creditnew, achieving the intended purpose. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703741 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874	Message 1703742 - Posted: 21 Jul 2015, 14:25:11 UTC - in response to Message 1703741. What's an "outturn" ? How things turn out after the event, compared to what you estimated before you started. Perhaps more familiar in a financial control context: "It cost how much ??? - you said it would only be tuppence-halfpenny". So estimate quality ? OK yep. Under our controlled Albert conditions, you saw > 37% variation in awarded credit, despite relatively constant runtimes. That reflects directly on the remaining variability, the estimate quality. I don't know for certain, as far as certainty goes in these times, but I'm pretty sure that you could predict the runtimes and proper credit heuristically on the back of an envelope, better than a robot should be able to do. at least as good or better. That's an algorithm, better than creditnew, achieving the intended purpose. Ah, credit - yes (that's an outturn). I thought you were guiding us to concentrate on runtime estimates instead - which would be easier to do at Albert, where tasks were essentially identical. Initial runtime estimates here are further complicated (for MultiBeam only) by the AR --> fpops calibration curve only being fine-tuned (and not very fine, even then) for stock CPU apps. ID: 1703742 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1703765 - Posted: 21 Jul 2015, 19:03:36 UTC - in response to Message 1703742. Last modified: 21 Jul 2015, 19:04:40 UTC And the free fall continues, still struggling to get any work for 2 of my computers. Maybe 20 here and there but no where near a full cache... edit.. Ok, looks like 1 struck gold, the other still is struggling ID: 1703765 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703785 - Posted: 21 Jul 2015, 20:52:09 UTC - in response to Message 1703742. Last modified: 21 Jul 2015, 21:25:29 UTC Ah, credit - yes (that's an outturn). I thought you were guiding us to concentrate on runtime estimates instead - which would be easier to do at Albert, where tasks were essentially identical. Initial runtime estimates here are further complicated (for MultiBeam only) by the AR --> fpops calibration curve only being fine-tuned (and not very fine, even then) for stock CPU apps. Well yes. that too. At the risk of seeming very Californian, I'll wave my arms wide and say "It's all connected.... dude", lol [Edit:] to be fair on the system, closed loop control well tuned seems to be capable of giving pretty good estimates (sometimes to the second) without the addition of a transfer function by workunit parameters, which is also a reflection on the base estimate quality (which for MB does include a transfer function by AR already). I suspect with good tuning and that kindof compensation added it could seem downright spooky to a casual observer, but isn't really more complex. It'd just reflect a deeper understanding of the system, to the point that it might not be ready for yet. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703785 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703799 - Posted: 21 Jul 2015, 21:51:53 UTC - in response to Message 1703526. Now I'd be happy to get VLARs to my NVIDIA cards. Is there a way to say in app_info.xml that my cards could take a try? (2000 seconds one at a time) (Fake they are ATI/AMD ...) you could pretend they are CPU instead. And revive my old @teammod@ to process both CPU and GPU apps via CPU-only BOINc scheduling. Good old days before BOINC even know what GPU is :DDD I thought of that. I remember seeing a bit of code that determined what GPU to use by looking which one had most free memory or something similar. I coded once a solution that had a variable in shared (CPU) memory and an increment counter modulo N (N=number of GPUs) to put the next task to that GPU.. But none of that is not necessary since the normal MB work is flowing in again. I used the down-time to upgrade my Linux (Ubuntu) version to a newer one. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703799 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1703802 - Posted: 21 Jul 2015, 21:59:54 UTC - in response to Message 1703799. I thought of that. I remember seeing a bit of code that determined what GPU to use by looking which one had most free memory or something similar. That's cool :) I think adaptive and heterogeneous things are still a bit scary for lots of people, but to me if it saves micromanaging so I have more special beer time, then it's good. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1703802 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1703811 - Posted: 21 Jul 2015, 22:16:32 UTC - in response to Message 1703487. My version has still some accuracy problems. Still didn't find the complete story there, though have the full team winding up to put each bit in and see what breaks (watch that Github soon). The Chirp explains a little, but not all of the issue. It'll be interesting what falls out the next few weeks in the background. My answer is off topic. Not a server problem. I'll pm. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1703811 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.