Rescheduling - good practice if applied properly

Author	Message
Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1478778 - Posted: 18 Feb 2014, 15:05:52 UTC In this thread I will provide arguments that defend thesis put in topic title. But before some preliminary statements that will help to understand what purpose of this post and why it's needed. 1) there is topic on this boards with directly opposite name. Cause I don't agree with thesis put in its title I decided no longer bump that thread with attempts to explain why it's wrong thesis. Better to support right one that argue against wrong one IMO. 2) this topic not about credits in any their application. If you wanna discuss influence of rescheduling on current "CreditScrew" system - go elsewhere or create own topic, it's easy to do, but not pollute this thread, please. Warning given. Credits discussion is offtopic here. 3) discussion in opposite thread gone to personal insults so it's another reason to start fresh. Lets try to stay civilized one more time. There is quite a lot of misunderstanding in this area that rises in local urban myths day by day. Lets consider some of them: 1) "if all <smth. to do>" First of all, if all do the same at least sometimes we will live in completely different world. Such "if all .." are pointless in very their essence. 2) "don't break the limits, it's not fair"; Let's consider why limits are needed. There are some hosts with too old BOINC client to properly handle cache settings. Such hosts download thousands of tasks. Such situation will result in deadlines missing, and even not by user fault. There are another possible situation when host can (maybe even intentionally) download much more tasks that it can handle in time. Disregarding the reason of such situation all of them lead to increase in average turnaround, waste server network resources and bloat database. Going to higher BOINC version would "kill" not only such hosts (but not all), but many set-and-forget ones that still produce fully valid work in time. Hence it's bad option for project. Instead to prevent excessive database bloat from such "rogue hosts" limits were imposed on number of tasks per task type. Here we see very clumsy approach that limits not tasks per host and nopt tasks per device (both these approaches would be better than implemented), but we have what we have until someone finds free time to make improvement. Limits were chosen to severely cut rogue hosts leaving "room to breathe" for others. Unfortunately, this "room to breathe" hugely differs for midrange hosts and hosts with powerful ATi GPUs on board. As one can see here nothing about "fairness", it's about to draw threshold somewhere. Cause more differential approaches would require much more time investments to implement. As another consequence: limits hardly will be increased. This will increase share of tasks going to misbehaved hosts. Now to original topic. First of all, lets precisely discriminate between cherry-picking and rescheduling. What is cherry-picking means in this post: to process only some work and abort other (already recived!) work. It's obvious that rescheduling differs. There is no any task abortion, no additional server resources consumed, just client resources rearranged in better way. Hence those who place in the same line cherry-picking and rescheduling are obviously wrong. Also, rescheduling should be applied "properly". What "properly" means in this context: not to collect more tasks that host can process in time. Doing so (collecting more tasks than host can process before task deadline) moves host to very same cathegory limits applied to prevent from. And exactly in this case to circumvent those limits is wrong thing (deadline missing will lead to additional server resources consumption just as cherry-picking and misbehaved hosts do). But until all deadlines met there is no issues with hosts going above the imposed limit of 200 tasks. We know that results stored in database about 1 day. So, high-performance host will "bloat database" for 1 day much more than host with increased number of tasks in cache but lesser throughput. Also, if we take into account how little the number of reading these boards and how little number of those who will spend his/her time to additionally optimize their hosts via rescheduling we will see that argument of database bloating is completely baseless (until rescheduling applied inproperly and put host into malicious hosts cathegory). All this was why it's not bad. Now why it's good: different types of workload better suit for different devices. But not all work distributed constantly. Hence, with imposed limits on number of tasks per device type there can (and will) be situation when device forced to be idle or to produce (if app exists at all) work that less fits to it. In such cases rescheduling can help with keeping device busy. And again, why not to just remove limits - look first part of this post, limits are needed but not to make some "fair" work distribution. They are needed to prevent database bloating (and other project resource waste) by large number of uncontrolled hosts with too old BOINC clients and (maybe) even against some malicious hosts that download tasks and not return (!) results intentionally. What changes in project configs are possible to reduce the need of rescheduling? with more constant flow of all types of work the need in rescheduling will be reduced greatly if not removed completely. This could be done by reducing number of splitters for AstroPulse tasks. And some example from not so far history: With introduction of CUDA app and devices we met situation when some type of supplied work was too inappropriate to this computing devices. Performance was awful even driver crashes with required full system reboot happened. So, the need arose to reschedule those type of work. Project has too big inertion to handle this situation properly in time hence there was opt apps written (partly by me) that get rid of this issue. Initial implementation was so called VLARkill mod that was just lesser evil. It did some kind of reschedule but in quite wasteful way, by aborting VLAR on GPU. Nethertheless this mod was widely distributed cause it was the only way to handle this issue on that moment. Later I developed more advanced approach, so called team-mod, that starts CPU app if VLAR detected instead of GPU one. This approach was better than VLARkill in some aspects but still far from perfect solution. Later few different reschedulers appeared that solve this issue even in more optimal way (though still no perfect). And only finally, after many months (!) project itself adapted to situation: .vlar type was formed with restriction to send it to GPUs. But again, this mean in its current implementation is not perfect - there are ATi devices that can handle VLARs OK but now don't recive them. So, still room for improvement even in this problem that lives with us few years already. What this example should illustrate: no need to consider current project config as "God's revelation". It can be unoptimal, and it's always some balance of "lesser evils" that should be set globally (like not to send VLARs to ATi GPUs too or limit even multi-GPU hosts to 100 tasks too) but can be improved locally. Non-optimal project-wide configuration is result of tradeof between project's complexity and number of real peoples involved in management. In such situation "good enough" is all that we can hope to get on global scale. But nothing bad in making that "good enough" better on local scale, on own particular hosts. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1478778 ·

bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0	Message 1478880 - Posted: 18 Feb 2014, 23:26:00 UTC - in response to Message 1478778. Last modified: 18 Feb 2014, 23:26:38 UTC Excellent. Thank you for starting this thread. I would like to point out, one more time, that some CUDA cards can handle VLARS. If there ever comes a time when the option is made available to handle VLARS with a GPU (OpenCL or CUDA) I would also like the option to reschedule VLARS to my GPU when it has no other work from SAH to do. ID: 1478880 ·

Claggy Volunteer tester Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4	Message 1478883 - Posted: 18 Feb 2014, 23:40:15 UTC - in response to Message 1478880. Last modified: 18 Feb 2014, 23:40:51 UTC I would like to point out, one more time, that some CUDA cards can handle VLARS. If there ever comes a time when the option is made available to handle VLARS with a GPU (OpenCL or CUDA) I would also like the option to reschedule VLARS to my GPU when it has no other work from SAH to do. But very inefficiently, and that inefficiency effects the APR, lowering it and the app overall efficiency project wide, and that effects you know what, which is outside the scope of this thread. Once the Cuda app pulsefinding code is improved, and the autocorrection code too (that is ineffient too for different reasons), then the project may just remove the the don't send VLARs to GPUs, but that will depend on how much improvement x42 brings. Claggy ID: 1478883 ·

draco Volunteer tester Send message Joined: 6 Dec 05 Posts: 119 Credit: 3,327,457 RAC: 0	Message 1479053 - Posted: 19 Feb 2014, 9:04:55 UTC - in response to Message 1478883. i think, all that situations be start, because there ( as everywhere another things) is a people, who race for RACS and their positions on world tables. in another definition, they goal is not to participate in SETI, and help analyze signals, etc, but get a highest possible score - in any way. in that, if there is eliminate scoring, that people go away from that project, and many related to cheating and so on, problems, too go away automatically. ID: 1479053 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1479054 - Posted: 19 Feb 2014, 9:13:30 UTC I'm just totally amazed by people who get caught with their hands in the cookie jar taking more than they should and then trying to justify their actions. :-O ID: 1479054 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1479056 - Posted: 19 Feb 2014, 9:25:29 UTC - in response to Message 1479054. I'm just totally amazed by people who get caught with their hands in the cookie jar taking more than they should and then trying to justify their actions. :-O Amen, brother. I guess they think their RAC is worth more that their reputation. At least in my book, it is not. 'Nuff said here, I shall abdicate further opinions on this subject. You all know where the kitties stand. Now, here is the question. 'What would the kitties do?' Meow meow meow. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1479056 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1479057 - Posted: 19 Feb 2014, 9:35:09 UTC - in response to Message 1479053. Last modified: 19 Feb 2014, 9:45:45 UTC i think, all that situations be start, because there ( as everywhere another things) is a people, who race for RACS and their positions on world tables. in another definition, they goal is not to participate in SETI, and help analyze signals, etc, but get a highest possible score - in any way. in that, if there is eliminate scoring, that people go away from that project, and many related to cheating and so on, problems, too go away automatically. Yes, most of misunderstanding about rescheduling goes from unjustified actions targeted to RAC maximizing. That's why I completely refuse to discuss RAC-related topics here. Applied wisely rescheduling can help to improve throughput w/o negative influence on server resources. It should not be mixed with so-called cherry-picking that negatively affects server resources and even can harm project science (there is limit of failures for any given task so if some unfortunate task happened to be placed on few cherry-picking hosts and aborted few times it can be marked as bad one and not processed at all). Rescheduling should be applied only in boundaries that ensure in time processing of all aquired tasks. Particular values of such boundaries can vary hugely between hosts. There is too big variation in host performance through project for any hardwired common limit to be effective. There are hosts with throughput of few or less tasks per day while others can produce hundrets if not thousands results every day. Hence, the particular size of cache should not be subject for concerns. The question is have host deadline misses or task abortions or not. These and only these parameters differentiate good ones from bad ones. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1479057 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1479060 - Posted: 19 Feb 2014, 9:46:55 UTC Last modified: 19 Feb 2014, 9:47:21 UTC Raistmer, I shall not diss you, because you have done great things in the programming bit.............. But, I still think you are amiss in you opinion of rescheduling ad infinitum just to get a large AP cache. It just sucks. Most of the major contributors to the project do so without any such games. Get real. I could get hundreds upon hundreds of AP tasks if I chose to cheat. You are saying that I am not doing the best I can for the project because I don NOT? Wake up and smell the bullshit, Raistmer. There is nothing honest about cheating and accumulating more AP or any other tasks than what the project currently allows by limits. Anybody that does it cheating, and cherry picking what they wish to accumulate for future processing. End of line, end of bullshit. I shall NOT. Mewofingwow. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1479060 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1479066 - Posted: 19 Feb 2014, 9:56:22 UTC And regarding proposal to remove scoring completely. As with almost anything there are positive outcomes and negative ones. What I see positives: It will greatly diminish number of misbehaving hosts so number of result abortions and deadline missings should go down. Negatives: As was mentioned, competition-oriented peoples w/o real interest in sience of this project will leave it taking away their computational resourses. Very initial goal of scoring system was to attract such peopels and their resources. So, some balance needed to be realized - what has more weight, harm from too excessive competition or additional computational resources that competition brings to project. To realize this balance statistical data for project at all required and maybe even some sociological surveys (to understand what part of computational power of project comes purely from competition). So, there is no obvious unswer as one could think. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1479066 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1479068 - Posted: 19 Feb 2014, 9:56:58 UTC msattler, please stop to pollute thread. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1479068 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1479069 - Posted: 19 Feb 2014, 9:59:31 UTC - in response to Message 1479068. Last modified: 19 Feb 2014, 10:04:46 UTC msattler, please stop to pollute thread. What are you saying?/??? I made an honest post about my opinion about rescheduling./ I appreciated your programming efforts, so why make a comment like that to me?????? You really do not want to piss ME off now. Get on with your argument, but do not piss on ME if I disagree with it, Raistmer. I have not dissed you. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1479069 ·

draco Volunteer tester Send message Joined: 6 Dec 05 Posts: 119 Credit: 3,327,457 RAC: 0	Message 1479071 - Posted: 19 Feb 2014, 10:03:58 UTC as i see a situation: it be very bad, if host take more workunits, than it can process before deadline. if it process all work faster than deadline come, and not get any aborted, and so on tasks - i do not see any harm to project, or in another directions. and vice versa - if host has aborted tasks, not process it until deadline reaches, and so on - not important, what is reason - it is a bad thing at all. as i understand Reistmaster do not have problems with workunit processing? then about what you blame it? you want to play games as in kindergarten? "Archy get larger showel than me!?" :D get more adult... sorry for hard opinion, but that is i can see... ID: 1479071 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1479072 - Posted: 19 Feb 2014, 10:06:38 UTC - in response to Message 1479060. Last modified: 19 Feb 2014, 10:07:28 UTC Well, through quite rough expressions from offtopic post there is one worth to discuss. I could get hundreds upon hundreds of AP tasks if I chose to cheat. You are saying that I am not doing the best I can for the project because I don NOT? Author of this sentence shows complete misunderstanding of purpose of rescheduling. Why? 1) hording as many AP tasks as possible should never be goal, I never call to such actions. The sense of rescheduling is to keep all devices busy with work it does best. 2) AFAIK author of that sentence uses mostly NV cards. Perhaps he missed my recommendations about what is best for what computational resource. If he would stop express his opinion in rude form and look at provided link he would see that for NV GPUs I recommend CUDA MB processing, though it will negatively affects host RAC ! Missed that? Well, read more carefully then. http://lunatics.kwsn.net/2-windows/what-is-best-hardware-for-what-seti-application.0.html SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1479072 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1479073 - Posted: 19 Feb 2014, 10:10:31 UTC - in response to Message 1479071. as i see a situation: it be very bad, if host take more workunits, than it can process before deadline. if it process all work faster than deadline come, and not get any aborted, and so on tasks - i do not see any harm to project, or in another directions. and vice versa - if host has aborted tasks, not process it until deadline reaches, and so on - not important, what is reason - it is a bad thing at all. as i understand Reistmaster do not have problems with workunit processing? then about what you blame it? you want to play games as in kindergarten? "Archy get larger showel than me!?" :D get more adult... sorry for hard opinion, but that is i can see... Well, perhaps my english more easely to understand to non-native (I suppose you are non-native english speaker as I am) english speakers than for native ones. Or maybe some persons just can't read carefully and take some time to think about just read. And yes, I tried to say what you are saying in little different wording, thanks. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1479073 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 1479074 - Posted: 19 Feb 2014, 10:11:37 UTC Although I believe the current work limits to be low, I would not reschedule any work I have, even if I could. The SETI servers have got me pretty much dialed in as far as estimated time to complete a task. Rescheduling work would mess this up. I would like to see the current limits set by SETI to be double what is currently. For those who do reschedule work I would prefer that it not be done as it reportedly screws with credits awarded. However as long as work is not aborted and is processed and returned to the project before the deadline there remains very little grounds for objections. In short, I am in favor of getting the work properly crunched and reported. Credit granted is secondary as far as I am concerned. Science is my primary goal here. Boinc....Boinc....Boinc....Boinc.... ID: 1479074 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1479077 - Posted: 19 Feb 2014, 10:17:35 UTC I am done here. My opinion has been made quite clear. Anybody that reschedule and garners more the the current limits is a cheat. They may do what the scheduler allows them to get away with but I shall not respect them/. Final word, and you can take from it what you my, Raistmer. There is no excuse or acceptance from me for anybody rescheduling to garner more AP tasks in their caches. I stand by the servers sending me what is right for my rigs, and shall crunch WHATEVER they send me. Believe me.........I could have hundreds of thousands of AP bits of work in my caches 'if I chose to cheat'. I could probably have hundreds of thousands. But, the kitties don't work that way. Have a good day, Raistmer. Done here. Quite done here. Meow. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1479077 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1479078 - Posted: 19 Feb 2014, 10:24:29 UTC - in response to Message 1479074. Last modified: 19 Feb 2014, 10:25:28 UTC Although I believe the current work limits to be low, My first reaction on limits was "to remove them completely" when I first was hit by them. After more careful thinking about their purpose I realise why they needed (initially I was deceived by urban myth that they are for "fair work distribution" that I considered as nonsense. They are needed for another reason, I described that reason in first post.) And hence, it's some balance between work will be trashed by misbehaved hosts and work will not be recived by some high-performance AP oriented hosts. Where to draw the line - hard question, I suppose limits could be rised (or, at least they can be implemented per device, not per type basis) but w/o statistics for misbehaving hosts it's hard to be sure that this opinion will be good for project. The SETI servers have got me pretty much dialed in as far as estimated time to complete a task. Rescheduling work would mess this up. Thanks, it's strong objection against rescheduling. But lets answer on this via case study. My HD6950 host does AP tasks rescheduling for some time already. I will report if time estimation are screwed for it or not in this thread. Would be good to understand what can prevent estimate distortion if any. In short, I am in favor of getting the work properly crunched and reported. Credit granted is secondary as far as I am concerned. Science is my primary goal here. As I do. Rescheduling requires additional operator time and efforts, would be nice if it will not be needed. Hope project config will be adapted in time. One of proposed methods for it - to reduce number of AP splitters. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1479078 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1479082 - Posted: 19 Feb 2014, 10:50:49 UTC The top users on the project live by the limits It'd be nice if that was true Mark, but andybutt is another up to no good as well as a few others that run themselves and/or their rigs anonymously. :-( ID: 1479082 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1479085 - Posted: 19 Feb 2014, 10:57:54 UTC - in response to Message 1479082. Last modified: 19 Feb 2014, 10:59:21 UTC The top users on the project live by the limits It'd be nice if that was true Mark, but andybutt is another up to no good as well as a few others that run themselves and/or their rigs anonymously. :-( Anonymously?? Well, it's nice for them to remain hidden. I, personally, reaveal all. I hide nothing from nobody. All specs of all my rigs can be viewed. All tasks I have in cache are revealed. I simply have nothing to hide from anybody on this project. Ask me anything you want. If I can remember, I shall tell you. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1479085 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1479086 - Posted: 19 Feb 2014, 11:02:53 UTC Thread requires moderation. Actually it should be totally deleted as it will only encourage other idiots to do the same thing. ID: 1479086 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.