Rescheduling - final attempt

Author	Message
William Volunteer tester Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0	Message 1479590 - Posted: 20 Feb 2014, 12:11:08 UTC Ok, guys. You're going to get another chance to discuss this controversial topic. Nicely, civil and EMOTIONFREE. I mean it. I don't want anything that smacks even remotely of flaming, insults, accusations or similar. You are perfectly entitled to your various opinions, but please state them rationally. This thread is going to see hard moderation. If you can't post without turning off your emotions first, don't. You'll only get modded. I reserve the right to have any posts removed that don't fit my criteria of an emotionfree debate. I realise this is difficult for some people. You just will have to try very hard if you wish to contribute. I reserve the right to have this thread closed if it turns out that you just cannot discuss this topic civilly and without getting upset. Think of it as a panel discussion. People who disturb the proceedings will be escorted off the premises. If the panel comes to blows, the proceedings will be halted without further ado. Before everybody starts talking, can we please have one post from each side - in favour and against - laying out your postion and reasoning in a few words. First person to use the c-word owes me a drink ;) A person who won't read has no advantage over one who can't read. (Mark Twain) ID: 1479590 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1479599 - Posted: 20 Feb 2014, 12:52:22 UTC - in response to Message 1479590. Last modified: 20 Feb 2014, 13:06:37 UTC First person to use the c-word owes me a drink ;) LOL - IÂ´m doomed... :) First of all, we need to separate "good and bad reschedulers" The first group is composed of those who need to use the rescheduling to keep their hosts working, for several real reasons, the best example of this group is when no MB apps where avaiable to his OS. They have no other option than rescheduler their jobs. So is good for the project in this case (keep more hosts crunching). There are several other who realy knows how to rescheduling and uses the rescheduler just to store more WU on their hosts but crunch the WU on the right device (GPU WU on GPU only) so in theory this still have no impact on the project. From my POV both groups are "good reschedulers", if they stay within the 200WU limit of course, and if you look with imparciality what they doing is good for the project (applause to them) And there are a third group, those who only rescheduling to optain more RAC, the "allready called cheaters". They are the ones who we need to worry about. Their host could crunch MB with no problems, then why they stockpilling to many AP WU and artificialy create a 15 or more days cache? if that is not needed after COLO. That i need to better understand, but my only clue is to obtain more RAC. Some of them (mostly anonymous) could even return all the WU crunched in time, so they could say, we are doing no harm to the project, but they forget one point, we live on a community, they have no more right than the others to DL WU. Just the 3 top anonymous reschedulers host have in their caches about 2% of the entire project avaiable WU and produces only 0,1% of the project output, thatÂ´s canÂ´t be right! And there are the 100+100 WU limit, if they are in place, they are for a reason, to keep the DB small as possible (or at least what was noticied) so build large caches who could extrapolate this limits are clearely "against" the project roules, so that is clearely wrong. In my opinion, there are no excuses to break this limits (i agree this limits s@#@!! and i allready ask to rise the a little several times on this form but that is for another thread), so anyone who break this limit is wrong no mather if is for a good cause, roules are roules and are there to follow. Just imagine, iÂ´m not telling anyone will going do that, if any of the top crunchers decide to do the same, just 1 like me could acumulate 2-5% of the entire project AP WU, some could acumulate up to 10% (just do the math), so where this will end? But we are lucky almost all the top cruncher not cheating, thats is the nice part, congrats to all. Hope i was able to start a peacefull exchange of words. ID: 1479599 ·

FeK9 Send message Joined: 20 May 99 Posts: 40 Credit: 61,229,677 RAC: 26	Message 1479637 - Posted: 20 Feb 2014, 15:55:43 UTC - in response to Message 1479599. Here in ZA the c-word is not used, instead the d-word... :) Jokes apart, I agree with 'juan BFB'. What the third group is doing is not 'Cool'... :( First person to use the c-word owes me a drink ;) LOL - IÂ´m doomed... :) First of all, we need to separate "good and bad reschedulers" The first group is composed of those who need to use the rescheduling to keep their hosts working, for several real reasons, the best example of this group is when no MB apps where avaiable to his OS. They have no other option than rescheduler their jobs. So is good for the project in this case (keep more hosts crunching). There are several other who realy knows how to rescheduling and uses the rescheduler just to store more WU on their hosts but crunch the WU on the right device (GPU WU on GPU only) so in theory this still have no impact on the project. From my POV both groups are "good reschedulers", if they stay within the 200WU limit of course, and if you look with imparciality what they doing is good for the project (applause to them) And there are a third group, those who only rescheduling to optain more RAC, the "allready called cheaters". They are the ones who we need to worry about. Their host could crunch MB with no problems, then why they stockpilling to many AP WU and artificialy create a 15 or more days cache? if that is not needed after COLO. That i need to better understand, but my only clue is to obtain more RAC. Some of them (mostly anonymous) could even return all the WU crunched in time, so they could say, we are doing no harm to the project, but they forget one point, we live on a community, they have no more right than the others to DL WU. Just the 3 top anonymous reschedulers host have in their caches about 2% of the entire project avaiable WU and produces only 0,1% of the project output, thatÂ´s canÂ´t be right! And there are the 100+100 WU limit, if they are in place, they are for a reason, to keep the DB small as possible (or at least what was noticied) so build large caches who could extrapolate this limits are clearely "against" the project roules, so that is clearely wrong. In my opinion, there are no excuses to break this limits (i agree this limits s@#@!! and i allready ask to rise the a little several times on this form but that is for another thread), so anyone who break this limit is wrong no mather if is for a good cause, roules are roules and are there to follow. Just imagine, iÂ´m not telling anyone will going do that, if any of the top crunchers decide to do the same, just 1 like me could acumulate 2-5% of the entire project AP WU, some could acumulate up to 10% (just do the math), so where this will end? But we are lucky almost all the top cruncher not cheating, thats is the nice part, congrats to all. Hope i was able to start a peacefull exchange of words. Noli tangere circulos meos... ID: 1479637 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1479670 - Posted: 20 Feb 2014, 17:39:09 UTC Last modified: 20 Feb 2014, 17:54:27 UTC I see 2 science projects competeing, SETI vs Boinc (computer science). Boinc is a great idea and if they can get credit new fixed it will be a better idea. A problem I see with rescheduling is that messes up the validity of the credits so the boinc developers don't get as good of data as they can to fix credit new. So a question is should the science of SETI precide over BOINC development? ID: 1479670 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1479672 - Posted: 20 Feb 2014, 17:43:30 UTC I'd like to repeat a few worthy concepts from the previous threads that would attempt to remove the need to Reschedule. 1) Slow down the AstroPulse Splitters to lengthen the time AP tasks are available. This could be accomplished by simply changing one AP splitter to MB. If you haven't noticed, the MB splitters are once again falling behind and the ready to send number will soon be Zero. 2) Do not load more than around 200-300 channels in the splitters at once. This will shorten the periods when AstroPulses are not available. It will also help in times when the MB splitters fall behind, as they are currently. The last time the MB ready to send was approaching Zero bringing more AP channels on line solved the problem. Just as it was before, there are currently around 250 MB channels remaining. If fewer channels had been loaded, it would be time to load more channels containing APs which would help the MB splitters maintain a buffer. 3) Change the limits to allow more GPU tasks than CPU tasks. For most hosts, 100/100 is not realistic. Ratios such as 150/50 would better suit actual workflow. Some Dual core hosts would be fine with 20 CPU tasks and 180 GPU tasks. My 2Â¢. ID: 1479672 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1479677 - Posted: 20 Feb 2014, 18:03:55 UTC I haven't joined in the previous rescheduler discussions, or even read very much of them, because I didn't think I had any interest in the topic. However, when the GPUs on my top cruncher ran out of work during this week's Tuesday outage (because of all the shorties in the queue), I realized that I do have an interest. While the GPUs burned through their allotted 100 tasks in about 3.5 hours, the CPUs (which never push the 100 task limit to begin with) still had over 30 tasks ready to start. If I could've reassigned those tasks to the GPUs, it would have kept them crunching for close to another hour, just enough to last through the outage. It's not about APs vs. MBs, or exceeding the mandated limits, or even about buying William a drink (which I see betreger has already volunteered to do), it's just about rescheduling tasks in a way that would have helped maximize my contribution to S@H by keeping my GPUs from going idle. ID: 1479677 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1479688 - Posted: 20 Feb 2014, 18:32:36 UTC - in response to Message 1479677. even about buying William a drink (which I see betreger has already volunteered to do) Huh? ID: 1479688 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1479693 - Posted: 20 Feb 2014, 18:40:26 UTC - in response to Message 1479688. even about buying William a drink (which I see betreger has already volunteered to do) Huh? First person to use the c-word owes me a drink ;) Perhaps I didn't understand William correctly, but I thought the c-word he referred to showed up 3 times in your post. :^) ID: 1479693 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1479703 - Posted: 20 Feb 2014, 18:52:19 UTC - in response to Message 1479688. Last modified: 20 Feb 2014, 19:01:36 UTC Don't worry, it's probably one of those virtual drinks. I'll take care of it for you; There, problem solved. :-) ID: 1479703 ·

David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1479714 - Posted: 20 Feb 2014, 19:07:24 UTC - in response to Message 1479703. Don't worry, it's probably one of those virtual drinks. I'll take care of it for you; There, problem solved. :-) Come over to Rocky's in the CafÃ© and get all the free drinks you want, mixed by a roo and served by a koala. In fact, I'm going to repost this nice image there. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1479714 ·

shizaru Volunteer tester Send message Joined: 14 Jun 04 Posts: 1130 Credit: 1,967,904 RAC: 0	Message 1479746 - Posted: 20 Feb 2014, 20:01:10 UTC I wanted to post this in Raistmer's thread but then it got locked so I sent him this PM yesterday (Now that this thread has opened up I'm posting it here) Since you are making a lot of sense, I had a quick think to figure out why no-one understands what you are saying. I may have found one of the reasons.... Boinc inability to enforce limits on a per-app basis. You know this well but it hasn't been mentioned? (Didn't follow Juan's thread) Example: SETI@home v7 7.00 windows_intelx86 : 50 tasks SETI@home v7 7.00 windows_intelx86 (cuda23): 200 tasks AstroPulse v6 6.04 windows_intelx86 (opencl_nvidia_100):100 tasks etc. Of course I just made the numbers up. They are not important. The fact that they should be treated differently is what is important. You and Jason (I think) have complained about this before but it was probably over on Beta... Maybe you should explain it here? Sorry if I misunderstood something! ID: 1479746 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34747 Credit: 261,360,520 RAC: 489	Message 1479753 - Posted: 20 Feb 2014, 20:13:02 UTC Last modified: 20 Feb 2014, 20:16:32 UTC From my POV both groups are "good reschedulers", if they stay within the 200WU limit of course, and if you look with imparciality what they doing is good for the project (applause to them) If they stayed within the limits there would not be any of this friction. The thing is that they're not. 1 rig has been regularly seen with over 1000 AP's in progress, 2 others regularly with over 700 AP's and a couple with over 300 AP's, so far. Cheers. ID: 1479753 ·

David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1479804 - Posted: 20 Feb 2014, 21:25:59 UTC - in response to Message 1479753. From my POV both groups are "good reschedulers", if they stay within the 200WU limit of course, and if you look with imparciality what they doing is good for the project (applause to them) If they stayed within the limits there would not be any of this friction. The thing is that they're not. 1 rig has been regularly seen with over 1000 AP's in progress, 2 others regularly with over 700 AP's and a couple with over 300 AP's, so far. Cheers. I think that Juan's point is that a lot of them DO stay within the limits. It's only a few bad apples that are tarnishing the reputation of the rest. (Kind of like the debate over gun control.) David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1479804 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1479821 - Posted: 20 Feb 2014, 22:17:25 UTC Last modified: 20 Feb 2014, 22:24:47 UTC Yes, if they stay within the limits and just use the resheduler to keep their hosts running itÂ´s my personal opinion they are not doing nothing bad for the project on the contrary, they just add more computing power to the project wich is good. ItÂ´s a simple better use of the resources each one have avaiable, some like us who use optimized apps vs stock. So i have nothing against that, on the contrary. The big argument against that was the credit mess it could cause, but the credit system is allready screwed and we all know that now, so screw a little more will make little impact on the project itself. Do not forget, there are some examples posted on the forums that shows the credit mess from resheduling is not confirmated, at least on AP WU. I just not sure if the bad apples are realy few, i find some more examples just in the 40 top hosts, but quantity not matter, always remember just one rotten apple could trash an entire bag of good apples. ID: 1479821 ·

Bernie Vine Volunteer moderator Volunteer tester Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328	Message 1479824 - Posted: 20 Feb 2014, 22:20:35 UTC - in response to Message 1479804. From my POV both groups are "good reschedulers", if they stay within the 200WU limit of course, and if you look with imparciality what they doing is good for the project (applause to them) If they stayed within the limits there would not be any of this friction. The thing is that they're not. 1 rig has been regularly seen with over 1000 AP's in progress, 2 others regularly with over 700 AP's and a couple with over 300 AP's, so far. Cheers. I think that Juan's point is that a lot of them DO stay within the limits. It's only a few bad apples that are tarnishing the reputation of the rest. (Kind of like the debate over gun control.) And I believe it was Raistmers point as well, although something may have got lost in translation. From what I have read here most seem to be in agreement, rescheduling within limits is OK Outside of that it is bad for the project and considered cheating and there should perhaps be a way of stopping it. So what is the way forward. ID: 1479824 ·

OzzFan Volunteer tester Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28	Message 1479826 - Posted: 20 Feb 2014, 22:26:46 UTC I'm left wondering if there's a way to solve this through the BOINC framework. Would it be possible to modify the BOINC client to look at the entire cache of workunits as a single pool, and assign work to whatever resource needs it at the moment it's ready to be worked on? I understand that this kind of change would be much larger than my simple question, involving lots of re-designing of several of BOINC's calculations for work-fetch, etc., but it could be done. Is this something that should be suggested to the developers (David) as a goal to work toward or something to implement in the next major release? ID: 1479826 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1479827 - Posted: 20 Feb 2014, 22:32:37 UTC - in response to Message 1479826. Last modified: 20 Feb 2014, 22:33:16 UTC DonÂ´t belive itÂ´s to hard to implement, since the servers actualy knows how many WU each hosts have pending, then block cheating itÂ´s simple. Think, the @Â¨&@%!!! limits uses almost the same function, so you just need to block the DL of new WU form a host who allready exceded the 200WU limit in their pending cache. Some tweeks could be needed to handle the crashed hosts, etc. but in the deep itÂ´s a very simple function. ID: 1479827 ·

William Volunteer tester Send message Joined: 14 Feb 13 Posts: 2037 Credit: 17,689,662 RAC: 0	Message 1479828 - Posted: 20 Feb 2014, 22:33:22 UTC My experience with making suggestions to David is that they come back and bite you in the face! A person who won't read has no advantage over one who can't read. (Mark Twain) ID: 1479828 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1479829 - Posted: 20 Feb 2014, 22:34:38 UTC - in response to Message 1479828. Last modified: 20 Feb 2014, 22:36:20 UTC My experience with making suggestions to David is that they come back and bite you in the face! Looks like we are doomed x 2!!! So the best i could do now is going to smoke a c-word and drink some beers. Cheers. ID: 1479829 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1479832 - Posted: 20 Feb 2014, 22:41:13 UTC - in response to Message 1479827. DonÂ´t belive itÂ´s to hard to implement, since the servers actualy knows how many WU each hosts have pending, then block cheating itÂ´s simple. Think, the @Â¨&@%!!! limits uses almost the same function, so you just need to block the DL of new WU form a host who allready exceded the 200WU limit in their pending cache. Some tweeks could be needed to handle the crashed hosts, etc. but in the deep itÂ´s a very simple function. Right, I'll craft a lengthy post detailing the way I see the whole situation. It'll take a while to present the things needed in a diplomatic, non-inflammatory way, but I think it'll help at this point, and explain why I've stayed out of this particular circus (until now) "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1479832 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.