Message boards :
Number crunching :
Question on High Priority Mode
Message board moderation
Previous · 1 · 2 · 3 · Next
Author | Message |
---|---|
Gundolf Jahn Send message Joined: 19 Sep 00 Posts: 3184 Credit: 446,358 RAC: 0 |
The thing is that we used to have EDF Mode (Earliest Deadline First) which used to work. It used to work before CUDA. So we now have 'It worked and we fixed it' and now it doesn't. It never worked after introduction of CUDA but is getting better with every new release (some more some less :-). Gruß, Gundolf Computer sind nicht alles im Leben. (Kleiner Scherz) SETI@home classic workunits 3,758 SETI@home classic CPU time 66,520 hours |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
The thing is that we used to have EDF Mode (Earliest Deadline First) which used to work. Danke, I think I had realised that. Would have thought, not being a programmer, that it was fairly easy to select project, application, task with earliest date and time (especially as this just a Julian number) |
Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0 |
The basic question remains: are tasks missing deadlines because of the behaviour you have noted? None of mine are. Or to put it in terms of software requirements: is the goal of High Priority Mode to do the closest deadlines first, or to minimize the number of WUs that miss deadlines? Those are two different requirements. |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
The basic question remains: are tasks missing deadlines because of the behaviour you have noted? None of mine are. None of my tasks are liable to miss there deadlines, but that is not really the point. As far as I can tell the BOINC Priority Mode is not working correctly, therefore there is a possibility that users will miss the deadlines. Therefore it should be fixed. As Seti on the MB tasks at least sets deadlines in accordance with the variations caused by the angle range, then the tasks that are processed fastest (VHAR) are given the shortest deadlines. This means the VHAR tasks that I downloaded on the 14th should be processed before the tasks downloaded on the 19th, they are not. That would fit also the minimise the number of tasks that would miss deadlines. Personally although I can see, it certain circumstances, that EDF and minimise number could be different. They are not, at least, for the the projects I do and have crunched for. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Run with <rr_simulation> for a while and post (part of) that log. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14656 Credit: 200,643,578 RAC: 874 |
Are any of the contibutors to this thread running BOINC v6.10.45, which should have changeset [trac]changeset:21085[/trac]: - Client: fix bug that caused wrong jobs to be run EDF |
woodenboatguy Send message Joined: 10 Nov 00 Posts: 368 Credit: 3,969,364 RAC: 0 |
The basic question remains: are tasks missing deadlines because of the behaviour you have noted? None of mine are. I apologise if the following sounds harsh but, the bolded text above is simply speculation and conclusion based on speculation. If this were one of my projects I would have to insist you come up with something factual before the project invests time into reviewing and resolving (if needed). You haven't done more than state your beliefs. That isn't the basis for action from the project. Again, I apologise for the sense this might come off communicating. Writing this instead of talking about it face to face always makes it seem like such. The intent is exactly the opposite. I am interested in what you are saying, I just don't see the evidence of it, or the acknowledgement that many more factors than those reviewed thus far are equally (if not more) important. You yourself say that tasks aren't missing their deadline. That alone must be a pretty important design requirement that's being met. The most effective utilization of resources is what I think you are getting at, but again, we don't know all the design requirements to be able to say if that is at issue. Bill Walker makes the point much clearer that I've done. Regards, |
woodenboatguy Send message Joined: 10 Nov 00 Posts: 368 Credit: 3,969,364 RAC: 0 |
Are any of the contibutors to this thread running BOINC v6.10.45, which should have changeset [trac]changeset:21085[/trac]: I am: http://setiathome.berkeley.edu/show_host_detail.php?hostid=5342012 I was waiting the usual period to see if anything untoward happened to the rig before extending it to the rest, and especially my main cruncher. Overall I hadn't noticed anything unfortunate. The box is coming back up after some changes made to it so I was also waiting for its RAC to stablize again before deciding whether things were all good before taking .45 through to the rest. Anything noticable for anyone else? [edit] grrrrrrr.... I THOUGHT I was....I had to reinstall the OS on that box and purposely downloaded .45 - I wonder how I picked up the wrong one off the network?! [/edit] Regards, |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
Are any of the contibutors to this thread running BOINC v6.10.45, which should have changeset [trac]changeset:21085[/trac]: Just looked at that host of yours. Do you have an explanation why your VHAR (AR=1.389265) task in wuid=600431810 has not been processed yet but other VHARs issued later have, like resultid=1581920884 edit] If I didn't state earlier I am using 6.10.43, the latest offical release. |
woodenboatguy Send message Joined: 10 Nov 00 Posts: 368 Credit: 3,969,364 RAC: 0 |
Are any of the contibutors to this thread running BOINC v6.10.45, which should have changeset [trac]changeset:21085[/trac]: It's the ReSchedule tool from Lunatics I believe (haven't dived completely into that but it arose once I ran it). I think what is happening is that a task "in-flight" is being caught as one to move off the GPU and onto the CPU. BOINC however is keeping the CPUs busy at that same moment however and hence the in-flight task goes to its wait-state. BOINC has then swept through thereafter and picked up a waiting task (or more I've noticed at times) and finished it. Only my speculation though as I don't know the internal workings enough to be certain. Whenever I run ReSchedule I see a few fall into that. A good example of where BOINC is doing its level best and some rascal is changing the rules on it! Regards, |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
I don't run the re-schedule tool because I do AP on the CPU and MB on the GPU. But my guess is, as your computer is showing the same symptoms as mine. Is that it is not the re-scheduler but BOINC. |
Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0 |
None of my tasks are liable to miss there deadlines, but that is not really the point. Then what is the point of High Priority Mode? |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
None of my tasks are liable to miss there deadlines, but that is not really the point. One scenario would be if a host lost power for a few days, then the VHAR tasks with their 14 day deadline could be in trouble if the host had a 10 day cache. |
woodenboatguy Send message Joined: 10 Nov 00 Posts: 368 Credit: 3,969,364 RAC: 0 |
I don't run the re-schedule tool because I do AP on the CPU and MB on the GPU. It is certainly possible. Regards, |
Bill Walker Send message Joined: 4 Sep 99 Posts: 3868 Credit: 2,697,267 RAC: 0 |
None of my tasks are liable to miss there deadlines, but that is not really the point. Right. That is exactly what happens to me quite regularly. My main cruncher is my lap top. When I work from home it may run 24 hours a day for a few weeks, with my cache filling up based on this. Then when I travel on business it may average only a few hours every other day for a week or two. That, plus running multiple projects with different ideas of what a "sensible" deadline is, means I go into Priority Mode quite often. Haven't missed a deadline in years, that I recall. My point is that, as far as I can tell, Priority Mode is what does this, even if its detailed workings aren't what we expect. If somebody is regularly missing deadlines, then we can complain to the programmers. By the way, most sources recommend shorter caches than 10 days. I'm sure there is some cache size that blows the Priority Mode assumptions, but that isn't a reason to change Priority Mode. It is a reason to change your cache size. The programmers have to cater to a number of users "styles", not just the extremes. |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
None of my tasks are liable to miss there deadlines, but that is not really the point. I have always run a cache lower than 7 days, usually two or three was my normal. The reason I have increased it was due to the difficulty of obtaining AP tasks but this was only to a max of 5 days. Until the last outage where I had to increase to 7 days to keep the quad warm. I was using BOINC 5.10.13 at this stage. Then last week, the 12th, I was upgraded a friends computer, more RAM, Win7 and graphics card. Which left two 512KB sticks of RAM and the NVidia 240 graphs card without homes. Theses items are now fitted, the gt3240 into my quad and did upgrade to cuda capable BOINC client. Reports seemed to say the offical latest release was ok. So here's me with my first CUDA card and a new version of BOINC. I managed to install the required Lunatics apps and app_info with no problems and didn't loose any work. (Definite gold star for that). Things went along quite nicely tasks were being done, I noticed the sawtooth effect on DCF as CUDA tasks drove it down slowly and AP would bash it back up again. It wasn't until about the 17th that I spotted the first Priority mode event. Two VHAR task had already been completed and on completion of the third put thing back to normal running and those VHAR tasks disappeared fast as BOINC also decided it was now short of tasks. This repeated a couple of times, sometimes I saw it but mainly it occured when I was occupied elsewhere. Then just before the first post in this thread, I saw there was a group from the same tape of VHAR's just about to be processed. Except it didn't happen. BOINC went into Priority Mode and selected VHAR tasks that had only been downloaded hours before. That should not have happened, on either of your requirements of miss tasks or minimise number of task that miss deadline the VHAR tasks closest to deadline needed doing first. |
woodenboatguy Send message Joined: 10 Nov 00 Posts: 368 Credit: 3,969,364 RAC: 0 |
I don't run the re-schedule tool because I do AP on the CPU and MB on the GPU. I brought down a new batch of tasks this morning and then reran ReSchedule. It does seem to cause a running task (presently at 89.890% complete) to get bumped to "waiting". This one: http://setiathome.berkeley.edu/result.php?resultid=1581886479 was running prior to ReSchedule. You can see in the message log that it hasn't been touched as BOINC gets restarted by ReSchedule: 21/04/2010 6:39:00 AM Starting BOINC client version 6.10.45 for windows_intelx86 A part of the environment over which BOINC has no control. Oh - this rig now has .45 as I'd thought it was loaded up with when I did the rebuild over the weekend. Regards, |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
The question is, if it went into Priority Mode, what tasks did in then run? Are they the tasks with shortest deadline? Are they VHAR If they are VHAR are there other VHAR with closer deadlines? |
woodenboatguy Send message Joined: 10 Nov 00 Posts: 368 Credit: 3,969,364 RAC: 0 |
The question is, if it went into Priority Mode, what tasks did in then run? Good questions. I did look at that briefly by sorting the list of tasks by their return date. I couldn't make a quick conclusion (headed out for work at that moment). There were tasks with earlier return dates not started and ones with slightly later dates that did start (you can see these in the log getting kicked off when BOINC was returned to running by ReSchedule). All the dates were very close but, as you've observed previously the later ones were getting priority. If I recall correctly (a bit hazy here) I think the one I spotted that was paused had an earlier date than one of the new ones that got kicked off. On another thread (http://setiathome.berkeley.edu/forum_thread.php?id=59701) there seems to be some informed feedback on the algorithm. One of the points being made there is that BOINC has some sophistication around what scheduling strategy it switches to, based on the anticipated timings of when things need to be returned by. There is no one-and-only approach but instead several it will switch between, based on events (simply getting a new download of tasks changes the landscape since we can get all sorts of new timing needs as a result). What I took from the discussion there was that it wasn't exclusively FIFO, even when it projects none of its tasks will miss their deadline. It is comfortable adjusting as events change, even when something like ReSchedule changes the game and moves something away from a speedy GPU onto a slower CPU). You are making me want to know more about this! Regards, |
W-K 666 Send message Joined: 18 May 99 Posts: 19144 Credit: 40,757,560 RAC: 67 |
Had a few words with a tired Jord, and upgraded to ver 6.10.45 and put in the cc_config.xml file as posted in msg 990648 My present theory after running for a short while is that it 'tries' to predict the run time of each task, and then if Priority Mode is triggered it will pick the shortest task. If my theory is correct, I am not convinced, at all, that it will ensure all tasks will meet their deadline. If the task with the shortest deadline is very close to that deadline, (i.e. if not done now it will miss deadline) and is not the task with the predicted shortest runtime then it will miss the deadline. Unless there is a second line of backup. I am not opposed to this method of predicting the runtime of a task but think it needs integrating with the deadline. So that a task with a later deadline but longer predicted runtime is possibly started before a task with a nearer deadline but shorter runtime. Edit] Will keep it running, but unless a gpu MB task takes longer than predicted I don't expect the computer to go into Priority Mode for about 4 hours. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.