Message boards :
Number crunching :
GUPPI Rescheduler for Linux and Windows - Move GUPPI work to CPU and non-GUPPI to GPU
Message board moderation
Previous · 1 . . . 14 · 15 · 16 · 17 · 18 · 19 · 20 . . . 37 · Next
Author | Message |
---|---|
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Thanks for the clarification Jeff. I'm surprised I never saw client_state_next.xml or heard of it. ( I don't see a reason for having 2 backups for a fraction of a second but oh well, there's probably be a good reason for this. It might be a way to keep the client_state.xml file from getting highly fragmented on HDDs. [edit] or its probably because it is much faster to simply rename a file than overwrite it with fresh content from RAM, and in effect shortening the time when something could go wrong. It would also ensure that if there is a client_state.xml then it is expected to be complete)[/e] If I understand the full-picture correctly: After Boinc has restarted, the client_state_prev.xml instance will get replaced fairly quickly. (Do you know what the triggers are for the creation of a new client_state_next.xml ?) The first time client_state_prev.xml gets replaced, it will be in effect a "better backup" since it will have the client_state.xml content from when Boinc wasn't running. But as soon as there is a second replacement then there is no backup left from before the time when Boinc was restarted. If that's the case, then client_state_prev.xml should not be relied on as a backup of client_state.xml Am I missing something again? |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
If the information message can't be removed entirely, then at least put into the help file that the message can be expected as normal if there are minimal or no tasks to move. As of now, it unnecessarily causes alarm to everyone who starts using the apps until they have more experience with them and likely have posted to the forum already why they are seeing the message and then receive the previous response from the authors Et al. I think there are 2 issues (maybe bugs) here: 1. the instance where sometimes GuppiR reports something like: ~ 0 tasks moved to GPU and 0 tasks moved to CPU ~ when there is definately something to move...and in my experience, sometimes hundreds of tasks. ( I don't have the exact quote captured anywhere but I will report my "not-in-scope" experience in a seperate post after this one); and 2. yoda51's "Error: could not determine CPU version_num from client_state. Nothing changed." From my experience, that is a bug and in this case it might be indirectly caused by yoda51's bloated app_info.xml I wrote "might be indirectly" since MrK's GuppiR...exe works well in a seperate directory with just 2 files: - client_state.xml and - sched_request_setiathome.berkeley.edu.xml (It could look for other files in the default Boinc directory that I would be unaware of from my little test). I find it very interesting that yoda51 HAD a bloated: app_info.xml and replacing it seems to have fixed the issue. I suspect a seperate source for that issue than MrK's GuppiR since it doesn't modify that file. @yoda51: if you still have a copy of the bloated app_info.xml then someone might be interested in looking at it. (not me though since I'm still learning a lot about: client_state.xml ) R :-) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . The rescheduler will only move equal numbers of tasks, that is one nonVLAR from CPU to GPU then one guppi from GPU to CPU to keep the numbers balanced. If there are no nonVLAR tasks in the CPU queue then it will not move any jobs, and the guppi tasks in your GPU queue will stay there. Stephen . |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Hi Jeff, . . That makes sense and is useful info to know. Thanks. Stephen . |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
As I mentionned in my previous post: 1. the instance where sometimes GuppiR reports something like: I have come across the GuppiR message: 0 moved... and 0 ...moved many times. Up until now, I was using MrK's GuppiR in a way that I considered "out-of-scope" since: with my semi-automated pre-MrK script, I can stash more tasks than S@h server allows but without breaking the Boinc Client's 1,000 task limit per project. This give me a <4-day stash with 1,000 tasks and it allows me to be able to measure tweaks to my setups in a quicker way (since my the tasks processed during the next 24hrs are older and therefore get validated more quickly). So back to my "out-of-scope" bug. I sometimes get the nothing moved message even when I have hundreds of nonVLARs stashed in the CPU queue that should be moved to a GPU App queue. Is it a real bug? Not until a CPU queue can exceed 100 tasks. I'm just mentionning it because I read Keith's post: ... then at least put into the help file that the message can be expected as normal if there are minimal or no tasks to move. and the output message is the same ...even if I suspect a different cause. I will put off any hardcore testing of my "out-of-scope" bug until we hear back from MrK. But I will start logging when I came across the scenario again. Cheers, RobG |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
Thanks for the clarification Jeff. It's not surprising that you've never seen a client_state_next.xml file since, unless there's a failure during the Write/Delete/Rename/Rename sequence, that file has a lifespan of perhaps a half second or so. Two lines from an old Process Monitor log show the initial file creation and final rename, with a whole lot of omitted entries in that brief interval. 11:21:09.7572316 AM boinc.exe 2904 CreateFile C:\ProgramData\BOINC\client_state_next.xml SUCCESS Desired Access: Generic Write, Read Attributes, Disposition: OverwriteIf, Options: Synchronous IO Non-Alert, Non-Directory File, Attributes: N, ShareMode: Read, Write, AllocationSize: 0, OpenResult: Created ... ... 11:21:10.3405110 AM boinc.exe 2904 SetRenameInformationFile C:\ProgramData\BOINC\client_state_next.xml SUCCESS ReplaceIfExists: True, FileName: C:\ProgramData\BOINC\client_state.xml As far as not hearing about it, well, I did bring it up in your original rescheduling thread as being something that anyone working on a rescheduling application should take into consideration. ( I don't see a reason for having 2 backups for a fraction of a second but oh well, there's probably be a good reason for this. The sequence is a very sensible and safe approach to updating such a critical file. I use the same approach with my own rescheduler. You don't want to have a WriteFile hiccup trash your most current client_state.xml, which is what would happen if you were overwriting it directly with the new file. If I understand the full-picture correctly: I don't know what the various triggers are for an updating of the client_state files, but yes, it would probably best not to count on client_state_prev.xml being the backup following a rescheduling run. Better to create your own client_state backup. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
client_state.xml is *the* most critical file in any BOINC installation - your entire BOINC life is in there. If you make the slightest mistake in editing it, it is likely that at least one project will become unusable with all work discarded. It's not a live working file. All BOINC's working data is held in memory, so editing the file while BOINC is running has no effect - it'll get overwritten in seconds, and changes will be discarded. BUT - it is crucial when BOINC is restarted. Everything is read back into memory then - the only time the file is read, and it has to be read in its entirety. The writing process (next-current-prev) is a variation of the Grandfather-Father-Son backup strategy, designed so that you always have two complete copies of everything, no matter at what point in the writing cycle the lights go out or some other disaster strikes. --- A note on the size of app_info.xml: the file size depends on the number of applications you choose to run when you run the Lunatics installer. There are two apps (SETI MB and Astropulse), and four hardware platforms (CPU, ATi, NV, and intel_gpu), so you can have up to eight apps in there. Some of them have been deployed in so many variants - see the Applications page - that there has to be a very complex specification to catch all installation and upgrade paths. Now that the setiathome_v7 applications have finally been removed from that page, I could do a public release following the SoG Beta4 model with setiathome_v7 references removed (that would save a lot of space), but I'll wait until we have the final go-ahead for the stock deployment of the improved SoG app, and pick up the new version numbers that will be needed when that finally happens. It's been a while since I walked through Mr. Kevvy's code, but he certainly can't be looking at app_info.xml directly (it's an optional file, which may not be present in all cases): I don't even think he's looking at the <app_version> entries which are copied into client_state.xml |
Kieron Walsh Send message Joined: 2 Mar 00 Posts: 74 Credit: 43,502,325 RAC: 112 |
Thanks for clarifying that Stephen. I just started using the rescheduler yesterday and got slightly worried today when I ran it and no changes were made. Having read your post and downloaded a batch of new work I ran it again - success! Great work Mr Kevvy btw! |
Stubbles Send message Joined: 29 Nov 99 Posts: 358 Credit: 5,909,255 RAC: 0 |
Thanks Jeff for all the great info there. As far as not hearing about it, well, I did bring it up in your original rescheduling thread as being something that anyone working on a rescheduling application should take into consideration.WoW your memory is great! I was fairly green back then (and I still am!) and barely understood that is was not a big issue for a prototype since it just meant that a reassignment of tasks would have seemed to have never occured. It definitely needs to be added to Jim's QOpt before v1.0 to cover all scenarios. Anything else that Jimbocous and his Beta testers should know or consider? Rob :-) |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1857 Credit: 268,616,081 RAC: 1,349 |
I don't know what the various triggers are for an updating of the client_state files, but yes, it would probably best not to count on client_state_prev.xml being the backup following a rescheduling run. Better to create your own client_state backup. Hi, Jeff. I don't consider anything but client_state.xml in QOpt. Once I've verified via a tasklist query that the BOINC client is completely down, a separate backup of client_state.xml (client_state_backup.xml) is made, and then using a File Compare verified. Only after that (and other checks) is Kevvy's Rescheduler invoked. As I understand the environment, this should cover it. We've all been banging on this code for a while now, and one thing that has not been reported by anyone are any issues related to this. Of course, in the final analysis, that all goes back to Kevvy anyway. QOpt is only a framework manager to effectively invoke the Rescheduler (or anything else someone comes up with in future) in the right environment and under control of the user without a lot of fuss and bother. I also added functionality to create a backup, or restore from one, independently of invoking the Rescheduler and, in each case, again confirming that boinc.exe is not a running process during any client_state file operations. So if something does go sideways, the user will have the tools to readily revert and find out what the issue was without trashing work. I did a from-the-top review this evening of both threads on this, to ensure that there weren't any nasties out there. I think I have it all covered, especially thanks to the folks who have been doing great Beta testing for me, and those who were kind enough to review my early code and get my mind right on certain things. Jim ... |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Glad I could be of assistance :) Stephen . |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Hi people, . . I have to say running Jimbocous' V1.0 with MrKevvy's app is doing very well, as long as you have 3 or 4 nonVLAR tasks in your CPU queue it works a treat and quite quickly. A few seconds (20 or so) a few times a day and I have a nicely balanced workload. Stephen . |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I concur. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
What exactly is Qopt? And where is Jim hiding it? :) |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1857 Credit: 268,616,081 RAC: 1,349 |
Hi, Brent, QOpt - SETI@home Device Queue Optimizer ------------------------------------------------------------ QOpt is a program for Windows 7, 8.x and 10 designed to manage Mr. Kevvy's GUPPIRescheduler.exe, a program that reassigns work units between CPU and GPU queues. QOpt will: 1. Inventory running processes and shut them down if needed or by user request, 2. Perform a verified backup of client_state.xml, 3. Invoke Mr. Kevvy's script version 0.51 or higher, and redirect output to a log, 4. Restart any processes identified as running in 1. above. Optionally, QOpt also can do the following (primarily for diagnostics): a. Do a verified backup of client_state.xml only, b. Do a verified restore to client_state.xml from a previous backup, c. Support enhanced logging that permits multiple machines to share logging across the network, d. Kill BoincTasks if requested by command line option, e. Kill Boinc Manager if requested by command line option, f. Abort processing if APs are present, by command line option. Bwahahaha, well hidden:) Actually, still in Beta so I haven't put out a general download link, but am passing it out to folks that run Kevvy's 0.51 by request. This all evolved from a batch file Stubbles69 had put out a while ago. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
I would give it a try if you send me a PM. One update I can think of (reading A;'s situation) is an option to not run unless there are X number of tasks to reassign. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1857 Credit: 268,616,081 RAC: 1,349 |
I would give it a try if you send me a PM. I can see where that could be of use, and I had a similar thought a while back. It would make a lot of sense to pull a quick analysis before going through all the trouble of shutting down, backing up, etc., just to discover there's nothing worth doing. Unfortunately, I'd have to refer that back to Mr. Kevvy, as I'm just the traffic cop here, and he does the head-count:) How's that for a mangled metaphor? At this point analyzing the Client State is far beyond the scope of QOpt, and I'd really like to leave that to Kevin. I'll definitely add that to the list of wants for a future, in case a down-n-dirty way of checking surfaces. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
I would give it a try if you send me a PM. . . Hi Jim, . . Yes that does make sense but is it easy to do? Stephen . |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1857 Credit: 268,616,081 RAC: 1,349 |
QOpt - SETI@home Device Queue Optimizer ------------------------------------------------------------ QOpt is a program for Windows 7, 8.x and 10 designed to manage Mr. Kevvy's GUPPIRescheduler.exe, a program that reassigns work units between CPU and GPU queues. QOpt will: 1. Inventory running processes and shut them down if needed or by user request, 2. Perform a verified backup of client_state.xml, 3. Invoke Mr. Kevvy's script version 0.51 or higher, and redirect output to a log, 4. Restart any processes identified as running in 1. above. Optionally, QOpt also can do the following (primarily for diagnostics): a. Do a verified backup of client_state.xml only, b. Do a verified restore to client_state.xml from a previous backup, c. Support enhanced logging that permits multiple machines to share logging across the network, d. Kill BoincTasks if requested by command line option, e. Kill Boinc Manager if requested by command line option, f. Abort processing if APs are present, by command line option. Barring any further bug fixes or feature requests, I think I'm done fiddling with this. I have added a couple other things to the distribution, which somehow has ended up being called 1.02e. 1. Considerably expanded the read_me, including discussion about using this with Windows Task Scheduler, 2. Added a quick little batch file that can be scheduled to run once per day to manage the log file by doing a backup. 3. Added a warning in read_me discussing the fact that intermittently certain Antivirus programs are tagging QOPt with a false positive as a Trojan. At this point, anyone who wishes to is welcome to come grab a copy and use it. If you do put it to use, I'd appreciate a quick note either here or in PM. |
EdwardPF Send message Joined: 26 Jul 99 Posts: 389 Credit: 236,772,605 RAC: 374 |
come grab a copy McAfee prevents me from downloading this file Just FYI Ed F |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.