Open Beta test: SoG for NVidia, Lunatics v0.45

Author	Message
Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1839042 - Posted: 31 Dec 2016, 2:31:58 UTC - in response to Message 1839027. I suspect Richard would be would quite happy if someone else were to release a Lunatics SoG-specific update installer. People can use the current Beta installers to install SoG (or whatever application they wish) as their selected application. The SoG-specific update installer would be for those that already have SoG selected as their main application & wish to update to the most recent release without the risk of trashing all their work manually fiddling with the app_info.xml file. I can definitely get behind that idea. That is the only thing that needs changing in the current Betas. I just won't touch manually editing app_info anymore. Too many mistakes and over 4 months to finally clear "ghosts" up. Too painful a memory. . . I have created a few ghosts myself, and I am sure not all were due to my mistakes. But this process, while fiddly, does work to recover them. I received a detailed step-by-step approach to recover lost tasks on SETI hosts from another very helpful SETIzen. It works but I encountered several snags in trying to implement it. Such as when uploading the completed task an immediate update would foil the attempt to recover ghosted WUs. So I have added a few extra steps. 1. Suspend network and set No New Tasks. 2. Wait until you have a single task that has finished and is ready to upload. 3. With network suspended execute a project update to start the timer until the next update, this will give you a 5 minute window. 4. After update sequence finished resume network activity and allow task to upload then suspend network again before an update can start. 5. Suspend processing. 6. Make a backup copy of the BOINC data directory.* (see NOTE, below) 7. Resume network activity, then "update" the project to report the waiting task (although, with NNT set, that will probably happen automatically). 8. Suspend network activity again then exit BOINC completely. 9. Restore the BOINC directory backup.* (see NOTE, below) 10. Restart BOINC and processing should have resumed, if not start it. 11. Increase your work buffer to accommodate your "lost" tasks. (BOINC won't send any tasks if it thinks your buffer is already "full".) 12. Check that the 5 minute window has passed since the last scheduler request in Step 4 completed, set "Allow new tasks", then resume network activity. 13. If BOINC doesn't automatically initiate a scheduler request, hit "Update" to once again report the same waiting task. This "should" result in one or more "lost" tasks being re-sent. [I find that regardless of how many ghosted tasks are assigned to you the system will only resend about 20 at a time]. NOTE: As was also suggested in that thread, it is probably sufficient to simply backup and restore client_state.xml (and perhaps client_state_prev.xml), but I tend to like to be sure I'm keeping everything in sync and so I choose to do the entire BOINC data directory. It's just as simple, really. Happy Crunching Stephen :) ID: 1839042 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1839043 - Posted: 31 Dec 2016, 2:33:52 UTC - in response to Message 1839028. With v0.45 installed, is there a way to download a newer app used on Main, then edit the app_info.xml from v0.45 accordingly? You certainly can. I believe Ramster posted the files here somewhere. Just put the new files in your folder, and carefully change the version numbers in your app_info . . Very, very carefully. Stephen . ID: 1839043 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1839045 - Posted: 31 Dec 2016, 2:42:10 UTC - in response to Message 1839032. It would make good sense if the workunit deadlines were cut if the aim is to keep the database small. True, however the project also wants to make use of the widest possible range of hardware, so the longer deadlines are necessary for the much, much slower crunchers out there. I have heard this said in the past and it is a laudable aim, but allowing 6 weeks when 99% are probably being returned inside of 7 days is a substantial commitment towards outreach and could be considered unwarranted, especially if it is impacting the performance of the project. . . And a shorter time limit does NOT preclude slower machines from taking part, it merely requires them to contact the servers more often. And while a 1 week limit may seem a bit short for some it can certainly be less than the 4 to 6 weeks currently used, maybe half would be a good move. A healthy and active cruncher, no matter how slow, rarely needs more than 2 weeks to process a WU, and the unhealthy ones would not hold a task in limbo for more than 3 weeks (and even that is too long). . . My beliefs anyway ... Stephen :) ID: 1839045 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1839069 - Posted: 31 Dec 2016, 6:32:37 UTC - in response to Message 1839042. You have to do EVERYTHING right.... cross your fingers ...... and hold your tongue - just so to use the method you describe. In my case of 600 "ghosted" tasks the process would take forever at 20 tasks retrieved every sequence. Also when you are constantly at your server defined limits because you process work so quickly you would have to run your onboard tasks down to nil after a NNT to even begin the sequence. I found out the hard way that what you would think is a simple task of restoring an old backup of client_state.xml has its own dangers. I lost 5 years of work history at Einstein because the backup restoration caused a change in cross-project ID and a new computer Host ID to be created at Einstein because BOINC detected a unexpected change in <rpc_seqno>. This also caused a "phantom" Host ID at Einstein and loss of 13 million credits because they are attached to the old Host ID. If you try to merge the old and new hosts BOINC always chooses the newest Host ID. That fiasco is why I will never edit app_info manually anymore. I'll just have to wait for Richard to crank out a new Lunatics installer with r3584 in it or whatever Raistmer is working on his latest revision. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1839069 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1839081 - Posted: 31 Dec 2016, 8:08:17 UTC Last modified: 31 Dec 2016, 8:10:52 UTC Consider ARM6 phone crunching only when charging. http://setiathome.berkeley.edu/host_app_versions.php?hostid=7435236 Ð¡Ñ€ÐµÐ´Ð½ÐµÐµ Ð²Ñ€ÐµÐ¼Ñ Ð¾Ð±Ñ€Ð°Ð±Ð¾Ñ‚ÐºÐ¸ 25.85 days That is, ~4 weeks. No need to touch deadlines. If "97%" complete "just in one-two days" then fine, those tasks will be validated and deleted from BOINC database faster. They even not need to be kept week, just those 2 days. But other still have chance to participate. P.S. What need to be touched instead is quota system that fails to catch hyper-fast broken GPU hosts. That can produce thousands of broken results per day - amount that slow crunchers will not accumulate through whole year! SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1839081 ·

Dr Grey Send message Joined: 27 May 99 Posts: 154 Credit: 104,147,344 RAC: 21	Message 1839083 - Posted: 31 Dec 2016, 9:09:46 UTC - in response to Message 1839081. Consider ARM6 phone crunching only when charging. http://setiathome.berkeley.edu/host_app_versions.php?hostid=7435236 Ð¡Ñ€ÐµÐ´Ð½ÐµÐµ Ð²Ñ€ÐµÐ¼Ñ Ð¾Ð±Ñ€Ð°Ð±Ð¾Ñ‚ÐºÐ¸ 25.85 days That is, ~4 weeks. No need to touch deadlines. If "97%" complete "just in one-two days" then fine, those tasks will be validated and deleted from BOINC database faster. They even not need to be kept week, just those 2 days. But other still have chance to participate. P.S. What need to be touched instead is quota system that fails to catch hyper-fast broken GPU hosts. That can produce thousands of broken results per day - amount that slow crunchers will not accumulate through whole year! I cannot see the benefit to the project of catering for such low processing capability devices. Do they generate publicity or revenue? Even my Raspberry Pi, with a RAC of 87.46 (!) still has a turnaround of 2.74 days. That is about 1/500 of what my PC produces. The only reason I keep it connected is because it makes me happy that way - a bit like keeping a hamster with a wheel. Is it right to keep the database inflated just to enable folks to keep their pet devices warm? What is the real benefit here? I agree on your last point. ID: 1839083 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1839090 - Posted: 31 Dec 2016, 10:00:08 UTC - in response to Message 1839083. I cannot see the benefit to the project of catering for such low processing capability devices. The only reason I keep it connected is because it makes me happy that way You answered on your question. Is it right to keep the database inflated just to enable folks to keep their pet devices warm? What is the real benefit here? "Inflated database" it's quite relative term. As long as it works OK it's not "inflated". And as I already said, the influence of slow hosts on that perceived "inflation" over-estimated. So yes, it is right to provide ability to help SETI on everything that can compute. That's right spirit. And spirit much more valuable than revenue. Should be even in "free market" world :P SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1839090 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1839091 - Posted: 31 Dec 2016, 10:07:04 UTC - in response to Message 1839090. I cannot see the benefit to the project of catering for such low processing capability devices. The only reason I keep it connected is because it makes me happy that way You answered on your question. Is it right to keep the database inflated just to enable folks to keep their pet devices warm? What is the real benefit here? "Inflated database" it's quite relative term. As long as it works OK it's not "inflated". And as I already said, the influence of slow hosts on that perceived "inflation" over-estimated. So yes, it is right to provide ability to help SETI on everything that can compute. That's right spirit. And spirit much more valuable than revenue. Should be even in "free market" world :P As soon as we adopt the intentions of the technology vendors at present, then more than 95% of our hardware compute capacity is immediately defunct. It would be nice if we could all have shiny new Kaby Lakes and GTX 1080s, but it's not realistically going to happen... especially when the gains represent poor return for the money. They need to do a lot better before we can say goodbye to stuff that works. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1839091 ·

Dr Grey Send message Joined: 27 May 99 Posts: 154 Credit: 104,147,344 RAC: 21	Message 1839096 - Posted: 31 Dec 2016, 11:26:39 UTC - in response to Message 1839091. As soon as we adopt the intentions of the technology vendors at present, then more than 95% of our hardware compute capacity is immediately defunct. It would be nice if we could all have shiny new Kaby Lakes and GTX 1080s, but it's not realistically going to happen... especially when the gains represent poor return for the money. They need to do a lot better before we can say goodbye to stuff that works. Nevertheless, time moves on. Looking back over the last few years SETI compute power increases by 5-10% each year. 1080s are soon to be old hat with the 1080Ti arriving soon, AMD's RyZen will hopefully perk up the CPU market pushing Intel towards adopting 10 nm quicker and I'm reading that we're hopeful for even better SETI apps in the near future. Older devices will continue to be retired and all this will all impact the average turnaround time, making analysis ever closer from near time to real time. The current deadlines for completion will continue to grow to be further away from the time taken for the bulk of canonical results to be achieved. ID: 1839096 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1839100 - Posted: 31 Dec 2016, 11:42:50 UTC - in response to Message 1839096. The current deadlines for completion will continue to grow to be further away from the time taken for the bulk of canonical results to be achieved. 1) Deadlines will not grow. They will remain the same. 2) So what? Why this bother? Did you ever observe new version roll up where BOINC mis-predict estimated time to completion? With shorter deadline % of killed tasks will be even higher. So, all this just to make credit accounted faster for those over-competitive ones who can't just leave host as it is and spend time on optimization instead of credit rise watching? It will accounted eventually - that's all what matter. Regarding real time processing: we do only pre-process. Final processing is Nebula. And it requires data movement into Atlas. This SETI search isn't real-time one by design. Its sensitivity comes from data accumulation over few observations spanned by years. Real-time processing just self-imposed goal, not really required for this kind of search. Would be good to complete search faster, but cutting some processing power will not make it faster, it will make it slower. Particular result mean validation time isn't matter. There are millions of such results. SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1839100 ·

Dr Grey Send message Joined: 27 May 99 Posts: 154 Credit: 104,147,344 RAC: 21	Message 1839108 - Posted: 31 Dec 2016, 12:34:33 UTC - in response to Message 1839100. The current deadlines for completion will continue to grow to be further away from the time taken for the bulk of canonical results to be achieved. 1) Deadlines will not grow. They will remain the same. 2) So what? Why this bother? Did you ever observe new version roll up where BOINC mis-predict estimated time to completion? With shorter deadline % of killed tasks will be even higher. So, all this just to make credit accounted faster for those over-competitive ones who can't just leave host as it is and spend time on optimization instead of credit rise watching? It will accounted eventually - that's all what matter. Regarding real time processing: we do only pre-process. Final processing is Nebula. And it requires data movement into Atlas. This SETI search isn't real-time one by design. Its sensitivity comes from data accumulation over few observations spanned by years. Real-time processing just self-imposed goal, not really required for this kind of search. Would be good to complete search faster, but cutting some processing power will not make it faster, it will make it slower. Particular result mean validation time isn't matter. There are millions of such results. 1) As average turnaround shrinks and deadlines remain the same, the time between them grows. 2) Why bother optimising the process? To decrease entries from slow to verify workunits. From my understanding this would enable a greater cache size to be allowed without impacting the database size. If older, unverified workunits sitting in the database that would otherwise be reduced by shortening the deadlines have little impact on the backend system then I have little to argue about. But it would be interesting to know what proportion that would be. I could do that by looking at my own cache I guess. I'll go and make a coffee and see if I can make some figures. On your last point though, the potential computational benefit to be gained I agree, is minimal. It would arise from those very high end machines that run dry during an outage. Looking at Petri's record breaker, his turnaround average is 0.16 days. A little under 4 hours. That means he's probably running dry half way into an 8 hour outage, about once a week. Or around 2.5 % of the time. Everyone else will be less than this. However, it could be argued that enabling a larger cache to ensure that 2.5 % performance gain from Petri, at the expense of our slower tail, would be better for the computational output of the project - as that 2.5% could be substantially larger than the tail output. But that would put us at odds with an egalitarian ideal. ID: 1839108 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1839110 - Posted: 31 Dec 2016, 13:05:48 UTC - in response to Message 1839108. Last modified: 31 Dec 2016, 13:25:02 UTC This imply assumption that maintenance time directly depends on database size. And that database size directly (or largely enough) depends on mean deadline time. Second assumption looks very unjustified for the reasons listed earlier. First assumption requires some proofs too. And third assumption is that if deadline shrinkage will take place it will automatically lead to number of tasks per host limit rise. Hardly... EDIT: also, there are means to increase number of cached tasks to survive outage (if operator really interested in such). There will be no technical means to extend deadline in case of its shrinkage though. So, still no-no. EDIT2: From another thread: Remember, if they do have a high error or invalid count they are actually doing NOTHING to hurt or damage the project, simply because the project is marking them as errors or invalids. Also the fact that they CAN run up a high total of invalids or errors is due to the project not them. Very true. And imposes much stronger load to database than all whole slow hosts together. Just because GPU can produce fault result in couple of SECONDS. And sending let say 1 valid result (~30min for average modern device) per 10 such invalids will sustain steady flow of results - how many per day? That's real danger, not deadlines... SETI apps news We're not gonna fight them. We're gonna transcend them. ID: 1839110 ·

Dr Grey Send message Joined: 27 May 99 Posts: 154 Credit: 104,147,344 RAC: 21	Message 1839117 - Posted: 31 Dec 2016, 13:55:24 UTC - in response to Message 1839110. This imply assumption that maintenance time directly depends on database size. And that database size directly (or largely enough) depends on mean deadline time. Second assumption looks very unjustified for the reasons listed earlier. First assumption requires some proofs too. And third assumption is that if deadline shrinkage will take place it will automatically lead to number of tasks per host limit rise. Hardly... Not sure all those implications directly follow but nevertheless I've taken a look at my older pendings and can prove I am wrong... 80 % of the pendings are less than 2 weeks old. 12 % are older than 4 weeks and 6 % are older than 6 weeks with the oldest being about 8 weeks old. Current deadlines appear to be about 7.5 weeks although some appear to be 3 weeks and I'm not sure why that is. All my pendings occur just beyond 1, 7.5 week deadline, suggesting that workunits requiring more than one resend are extremely rare, and when resends happen they turnaround on average in a couple of days. According to these figures anyway, trying to reclaim space by reducing deadlines from 7.5 to 4 weeks will gain only about 12 % space. That would be just 90 workunits based on my pendings and probably not worth it. ID: 1839117 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1839130 - Posted: 31 Dec 2016, 14:59:25 UTC - in response to Message 1839117. Last modified: 31 Dec 2016, 15:00:28 UTC Adding: With v8, some certain subtle precision refinements became effective, designed to converge the dominant CPU and proper GPU applications cross platform, along with easing prior discrepancies that affected 64 bit CPU builds Vs x86 original. While the desired effect was achieved, yielding better than 5% inconclusive (i.e. resend) to pending ratios on the large scale, an unintended consequence was that 'rogue' machines and/or applications became more visible. That's not necessarily a bad thing by any means, just unexpected. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1839130 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1839141 - Posted: 31 Dec 2016, 15:46:49 UTC - in response to Message 1839139. I've no idea who this Dr Grey is, but what I do know is that I would back Jason and Raistmer aganst his technical ability any day. Have a look at his team name. I've got no qualms about anyone questioning my, raistmer's or the probably 30+ other people's work. It's been a long road and a long way to go. Only thing I take issue with is the fairly recent drives to abandon working hardware & applications in the name of progress. That's simply a deadend approach, because it discards the most time-proven parts for unknowns. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1839141 ·

Dr Grey Send message Joined: 27 May 99 Posts: 154 Credit: 104,147,344 RAC: 21	Message 1839154 - Posted: 31 Dec 2016, 16:05:23 UTC - in response to Message 1839139. I've no idea who this Dr Grey is, but what I do know is that I would back Jason and Raistmer aganst his technical ability any day. Have a look at his team name. Thanks for the vote of confidence for my technical ability. For the record I have none, but I don't see any harm in challenging the status quo and offering up ideas that are worth discussing. Do you? ID: 1839154 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1839179 - Posted: 31 Dec 2016, 17:38:21 UTC Last modified: 31 Dec 2016, 17:38:53 UTC First of all it is Richards decision when he will release a new Installer. From my testing point of view and being part of the installer crew the work is done. But we shouldn`t forget this time of the year Richard might have taken a few days off which is well deserved IMHO. The whole Lunatics team is working on spare free time and a lot of work has been finnished so far. So another few days/weeks doesn`t really matter. Happy New Year @all. With each crime and every kindness we birth our future. ID: 1839179 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1839184 - Posted: 31 Dec 2016, 17:45:11 UTC - in response to Message 1839179. First of all it is Richards decision when he will release a new Installer. From my testing point of view and being part of the installer crew the work is done. But we shouldn`t forget this time of the year Richard might have taken a few days off which is well deserved IMHO. The whole Lunatics team is working on spare free time and a lot of work has been finnished so far. So another few days/weeks doesn`t really matter. Happy New Year @all. Richard has always been very generous with donating his time and services to the project, and does a great job. No worries. He shall get to it when he gets to it. Happy New Year to all from the kitty crew. Meow! "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1839184 ·

AMDave Volunteer tester Send message Joined: 9 Mar 01 Posts: 234 Credit: 11,671,730 RAC: 0	Message 1839198 - Posted: 31 Dec 2016, 18:43:58 UTC - in response to Message 1839028. With v0.45 installed, is there a way to download a newer app used on Main, then edit the app_info.xml from v0.45 accordingly? You certainly can. I believe Ramster posted the files here somewhere. Just put the new files in your folder, and carefully change the version numbers in your app_info How about copying the requisite files from the Seti Beta directory to the Seti Main directory?Â Â I think those files are â™¦Â Â mb_cmdline-8.22_windows_intel__opencl_nvidia_SoG.txt â™¦Â Â Â MultiBeam_Kernels_r3584.cl â™¦Â Â Â MultiBeam_Kernels_r3584.cl_GeForceGTX950.bin_V7_SoG_35900 â™¦Â Â Â r3584_IntelRCoreTMi76700KCPU400GHz_x86.wisdom â™¦ Â setiathome_8.22_windows_intelx86__opencl_nvidia_SoG.exe ID: 1839198 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1839210 - Posted: 31 Dec 2016, 19:59:10 UTC - in response to Message 1839198. With v0.45 installed, is there a way to download a newer app used on Main, then edit the app_info.xml from v0.45 accordingly? You certainly can. I believe Ramster posted the files here somewhere. Just put the new files in your folder, and carefully change the version numbers in your app_info How about copying the requisite files from the Seti Beta directory to the Seti Main directory?Â Â I think those files are â™¦Â Â mb_cmdline-8.22_windows_intel__opencl_nvidia_SoG.txt â™¦Â Â Â MultiBeam_Kernels_r3584.cl â™¦Â Â Â MultiBeam_Kernels_r3584.cl_GeForceGTX950.bin_V7_SoG_35900 â™¦Â Â Â r3584_IntelRCoreTMi76700KCPU400GHz_x86.wisdom â™¦ Â setiathome_8.22_windows_intelx86__opencl_nvidia_SoG.exe Two of those files MultiBeam_Kernels_r3584.cl_GeForceGTX950.bin_V7_SoG_35900 r3584_IntelRCoreTMi76700KCPU400GHz_x86.wisdom Are created when you run the app. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1839210 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.

Open Beta test: SoG for NVidia, Lunatics v0.45 - Beta6 (RC again)