Reorganization of resources idea

Author	Message
Cheopis Send message Joined: 17 Sep 00 Posts: 156 Credit: 18,451,329 RAC: 0	Message 1012895 - Posted: 6 Jul 2010, 19:32:24 UTC I have a hard time imagining this has not been considered before, and I know the SETI@home team has been pretty busy recently, so this is a "maybe someday" request which might already be in the planning stages. Perhaps allow home users to do splitting? I know this would require several different layers of implementation, but it seems as if the science the team is actually capable of performing is limited by the sheer amount of beancounting and non-analysis work that is required to keep new data coming in. I'm envisioning the following implementation: 1) The same raw data is sent to several different users. 2) That data is then split onto work units independently by each user. 3) The resulting hashes are crosschecked against each other. 4) When enough identical readings are received, each user with a "good" dataset is assigned to analyze a certain % of the resulting data that they already have on their machines. This implementation would remove splitting from the serverside, it would remove the need to send the independent workunits to at least some users, and these resources could be used for actual science. How would one get the large initial data sets to users? I'm thinking that one might be able to find a respectable p2p provider out there that would be willing to work with SETI@Home & the BOINC project. That way the data would only need to be sent once, and the p2p network would propagate it to other SETI@Home users. I know that several MMO's use dedicated channels in p2p networks for their client downloads, so this would be nothing new for them. The same p2p network could be used for crosschecking the workunits generated by splitting. ID: 1012895 ·

ML1 Volunteer moderator Volunteer tester Send message Joined: 25 Nov 01 Posts: 20283 Credit: 7,508,002 RAC: 20	Message 1014836 - Posted: 11 Jul 2010, 23:39:40 UTC - in response to Message 1012895. ... Perhaps allow home users to do splitting? ... No can do unless you can include some completely robust checking to check for cheats. ... And the robust checking would require wasted duplicated effort in doing the splitting... The present division of processing for s@h and Boinc makes good system sense. More of a question is whether we could usefully do any other types of processing with the Arecibo data. Astropulse is perhaps one example answer... Also bear in mind that the present Berkeley system bottleneck appears to be the database used for the science and for Boinc processing. I have my own suspicion that the entire Boinc system is limited by the maximum possible (disk) IO speed for the database logging! Keep searchin', Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) ID: 1014836 ·

Cheopis Send message Joined: 17 Sep 00 Posts: 156 Credit: 18,451,329 RAC: 0	Message 1014888 - Posted: 12 Jul 2010, 4:19:14 UTC - in response to Message 1014836. Last modified: 12 Jul 2010, 4:21:02 UTC ... Perhaps allow home users to do splitting? ... No can do unless you can include some completely robust checking to check for cheats. ... And the robust checking would require wasted duplicated effort in doing the splitting... <snip some interesting stuff> Keep searchin', Martin Well, the Robust data checking would simply be to give the same data to multiple users with comparable work completion rates, compare results, and not allow clients to work on and report results on work units that are not "registered" as coming from one of the set of clients that generated identical data. Since I don't know how many work units are generated by a typical "split" data set, I can't say what would be an efficient number of clients per data set. I strongly suspect that each work unit is distributed several times in any case, under the current system. The level of clients required to be inline with the redundancy level required for a sufficiently robust data cross checking for client splitting might be perfectly in line with the number of clients that rework data to perform current levels of work unit verification. ID: 1014888 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.