Questions and Answers :
Wish list :
Use DeDuplication in the files
Message board moderation
Author | Message |
---|---|
cthame Send message Joined: 20 Aug 06 Posts: 3 Credit: 2,097,551 RAC: 0 |
I am wondering if we dedupe the data files processing or file transmission couldn't be quicker? |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
Deduplication compresses files which requires extra overhead to compress and decompress. How would this be quicker in processing? ...and actually, SETI@home is trying to find ways to make processing longer so as to keep the 150,000+ hosts from constantly pounding the servers for more work. Processing was recently increased in SETI@home v7 by adding autocorrelation into the processing. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Added to that, the Seti@Home tasks are 367KB per task. Not too big. But even if you compress these, they are 270KB. A reduction of only 27%. The Astropulse are 8MB per task, and can only be partially compressed. I tried with 7zip LZMA Ultra: 7zip a -t7z -m0=lzma -mx=9 -mfb=256 -md=512m -ms=on, will compress my 8,196KB test-AP task into 5,666KB. A reduction of ~30%. And this depends on the AR of the task as well, as not all of them compress that well. Now, the problem is that the compression method needs to be available on the 4 main platforms that BOINC is available for: FreeBSD, Linux, Windows and Macintosh OS X. 7zip (7z) may do all that, but it still needs an external 7zip program before it can be used. Which then means that we fall back to the next best thing, one that is supported by all out-of-the-box, which is plain zip. Now then, using zip -m0=deflate -mx=9 -mfb=128 -md=32k -ms=on, the Seti@Home v7 task compresses from 367KB to 266KB, while the AP goes from 8,196KB to 8,096KB. A reduction of 28% and 1.3%. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.