Message boards :
Number crunching :
News from the top - file compression
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
trux Send message Joined: 6 Feb 01 Posts: 344 Credit: 1,127,051 RAC: 0 |
You all seem to be going on about bandwidth here but where is the cpu power coming from to do the compression/decompression at the server end?As I explained, that will happen only once per four 4 WU downloads, and once per ~830,000 2.6MB application downloads, saving more than a Terrabyte of bandwidth. Although we can speculate about the impact of the additional CPU load on the servers because of the packing at WU downloads and/or results uploads, it is absolutely no question at the application downloads. trux BOINC software Freediving Team Czech Republic |
W-K 666 Send message Joined: 18 May 99 Posts: 19080 Credit: 40,757,560 RAC: 67 |
But how often do we need to do an app d/load as far as I can remember we have had at most three changes in the official app in the last year, so compressing the app will only save bandwidth on a few days/year. Most hosts only do one unit/day, and one way of saving bandwidth may be to allow those of us with more than one host, and especially the farmers, is to download a new app once and copy it locally to all the users hosts. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
The host that does the ul/dl is, I believe, not normally CPU bound. It is occasionally I/O bound, and it is also sometimes disk bound (caused by a bug, hopefully never to happen again). BOINC WIKI |
Jim-R. Send message Joined: 7 Feb 06 Posts: 1494 Credit: 194,148 RAC: 0 |
The host that does the ul/dl is, I believe, not normally CPU bound. It is occasionally I/O bound, and it is also sometimes disk bound (caused by a bug, hopefully never to happen again). Ok, the disk usage for doing the compression may cause a problem then. So much for our cpu concerns! haha. But good news as far as the cpu then. Sounds like if disk problems can be conquered it might be worth it, at least for apps, but as was mentioned that's rare. A greater benefit to the I/O would be compressing wu's but it would put a lot more strain on the cpu and disk. Possibly not worth it? Jim Some people plan their life out and look back at the wealth they've had. Others live life day by day and look back at the wealth of experiences and enjoyment they've had. |
MikeSW17 Send message Joined: 3 Apr 99 Posts: 1603 Credit: 2,700,523 RAC: 0 |
But how often do we need to do an app d/load as far as I can remember we have had at most three changes in the official app in the last year, so compressing the app will only save bandwidth on a few days/year. Most hosts only do one unit/day, and one way of saving bandwidth may be to allow those of us with more than one host, and especially the farmers, is to download a new app once and copy it locally to all the users hosts. /EDIT (See edit at end!) Actually, in the past year there have been the following releases: 12/05/2005 22:39 6,288,477 boinc_4.39_windows_intelx86.exe 13/05/2005 06:21 6,288,989 boinc_4.40_windows_intelx86.exe 15/05/2005 10:43 6,288,989 boinc_4.41_windows_intelx86.exe 17/05/2005 00:03 6,288,989 boinc_4.42_windows_intelx86.exe 19/05/2005 19:36 6,290,013 boinc_4.43_windows_intelx86.exe 27/05/2005 06:58 6,347,357 boinc_4.44_windows_intelx86.exe 08/06/2005 19:36 6,341,725 boinc_4.45_windows_intelx86.exe 08/07/2005 13:17 6,363,741 boinc_4.70_windows_intelx86.exe 29/09/2005 10:16 10,496,089 boinc_5.1.5_windows_intelx86.exe 11/10/2005 19:26 10,508,377 boinc_5.2.1_windows_intelx86.exe 20/10/2005 21:09 10,507,865 boinc_5.2.2_windows_intelx86.exe 28/10/2005 08:36 10,514,521 boinc_5.2.5_windows_intelx86.exe 04/11/2005 19:39 10,515,545 boinc_5.2.6_windows_intelx86.exe 26/11/2005 19:13 10,532,954 boinc_5.2.10_windows_intelx86.exe 26/11/2005 23:05 10,541,658 boinc_5.2.11_windows_intelx86.exe 19/12/2005 23:58 10,546,778 boinc_5.2.13_windows_intelx86.exe That's 16. Over 130Mb of BOINC Releases, and I may have missed a few short-lived ones. Admittedly, not everyone downloads every/most releases but it's still a significant transfer load. /EDIT Oops! On re-reading I realise you're talkig about the science app. Never the less, compression should certainly help on the one-off side, but I'm less convinced that it will help with WUs. |
Tigher Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 |
I'm not sure the app is important in this measurement exercise thats going on. I took just one of those apps last year - 5.2.13. What I do take every day is (a decent guess here) about 160 work units at something like 354 KB each. That's approaching 60 MB per day every day. Compress it I say and the results back! Batch them too and stop giving in to "I want to see each result as it completes get credits"; one batch a day I vote for all compressed. Now there is a serious server overheads saving. |
Jim-R. Send message Joined: 7 Feb 06 Posts: 1494 Credit: 194,148 RAC: 0 |
But how often do we need to do an app d/load as far as I can remember we have had at most three changes in the official app in the last year, so compressing the app will only save bandwidth on a few days/year. Most hosts only do one unit/day, and one way of saving bandwidth may be to allow those of us with more than one host, and especially the farmers, is to download a new app once and copy it locally to all the users hosts. The greatest benefit as far as disk space and bandwidth (about 100kb apiece) would be if it were possible to store the wu's on the disk as compressed files. However since there are only four or a few more copies of each wu on the disk the cpu useage for compressing all those files would be high. A slightly better situation were if the splitter could compress the files (I'm assuming the splitter is a separate computer and has the cpu cycles to spare) but I understand the header of the wu has to be parsed for information to store in the database. Of course here it might be possible to uncompress just the first couple hundred bytes (just enough to get the header info) without uncompressing the entire file. Also I'm talking about compressing the wu's before they are even stored, not "on the fly" when they're sent. This would mean less disk space needed to store them, and the compression/storage could be done "off-peak". That would give the greatest return on the investment of cpu cycles. Jim Some people plan their life out and look back at the wealth they've had. Others live life day by day and look back at the wealth of experiences and enjoyment they've had. |
Jim-R. Send message Joined: 7 Feb 06 Posts: 1494 Credit: 194,148 RAC: 0 |
I'm not sure the app is important in this measurement exercise thats going on. I took just one of those apps last year - 5.2.13. What I do take every day is (a decent guess here) about 160 work units at something like 354 KB each. That's approaching 60 MB per day every day. The compression of the wu's would save about 1/3 in bandwidth and if they were stored on disk in compressed format it would also save 1/3 the disk space, however the cpu power to compress all those files would be fairly large. However it's been found that compressing the application saves about 1/2 of the file size, so half the bandwidth of sending them out with just a slight one time cpu jump. Very little cpu useage and neglible disk space savings but tremendous return on saving bandwidth. Wu's however, would give a greater savings overall on both disk space and bandwidth but at a very high cost in cpu cycles. Can't get something for nothing! It's just a question of is the saving in disk space and bandwidth worth the extra expense in cpu cycles. Jim Some people plan their life out and look back at the wealth they've had. Others live life day by day and look back at the wealth of experiences and enjoyment they've had. |
Tigher Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 |
Well its in 5.3.22 which has been released (non stable test version!) so we will see or hear perhaps how successful it is or is not. Then again...we never hear much about anything so lets not build our hopes up! I think its likely to be pretty flakey as I "think" each WU will be zipped as it is sent so it may then be zipped 3 or 4 or 5 times. Unnecessary CPU time for sure but saved bandwidth. No change on disc. As I say I think this is how its done. I might get the source and take a peek. |
PRouleau Send message Joined: 3 Apr 99 Posts: 15 Credit: 20,638,960 RAC: 0 |
Hi! I would like to have an update on this thread. Does the workunits are sent in a compressed form? My problem is that we have a lot of Idle cpu cycles at work, but we are limited by the bandwidth. By using only 7 PCs, BOINC use ~1GB of our 20GB/month. The dual cores boost the download frequency... It would be nice if the wu was also saved in compressed form, but the HD space isn't a problem (on the client side). A compressed partition can do the job, when needed. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Hi! I understand that the latest Alpha test versions of BOINC have this ability, but I don't know if any projects are planning to use it. In any event, how much use would it be here on SETI? The data we're searching is very close to random - that's why we have to use so much computing power to find patterns in it! And random files don't compress well. I recently used WinZip to compress a data file in case it might be useful for the optimisers to analyse one with a fairly rare AR. It went down from 354KB to 267KB, a 25% saving in bandwidth. Useful, but not the answer to all your problems. |
Benher Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 |
The majority of the data in the WU files is also ASCII encoded binary data. (the first little part is XML). Can't be sure it would compress if still in binary form. |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
The majority of the data in the WU files is also ASCII encoded binary data. (the first little part is XML). Can't be sure it would compress if still in binary form. But ASCII compresses very nicely. BOINC WIKI |
Hans Dorn Send message Joined: 3 Apr 99 Posts: 2262 Credit: 26,448,570 RAC: 0 |
The majority of the data in the WU files is also ASCII encoded binary data. (the first little part is XML). Can't be sure it would compress if still in binary form. Sendig the WUs in binary in the first place would compress even better :o) Regards Hans |
Benher Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 |
The majority of the data in the WU files is also ASCII encoded binary data. (the first little part is XML). Can't be sure it would compress if still in binary form. Normally True John, but not seemingly random ASCII as in one of the WUs. Open one yourself and have a look. |
Benher Send message Joined: 25 Jul 99 Posts: 517 Credit: 465,152 RAC: 0 |
The majority of the data in the WU files is also ASCII encoded binary data. (the first little part is XML). Can't be sure it would compress if still in binary form. Normally true John, But not the seemingly random ASCII of an encoded WU. Open one yourself and look at the last hundred lines or so. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.