Message boards :
Number crunching :
Idea for Boinc.....
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Rick Send message Joined: 3 Dec 99 Posts: 79 Credit: 11,486,227 RAC: 0 |
Is the WU data compressed? If not, has anybody ever tried to compress the WU data to see if it would make much difference? I know compressing the data on the Seti side would take more resources but not sure how that would balance out against the reduced traffic. I would also suggest that changing the download process so the client only makes one connection then downloads multiple files across that one connection would also reduce the load on the router. If it would mean fewer lost connections then the actual throughput would probably be better than we're seeing now. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Is the WU data compressed? If not, has anybody ever tried to compress the WU data to see if it would make much difference? I know compressing the data on the Seti side would take more resources but not sure how that would balance out against the reduced traffic. I believe it's been tested, and compressing the actual WUs results in very little gain. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Rick Send message Joined: 3 Dec 99 Posts: 79 Credit: 11,486,227 RAC: 0 |
Is the WU data compressed? If not, has anybody ever tried to compress the WU data to see if it would make much difference? I know compressing the data on the Seti side would take more resources but not sure how that would balance out against the reduced traffic. Well, it was worth a shot but after this many years I should have guessed that someone would have looked into it by now. |
Arvid Almstrom Send message Joined: 23 Mar 00 Posts: 98 Credit: 137,331,372 RAC: 0 |
|
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
[/quote] You have to remember that graph is inside-out.... The green is downloads going out. The blue line is uploads and work requests going in. [/quote] OK. Help me out since i'm not english spoken. You say downloads are going out and uploads are going in. Please send an answer to this folling questions. If Alice sends a message to Bob, does she download the message? If I receive message from Bob, do I upload the message? |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
What if every host was only allowed to connect to the project at maximum when a certain percentage of cache is completed, and the update button did nothing. For example, with a ten day cache, Boinc would only connect to the server to simultaneously upload/report/get new workunits after 1 day of cache was depleted. If your cache hasn't been filled maybe Boinc could be limited to connecting to just once an hour or more if necessary. Except for when a host was out of workunits. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
No, the opposite. If you receive a file into your computer, you are downloading it from another computer or server. If you send a file out of your computer, you are uploading it to another computer or server. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Is the WU data compressed? If not, has anybody ever tried to compress the WU data to see if it would make much difference? I know compressing the data on the Seti side would take more resources but not sure how that would balance out against the reduced traffic. MB WUs compress to about 75 to 85% of their original size because the data format is BASE64 and there's a fairly large XML header section. AP WUs have a smaller header and the data is pure binary so compression is much less. Joe |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Is the WU data compressed? If not, has anybody ever tried to compress the WU data to see if it would make much difference? I know compressing the data on the Seti side would take more resources but not sure how that would balance out against the reduced traffic. Interesting.... I thought that MB work was almost uncompressable too. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
Thanks msattler. However the cricket graph makes me wonder still. Who's gigabitethernet2_3 and who's sending and receiving what? |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
gigabitethernet2_3 does SETI@Home upload, download, and possibly update (work requests and reports). There are two paths into SSL (Space Sciences Lab), the one that does the web pages goes through the normal Berkeley network. The one that does uploads and downloads goes through Hurricane Electric. The cricket graph is for a switch at Hurricane Electric. It is in the Hurricane Electric building facing SSL. Since in and out are from the point of view of a router/switch facing SSL, they have to be swapped to match the point of view of SSL. Think of it this way. The port side of a boat is to your left if you are in the stern facing the bow. If you turn around, so you are looking at the wake, the port side of the boat will be on your right. BOINC WIKI |
W-K 666 Send message Joined: 18 May 99 Posts: 19048 Credit: 40,757,560 RAC: 67 |
gigabitethernet2_3 is the Berkeley campus designation, also the cricket graphs are generated by Berkeley campus. Therefore the cricket graphs are showing d/loading from the Seti servers in green and uploads to the Seti servers in blue. |
tbret Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 |
Why not make the workunits bigger? Here's an idea --- What if they just didn't send VHAR tasks at all and processed them on a vhar-optimized cruncher or two locally? I'm seeing completion times under 2:00 on some computers, under 4:00 on many, many (and I assume these are running two or three at a time). Say... eight GTX 590s running six units each (48) completing every three minutes, so 20 x that per hour...ummm 960 WUs/hr...ummmm, 23040 WUs per day that wouldn't have to be sent, reported, retrieved... x 2... 46080 Wus of bandwidth @ 366.55KB = ummmm.... 16GB/day...downloaded, and ummm...results are how big?... ummmm... And do that twice... so you've got 32GB/day of bandwith opened-up while crunching 92,000 vhars/day Well, whatever the real number is, it's a lot of bandwidth and a lot of CPU cycles wasted trying to download and upload and a lot of scheduler tasks and a lot of everything else. If nothing else, crunching them locally would save half the bandwidth because I assume they wouldn't need duplication for verification. Now if I just knew what portion of the 40-50k results every hour were vhars I'd know if I'm slowing things down or speeding things up for the project as a whole. And then there is the question: Would anything be left for us to crunch? We would "hit" the servers a WHOLE lot less frequently if we were doing nothing but larger tasks. As it is, I've got a pot-load of under 4:00 tasks I've been downloading, crunching, and uploading. Maybe one of these? http://www.youtube.com/watch?NR=1&v=PZ3CJw-u2cc Yeah, I know. But it looks so cool. I've just got this feeling that we're not thinking outside the box and there's some very simple principle that could be applied to make things much better very quickly. I just "feel" like there's an "ah-HA!" to be had. |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
Einstein does locality scheduling, meaning that once you have some work files you can generate your own wu from them, this saves downloading them all the time. Once the scheduler has worked out all the work units have been done for that work file it instructs the BOINC client to delete it. Don't know if it's at all possible to do on here. Also I would suggest gzip for the scheduler requests to reduce the traffic for them, while not giving a big saving in bandwidth every little bit helps. BOINC blog |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
Thanks all for the help. I can now see that Berkeley uses all the bandwidth all the time. The bottom line is that Berkeley lacks of fundings and we all have to wait for some years until they can get all the bandwidth they (we) want. |
Sami Send message Joined: 12 Aug 99 Posts: 38 Credit: 12,671,175 RAC: 4 |
Why not make the workunits larger in datasize? Rna world beta did something like that. They combined many small WUs into one big WU. One reason was to reduce server load. http://www.rnaworld.de/rnaworld/forum_thread.php?id=76 Is this something you meant? |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
Why not make the workunits larger in datasize? They also did this for the Einstein cuda work. They are actually 8 wu combined. To process you download 8 x 4Mb data files, which are processed serially and uploaded as 8 result files. BOINC blog |
W-K 666 Send message Joined: 18 May 99 Posts: 19048 Credit: 40,757,560 RAC: 67 |
Einstein does locality scheduling, meaning that once you have some work files you can generate your own wu from them, this saves downloading them all the time. Once the scheduler has worked out all the work units have been done for that work file it instructs the BOINC client to delete it. Don't know if it's at all possible to do on here. Until you get the end of a run, and you are on clean-up duty to process the odd ones not completed yet. Then you d/load these large files just to extract one unit for processing, when that one task is complete the order is to delete those massive d/loads. |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
Why not make the workunits larger in datasize? |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
Why not make the workunits larger in datasize? Hyvää päivää. Exactly. However the Berkeley network load is probably too high and the crunchers demand is more than they can handle. But like they say. Every bit counts. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.