Message boards :
Number crunching :
Idea for Boinc.....
Message board moderation
Author | Message |
---|---|
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Just pondering.... I know that bandwidth conservation may not be an issue for other Boinc projects, but it is for Seti. At least right now. I wonder if..... With the increasing number of fast hosts and faster GPUs, cache sizes regardless of a user's settings are only going to increase. Every time a host reports a completed WU or requests work, their entire cache list is sent along with the request. When cache sizes get into the 1000's of WUs, this can be a sizable transmission. Would it be possible or practical for Boinc to transmit only information relating to what has changed in the list since the last host update, and not the entire list every time? And perhaps the entire list only every 10 completed requests to be sure everything stays in synch? Would this be cumbersome to implement? Would the bandwidth saved be worth the effort? Just asking for trouble? Just kitty thoughts. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
I agree with this. Even when downloads/uploads are off, feeder is empty, etc, there is still a decent amount of traffic from just people trying to get new work. Perhaps there could be a quick comparison done between the host and the server of just the NUMBER of WU's on board, instead of sending the entire task list. Then, if there is a discrepancy (see: ghosts), the full file could be sent to see which files are missing. The server is already having to query the DB to compare task lists, surely it wouldn't be hard to just compare a number. I can't imagine it would add much if any extra load to the servers, and would surely save some bandwidth. Then, it would just be: Requesting new tasks, have 4250 on board. ACK, 4250 on board, DB agrees. Here's more tasks. Or: Requesting new tasks, have 4250 on board. Error: should be 4275 on board. Sending entire task list. etc. |
musicplayer Send message Joined: 17 May 10 Posts: 2430 Credit: 926,046 RAC: 0 |
My guess is that I am in fact running tasks remotely which are stored or "downloaded" to "my" user area on the download or workunit storage server. The only fields that are missing out before the result finishes up is CPU-time, power and score for the four elements, as well as a couple of other fields as well. The uploading of the finished results should update these fields only. |
AndyJ Send message Joined: 17 Aug 02 Posts: 248 Credit: 27,380,797 RAC: 0 |
Would it be possible or practical for Boinc to transmit only information relating to what has changed in the list since the last host update, and not the entire list every time? And perhaps the entire list only every 10 completed requests to be sure everything stays in synch? Elegant idea. Can`t be that simple, can it.......? Here`s hoping it is. Regards, Andy |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
Would it be possible or practical for Boinc to transmit only information relating to what has changed in the list since the last host update, and not the entire list every time? And perhaps the entire list only every 10 completed requests to be sure everything stays in synch? Probably a bit of a pain to implement though. BOINC WIKI |
Blake Bonkofsky Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0 |
It would surely require a client side update, as well as compatible server side changes, but I wouldn't think it would be TOO bad... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
The server code is prepared to handle requests which don't have those lists of tasks on hand, some very old versions of the core client don't send them. So it's certainly possible to have a new core client which only sends the lists once in every 5 or 10 requests, etc. Note that requests without the lists wouldn't be useful for the resend lost work capability, but maybe having only a fraction of requests triggering the comparison would reduce the database load it causes to a negligible amount. However, another possibility to consider is having the request gzipped before sending. That would reduce the size considerably, and the method of sending a request is very similar to uploading a result (which already has an optional gzip feature). A 19236 byte request listing 37 in progress compresses to 3301 bytes, for instance. Joe |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 |
The server code is prepared to handle requests which don't have those lists of tasks on hand, some very old versions of the core client don't send them. So it's certainly possible to have a new core client which only sends the lists once in every 5 or 10 requests, etc. The way it would have to work would be something like: The client requests work and reports the count of tasks. It would also have to have a flag that indicates that it can do this. The server would see the flag, and notice a discrepancy in the count. The server would pick some tasks that were already allocated to the client to send (if done with a reasonable amount of intelligence, this MIGHT avoid the next round trip). The server would add a flag to indicate that the client is missing some tasks. The client upon receiving the tasks sets a flag for the project noting that on the next connection that it would have to report the list. The client would inspect the tasks that the server has suggested it can work on, and pitch any that it already has. The client would then determine if it had to request more work. If the server receives a list rather than just a count, it behaves pretty much as it does today. However, if the work request does not ask for enough tasks to use all of the tasks that are already allocated to the client, the flag to report the list of tasks would be sent to the client. (Yes, this could cycle several times before either the tasks are sent to someone else, or are all delivered to the client). BOINC WIKI |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
Why not make the workunits bigger? Every server at Seti would benefit from it. |
musicplayer Send message Joined: 17 May 10 Posts: 2430 Credit: 926,046 RAC: 0 |
Bigger you say? Probably not a good idea in my opinion. So what about those VLAR's then. Do they return anything of interest? Should be perhaps concentrate around those locations where numbers were found to be a little better (if we are still listening for signals from the stars)? We could have fewer results being returned, but possibly better results returned instead. But then we will not be able to track those elusive UFO's who are supposed to be visiting us from time to time (believe it or not). |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
Why not make the workunits bigger? I'll try again. Why not make the workunits larger in datasize? Every server at Seti would benefit from it. Instead of about 300 Kbytes files per WU, increase it to let's say 1 Mbyte. That will decrease the number of workunits by a third. The network load for WU's surely must drop. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
Why not make the workunits bigger? That depends on the time required for the hosts to crunch the WU. Simply increasing their size will not reduce server load if the crunch time per Kbyte of data remains the same. AP is a good example...the WUs are much larger, but the time required to process them does not increase in proportion to their size, so they actually are harder on bandwidth than MB work. Of course, VHAR MB work with it's very quick processing times are even worse. The only thing that would change the ratios is if the science application did more work on the data sent. I am not sure how the new version of Seti being rolled out later this year compares in terms of processing time per WU sent. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
I don't believe it's a bandwidth problem but that network requests are coming in too often to SETI due to the big amount of WU's. The solution to this is to make fewer WU's by making them larger. The crunching time on computers beside SETI has no relevance to this problem. Meeeooowwwrrr. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I don't believe it's a bandwidth problem but that network requests are coming in too often to SETI due to the big amount of WU's. The solution to this is to make fewer WU's by making them larger. The crunching time on computers beside SETI has no relevance to this problem. If you look at the cricket graph, you'll see that bandwidth is, indeed, one of the many limiting factors: we've been hard against the top peg for 23 hours solid now. You are also correct about the number of network requests being a strain on the servers, but you would need to solve both halves of the problem (and probably many more besides) to make a significant differance. |
W-K 666 Send message Joined: 18 May 99 Posts: 19014 Credit: 40,757,560 RAC: 67 |
For over 12 yrs the data size has been the same and after we have processed them the results are stored on the science database, with up to 30 items of interest for each result. If the data size were increased, do we still only look for a max of thirty items of interest and throw away any other items found, OR do you have some magic way of storing these larger tasks (twice size, 60 items. etc.) that would be compatible with all the older data. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
One AstroPulse task took 8 retries and 45 minutes to download. And MB work DownLoads are very slow and need several retries, too. Bigger tasks don't seem like a good idea, zipping them, if it's worth it? (I noticed CPDN, also was slow and took several retries to DownLoad these (big) WUs). I also lowered my cache from 4 to 3 and 2 days, depending of the host. When BOINC (6.12.33/34) asks for work and there is no work, a few minutes later a new request is made and it takes some time, before it downloaded enough or waits over an hour, to request again. |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
For over 12 yrs the data size has been the same and after we have processed them the results are stored on the science database, with up to 30 items of interest for each result. I have seen mention several times that they have considered processing MB work when AP tasks are downloaded. Since there are about 20 MB tasks in the same data that generates 1 AP task. I think it is just a matter of working out the logic on the back end for that. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
that number is about 30 IIRC. I would suggest that since we already label tasks as VLAR why not simply increase the size 3-4 fold of just the VHAR WU's. This will cut down the number of requests for work and keep the GPU's crunching away on a single WU instead of starting and stopping every couple of minutes In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
janneseti Send message Joined: 14 Oct 09 Posts: 14106 Credit: 655,366 RAC: 0 |
I'm a bit confused now. I have looked at the cricket graph and got this. Graphs for gigabitethernet2_3 Values at last update: Average bits in (for the day): Cur: 92.94 Mbits/sec Avg: 93.57 Mbits/sec Max: 95.96 Mbits/sec Average bits out (for the day): Cur: 12.13 Mbits/sec Avg: 12.55 Mbits/sec Max: 13.70 Mbits/sec Last updated at Fri Oct 7 14:05:18 2011 Is gigabitethernet2_3 only for SETI or it is for Berkley University and others? To me it looks like they are sending 8 times more data then they are getting. My contribution is getting 360 Kb and sending 30 kB per WU, that is i'm sending less data that i'm get. Like you said it's indeed a bandwidth problem. To me it's seems there are other projects than SETI who maybe have to lookup their routines. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
I'm a bit confused now. You have to remember that graph is inside-out.... The green is downloads going out. The blue line is uploads and work requests going in. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.