More AP WU's please! |
![]() |
| log in |
Message boards : Number crunching : More AP WU's please!
1 · 2 · Next
| Author | Message |
|---|---|
|
Any idea on when we'll be getting more AP work to crunch? | |
| ID: 897261 · | |
Any idea on when we'll be getting more AP work to crunch? The kitties also wanna know.... They still have a little cache to keep the crunchers warm, but it won't hold out forever... And whilst we are on the subject....any projection on the release of the new version of AP being beta'd? ____________ ****** "Ask not, what your kitty can do for you. Ask what you can do for your kitty." As it is kitten, so shall it be done. | |
| ID: 897275 · | |
|
This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's | |
| ID: 897280 · | |
This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's Mebbe us AP folks figured the Cuda crowd was gonna gobble up all that MB work....LOL. ____________ ****** "Ask not, what your kitty can do for you. Ask what you can do for your kitty." As it is kitten, so shall it be done. | |
| ID: 897291 · | |
|
If only it were that easy... :) This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's ____________ Please consider a Donation to the Seti Project. | |
| ID: 897310 · | |
|
I'm doing my part! :) | |
| ID: 897312 · | |
So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Hey up to a couple of weeks ago I had done nothing but MB... tried CUDA and dropped it just as fast... | |
| ID: 897316 · | |
I'm doing my part! :) With MultiBeam is roughly ~29480000 WU's behind, that takes us down to ~29475000 LOL ____________ Please consider a Donation to the Seti Project. | |
| ID: 897318 · | |
|
Looks like I'll be helping out with the MBs here shortly. Right now I'm running two APs and a Cuda. After these two I have one more AP. Today the server has been feeding me 6.08s for my cuda and 6.03s for my CPUs. It's been awhile since I've seen this. The server has been just feeding me APs along with my Cudas | |
| ID: 897319 · | |
So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Depending on the computer and video card, Cuda is not for everyone. Not counting some the issues getting it working properly. The generally recommend version of Boinc is 6.6.20 (has its own story) 182.5 drivers and when the Server recognizes it, it "should" send Cuda. Because of how the Server is configured, it usually tells Boinc that the Cuda Application should use x.xxx percent of the CPU and as much of the Video card as it can grab. This sometimes overpowers the Video Card. What you end up with is something "odd" that may seem unworkable. In the case of the Optimized Cuda, it tells Boinc and the Cuda Application to back off a bit in the use of the CPU and uses the video card more effectively. Then you end up with something that is "workable" or a reason to disable Cuda. No Cuda is not for everyone. ____________ Please consider a Donation to the Seti Project. | |
| ID: 897324 · | |
|
Once upon a time there was a flock of ugly ducklings called APs. And there was many calls from the masses that AP be dropped or disabled by default. Almost everyone wanted the more of ducks called MBs which are especially delicious served with CUDA sauce. | |
| ID: 897631 · | |
This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's As I have stated several times over the past few months, I would love to be able to crunch MB and "help the problem", however, when things are running normally, the scheduler will not send me MBs if I have AP selected as an allowed application. From the testing I have done, it appears that part of the scheduler logic is something like: if (CUDA == yes) give MB; else give AP; I don't know the validity to that statement, but that's what it seems like. If I go to an MB-only venue, I obviously get nothing but MBs. Before the APs ran out, there were times where the ready-to-send queue was running low and I requested work, and the messages tab said "no jobs available", but I switched to the MB venue before the next work fetch retry and got 50 MBs in short order. So the bottom line is, there is still some scheduler logic that seems to only want to give MB out to the CUDA folks. This may need some deeper analysis, but I've done all I can do from the client-side, of course. However, now that APs are pretty hard to come by, the scheduler does go ahead and give me some MBs, but I'm sure it really does not want to. ____________ Linux laptop uptime: 1484d 22h 42m Ended due to UPS failure, found 14 hours after the fact | |
| ID: 897641 · | |
|
The server logic you stated as............ | |
| ID: 897650 · | |
... Not sure this is entirely accurate. 6.6.28 makes separate work requests for CUDA and non-CUDA. The non-CUDA work requests should be filled with a (20 / 1 ?) mix of MB / AP but always seem to be AP under "normal" running conditions. Not that I'm complaining - just clarifying the facts as I see them. F. ____________ | |
| ID: 897661 · | |
|
Well.....all I know is there are a couple of datasets in the splitters right now, so at least some new AP work is being created.... | |
| ID: 897670 · | |
|
Just wondering out loud. | |
| ID: 897708 · | |
|
I am using SuSE Linux 10.3 and its firewall, no antivirus. Recently I noticed attempts to connect to my box using ssh and I killed the sshd daemon. | |
| ID: 897710 · | |
|
Still one AP dataset splitting.... | |
| ID: 897728 · | |
|
Chelski wrote: ... Two factors you missed make the ratio very close to 40x. Although the mb_splitters make groups of WUs with 107.37 second duration they only advance about 85.7 seconds for the next group; the time overlap of about 20 seconds ensures we don't miss signals at the seams between WUs. AP WUs don't have any overlap, for short single pulses it wouldn't add any significant value. Second, data is packed 6 bits per byte in MB (plus there's a newline after each 64 data bytes), but full 8 bits per byte in AP. I don't think the actual crunch rate is this high in average (of course, if we sample now, it could be higher because we are starving the project of AP WUs). And overall we need to get the ratio to be bigger than 22:1 to start correcting the backlog. A back of envelope estimate of my cuda machine does 80 MBs per day on CUDA and 4 APs per day (20:1) ratio, so yes, I think I'm part of the problem. Will be interesting if some of the crunchers with Dreadnought class CUDA machines if they are doing much better. The only sensible thing for the project is to adjust work delivery so a given set of 'tapes' is handled at nearly the same rate by both sets of splitters. If everything were working as it should, they'd deliver 40 MB tasks for every AP task. That would take 3 slots for AP_v5 and 96 for MB in the 100 shared memory slots between the Feeder and Scheduler. (and 1 for any original AP reissues) But there are more tasks needed to get an AP_v5 WU finished, so 5 slots for AP_v5 and 94 for MB would allow for the needed reissues. I'd expect the AP_v5 reissue rate to drop, then its number of slots could be further reduced. Note that having 94 or 96 slots for MB should mean fairly rare cases where the MB slots get emptied between Feeder cycles. The delivery rate for MB should go up to near what it was before AP was released. Edit: If it seems I'm advocating fewer AP WUs, that isn't really so. If the MB delivery had been keeping up, there wouldn't be 137 'tapes' waiting to be done by the mb_splitters and recalling data from NERSC HPSS would have started sooner. Work delivery should have remained reasonably stable for both MB and AP. Joe | |
| ID: 897878 · | |
|
Joe wrote: Two factors you missed make the ratio very close to 40x. Although the mb_splitters make groups of WUs with 107.37 second duration they only advance about 85.7 seconds for the next group; the time overlap of about 20 seconds ensures we don't miss signals at the seams between WUs. AP WUs don't have any overlap, for short single pulses it wouldn't add any significant value. Second, data is packed 6 bits per byte in MB (plus there's a newline after each 64 data bytes), but full 8 bits per byte in AP. Thanks, I didn't know the overlap in MB and the increased packing density of AP,so used the direct filesize comparison which is obviously underestimated the severity of the problem. Joe wrote: Edit: If it seems I'm advocating fewer AP WUs, that isn't really so. If the MB delivery had been keeping up, there wouldn't be 137 'tapes' waiting to be done by the mb_splitters and recalling data from NERSC HPSS would have started sooner. Work delivery should have remained reasonably stable for both MB and AP. Agreed, this will not reduce the AP WUs since each tape will still give the fixed number of AP WUs and a fixed number of MB WUs. At least this workaround will correct the imbalance and the effect will be that clients that does not specify limitation to one kind of work will receive MBs and APs in the correct ratio. A quick gander at the last hour's results returned was only 30:1 MB:AP, so we are still a long way to go in bringing balance back to the Force. Maybe controversially inflate the MB credits for a limited duration (e.g. putting CPU MB and CUDA MB at parity for a start) so that those people who bothers to specify preferences for the CPU cruncing will help clear some of the MB backlog? ____________ | |
| ID: 898113 · | |
Message boards : Number crunching : More AP WU's please!
| Copyright © 2013 University of California |