More AP WU's please!

Author	Message
JAMC Volunteer tester Send message Joined: 2 Jan 07 Posts: 71 Credit: 9,521,522 RAC: 0	Message 897261 - Posted: 20 May 2009, 15:32:28 UTC Any idea on when we'll be getting more AP work to crunch? ID: 897261 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 897275 - Posted: 20 May 2009, 16:00:28 UTC - in response to Message 897261. Any idea on when we'll be getting more AP work to crunch? The kitties also wanna know.... They still have a little cache to keep the crunchers warm, but it won't hold out forever... And whilst we are on the subject....any projection on the release of the new version of AP being beta'd? "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 897275 ·

Pappa Volunteer tester Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0	Message 897280 - Posted: 20 May 2009, 16:08:14 UTC Last modified: 20 May 2009, 16:09:07 UTC This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's Joe describes a part of the process in this post/thread Astropulse Beam 3 Polarity 1 Errors So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Regards Please consider a Donation to the Seti Project. ID: 897280 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 897291 - Posted: 20 May 2009, 16:21:03 UTC - in response to Message 897280. This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's Joe describes a part of the process in this post/thread Astropulse Beam 3 Polarity 1 Errors So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Regards Mebbe us AP folks figured the Cuda crowd was gonna gobble up all that MB work....LOL. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 897291 ·

Pappa Volunteer tester Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0	Message 897310 - Posted: 20 May 2009, 16:48:45 UTC - in response to Message 897291. If only it were that easy... :) This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's Joe describes a part of the process in this post/thread Astropulse Beam 3 Polarity 1 Errors So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Regards Mebbe us AP folks figured the Cuda crowd was gonna gobble up all that MB work....LOL. Please consider a Donation to the Seti Project. ID: 897310 ·

KW2E Send message Joined: 18 May 99 Posts: 346 Credit: 104,396,190 RAC: 34	Message 897312 - Posted: 20 May 2009, 16:49:31 UTC - in response to Message 897291. Last modified: 20 May 2009, 16:50:03 UTC I'm doing my part! :) I crunch both. Prefer AP for the CPUs but a quick survey shows a good deal of 603's waiting in addition to my 608's done by CUDA. Rob ID: 897312 ·

JAMC Volunteer tester Send message Joined: 2 Jan 07 Posts: 71 Credit: 9,521,522 RAC: 0	Message 897316 - Posted: 20 May 2009, 16:53:57 UTC - in response to Message 897280. Last modified: 20 May 2009, 16:54:18 UTC So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Regards Hey up to a couple of weeks ago I had done nothing but MB... tried CUDA and dropped it just as fast... ID: 897316 ·

Pappa Volunteer tester Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0	Message 897318 - Posted: 20 May 2009, 17:14:10 UTC - in response to Message 897312. I'm doing my part! :) I crunch both. Prefer AP for the CPUs but a quick survey shows a good deal of 603's waiting in addition to my 608's done by CUDA. Rob With MultiBeam is roughly ~29480000 WU's behind, that takes us down to ~29475000 LOL Please consider a Donation to the Seti Project. ID: 897318 ·

perryjay Volunteer tester Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0	Message 897319 - Posted: 20 May 2009, 17:16:55 UTC - in response to Message 897316. Looks like I'll be helping out with the MBs here shortly. Right now I'm running two APs and a Cuda. After these two I have one more AP. Today the server has been feeding me 6.08s for my cuda and 6.03s for my CPUs. It's been awhile since I've seen this. The server has been just feeding me APs along with my Cudas PROUD MEMBER OF Team Starfire World BOINC ID: 897319 ·

Pappa Volunteer tester Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0	Message 897324 - Posted: 20 May 2009, 17:34:05 UTC - in response to Message 897316. So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Regards Hey up to a couple of weeks ago I had done nothing but MB... tried CUDA and dropped it just as fast... Depending on the computer and video card, Cuda is not for everyone. Not counting some the issues getting it working properly. The generally recommend version of Boinc is 6.6.20 (has its own story) 182.5 drivers and when the Server recognizes it, it "should" send Cuda. Because of how the Server is configured, it usually tells Boinc that the Cuda Application should use x.xxx percent of the CPU and as much of the Video card as it can grab. This sometimes overpowers the Video Card. What you end up with is something "odd" that may seem unworkable. In the case of the Optimized Cuda, it tells Boinc and the Cuda Application to back off a bit in the use of the CPU and uses the video card more effectively. Then you end up with something that is "workable" or a reason to disable Cuda. No Cuda is not for everyone. Please consider a Donation to the Seti Project. ID: 897324 ·

Chelski Send message Joined: 3 Jan 00 Posts: 121 Credit: 8,979,050 RAC: 0	Message 897631 - Posted: 21 May 2009, 3:50:12 UTC Last modified: 21 May 2009, 3:50:38 UTC Once upon a time there was a flock of ugly ducklings called APs. And there was many calls from the masses that AP be dropped or disabled by default. Almost everyone wanted the more of ducks called MBs which are especially delicious served with CUDA sauce. Then someone found out that APs are prettier than swans - they are geese that lays golden eggs. And then were hunted to almost extinction and joined Spock as members of endangered species. Actually the MB splitting have always lagged behind APs even, but never this acutely. The file size difference between MB and AP is almost 22:1 - for each tape we can get about 22x in number of MB WU vs AP I don't think the actual crunch rate is this high in average (of course, if we sample now, it could be higher because we are starving the project of AP WUs). And overall we need to get the ratio to be bigger than 22:1 to start correcting the backlog. A back of envelope estimate of my cuda machine does 80 MBs per day on CUDA and 4 APs per day (20:1) ratio, so yes, I think I'm part of the problem. Will be interesting if some of the crunchers with Dreadnought class CUDA machines if they are doing much better. ID: 897631 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 897641 - Posted: 21 May 2009, 4:59:16 UTC - in response to Message 897280. Last modified: 21 May 2009, 4:59:59 UTC This is interesting. When the Beta Group got AP ready for Seti Main, Seti could not get anyone to Crunch AP. If I look at the "Server Status Page" we see AP all but complete (with trickles and resends) and tons of images (134 currently) trying to split/complete MultiBeam WU's Joe describes a part of the process in this post/thread Astropulse Beam 3 Polarity 1 Errors So we end up with a delima, some users have entirely focused on AP for various reasons. MultiBeam has fallen behind. To an extent while Tape Images are being retrieved from off site storage; Matt's Tech post Countdown (May 18 2009). With 134 images loaded locally and as Joe states ~220000 MB WU's/image it tends to indicate that MultiBeam is roughly ~29480000 WU's behind. My feeling would be the users should focus on eating as many of the MultiBeam WU's as they can. Regards As I have stated several times over the past few months, I would love to be able to crunch MB and "help the problem", however, when things are running normally, the scheduler will not send me MBs if I have AP selected as an allowed application. From the testing I have done, it appears that part of the scheduler logic is something like: if (CUDA == yes) give MB; else give AP; I don't know the validity to that statement, but that's what it seems like. If I go to an MB-only venue, I obviously get nothing but MBs. Before the APs ran out, there were times where the ready-to-send queue was running low and I requested work, and the messages tab said "no jobs available", but I switched to the MB venue before the next work fetch retry and got 50 MBs in short order. So the bottom line is, there is still some scheduler logic that seems to only want to give MB out to the CUDA folks. This may need some deeper analysis, but I've done all I can do from the client-side, of course. However, now that APs are pretty hard to come by, the scheduler does go ahead and give me some MBs, but I'm sure it really does not want to. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 897641 ·

Geek@Play Volunteer tester Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0	Message 897650 - Posted: 21 May 2009, 5:18:31 UTC - in response to Message 897641. Last modified: 21 May 2009, 5:19:24 UTC The server logic you stated as............ if (CUDA == yes) give MB; else give AP; May be true when Boinc 6.2.19 is making the request. I know for a fact that Boinc version 6.6.28, the current recommended version, will make seperate work requests for MB and AP. Thus if MB is requested and available it will be issued to you. If AP is requested and available it will be issued to you. Perhaps you should consider upgrading to the current version of Boinc. Should upgrade fine without destroying the work you have on hand. Boinc....Boinc....Boinc....Boinc.... ID: 897650 ·

Fred W Volunteer tester Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0	Message 897661 - Posted: 21 May 2009, 5:53:02 UTC - in response to Message 897650. ... I know for a fact that Boinc version 6.6.28, the current recommended version, will make seperate work requests for MB and AP. ... Not sure this is entirely accurate. 6.6.28 makes separate work requests for CUDA and non-CUDA. The non-CUDA work requests should be filled with a (20 / 1 ?) mix of MB / AP but always seem to be AP under "normal" running conditions. Not that I'm complaining - just clarifying the facts as I see them. F. ID: 897661 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 897670 - Posted: 21 May 2009, 6:20:15 UTC Well.....all I know is there are a couple of datasets in the splitters right now, so at least some new AP work is being created.... Of course, due to the high demand, actually getting some is still hit and miss...as there has been no ready to send cache built up.... "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 897670 ·

Virtual Boss* Volunteer tester Send message Joined: 4 May 08 Posts: 417 Credit: 6,440,287 RAC: 0	Message 897708 - Posted: 21 May 2009, 10:31:53 UTC Just wondering out loud. Are all the older "tapes" that were split for MB before AP started still in existence, or were they deleted? And if they are still around, can they be re-split for AP only? ID: 897708 ·

tullio Volunteer tester Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1	Message 897710 - Posted: 21 May 2009, 10:47:56 UTC I am using SuSE Linux 10.3 and its firewall, no antivirus. Recently I noticed attempts to connect to my box using ssh and I killed the sshd daemon. Tullio ID: 897710 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 897728 - Posted: 21 May 2009, 12:55:11 UTC Last modified: 21 May 2009, 12:56:19 UTC Still one AP dataset splitting.... Overnight, my rigs all seem to have been able to fill their AP work requests. Still, no ready to send cache has been built up. So it's catch as you can. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 897728 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 897878 - Posted: 21 May 2009, 20:44:27 UTC - in response to Message 897631. Last modified: 21 May 2009, 20:52:36 UTC Chelski wrote: ... The file size difference between MB and AP is almost 22:1 - for each tape we can get about 22x in number of MB WU vs AP Two factors you missed make the ratio very close to 40x. Although the mb_splitters make groups of WUs with 107.37 second duration they only advance about 85.7 seconds for the next group; the time overlap of about 20 seconds ensures we don't miss signals at the seams between WUs. AP WUs don't have any overlap, for short single pulses it wouldn't add any significant value. Second, data is packed 6 bits per byte in MB (plus there's a newline after each 64 data bytes), but full 8 bits per byte in AP. I don't think the actual crunch rate is this high in average (of course, if we sample now, it could be higher because we are starving the project of AP WUs). And overall we need to get the ratio to be bigger than 22:1 to start correcting the backlog. A back of envelope estimate of my cuda machine does 80 MBs per day on CUDA and 4 APs per day (20:1) ratio, so yes, I think I'm part of the problem. Will be interesting if some of the crunchers with Dreadnought class CUDA machines if they are doing much better. The only sensible thing for the project is to adjust work delivery so a given set of 'tapes' is handled at nearly the same rate by both sets of splitters. If everything were working as it should, they'd deliver 40 MB tasks for every AP task. That would take 3 slots for AP_v5 and 96 for MB in the 100 shared memory slots between the Feeder and Scheduler. (and 1 for any original AP reissues) But there are more tasks needed to get an AP_v5 WU finished, so 5 slots for AP_v5 and 94 for MB would allow for the needed reissues. I'd expect the AP_v5 reissue rate to drop, then its number of slots could be further reduced. Note that having 94 or 96 slots for MB should mean fairly rare cases where the MB slots get emptied between Feeder cycles. The delivery rate for MB should go up to near what it was before AP was released. Edit: If it seems I'm advocating fewer AP WUs, that isn't really so. If the MB delivery had been keeping up, there wouldn't be 137 'tapes' waiting to be done by the mb_splitters and recalling data from NERSC HPSS would have started sooner. Work delivery should have remained reasonably stable for both MB and AP. Joe ID: 897878 ·

Chelski Send message Joined: 3 Jan 00 Posts: 121 Credit: 8,979,050 RAC: 0	Message 898113 - Posted: 22 May 2009, 6:27:53 UTC - in response to Message 897878. Last modified: 22 May 2009, 6:28:59 UTC Joe wrote: Two factors you missed make the ratio very close to 40x. Although the mb_splitters make groups of WUs with 107.37 second duration they only advance about 85.7 seconds for the next group; the time overlap of about 20 seconds ensures we don't miss signals at the seams between WUs. AP WUs don't have any overlap, for short single pulses it wouldn't add any significant value. Second, data is packed 6 bits per byte in MB (plus there's a newline after each 64 data bytes), but full 8 bits per byte in AP. Thanks, I didn't know the overlap in MB and the increased packing density of AP,so used the direct filesize comparison which is obviously underestimated the severity of the problem. Joe wrote: Edit: If it seems I'm advocating fewer AP WUs, that isn't really so. If the MB delivery had been keeping up, there wouldn't be 137 'tapes' waiting to be done by the mb_splitters and recalling data from NERSC HPSS would have started sooner. Work delivery should have remained reasonably stable for both MB and AP. Agreed, this will not reduce the AP WUs since each tape will still give the fixed number of AP WUs and a fixed number of MB WUs. At least this workaround will correct the imbalance and the effect will be that clients that does not specify limitation to one kind of work will receive MBs and APs in the correct ratio. A quick gander at the last hour's results returned was only 30:1 MB:AP, so we are still a long way to go in bringing balance back to the Force. Maybe controversially inflate the MB credits for a limited duration (e.g. putting CPU MB and CUDA MB at parity for a start) so that those people who bothers to specify preferences for the CPU cruncing will help clear some of the MB backlog? ID: 898113 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.