Message boards :
Number crunching :
New Twist For "Aborted by project" Issue ?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
Author | Message |
---|---|
![]() ![]() Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 ![]() |
Overly simplifying, many signal processing algorithms, like Fast Forier Transforms for example, achieve what they do by sorting into 'buckets' and are 'windowed'. Both the number of the buckets, and the size and shape of the 'windows' determine the resolution of the final data [ A lot is thrown out at this stage: think 4 band equaliser versus 32 band equaliser]. Those are choices, made in the design of the algorithms and program, which are influenced by practical computational complexity and the required precision. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 ![]() |
Brian Silvers:...multibeam will increase processing time, reducing even further the number of results that can be processed in any "average" day. This will mean that 400/day will still be able to fill up a cache quicker. Joe, MajorKong already pointed that out, and I thank both of you for clearing up my misunderstanding in regards to MB. AP though will likely still take longer. I guess what needs to be understood is if fast hosts will be given a higher share of AP to work on, or is it just "potluck"...? Brian |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 ![]() |
Eh, the life I could've had (US Navy nuclear operator -> civilian physicist / engineer) would've likely had me easily understanding all that and then some without the equalizer comment. Instead, I decided that being lazy and not going to class on "minor subjects" such as Organic Chemistry in favor of playing Spades...was somehow a "good idea"... Duhhhhhhhhhhhhhhhh..... Funny but true story: My high school Physics instructor who wanted me to go into Physics (if not the Navy) cuts the grass for the woman that lives next door... It's a small world after all... |
![]() ![]() Send message Joined: 2 Aug 99 Posts: 14 Credit: 4,592,515 RAC: 42 ![]() ![]() |
Here's a thought: It's not the DC but your cache/machine combination that is failing to keep up with the DC. Your WU are aborting because the DC has reached its quota because other machines have finished their WUs. The only way for the system to not run into the abort issuse is to only send out the number of WUs for the qouta and wait for the timeout to pass before issuing another WU. Even then if you have cut off network activity you yourself are defeating the purpose of the DC. ![]() |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
Actually, Multibeam will decrease processing time. Because of the difference in beam width, ar=0.441 Multibeam WUs like those we're running in SETI Beta now are processed like ar=0.732 Line feed WUs. I had read MajorKong's post, but wanted to point out that there were technical reasons that would boost S@H work throughput even using identical apps. My guesstimate is that those who have been using the stock 5.15 here and will be automatically upgraded to the 5.2x Multibeam app will experience nearly a doubling of WUs/day. That combines the above factors with the optimizations Eric has added to the Multibeam app. In the test project, AP work is distributed on a "potluck" basis except that it has a requirement that the host's memory size as reported by BOINC be at least 256 MiB. The AP WUs contain 8 times as much data as S@H WUs, but processing is different enough I doubt the eventual crunch time will be directly proportional. The AP Wus contain the full 2.5 MHz bandwidth rather than 1/256 subbands like S@H, but only represent 13.42 seconds of telescope time. Assuming they have the same ~20% time overlaps as S@H WUs, there can be 8 AP WUs for each 256 S@H WUs. Joe |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 ![]() |
OK, then it is important to figure out if host selection for AP is going to have a bump in requirements once it enters "real world". Also it is important to note performance of MB vs. current optimized apps to see if top-end machines are really going to be able to process 400/day on an average day if they got all MB. If the answers are "no" and "yes", then I'm willing to tentatively endorse the idea of going to 200/core (4 max). The consequence of all that means that the splitting better be fast and the back-end better be able to handle the load. I personally doubt it could at this point, although since Sidious was swapped, things have seemed much snappier... Also, I know the "no guarantee" verbiage, but the reality is that a lot of "no work from project" messages will irk a lot of people... |
![]() ![]() Send message Joined: 5 Jan 00 Posts: 2892 Credit: 1,499,890 RAC: 0 ![]() |
8 per 256 instead of 1 per 256. Thanks Joe for correcting my flawed understanding! |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19592 Credit: 40,757,560 RAC: 67 ![]() ![]() |
In the short term, before AP gets here, the 'no work from project' is probably going to be fairly common. As Joe has said the MB units, at same AR, take approx 70% of time as present enhanced units, because of the reduced beamwidth. But also the standard app now incorporates most of the generic cpu optimisation and is therefore another 20% faster. So this all means that the majority of users who don't know about and therefore don't use optimised apps will see the crunch time per unit decrease to under 60% of present enhanced times. There will still be optimised apps, but these gains will be for using SSEx etc. And the gains will not be as much as before. Going back to original thread subject. The only way to stop 'aborted by project' is to initially only replicate as many as required to meet quorum. i.e. 2. Unless the BOINC code can be modified to delay issue of extra results for a limited period. If the issuing of the extra results could be delayed for 2 days, here on Seti, then from my observations, for over 80% of WU's they would not be issued. AstroPulse. Assuming we can use most of the old data, there should be plenty of work for AP when it arrives here. But don't hold your breath, AP is in alpha phase. It took enhanced about 9 months to go through the Beta phase. Andy |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 ![]() |
How is MB in regards to noisy units? Are there large batches of early-ending results? |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19592 Credit: 40,757,560 RAC: 67 ![]() ![]() |
To be honest, I don't think we know. So far on my computers all units have been close to AR=0.4nnnn. Experience indicates most noisy units are at VHAR, or had radar interference, now cleared. But I have just asked over there, about an hour ago, msg 25184. |
Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0 ![]() |
Well, that's a critical component. If what you are saying comes to fruition, then that would push "Octo" (happy now????) machines up to the point where they could do 400/day on a consistent basis, as my estimate for Zombie before was in the 200-250 range. Since we know he'll use a further optimized app, then you have to assume 2X the capability. The quandary becomes thus: Are people like Zombie kept happy, or are more users kept happy? Can both be kept happy (faster splitting)? Can the back-end handle this? I feel slightly better about that seeing how the responsiveness of the forums and viewing result pages has improved since Jocelyn was made the master db... It will be interesting how the next outage goes. Will it take less time to complete? Will it take less time to recover? If the backend can't efficiently deal with the higher load again, then if I were making the decision, the most I'd do is bump to 150/core (4 max), but I'd be more inclined to leave it alone and try to augment the equipment to handle the extra load before doing that... |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19592 Credit: 40,757,560 RAC: 67 ![]() ![]() |
As the MB splitter is new, the MB data comes from Aricebo on HDD, I have no idea on its max splitting rate, compared to enhanced tape splitter. But with 7 beams and H and V polarization, there could be more data but that is governed by periods MB receiver is going to be on, after Aricebo finishes its maintenance/repaint. My personal view is that I don't believe the 'Management' will increase daily allowance. Unless they can be sure the backend can handle it reliably and the new splitter can handle the increased rate required. I think they will try to ensure all hosts get some work, rather than a few fast multicores getting the lions share. If you think about it more hosts, each with a bit, is probably more reliable due to redundancy, when relying on 'unknown' computers, that a few fast multicores. Plus if the users that only do a few units/day at present start noticing they are only getting work now and then, will soon question wether Seti and/or BOINC really needs them and start detaching. Then when they are needed (AP) they won't be here. If you look at the top 20 computers, the only ones at the mo that stand a chance of crunching more than 400/day, what would be the effect on the project if we switched them all off. Absolutely nothing (<1%), its only 8000/day out of 800,000/day. Sorry Zombie, but your octo, is a very small part (<0.05%) in a large machine. Andy |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19592 Credit: 40,757,560 RAC: 67 ![]() ![]() |
I knew a rogue host would pop up sooner or later. This host is an example of why the download limits are imposed, and why they probably should not be increased until the Berkeley backend can handle a larger load reliably. Andy |
Alinator Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0 ![]() |
Hmmm.... I rooted around in this hosts summary pages a bit. Turns out it's running 4.19. I couldn't find any sign that it started malfunctioning per se. IOW, no compute errors just time outs. It had a few successful returns at the very end of the listings back toward the end of May. Weren't there some real problems with the CC scheduler and big CI's back with the early versions of 4x (It's been so long I don't remember for sure)? In any event, I don't see why the project doesn't use a 4.45 (or whatever is appropriate) cutoff. This would allow the folks who need to use 4x to participate, yet eliminate problems which arise from the use of the real old stuff (which obviously had problems or there wouldn't have been the need to release newer versions). Alinator |
Astro ![]() Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0 |
bumped from 2nd page for Modesto |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.