New Twist For "Aborted by project" Issue ?

Author	Message
jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 586927 - Posted: 14 Jun 2007, 19:27:38 UTC - in response to Message 586922. Last modified: 14 Jun 2007, 19:33:11 UTC Yep. But you can only increase the resolution of the search in the data so much without making any results meaningless. OK, my lack of sleep is really kicking in now... Due to noise enhancement or due to having sufficient samples so that any more samples just add redundant data that could be extrapolated (ala Integral Calculus), or something else entirely? Overly simplifying, many signal processing algorithms, like Fast Forier Transforms for example, achieve what they do by sorting into 'buckets' and are 'windowed'. Both the number of the buckets, and the size and shape of the 'windows' determine the resolution of the final data [ A lot is thrown out at this stage: think 4 band equaliser versus 32 band equaliser]. Those are choices, made in the design of the algorithms and program, which are influenced by practical computational complexity and the required precision. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 586927 ·

Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0	Message 586930 - Posted: 14 Jun 2007, 19:32:48 UTC - in response to Message 586923. Brian Silvers: ...multibeam will increase processing time, reducing even further the number of results that can be processed in any "average" day. This will mean that 400/day will still be able to fill up a cache quicker. ... Actually, Multibeam will decrease processing time. Because of the difference in beam width, ar=0.441 Multibeam WUs like those we're running in SETI Beta now are processed like ar=0.732 Line feed WUs. In addition, one of the techniques used with the ALFA receivers is a basket weave scan which Kevin Douglas described as "nodding on the meridian". The nodding rate is such that those observations will produce VHAR WUs in abundance. Joe Joe, MajorKong already pointed that out, and I thank both of you for clearing up my misunderstanding in regards to MB. AP though will likely still take longer. I guess what needs to be understood is if fast hosts will be given a higher share of AP to work on, or is it just "potluck"...? Brian ID: 586930 ·

Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0	Message 586932 - Posted: 14 Jun 2007, 19:41:58 UTC - in response to Message 586927. Yep. But you can only increase the resolution of the search in the data so much without making any results meaningless. OK, my lack of sleep is really kicking in now... Due to noise enhancement or due to having sufficient samples so that any more samples just add redundant data that could be extrapolated (ala Integral Calculus), or something else entirely? Overly simplifying, many signal processing algorithms, like Fast Forier Transforms for example, achieve what they do by sorting into 'buckets' and are 'windowed'. Both the number of the buckets, and the size and shape of the 'windows' determine the resolution of the final data [ A lot is thrown out at this stage: think 4 band equaliser versus 32 band equaliser]. Those are choices, made in the design of the algorithms and program, which are influenced by practical computational complexity and the required precision. Eh, the life I could've had (US Navy nuclear operator -> civilian physicist / engineer) would've likely had me easily understanding all that and then some without the equalizer comment. Instead, I decided that being lazy and not going to class on "minor subjects" such as Organic Chemistry in favor of playing Spades...was somehow a "good idea"... Duhhhhhhhhhhhhhhhh..... Funny but true story: My high school Physics instructor who wanted me to go into Physics (if not the Navy) cuts the grass for the woman that lives next door... It's a small world after all... ID: 586932 ·

LTDInvestments Volunteer tester Send message Joined: 2 Aug 99 Posts: 14 Credit: 4,592,515 RAC: 42	Message 586948 - Posted: 14 Jun 2007, 20:36:39 UTC - in response to Message 586693. Here's a thought: If a DC project cannot keep up with the machines of their contributors, then perhaps they shouldn't be DCs. At that point, their projects just don't require the crunching power offered by DC. Just go buy a few machines and do the work in-house. It's not the DC but your cache/machine combination that is failing to keep up with the DC. Your WU are aborting because the DC has reached its quota because other machines have finished their WUs. The only way for the system to not run into the abort issuse is to only send out the number of WUs for the qouta and wait for the timeout to pass before issuing another WU. Even then if you have cut off network activity you yourself are defeating the purpose of the DC. ID: 586948 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 586961 - Posted: 14 Jun 2007, 21:15:38 UTC - in response to Message 586930. Last modified: 14 Jun 2007, 21:16:34 UTC Actually, Multibeam will decrease processing time. Because of the difference in beam width, ar=0.441 Multibeam WUs like those we're running in SETI Beta now are processed like ar=0.732 Line feed WUs. In addition, one of the techniques used with the ALFA receivers is a basket weave scan which Kevin Douglas described as "nodding on the meridian". The nodding rate is such that those observations will produce VHAR WUs in abundance. Joe Joe, MajorKong already pointed that out, and I thank both of you for clearing up my misunderstanding in regards to MB. AP though will likely still take longer. I guess what needs to be understood is if fast hosts will be given a higher share of AP to work on, or is it just "potluck"...? Brian I had read MajorKong's post, but wanted to point out that there were technical reasons that would boost S@H work throughput even using identical apps. My guesstimate is that those who have been using the stock 5.15 here and will be automatically upgraded to the 5.2x Multibeam app will experience nearly a doubling of WUs/day. That combines the above factors with the optimizations Eric has added to the Multibeam app. In the test project, AP work is distributed on a "potluck" basis except that it has a requirement that the host's memory size as reported by BOINC be at least 256 MiB. The AP WUs contain 8 times as much data as S@H WUs, but processing is different enough I doubt the eventual crunch time will be directly proportional. The AP Wus contain the full 2.5 MHz bandwidth rather than 1/256 subbands like S@H, but only represent 13.42 seconds of telescope time. Assuming they have the same ~20% time overlaps as S@H WUs, there can be 8 AP WUs for each 256 S@H WUs. Joe ID: 586961 ·

Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0	Message 586972 - Posted: 14 Jun 2007, 21:40:14 UTC - in response to Message 586961. In the test project, AP work is distributed on a "potluck" basis except that it has a requirement that the host's memory size as reported by BOINC be at least 256 MiB. The AP WUs contain 8 times as much data as S@H WUs, but processing is different enough I doubt the eventual crunch time will be directly proportional. The AP Wus contain the full 2.5 MHz bandwidth rather than 1/256 subbands like S@H, but only represent 13.42 seconds of telescope time. Assuming they have the same ~20% time overlaps as S@H WUs, there can be 8 AP WUs for each 256 S@H WUs. OK, then it is important to figure out if host selection for AP is going to have a bump in requirements once it enters "real world". Also it is important to note performance of MB vs. current optimized apps to see if top-end machines are really going to be able to process 400/day on an average day if they got all MB. If the answers are "no" and "yes", then I'm willing to tentatively endorse the idea of going to 200/core (4 max). The consequence of all that means that the splitting better be fast and the back-end better be able to handle the load. I personally doubt it could at this point, although since Sidious was swapped, things have seemed much snappier... Also, I know the "no guarantee" verbiage, but the reality is that a lot of "no work from project" messages will irk a lot of people... ID: 586972 ·

KWSN - MajorKong Volunteer tester Send message Joined: 5 Jan 00 Posts: 2892 Credit: 1,499,890 RAC: 0	Message 586983 - Posted: 14 Jun 2007, 21:58:24 UTC - in response to Message 586961. The AP Wus contain the full 2.5 MHz bandwidth rather than 1/256 subbands like S@H, but only represent 13.42 seconds of telescope time. Assuming they have the same ~20% time overlaps as S@H WUs, there can be 8 AP WUs for each 256 S@H WUs. Joe 8 per 256 instead of 1 per 256. Thanks Joe for correcting my flawed understanding! ID: 586983 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 587105 - Posted: 15 Jun 2007, 3:25:44 UTC In the short term, before AP gets here, the 'no work from project' is probably going to be fairly common. As Joe has said the MB units, at same AR, take approx 70% of time as present enhanced units, because of the reduced beamwidth. But also the standard app now incorporates most of the generic cpu optimisation and is therefore another 20% faster. So this all means that the majority of users who don't know about and therefore don't use optimised apps will see the crunch time per unit decrease to under 60% of present enhanced times. There will still be optimised apps, but these gains will be for using SSEx etc. And the gains will not be as much as before. Going back to original thread subject. The only way to stop 'aborted by project' is to initially only replicate as many as required to meet quorum. i.e. 2. Unless the BOINC code can be modified to delay issue of extra results for a limited period. If the issuing of the extra results could be delayed for 2 days, here on Seti, then from my observations, for over 80% of WU's they would not be issued. AstroPulse. Assuming we can use most of the old data, there should be plenty of work for AP when it arrives here. But don't hold your breath, AP is in alpha phase. It took enhanced about 9 months to go through the Beta phase. Andy ID: 587105 ·

Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0	Message 587110 - Posted: 15 Jun 2007, 3:39:33 UTC - in response to Message 587105. So this all means that the majority of users who don't know about and therefore don't use optimised apps will see the crunch time per unit decrease to under 60% of present enhanced times. There will still be optimised apps, but these gains will be for using SSEx etc. And the gains will not be as much as before. How is MB in regards to noisy units? Are there large batches of early-ending results? ID: 587110 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 587118 - Posted: 15 Jun 2007, 3:55:40 UTC - in response to Message 587110. So this all means that the majority of users who don't know about and therefore don't use optimised apps will see the crunch time per unit decrease to under 60% of present enhanced times. There will still be optimised apps, but these gains will be for using SSEx etc. And the gains will not be as much as before. How is MB in regards to noisy units? Are there large batches of early-ending results? To be honest, I don't think we know. So far on my computers all units have been close to AR=0.4nnnn. Experience indicates most noisy units are at VHAR, or had radar interference, now cleared. But I have just asked over there, about an hour ago, msg 25184. ID: 587118 ·

Brian Silvers Send message Joined: 11 Jun 99 Posts: 1681 Credit: 492,052 RAC: 0	Message 587125 - Posted: 15 Jun 2007, 4:13:47 UTC - in response to Message 587118. So this all means that the majority of users who don't know about and therefore don't use optimised apps will see the crunch time per unit decrease to under 60% of present enhanced times. There will still be optimised apps, but these gains will be for using SSEx etc. And the gains will not be as much as before. How is MB in regards to noisy units? Are there large batches of early-ending results? To be honest, I don't think we know. So far on my computers all units have been close to AR=0.4nnnn. Experience indicates most noisy units are at VHAR, or had radar interference, now cleared. But I have just asked over there, about an hour ago, msg 25184. Well, that's a critical component. If what you are saying comes to fruition, then that would push "Octo" (happy now????) machines up to the point where they could do 400/day on a consistent basis, as my estimate for Zombie before was in the 200-250 range. Since we know he'll use a further optimized app, then you have to assume 2X the capability. The quandary becomes thus: Are people like Zombie kept happy, or are more users kept happy? Can both be kept happy (faster splitting)? Can the back-end handle this? I feel slightly better about that seeing how the responsiveness of the forums and viewing result pages has improved since Jocelyn was made the master db... It will be interesting how the next outage goes. Will it take less time to complete? Will it take less time to recover? If the backend can't efficiently deal with the higher load again, then if I were making the decision, the most I'd do is bump to 150/core (4 max), but I'd be more inclined to leave it alone and try to augment the equipment to handle the extra load before doing that... ID: 587125 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 587149 - Posted: 15 Jun 2007, 6:05:26 UTC - in response to Message 587125. Last modified: 15 Jun 2007, 6:06:10 UTC So this all means that the majority of users who don't know about and therefore don't use optimised apps will see the crunch time per unit decrease to under 60% of present enhanced times. There will still be optimised apps, but these gains will be for using SSEx etc. And the gains will not be as much as before. How is MB in regards to noisy units? Are there large batches of early-ending results? To be honest, I don't think we know. So far on my computers all units have been close to AR=0.4nnnn. Experience indicates most noisy units are at VHAR, or had radar interference, now cleared. But I have just asked over there, about an hour ago, msg 25184. Well, that's a critical component. If what you are saying comes to fruition, then that would push "Octo" (happy now????) machines up to the point where they could do 400/day on a consistent basis, as my estimate for Zombie before was in the 200-250 range. Since we know he'll use a further optimized app, then you have to assume 2X the capability. The quandary becomes thus: Are people like Zombie kept happy, or are more users kept happy? Can both be kept happy (faster splitting)? Can the back-end handle this? I feel slightly better about that seeing how the responsiveness of the forums and viewing result pages has improved since Jocelyn was made the master db... It will be interesting how the next outage goes. Will it take less time to complete? Will it take less time to recover? If the backend can't efficiently deal with the higher load again, then if I were making the decision, the most I'd do is bump to 150/core (4 max), but I'd be more inclined to leave it alone and try to augment the equipment to handle the extra load before doing that... As the MB splitter is new, the MB data comes from Aricebo on HDD, I have no idea on its max splitting rate, compared to enhanced tape splitter. But with 7 beams and H and V polarization, there could be more data but that is governed by periods MB receiver is going to be on, after Aricebo finishes its maintenance/repaint. My personal view is that I don't believe the 'Management' will increase daily allowance. Unless they can be sure the backend can handle it reliably and the new splitter can handle the increased rate required. I think they will try to ensure all hosts get some work, rather than a few fast multicores getting the lions share. If you think about it more hosts, each with a bit, is probably more reliable due to redundancy, when relying on 'unknown' computers, that a few fast multicores. Plus if the users that only do a few units/day at present start noticing they are only getting work now and then, will soon question wether Seti and/or BOINC really needs them and start detaching. Then when they are needed (AP) they won't be here. If you look at the top 20 computers, the only ones at the mo that stand a chance of crunching more than 400/day, what would be the effect on the project if we switched them all off. Absolutely nothing (<1%), its only 8000/day out of 800,000/day. Sorry Zombie, but your octo, is a very small part (<0.05%) in a large machine. Andy ID: 587149 ·

W-K 666 Volunteer tester Send message Joined: 18 May 99 Posts: 19059 Credit: 40,757,560 RAC: 67	Message 587302 - Posted: 15 Jun 2007, 16:25:56 UTC I knew a rogue host would pop up sooner or later. This host is an example of why the download limits are imposed, and why they probably should not be increased until the Berkeley backend can handle a larger load reliably. Andy ID: 587302 ·

Alinator Volunteer tester Send message Joined: 19 Apr 05 Posts: 4178 Credit: 4,647,982 RAC: 0	Message 587329 - Posted: 15 Jun 2007, 17:49:03 UTC Hmmm.... I rooted around in this hosts summary pages a bit. Turns out it's running 4.19. I couldn't find any sign that it started malfunctioning per se. IOW, no compute errors just time outs. It had a few successful returns at the very end of the listings back toward the end of May. Weren't there some real problems with the CC scheduler and big CI's back with the early versions of 4x (It's been so long I don't remember for sure)? In any event, I don't see why the project doesn't use a 4.45 (or whatever is appropriate) cutoff. This would allow the folks who need to use 4x to participate, yet eliminate problems which arise from the use of the real old stuff (which obviously had problems or there wouldn't have been the need to release newer versions). Alinator ID: 587329 ·

Astro Volunteer tester Send message Joined: 16 Apr 02 Posts: 8026 Credit: 600,015 RAC: 0	Message 592370 - Posted: 25 Jun 2007, 11:33:27 UTC bumped from 2nd page for Modesto ID: 592370 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.