Multiplet-finding revisited

Author	Message
David Anderson Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 13 Feb 99 Posts: 173 Credit: 502,653 RAC: 0	Message 1972905 - Posted: 1 Jan 2019, 2:59:59 UTC Last modified: 1 Jan 2019, 3:33:23 UTC In my last post I described how Eric has been working on generating more realistic birdies; that is, sets of signals that approximate what the SETI@home front end would produce given a) various combinations of transmitter parameters, including doppler shift due to orbital and rotational accelerations, and b) the way the S@h client works: its order of chirp rates and FFT lengths, and the threshold on the number of signals it generates. This got me thinking - again - about how we detect and score multiplets. A couple of major problems became apparent. 1) Sequences of consistent signals should count more than single signals Previously, multiplet-finding had an "observation selection" phase where we discarded all but one signal from each 10-minute period. The idea was that there might be a bunch of different signals, maybe at different FFT lengths or in neighboring frequency bins, that all corresponded to the same physical source, and that counting them all would give an inflated multiplet score. But what if there's a sequence of signals over time, during the same minute or two, with consistent frequence and chirp parameters? Shouldn't that count for more than just one signal? Of course. But we still need to discard simultaneous signals. So I replaced observation selection with what I call "overlap pruning". Given a set of signals, we find a subset that is non-overlapping. Furthermore, we try to pick signals that are likely to maximize multiplet score, namely that have high power, are clustered in position, and have low "frequency deviation" (see below). 2) Non-bary signals can have large frequency variation. This picture shows a possible non-bary ET signal (or birdie): The frequency varies due to rotation (short time scale) and orbit (long time scale). Depending on the orbital parameters, the total variation might be as much as 200 KHz. Generally, our data will include only a few observations of that sky position, and each observation will be short - a minute or two. We'll have signals only for these periods. This is indicated the gray rectangles. Previously, our multiplet-finding algorithm looked for groups of signals that are close in sky position and frequency. The multiplet "score" included a term that penalized frequency variation. The only different between bary and non-bary was that we used a wider frequency range (50KHz) for non-bary. This is clearly not the right approach for sources like the one shown above. Because frequency variation is penalized, we'd get a multiplet consisting of one of the 3 groups, together with noise signals at nearby frequencies. So I changed the algorithms as follows: a) within each .1-day "chirp segment" (during which chirp is assumed to be more or less constant, and frequency more or less linear with time) we find the median of chirp, and the median of chirp-adjusted frequency. For each signal, we compute a "frequency deviation" from this line. This is a local deviation; in the example above, if we find signals along the wiggly line, they'll have low deviation even though globally they have a wide variation. b) Overlap pruning (see above) is based on the local deviation. c) Multiplet scoring is based on the local deviation. That's the algorithm for non-bary multiplets. For bary multiplets, we no longer do chirp pruning; for a bary source, the only chirp is due to receiver motion. For now, we just discard signals whose chirp is outside the range we'd expect from Arecibo's motion. We could (and possibly will) get fancier and look at Arecibo's motion at the time of the signal. 3) Big Picture The issues discussed here are basic and obvious. Why are we addressing them so late in the game? The reason, I think, is that Nebula started with a large code base that evolved over a long period of time, starting from pre-SETI@home days and going through the NtPckr phase of S@h. At the beginning there was no notion of chirp or of non-bary multiplets, and as these notions were added their ramifications weren't always thought through. When I started Nebula, I rewrote a lot of this code to make it more efficient, but I didn't change the algorithms. As Nebula has progressed, Eric and I have examined the algorithms in turn, and in most cases we've realized that they were no longer optimal, and we've changed or replaced them. This won't go on forever; at this point we've reworked pretty much everything. ID: 1972905 ·

George 254 Volunteer tester Send message Joined: 25 Jul 99 Posts: 155 Credit: 16,507,264 RAC: 19	Message 1975265 - Posted: 15 Jan 2019, 10:00:55 UTC - in response to Message 1972905. Apologies for asking this but bary and non-bary signals? TIA ID: 1975265 ·

Tom M Volunteer tester Send message Joined: 28 Nov 02 Posts: 5124 Credit: 276,046,078 RAC: 462	Message 1984096 - Posted: 8 Mar 2019, 13:31:24 UTC - in response to Message 1975265. Apologies for asking this but bary and non-bary signals? TIA I found this: o Whether the signal is "barycentric", i.e. whether its frequency has been adjusted by the sender to cancel Doppler shift due to acceleration of the sender's reference frame (e.g. planetary rotation and orbit). o For non-barycentric birdies, we model its frequency over time as a sinusoid with random amplitude, frequency, and phase. From here: https://setiathome.berkeley.edu/nebula/web/birdies.php HTH, Tom A proud member of the OFA (Old Farts Association). ID: 1984096 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.