Multiplet-finding revisited

Message boards : Nebula : Multiplet-finding revisited
Message board moderation

To post messages, you must log in.

Profile David Anderson
Volunteer moderator
Project administrator
Project developer
Project scientist

Send message
Joined: 13 Feb 99
Posts: 157
Credit: 502,653
RAC: 0
Message 1972905 - Posted: 1 Jan 2019, 2:59:59 UTC
Last modified: 1 Jan 2019, 3:33:23 UTC

In my last post I described how Eric has been working on generating more realistic birdies; that is, sets of signals that approximate what the SETI@home front end would produce given

a) various combinations of transmitter parameters, including doppler shift due to orbital and rotational accelerations, and
b) the way the S@h client works: its order of chirp rates and FFT lengths, and the threshold on the number of signals it generates.

This got me thinking - again - about how we detect and score multiplets. A couple of major problems became apparent.

1) Sequences of consistent signals should count more than single signals

Previously, multiplet-finding had an "observation selection" phase where we discarded all but one signal from each 10-minute period. The idea was that there might be a bunch of different signals, maybe at different FFT lengths or in neighboring frequency bins, that all corresponded to the same physical source, and that counting them all would give an inflated multiplet score.

But what if there's a sequence of signals over time, during the same minute or two, with consistent frequence and chirp parameters? Shouldn't that count for more than just one signal?

Of course. But we still need to discard simultaneous signals. So I replaced observation selection with what I call "overlap pruning". Given a set of signals, we find a subset that is non-overlapping. Furthermore, we try to pick signals that are likely to maximize multiplet score, namely that have high power, are clustered in position,
and have low "frequency deviation" (see below).

2) Non-bary signals can have large frequency variation.

This picture shows a possible non-bary ET signal (or birdie):

The frequency varies due to rotation (short time scale) and orbit (long time scale). Depending on the orbital parameters, the total variation might be as much as 200 KHz. Generally, our data will include only a few observations of that sky position, and each observation will be short - a minute or two. We'll have signals only for these periods. This is indicated the gray rectangles.

Previously, our multiplet-finding algorithm looked for groups of signals that are close in sky position and frequency. The multiplet "score" included a term that penalized frequency variation. The only different between bary and non-bary was that we used a wider frequency range (50KHz) for non-bary. This is clearly not the right approach for sources like the one shown above. Because frequency variation is penalized, we'd get a multiplet consisting of one of the 3 groups, together with noise signals at nearby frequencies.

So I changed the algorithms as follows:

a) within each .1-day "chirp segment" (during which chirp is assumed to be more or less constant, and frequency more or less linear with time) we find the median of chirp, and the median of chirp-adjusted frequency. For each signal, we compute a "frequency deviation" from this line. This is a local deviation; in the example above, if we find signals along the wiggly line, they'll have low deviation even though globally they have a wide variation.

b) Overlap pruning (see above) is based on the local deviation.

c) Multiplet scoring is based on the local deviation.

That's the algorithm for non-bary multiplets. For bary multiplets, we no longer do chirp pruning; for a bary source, the only chirp is due to receiver motion. For now, we just discard signals whose chirp is outside the range we'd expect from Arecibo's motion. We could (and possibly will) get fancier and look at Arecibo's motion at the time of the signal.

3) Big Picture

The issues discussed here are basic and obvious. Why are we addressing them so late in the game?

The reason, I think, is that Nebula started with a large code base that evolved over a long period of time, starting from pre-SETI@home days and going through the NtPckr phase of S@h. At the beginning there was no notion of chirp or of non-bary multiplets, and as these notions were added their ramifications weren't always thought through.

When I started Nebula, I rewrote a lot of this code to make it more efficient, but I didn't change the algorithms. As Nebula has progressed, Eric and I have examined the algorithms in turn, and in most cases we've realized that they were no longer optimal, and we've changed or replaced them. This won't go on forever; at this point we've reworked pretty much everything.
ID: 1972905 · Report as offensive
Profile George 254
Volunteer tester

Send message
Joined: 25 Jul 99
Posts: 155
Credit: 16,507,264
RAC: 19
United Kingdom
Message 1975265 - Posted: 15 Jan 2019, 10:00:55 UTC - in response to Message 1972905.  

Apologies for asking this but bary and non-bary signals?
ID: 1975265 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5117
Credit: 276,046,078
RAC: 462
Message 1984096 - Posted: 8 Mar 2019, 13:31:24 UTC - in response to Message 1975265.  

Apologies for asking this but bary and non-bary signals?

I found this:
o Whether the signal is "barycentric", i.e. whether its frequency has been adjusted by the sender to cancel Doppler shift due to acceleration of the sender's reference frame (e.g. planetary rotation and orbit).

o For non-barycentric birdies, we model its frequency over time as a sinusoid with random amplitude, frequency, and phase.

From here:

A proud member of the OFA (Old Farts Association).
ID: 1984096 · Report as offensive

Message boards : Nebula : Multiplet-finding revisited

©2021 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.