Finding persistent signals

Now we come to the part where we actually find ET! Although SETI@home is able to detect short, one-time ET transmissions, it's best at detecting beacons that remain on for many years. So we search through the entire set of non-RFI signals, looking for groups of signals that are close in sky position and frequency, but possibly spread out over time.

Multiplets

These groups of signals are called "multiplets". We look for two kinds of multiplets, different in the size of the frequency window:

For each multiplet we compute the probability that it would have occurred in noise. The lower the probability, the more interesting the multiplet. The multiplet's "score" is the log of the probability (this gives better numerical resolution for extremely small probabilities). When we refer to "high-scoring" multiplets we actually mean those with large negative values.

Originally, we found multiplets separately for each signal type. However, doing this throws away sensitivity, because an ET signal could manifest as both spikes and gaussians, depending on the slew rate of the telescope. Similarly with pulses and triplets. So starting in March 2018 we look for mixed-type multiplets: spikes/gaussians, and pulses/triplets. Autocorrs are handled separately.

Pixels

SETI@home uses a system called Healpix that divides the celestial sphere into small rectangles called "pixels". Healpix allows for different resolutions:

We use a resolution at which each pixel is about as large as a telescope beam. This resolution divides the sphere into 226 (about 67 million) pixels, of which about 16 million are visible from the Arecibo telescope.

After finding the multiplets in a given pixel, we compute a score for the pixel itself. This score is based on the number of multiplets and their scores, and on the presence of stars, and especially Sun-like stars, in the pixel. Pixel scoring lets us detect ET transmissions that manifest as several multiplets; e.g. a signal with several frequencies.

Algorithms

I'll now describe the algorithms for finding the multiplets in a given pixel, and for scoring the pixel.

Assembling signals

For each signal type, we make an in-memory list of the signals within a beam width of the pixel. The area of a pixel is a bit less than the beam size of the telescope. We include all the signals in a disc that is centered at this pixel and contains parts of the 8 adjacent pixels. These lists are in the range of 100 to 100K signals.

Scanning signals

We sort the list of signals by their barycentric frequency. Then we do the following steps, with different frequency range parameters, to find the barycentric and non-barycentric multiplets. For barycentric the frequency range is 10 Hz; for non-barycentric, 50K Hz.

We scan the frequency-sorted list in order, maintaining a "sliding window" of signals, whose size is limited by the frequency range parameter. When we append a signal, if the new signal is outside the range, and the window has > 1 signal, we try to form a multiplet from the signals in the window. If this succeeds, we trim signals from the window up to the highest-frequency signal in the multiplet, then try to form a multiplet from the remaining signals, and so on. This ensures that multiplets are disjoint, i.e. no 2 multiplets contain the same signals.

Chirp pruning

If a group of signals are from the same ET source, then over short periods of time (say, an hour):

The "chirp pruning" takes a set of signals and returns a subset that is consistent in these ways. It tries to find a subset that is likely to produce the highest-scoring multiplet. It works as follows:

Observation selection

An "observation" is a contiguous time interval during which a beam sees a given pixel. An ET signal could produce many signals during a given observation. for example, a narrow-band signal may be detected as a spike in several consecutive FFTs. But these spikes are all really the same signal. To calculate the scores of multiplets properly, we need to include only one of them in the multiplet.

In practice, during scoring we don't have the detailed pointing information we'd need to identify observations. Instead, we consider signals separated by less than 10 minutes to be in the same observation.

If there multiple signals in an observation, we want to include the one that will contribute most to a high multiplet score. The score of a multiplet (see below) reflects its compactness in frequency and sky position, and the power of its component signals. So we compute the median frequency and sky position of the signals in the multiplet, and compute a signal "score" based on proximity to these medians and on signal power. Then, from each observation, we keep the signal with the highest score and discard the others.

Scoring multiplets

We then compute a score for each multiplet. This score includes several factors, including:

Scoring pixels

Once we've found all the multiplets (barycentric and non-barycentric) for a pixel, we compute the score of the pixel. This includes several factors, including:

Normalizing multiplet scores

The distribution of multiplet scores varies between signal types. To allow meaningful ranking of multiplets of different types, we "normalize" the scores. This is done after large numbers of pixels have been processed.

This is done as follows. For each signal type we find the set of pixels having a multiplet of that type. For each such pixel we find top-scoring multiplet. Then we compute the median, over pixels, of these top scores. The differences between these medians determine normalization factors that are added to multiplet scores.

Code

If you know C++ pretty well, you might be interested in looking at the source code for the above algorithms. In fact, I encourage you to read the code carefully and let me know if you find any bugs or problems.

Here are some starting points (click to view the file, then search for the function):

Main program: run_task().

Assembling signals: candidate_set_t::assemble().

Scanning signals: signal_disc_t::find_multiplets().

Chirp pruning: signal_disc_t::prune_chirp().

Observation selection: signal_disc_t::observation_selection().

Multiplet scoring: Code: signal_disc_t::score_multiplet().

Pixel scoring: meta_candidate_score(). Note: in the code, "meta-candidate" is synonymous with "pixel".

Multiplet score normalization: multiplet_stats.php.


Next: Birdies



 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.