Finding persistent signals

Now we come to the part where we actually find ET! Although SETI@home is able to detect short, one-time ET transmissions, it's designed to detect signals that are "persistent" - that remain on for many years. So we search through the entire set of non-RFI detections, looking for groups of detections that are close in sky position and frequency, but possibly spread out over time. These groups of detections are called "multiplets".

Types of multiplets

We look for two kinds of multiplets:

Barycentric: for this type of multiplet, we assume that ET is sending a directional beacon, and that they modulate its frequency to cancel out chirp due to the transmitter's acceleration in that direction. In other words, a barycentric signal would be received at a fixed frequency in an inertial reference frame. ET might transmit a signal like this if they were trying to communicate with us, or if they were transmitting to their own deep-space probe. Once we compensate for our own acceleration, such a signal would be detected at more or less a fixed frequency, so we can use a small frequency window when looking for barycentric multiplets.
Non-barycentric: for these, we assume that the transmitted frequency is not adjusted for transmitter acceleration, either because it's omnidirectional or because it's not intended for receivers in other reference frames. We need to use a wider frequency window, corresponding to the likely limits of the transmitter's radial velocity.

For each multiplet we compute the probability that it would have occurred in noise. The lower the probability, the more interesting the multiplet. The multiplet's "score" is the log of the probability (this gives better numerical resolution for extremely small probabilities). When we refer to "high-scoring" multiplets we actually mean those with large negative values.

Originally, we found multiplets separately for each detection type. However, doing this throws away sensitivity, because an ET signal could manifest as both spikes and gaussians, depending on the slew rate of the telescope. Similarly with pulses and triplets. So starting in March 2018 we look for mixed-type multiplets: spikes/gaussians, and pulses/triplets. Autocorrs are handled separately.

Pixels

SETI@home uses a system called Healpix that divides the celestial sphere into small rectangles called "pixels". Healpix allows for different resolutions:

We use a resolution at which each pixel is about as large as a telescope beam. This resolution divides the sphere into 2²⁶ (about 67 million) pixels, of which about 16 million are visible from the Arecibo telescope.

After finding the multiplets in a given pixel, we compute a score for the pixel itself. This score is based on the number of multiplets and their scores, and on the presence of stars, and especially Sun-like stars, in the pixel. Pixel scoring lets us detect ET transmissions that manifest as several multiplets; e.g. a signal with several frequencies.

Algorithms

I'll now describe the algorithms for finding the multiplets in a given pixel, and for scoring the pixel.

Assembling detections

For each detection type, we make an in-memory list of the detections within a beam width of the pixel. The area of a pixel is a bit less than the beam size of the telescope. We include all the detections in a disc that is centered at this pixel and contains parts of the 8 adjacent pixels. These lists are in the range of 100 to 100K detections.

Scanning detections

We sort the list of detections by their barycentric frequency. Then we do the following steps, with different frequency range parameters, to find the barycentric and non-barycentric multiplets. For barycentric the frequency range is 10 Hz; for non-barycentric, 50K Hz.

We scan the frequency-sorted list in order, maintaining a "sliding window" of detections, whose size is limited by the frequency range parameter. When we append a detection, if the new detection is outside the range, and the window has > 1 detection, we try to form a multiplet from the detections in the window. If this succeeds, we trim detections from the window up to the highest-frequency detection in the multiplet, then try to form a multiplet from the remaining detections, and so on. This ensures that multiplets are disjoint, i.e. no 2 multiplets contain the same detections.

Chirp pruning

If a group of detections are from the same ET source, then over short periods of time (say, an hour):

They'd have roughly the same chirp rate.
Their frequencies, when adjusted by this chirp rate, would be similar. For example, if the chirp rate is 1 Hz/sec, a detection one minute later should have a frequency about 60Hz greater.

The "chirp pruning" takes a set of detections and returns a subset that is consistent in these ways. It tries to find a subset that is likely to produce the highest-scoring multiplet. It works as follows:

Divide the set of detections into .1 day segments.
For each segment, find the 20 Hz/sec chirp range for which the sum of powers of detections in that range is greatest; discard the detections outside this range.
Find the median chirp of the remaining detections.
Adjust the frequencies of the detections by this chirp rate.
Find the 10KHz range for which the sum of powers of detections in that range is greatest; discard detections outside this range.

Observation selection

An "observation" is a contiguous time interval during which a beam sees a given pixel. An ET signal could produce many detections during a given observation. for example, a narrow-band signal may be detected as a spike in several consecutive FFTs. But these spikes are all really the same signal. To calculate the scores of multiplets properly, we need to include only one of them in the multiplet.

In practice, during scoring we don't have the detailed pointing information we'd need to identify observations. Instead, we consider detections separated by less than 10 minutes to be in the same observation.

If there multiple detections in an observation, we want to include the one that will contribute most to a high multiplet score. The score of a multiplet (see below) reflects its compactness in frequency and sky position, and the power of its component detections. So we compute the median frequency and sky position of the detections in the multiplet, and compute a detection "score" based on proximity to these medians and on detection power. Then, from each observation, we keep the detection with the highest score and discard the others.

Scoring multiplets

We then compute a score for each multiplet. This score includes several factors, including:

The scores of the component detections.
The "tightness" of the detections, as measured by the standard deviation of sky position, chirp rate, and barycentric frequency.

Scoring pixels

Once we've found all the multiplets (barycentric and non-barycentric) for a pixel, we compute the score of the pixel. This includes several factors, including:

The number and scores of the multiplets.
Stars: the presence of stars in the pixel increases its score, and the presence of Sun-like stars increases it more.

Normalizing multiplet scores

The distribution of multiplet scores varies between detection types. To allow meaningful ranking of multiplets of different types, we "normalize" the scores. This is done after large numbers of pixels have been processed.

This is done as follows. For each detection type we find the set of pixels having a multiplet of that type. For each such pixel we find top-scoring multiplet. Then we compute the median, over pixels, of these top scores. The differences between these medians determine normalization factors that are added to multiplet scores.

Code

If you know C++ pretty well, you might be interested in looking at the source code for the above algorithms. In fact, I encourage you to read the code carefully and let me know if you find any bugs or problems.

Here are some starting points (click to view the file, then search for the function):

Main program: process_pixel().

Assembling detections: candidate_set_t::get_signals().

Scanning detections: signal_disc_t::find_multiplets().

Chirp pruning: signal_disc_t::prune_chirp().

Observation selection: signal_disc_t::observation_selection().

Multiplet scoring: Code: signal_disc_t::score_multiplet().

Pixel scoring: meta_candidate_score(). Note: in the code, "meta-candidate" is synonymous with "pixel".

Multiplet score normalization: multiplet_stats.php.

Next: Birdies

©2026 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.