Distinguishing Possible Extraterrestrial Signals From Noise and RFI|
Eric Person, Dan Werthimer, Jeff Cobb, Matt Lebofsky
Thanks to the processing power provided by millions of SETI@home participants, an incredible number of interesting signals are being detected every day. For example, SETI@home participants detected a total of 478,716 spikes (strong signals) in data collected in a 16-hour period from the Arecibo Radio Telescope on November 13, 2000. This number of spike results is typical for any 16-hour period. Clearly, we expect most of these spikes to have originated either from noise (naturally occuring radiowaves) or from Earth (called "radio frequency interference", or RFI), not from an extraterrestrial civilization. But how can we tell the difference? In this newsletter we'll demonstrate some common features of the spike signals detected from SETI@home clients and demonstrate a few methods of separating signals created by noise or RFI from those that are interesting (possibly extraterrestrial) in origin.
Let's begin by taking a closer look at the spike data mentioned above. To learn how these spikes were distributed, we plotted each spike at the time and frequency at which it was detected. Figure 1 shows the detected spikes, with frequency (in units of MHz) along the horizontal (x) axis and time (in seconds from the beginning of the measured 16-hour period) along the vertical (y) axis. This type of plot is known as a "waterfall", since its appearance typically resembles a sheet of dripping water.
Also notice that the signal strength (or "power") of each spike is color-coded:
Reducing Waterfall Density: Viewing Spikes at a Specific FFT Length
There are so many signals displayed in Figure 1 that it's too crowded to able to see any patterns in the data. However, if we select a subset of the data, such as only those spikes detected from an FFT length of 16k, the plot becomes less crowded and patterns begin to emerge.
Figure 2 shows only those spikes detected from an FFT length of 16k (a total of 17,464 spikes). Note the vertical lines at 1419 and 1421 MHz. The signals detected at these two frequencies are not from extraterrestrials; rather, they are "test signals" (also called "birdies") we inject into the telescope receiver to make sure that the instrumentation and software are working properly. Also notice that at an FFT length of 16K, we're mostly detecting signals with strengths less than 100.
The Power Distribution of Spikes
Let's take a closer look at the signal strengths of the spikes. Figure 3 is a histogram showing the number of spikes detected at each power. Notice that the vertical (y) axis has a logarithmic scale, where each major incremental mark represents a quantity 10 times the major mark beneath it. Logarithmic scales are useful for displaying data with very long ranges, such as the case here where the number of spikes at a given power ranges anywhere from 0 to 100,000. Also notice that the upper-bound of the horizontal (x) axis is 1000. We saw from the blue dots appearing in Figure 1 that spike powers can have magnitudes well over 100,000. In fact, there is no upper limit to the power any given spike can have; there are spikes in this sample with powers extending beyond 13 decimal spaces. This power range is way too large for even a logarithmic scale to handleall of the low-power spikes would be bunched up on the left, making it difficult to discern any patterns. Since the vast majority of spikes have powers less than 200, we restrict the plot to powers less than 1,000 so that we can view the distribution of these spikes more clearly.
As you can see, Figure 3 has a very interesting pattern. The peaks on the left of the graph, the largest of which are located at power values 44, 88, and 176, come from spikes detected by analyses using FFT lengths of 32k, 64k, and 128k, respectively. As the FFT length increases, the power threshold for signal detection is set higher; this increased threshold compensates for the fact that power values are amplified for analyses at long FFTs. So, analyses performed at an FFT length of 128k won't detect signals weaker than 176, etc. Also, analyses using longer FFTs are better at detecting narrowband spikes, and so you see high peaks in the graph where each analysis at a particular FFT length "kicks in". The hump on the right of the graph, peaking at a power of about 700, is from the strong test signals we inject (the same test signals visible in Figure 2). These test signals produced a hump in Figure 3 rather than a sharp peak because they can vary in terms of their relative power. (Their average is 700, but at any given time an individual test signal can be weaker or stronger than 700.) If we remove these "birdies" from the data, the hump disappears, as shown in Figure 4.
Power Distribution at FFT Length 128k
Let's examine a subset of the power distribution, taking only spikes detected from analyses using an FFT length of 128k, with test signals removed. Figure 5 shows the distribution of these signal strengths. As mentioned earlier, a SETI@home analysis using an FFT length of 128k should only report spikes whose power is greater than 176. Note the small hump on the left of the graph, centering at a power of around 95. We have no explanation for these spikes yet.
What Kind of Distribution Would One Expect From Noise?
In Figure 5 above, most of the spikes are detected just above the threshold of 176fewer and fewer signals are detected at higher powers. It turns out that this exponential drop-off follows the same pattern as a Chi Square distribution with two degrees of freedom. Interestingly, the power distribution one would expect for pure noise also follows this same pattern. Hence, the vast majority of signals that follow this pattern can be attributed to noise. Most of the remaining signals that don't follow the noise pattern (such as the excess of signals with power ranging from 225 to 500) are mostly (if not all) due to radio frequency interference. Luckily, the number of these RFI signals is very low (about 1% of signals detected).
Of course, extraterrestrial signals might be imbedded in the noise pattern somewhere or in the range we attribute to RFI. Further analyses are being performed to determine which spikes (if any) occur consistently from specific locations in the sky. A spike that occurs repeatedly from the direction of a particular star, for example, would be a candidate for extraterrestrial origin. In this way we hope to discriminate signals caused by extraterrestrial civilizations from signals caused by noise, events on Earth, satellites, or natural astronomical events.
Of the 478,716 spikes we addressed in this newsletter, about 3% are actually test signals ("birdies") that we inject into the data, and about 96% follow a pattern attributable to noise. We currently attribute the final 1% of signals to RFI and technical anomalies (the small hump in Figure 5 may turn out to be one such anomaly). More sophisticated analyses are underway to determine which of these signals are arriving consistently from specific locations in the skycharacteristics that might indicate extraterrestrial communication.
©2021 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.