Near-Time Persistency Checker (NTPCkr) Frequently Asked Questions

(last updated: 20 Aug 2012, 22:03:40 UTC)

Answers to frequently asked questions about the NTPCkr:

Q: What is the NTPCkr?

A: A program which sits at the end of our data pipeline and scans SETI@home results soon after they are validated and determines if we've previously seen similar signal types at similar frequencies at that exact point in the sky (i.e. a persistent signal).

Q: Shouldn't it be easy to find strong signals in the results we return to you?

A: Individual signals reported in your SETI@home results are never interesting no matter how strong they are. They only get interesting if they persist over multiple observations (i.e. if we look at the same point in the sky, say, a year later do we see the same exact type of signal at the same frequency?). This is why we need a tool to hunt for persistent signals.

Q: Didn't you already have a program that does this?

A: Yes and no. The name of the game in SETI has always been to look for persistent signals. Individual spikes, Gaussians, etc. are uninteresting as they are commonly found due to random noise, or radio frequency interference (RFI). They are far more interesting when the same signals are seen over many observations.

In the past we didn't have the computing power to do this kind of analysis in "real time," i.e. as new data arrive. Instead there was a painful process of accumulating resources to analyze our entire, large data set in one pass - what we called "turning the crank." Finding pairs (or triplets, etc.) of similar signals in a database of billions of singals is quite difficult - imagine playing a game of concentration with billions of cards. That's why it usually took years between each time we turned the crank, during which our hardware/tools evolved enough to require reinventing the wheel again and again. However the NTPCkr keeps up with incoming data close to real time, updating our list of most interesting candidates every day (or hour, or minute!).

Q: What do you mean by "candidates"?

A: It sounds exciting, but we always found it difficult to come up with less exciting-sounding yet meaningful terminology in SETI. Think of it this way: there are always lots of candidates for U.S. president. A "candidate" simply means it has non-zero potential for interest.

We actually go further than that: a candidate is simply a point in the sky. We divide our observable space into pixels just like on a computer monitor, each identified with a unique number. Every spot on the sky is a candidate, even if we haven't seen anything there.

For more information about how we pixelize the sky, see here.

Q: How are candidates ranked?

A: They are ranked by score. The score is the probability that the exact set of signals we observed at that point in the sky would occur due to random chance. So the lower the score, the better.

Q: So is what I'm seeing at the top of the NTPCkr page must be E.T., right?

A: These may be our current top candidates, but they are most certainly due to our own earth's radio frequency interference. We're a very noisy planet - in fact there's a half dozen radar signal patterns we commonly see in our data from Arecibo. When a candidate stays near the top of the list, even after our best efforts to minimize the interference, then that's a little more interesting.

But I still wouldn't get too excited. Remember our data set contains billions and billions of signals - and so the natural random noise out in space will still cause some curious candidates. At that point we'd tag these candidates for reobservation. Only if we see it again during these reobservations will we start raising our eyebrows.

In short, what you see on the NTPCkr page is just the very first step in finding E.T., and it's highly likely that 100% of the candidates that show up on this page are noise or interference.

Q: Why are so many candidates at the same frequency?

A: Remember this is currently a list of "top" candidates ranked by score, and therefore it is highly likely a single short burst of earth radio interference (which we haven't rejected yet) is affecting the scores of many candidates. Once these are determined to be uninteresting, they will be dropped from the top of the list.

Q: Why are so many candidates at the same Declination?

A: Quite often the telescope at Arecibo is "parked" when not in use by other projects. When it's parked it is sitting at a single line of Declination (while the Right Ascension changes due to earth rotation). However, since we are collecting all the time we get an excess of data from these parked Declinations (like 18, which is quite common).

Q: Hey! A candidate (or a bunch of candidates) disappeared from the list! I know they were there yesterday - is this some kind of cover up?

A: No - this could be for several reasons. Here are four of them:

1. When a candidate gets a good score, it is automatically tagged for "cleaning." We'll check it for RFI contamination, and re-score it. Chances are very, very likely it is RFI contaminated, and so the score will increase (i.e. get worse) and the candidate will drop down in the list.

2. A candidate may have a good score because, say, we observed there 5 times and saw a Gaussian there each time. Then it just so happens the telescope passes over that point in the sky again, but this time saw nothing. The NTPCkr will rescore this candidate, and now that it only saw 5 Gaussians out of 6 observations, the score gets worse and the candidate will drop down in the list.

3. You could imagine that our scoring algorithms are quite complicated, given the varied nature of the signal types, the chaotic nature of our observation schedule, etc. We may find over time we need to adjust our algorithms, and this may cause some previously good candidates to become less good.

4. The NTPCkr will always be a work in progress, and due to testing/debugging we may find reason to scrap our current set of candidates and start from scratch. This isn't destructive: if a candidate is truly interesting it will soon reappear in our new top candidate list.

Q: What does this candidate line mean?

Example:

#	id	RA	Dec	Spikes	Gaussians	Pulses	Triplets	Stars	Metascore	State	Last Updated
1	14683503	19.529297	21.442044	770 (6/308)	148 (0/0)	273 (0/121)	196 (0/49)	0	-1.280627e+5	0	9 Jun 2009 18:49:42 UTC	skyplot

A: Going column by column...

id: A unique number which identifies this candidate. It also happens to be its "sky pixel" (i.e. with some math you can convert that into sky coordinates). Click on that to get a "pull down" of this candidate's most interesting multiplets.
RA/Dec: Right Ascention and Declination, which are the actual sky coordinates which point to the center of the candidate (remember that a candidate is made up many signals - these coordinates point to the center of the sky pixel closest to the average of all the individual signal coordinates).
Spikes/Guassians/Pulses/Triplets: Counts of all these signals, arranged by type, which make up this candidate. In the parentheses are counts of the multiplets, or groups of similar signals that showed up at the similar frequencies across multiple observations. The first such number is a count of "barycentric" multiplets (i.e. close in frequency), the second a count of nonbarycentric multiplets (i.e. not as close in frequency). In summation: signal count (barycentric multiplet count/nonbarycentric multiplet count).
Stars: Count of any known stars that are also on this sky pixel (which makes it more interesting).
Metascore: The cumulative score of everything that made up this candidate (the individual signals and multiplets). If we did the math right, this is the probability we'd see this candidate in random noise (so the lower the score, the better).
State: State of RFI (radio frequency interference) rejection/cleaning. Still in progress.
Last Updated: When the pixel was last updated, i.e. last time new signals from that pixel were assimilated into the database. Each time new data come in this candidate has to be rescored and may rise higher (or drop lower) in the list.
skyplot link: Click there to see some fun pictures regarding this candidate's location on the sky.

Q: What does this multiplet line mean?

Example:

id	type	ang dist	freq	chirp	period	snr	score	# det
700075	spike (b)	0.00383570	5.29402355	0.00000000	0.00000000	0.00000000	-1.486208e+2	3

A: Going column by column...

id: A unique id which identifies this multiplet. Click on that to get a "pull down" of the individual signals that make up this multiplet.
type: Signal type. In the parentheses is "b" for barycentric or "nb" for nonbarycentric. For now, we're only showing barycentric multiplets, as those are far more interesting (and it would take too long to display all the nonbarycentric multiplets).
ang dist: The standard deviation showing the tightness of sky location between all the constituent signals.
freq: The standard deviation showing the tightness of frequency between all the constituent signals.
chirp: The standard deviation showing the tightness of chirp rate (or doppler shift) between all the consituent signals.
period: The standard deviation showing the tightness of period between all the constituent signals. Note that this only pertains to signals with periods (i.e. pulses).
snr: The standard deviation showing the tightness of signal to noise ratio between all the constituent signals. Note that this only pertains to signals with snr's.
score: Score of this individual multiplet. Like the meta-score, this should be the probability we'd see such a candidate in random noise, so the lower the score the better.
# det: Number of detections that make up this multiplet.

Q: What does this signal line mean?

Example (for a spike):

id	result	time	peak power	det freq	fft	chirp	rfi c/f		chi sqr	null chi sqr	max power	period	snr	thresh	len_prof	score
1728077228	1947478456	2454784.5199488	50.032177	1420001134.3509	131072	0	0/0										plots

A: First, note that these columns don't always pertain to a given signal type, so many fields may be blank. Going column by column...

id: A unique id which identifies this signal.
result: A unique id which identifies which SETI@home result contained this signal.
time: Exact time when the signal was observed. This is represented as a Julian date.
peak power: The power (strength) of this signal at its highest point.
det freq: The frequency of the signal as it was detected.
fft: Short for Fast Fourier Transform, this represents the length of the transform used when find this signal in our data. Different lengths cause different resolutions in time and frequency.
chirp: The chirp rate, or doppler shift, of this signal.
rfi c/f: RFI (radio frequency interference) checked / RFI found - flags to tell us the state of RFI cleaning (in progress).
/ chi sqr / null chi sqr / period / snr / thresh / len_prof: Real nerdy stuff depending on the signal type. In order: the sigma of the signal, its chi squared and null chi squared (determining "goodness of fit" for Gaussians), its period, the signal to noise ratio, threshold, and length of the profile that contains information about the signal's shape.
score: Score for this individual signal. Like all the other scores, lower is better as it is the probability of finding this signal in random noise.
plots: If this link exists, click on it to see "waterfall" plots, i.e. time vs. frequency around the signal. Very useful for finding RFI.

Q: So what are "waterfall plots" exactly?

A: A waterfall plot shows all spike signals observed in a given time window by their frequencies. It's called a "waterfall plot" because it sometimes actually looks like a waterfall.

These plots are useful for quickly identifying certain frequencies that are both strong and constant. In short, if you see anything resembling a vertical line, this is radio frequency interference from earth. We know this because as the earth rotates and the telescope moves, constant frequencies must be local, i.e. from our own planet. Chances are if you look at these waterfall plots, you'll find lots of our "candidates" living on one of these vertical lines.

More information about waterfall plots can be found in this old science newsletter.

Q: What are the long range plans for getting rid of all this RFI (radio frequency interference)?

A: We have several schemes, most of which we used in earlier SETI projects, for automatically detecting and rejecting obvious RFI. However, we are currently devising new (and hopefully better) methods, including using many human eyeballs to identify questionable candidates (i.e. ones that live on or near interference). These eyeballs could include yours, if you're interested.

Q: Hey - I zoomed in on the Google skymap involving this candidate and see a big green diamond. What's that?!

A: That is showing you the border of the "pixel" which makes up this particular candidate. Code to generate those borders was donated by Tiaan Geldenhuys. Thanks Tiaan!

Q: I'm still confused. Is there a glossary of basic terms or concepts somewhere?

A: Yes, over here.

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.