Joined: 13 Feb 99
We at SETI@home are fully vaxxed now, but we're still meeting on Zoom. Progress has been good. We're finally generating spikes for most birdies, and "finding" their multiplets. We're trying lower-power birdies to explore sensitivity. Here's what we've been doing/thinking recently.
If a radio SETI project (like SETI@home) detects something resembling an ET signal, it could also be RFI, noise, a satellite transmission, or an artifact of the project's hardware or software. One way to rule these out is to reobserve that sky location, preferably using a different telescope and data analysis system, and see if you detect the same thing. (This assumes, of course, that ET's transmitter is on for long periods.)
SETI@home, because it looks at most sky locations multiple times, has a limited form of reobservation built in. We use this (in our multiplet score function) to weed out one-time anomalies. But, to have confidence that a candidate is ET, we still need to reobserve separately. We started making plans to do this.
The system used for reobservation must be at least as sensitive as the original. Arecibo would have been fine, but alas. Another option is the FAST radio telescope in China, and we're applying for observing time there.
Drifting RFI algorithm
The drifting RFI algorithm had been erroneously removing lots of birdie spikes (~30% of them). We fixed this and made some other improvements to the algorithm:
I got curious about how much total Arecibo observing time we (commensally) used. It turns out to be less than I thought - about 400 days of data (per beam) in 12 calendar years, or about 9%. This is because the ALFA receiver, which we used, is just one of several Arecibo receivers, and it was only installed part of the time. Details are here and here.
A while back I discussed "bandwidth-dependent sky coverage" - the fact that for long FFT lengths (like 128K samples, or ~13 seconds) we don't have sufficiently long observations in some pixels, and therefore we have effectively covered less sky at those bandwidths.
I was calculating this in a "pessimistic" way: counting a pixel only if it was certain to have a sufficiently long interval. I.e.: you need a 26-second observation to be sure that a 13-second period of unknown start time is contained within it. Dan pointed out that if you have a shorter observation - say 14 seconds - the random 13-second interval MIGHT be contained in it. So I calculated the bandwidth-dependent sky coverage with this "optimistic" assumption. Indeed, sky coverage is greater - but at 128K FFT length, not by much. Results are here.
Frequency range pruning
Our existing RFI algorithms are run, together, on the entire set of detections. We added a new type of RFI rejection which is done later in the pipeline: namely, during multiplet finding. It applies only to spike/gaussian barycentric multiplets. It's based on the fact that the barycentric frequencies of the detections in these multiplets shouldn't vary much, but the topocentric frequencies (i.e. the actual received frequency, not adjusted by Doppler shift from Arecibo's motion) SHOULD vary because of this motion. For RFI (from terrestrial sources) it's the other way around.
We call this "frequency range pruning". What we do is to remove groups of detections that are within a .1 day interval, and for which B > 2*max(max_bw, D), where B is the RMS variation in barycentric frequency, D is the RMS variation in detection frequency, and max_bw is the max bandwidth (FFT freq resolution) of the detections. This is done at the very end (i.e. after time overlap pruning).
I changed score factor normalization so 25th percentile goes to -1, and the 75th goes to 1. This puts the mean somewhere around 0, so it's a easier to interpret a score.
Finally, a couple of programming things. I fixed an extremely rare crashing bug in RFI removal. I was getting a "reference" to the first item in a deque, then removing the item, then using the reference. This is a no-no. Once I saw this in the debugger it was obvious, but it took many hours to find because it happened in 1 process out of 56, about 1 hour into a 90-minute computation. This made me briefly wish I was using a language that prevented this sort of mistake (C++ does not).
Also, I discovered that the program that computes RFI zones was crashing, but I hadn't noticed it because the way I run the Nebula pipeline (using the Unix "make" program) wasn't checking the exit code. So we were using zone bitmaps from 6 months ago, though I don't think this made any difference. In any case, the makefile now checks all exit codes.
©2021 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.