## Finding persistent non-barycentric signals (work in progress)

Message boards : Nebula : Finding persistent non-barycentric signals (work in progress)
Message board moderation

AuthorMessage
David Anderson
Volunteer moderator
Project developer
Project scientist

Joined: 13 Feb 99
Posts: 173
Credit: 502,653
RAC: 0
Message 2060391 - Posted: 1 Nov 2020, 8:08:53 UTC

SETI@home is supposed to find persistent non-barycentric signals. We currently do a bad job of this:
• The high-scoring non-bary multiplets look like noise; they don't resemble the sort of signals we're looking for.
• We create non-bary birdies, with high power, but we're not finding them. The multiplets containing their detections contain lots of other detections as well, and their scores don't put them anywhere near the top 1000.

Our current algorithm for finding non-bary multiplets goes like this:
• For a given pixel, assemble the detections in its "pixel disc".
• Move a "sliding window" 200 KHz wide across the detections (200 KHz was assumed to be the largest variation for the range of planetary parameters we wanted to consider).
• In a given 200 KHz band, divide the detections by time into .1-day intervals.
• In a given interval, find the "clump" of width at most 200 Hz for which the sum of detection powers is greatest.
• Toss out clumps whose frequency is too far off from nearby clumps, based on planetary parameter assumptions. I call this "long-term frequency consistency".

This is what I'd call a "wishful thinking" algorithm. I developed it based on a mental model of what the data looked like, rather than on actual data.

A while back I added web features for plotting (in time/freq space) the detections making up non-bary multiplets. When I got around to actually looking at these plots, I realized that the algorithm wasn't working at all. The multiplets jumped all over the place in frequency, way too fast to be from any plausible transmitter, and with none of the sinusoidal structure we'd expect. My brilliant algorithm was outputting garbage.

Aside: in developing Nebula I wrote algorithms first, then created data visualizations. In retrospect, I should have done it the other way around.

Anyway, let's think about what a non-bary multiplet should look like. The orbital shift would be sinusoidal (if the orbit is circular; a bit different if it's eccentric, but we'll ignore that for now). The rotational shift would also be sinusoidal, most likely with a smaller amplitude and higher frequency. The shift depends on the nature of the star/planet system; the overall width can pretty big, like hundreds of KHz. We'd expect to see detections (e.g. spikes) scattered along this frequency/time curve.

So when I saw the garbage multiplets my algorithm as finding, my first instinct was to think about curve-fitting: take the set of detections and fit (in the least-squares sense) a sine wave (or a sum of 2 sine waves, if you want to include rotation) to them. I spent some time looking at libraries that did this: GSL (which seemed impossibly complicated), NLopt, and ALGLIB. I wrote programs to test these with actual sinusoidal data, with and without noise.

And with sporadic (but regular) data:

This is a nice idea but there are problems:
• In my tests, fitting a single sine wave to synthetic data, the curve-fitting functions converged to the right parameters only if the initial guess was close to the answer. Otherwise it converged to a local minimum:

And the presence of random noise pulls the fitted curve toward the middle:

• My tests used data spaced regularly in time. In practice, sky locations are observed only sporadically during our 20-year history. Most observations are short - a few seconds, as one of the telescope beams passes over the sky location. We might have 10 observations in an hour, then a gap of a year or two.
• The range of planetary parameters we were considering included Earth-like systems, i.e. F and G-type stars, where the orbital periods are on the order of a year, but also M type stars where the periods can be much shorter. In this case, with a few observations sprinkled over 20 years, there could be an almost unlimited number of parameters that would fit even ideal data.

At this point I did what I should have done in the first place, and added a web feature for showing all the detections in the pixel disc in the 200 KHz band - i.e. the input to the non-bary multiplet finding algorithm. This showed that, in each observation interval, there were tons of detections randomly distributed across the entire frequency band. Here's the data for the multiplet shown above:

There were no well-defined clumps. The algorithm was finding essentially random clumps, tossing out a few to enforce the long-term consistency constraint, and calling that a multiplet. And it was clear that fitting sine waves to this data would be meaningless.

It was clear that we had to somehow "thin out" the data. Eric and I decided to do this by restricting our search to signals from F and G type stars. Transmitters in this case would be accelerated relatively slowly, so the chirp rate of the resulting detections would be small (like in the range 0 to 1 Hz/sec; SETI@home searches all the way out to +/- 100 Hz/sec).

Here's the above pixel disc with chirp-rate thinning:

This F/G star restriction also means that the "long-term consistency constraint" can be quite a bit tighter: the signal can't change frequency that fast over time.

So we're in the middle of changing the way we generate non-bary birdies, and the way we find non-bary multiplets, to reflect these restrictions. We'll see how things look after that. My view is: if we reach the point of being able to easily see birdie signals, and their sinusoidal structure, in the pixel-disc plots, then we can return to the task of fitting curves to them.

Notes on curve fitting:
• a given set of planetary motion parameters determines not just frequency as a function of time, but also chirp rate. So we can include chirp rate in the least-squares fit; this may improve things. Or it may not.
• To deal with the effects of noise on fitting (see above) I floated the idea of doing a fit, discarding everything outside a moderate band around the curve, doing another fit, and so on.

It remains to be seen whether SETI@home's multiple observations gives us greater sensitivity to non-bary signals. It seems like it should: signals presumably persist, but noise doesn't. The problem is that, for non-bary signals, the detected frequency can change widely over time, But we have a bunch of ideas to explore, and maybe some of them will pay off.
ID: 2060391 ·
rob smith
Volunteer moderator
Volunteer tester

Joined: 7 Mar 03
Posts: 22285
Credit: 416,307,556
RAC: 380
Message 2060429 - Posted: 1 Nov 2020, 18:11:34 UTC

Well, you never said it was going to be a simple, smooth ride to plough through all the first part filtered data. It now looks as if you are getting into the tiddly step forward each time instead of the one forward, two back of your initial attempts.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2060429 ·
Bill F
Volunteer tester

Joined: 18 May 99
Posts: 42
Credit: 5,653,653
RAC: 2
Message 2060450 - Posted: 2 Nov 2020, 1:43:12 UTC - in response to Message 2060429.

Thank you for the update... your volunteer's are out here waiting to hear from you and these updates are great.

Like the Arecibo Observatory ... we are "all ears"

Bill F
ID: 2060450 ·
Rich

Joined: 25 Apr 08
Posts: 15
Credit: 3,727,057
RAC: 65
Message 2060650 - Posted: 5 Nov 2020, 12:01:28 UTC - in response to Message 2060391.

Thanks very much for the update – that’s really interesting.

And... I have nothing useful to offer, so I won’t interrupt further. Please keep on sharing, when you can.
ID: 2060650 ·
Steven

Joined: 5 Aug 18
Posts: 1
Credit: 1,600,364
RAC: 1
Message 2061118 - Posted: 11 Nov 2020, 0:29:47 UTC

Could you create some kind of dataset for an AI to chew on and see what it could spit out (I am not fully caught up on that stuff)?
ID: 2061118 ·

Joined: 27 Feb 19
Posts: 3
Credit: 3,026,745
RAC: 7
Message 2061688 - Posted: 18 Nov 2020, 19:15:12 UTC - in response to Message 2060391.

If/when you devise a suitable algorithm, will we be running 'old' datasets again?
ID: 2061688 ·
Peter Hucker
Volunteer tester

Joined: 3 Apr 99
Posts: 39
Credit: 7,555,481
RAC: 0
Message 2062007 - Posted: 22 Nov 2020, 16:11:51 UTC - in response to Message 2060391.

7 PCs at the ready when you are.
ID: 2062007 ·
rob smith
Volunteer moderator
Volunteer tester

Joined: 7 Mar 03
Posts: 22285
Credit: 416,307,556
RAC: 380
Message 2064559 - Posted: 29 Dec 2020, 16:51:48 UTC - in response to Message 2062007.

Nebula will never be a "BOINC Style" project - which you would have learned if you read the introduction written by D.Anderson when he started this phase of SETI.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2064559 ·

Message boards : Nebula : Finding persistent non-barycentric signals (work in progress)