## Improved multiplet-finding

Message boards : Nebula : Improved multiplet-finding
Message board moderation

AuthorMessage
David Anderson
Volunteer moderator
Project developer
Project scientist

Joined: 13 Feb 99
Posts: 173
Credit: 502,653
RAC: 0
Message 1956619 - Posted: 21 Sep 2018, 8:19:33 UTC

I recently thought of a way to improve our multiplet-finding algorithm. In a nutshell, the algorithm works on one pixel at a time. It collects the signals whose sky locations are in a small disk centered at that pixel. It sorts these signals by barycentric frequency, then scans the signals with a "sliding window" whose frequency width depends on whether we're looking for barycentric and non-barycentric multiplets. When it finds a multiplet, it records it and moves the window past it.

What I realized is that when we find a multiplet whose signals are in the middle or end of the window, it's possible that there are signals beyond the end of the window which, if added to the multiplet, would increase its score.

These additional signals might be detected as another multiplet. But then we'd have two multiplets with lower scores, rather than one big multiplet with a higher score. If there's an ET signal in our data it's critical that the corresponding multiplet have as high a score as possible, so that it stands out from the noise multiplets. Similarly, we want birdies to be detected with as high scores as possible, to demonstrate the full sensitivity of our system.

I came up with a way to address the problem. The idea is that when you find a multiplet M, instead of outputting it immediately, you "stash" it, slide the window so that it starts at the first signal in M, and look for a multiplet in the resulting window. If you find one, and either a) it is entirely after M, or b) its score is lower than that of M, then you output M. Otherwise you stash the new multiplet and continue.

That's the basic idea. In full detail (see below) it's quite a bit more complex, and figuring it out required extra coffee, extra concentration, and a lot of trial and error. BTW, there may be errors in my solution, and there may be a better approach entirely. If so, please let me know!

I should point out that the multiplet-finding algorithm is "heuristic". That means it only approximates the right answer,
i.e. the highest-scoring multiplets. Finding the actual right answer, I believe, is "NP-complete", essentially meaning that it would take more than the lifetime of the universe to solve. In any case, the score function is itself a heuristic for estimating the probability of a set of signals occurring in noise.

When I work on things like this I write "pseudocode", which is halfway between English and a program. Here's my final pseudocode for this.
```maintain a variable "have_mp" and mp0
Invariant: if have_mp, then mp0 begins at start of window

Invariant: MPs are disjoint and don't overlap in freq

Assumption: if X is a subset of Y,
then find_multiplet(X) has lower score than find_multiplet(Y)

for signals S (in bary freq order)
while window not empty
if S doesn't exceed window, break

// invariant: current window is maximal (S is beyond it)
// each iteration of while removes at least 1 signal from start of window

if find a multiplet m
if have_mp
have_mp = false
if start(m) > end(mp0)
else if score(mp0) > score(m)
move start of window past mp0
// discard m in this case
continue
// decide if we should stash m
f = freq of first signal in m
if S.freq < f + window size
// S is in the window starting at start of m; extend window
mp0 = m
have_mp = true
move start of window to start(m)
break
else
move start of window past end(m)
else
trim front of window enough to include S at end
break
append S to window

at end
if have_mp
look for multiplet in final window

```
ID: 1956619 ·
Hugo A. Durantini Luca

Joined: 24 Sep 09
Posts: 14
Credit: 7,253,842
RAC: 10
Message 1962577 - Posted: 30 Oct 2018, 13:35:13 UTC

Its a shame what helping with this part of the project looks a bit out of reach for most of the people.

There any chance of someday seeing a Zooniverse version of this like say... Gravity Spy?
ID: 1962577 ·
Mr. Kevvy
Volunteer moderator
Volunteer tester

Joined: 15 May 99
Posts: 3776
Credit: 1,114,826,392
RAC: 3,319
Message 1962579 - Posted: 30 Oct 2018, 13:49:13 UTC - in response to Message 1962577.

There any chance of someday seeing a Zooniverse version of this like say... Gravity Spy?

There was already a project as you describe on Zooniverse: SETILive however it ceased in 2014.
ID: 1962579 ·
[VENETO] boboviz
Volunteer tester

Joined: 5 Oct 99
Posts: 16
Credit: 613,159
RAC: 0
Message 1962846 - Posted: 1 Nov 2018, 16:34:44 UTC

Very interesting!!
ID: 1962846 ·

Message boards : Nebula : Improved multiplet-finding