Improved multiplet-finding

Message boards : Nebula : Improved multiplet-finding
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile David Anderson
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 13 Feb 99
Posts: 110
Credit: 465,399
RAC: 583
Message 1956619 - Posted: 21 Sep 2018, 8:19:33 UTC

I recently thought of a way to improve our multiplet-finding algorithm. In a nutshell, the algorithm works on one pixel at a time. It collects the signals whose sky locations are in a small disk centered at that pixel. It sorts these signals by barycentric frequency, then scans the signals with a "sliding window" whose frequency width depends on whether we're looking for barycentric and non-barycentric multiplets. When it finds a multiplet, it records it and moves the window past it.

What I realized is that when we find a multiplet whose signals are in the middle or end of the window, it's possible that there are signals beyond the end of the window which, if added to the multiplet, would increase its score.

These additional signals might be detected as another multiplet. But then we'd have two multiplets with lower scores, rather than one big multiplet with a higher score. If there's an ET signal in our data it's critical that the corresponding multiplet have as high a score as possible, so that it stands out from the noise multiplets. Similarly, we want birdies to be detected with as high scores as possible, to demonstrate the full sensitivity of our system.

I came up with a way to address the problem. The idea is that when you find a multiplet M, instead of outputting it immediately, you "stash" it, slide the window so that it starts at the first signal in M, and look for a multiplet in the resulting window. If you find one, and either a) it is entirely after M, or b) its score is lower than that of M, then you output M. Otherwise you stash the new multiplet and continue.

That's the basic idea. In full detail (see below) it's quite a bit more complex, and figuring it out required extra coffee, extra concentration, and a lot of trial and error. BTW, there may be errors in my solution, and there may be a better approach entirely. If so, please let me know!

I should point out that the multiplet-finding algorithm is "heuristic". That means it only approximates the right answer,
i.e. the highest-scoring multiplets. Finding the actual right answer, I believe, is "NP-complete", essentially meaning that it would take more than the lifetime of the universe to solve. In any case, the score function is itself a heuristic for estimating the probability of a set of signals occurring in noise.

When I work on things like this I write "pseudocode", which is halfway between English and a program. Here's my final pseudocode for this.
maintain a variable "have_mp" and mp0
Invariant: if have_mp, then mp0 begins at start of window

Invariant: MPs are disjoint and don't overlap in freq

Assumption: if X is a subset of Y,
then find_multiplet(X) has lower score than find_multiplet(Y)

for signals S (in bary freq order)
	while window not empty
		if S doesn't exceed window, break

		// invariant: current window is maximal (S is beyond it)
		// each iteration of while removes at least 1 signal from start of window

		if find a multiplet m
			if have_mp
				have_mp = false
				if start(m) > end(mp0)
					add mp0 to result vector
				else if score(mp0) > score(m)
					add mp0 to result vector
					move start of window past mp0
					// discard m in this case
					continue
			// decide if we should stash m
			f = freq of first signal in m
			if S.freq < f + window size
				// S is in the window starting at start of m; extend window
				mp0 = m
				have_mp = true
				move start of window to start(m)
				break
			else
				add m to result vector
				move start of window past end(m)
		else
			trim front of window enough to include S at end
			break
	append S to window

at end
	if have_mp
		add m0 to result
	look for multiplet in final window

ID: 1956619 · Report as offensive     Reply Quote

Message boards : Nebula : Improved multiplet-finding


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.