A quick review: we find "multiplets" - groups of detections that are plausibly from a single extraterrestrial radio beacon. We then compute a "score" for each multiplet. The score is intended to estimate the probability that the multiplet is noise, so a lower score is better.
Multiplet scores are used for two purposes:
- We will (at some point) manually examine the 1000 or so best-scoring multiplets of each category, looking for ET. ("Category" is the combination of detection type and barycentric-ness).
- We consider a birdie multiplet to be "detected" if its score ranks among the top 1000 non-birdie multiplets of that category. This is how we measure the sensitivity of our search, and also the effectiveness of our algorithms.
So the score function is critical to the entire experiment. We could be doing RFI removal perfectly, and finding the best possible multiplets, including perhaps the elusive "ET multiplet". But if the score function puts that multiplet outside the top 1000, we'll never see it, and we won't find ET.
The score function has evolved a lot, as described in previous blog entries. To preserve precision for good scores (close to zero probability) we compute the score in log space. Scores are generally negative numbers, and better scores are more negative.
Currently the score function is the sum of three terms (this corresponds to multiplying three probabilities). The terms are:
- ND factor: the probability of finding N consistent detections in the multiplet's frequency band and sky disk, given the number of detections across all frequencies; N is the number of detections in the multiplet before pruning.
- Signal factor: the median of the scores of the detections in the multiplet. (The score of a detection is its power plus, for non-spike types, a goodness-of-fit factor; these are adjusted to normalize scores across detection types).
- Time factor: a measure of how uniformly the detection times are distributed over the time we observed this part of the sky.
These terms are somewhat ad-hoc. Each them tries to measure something, but we're not sure how well. When we look at the top-scoring multiplets we invariably find ones that - intuitively - shouldn't have scored that high.
How can we improve multiplet scoring? We can improve the factors, or maybe find new ones. But another thing we can do is adjust the weights of the existing factors, i.e. multiply each of them by some number prior to adding them. Currently the factors are weighted equally. But we could weight them unequally (this would correspond to raising each probability to some power).
OK. But what should the weights be? Birdies (once again) are the key to answering this. Our underlying goal is for birdie multiplets to score well relative to non-birdie multiplets. So we want to find weights that do this.
A while back we had a visiting student do this using neural networks. The results were positive but not dramatically so. I decided to use a simpler approach based on function minimization. We need to define a "figure of merit" for the score function: a single number that says how well the score function works. Then we want to optimize this value.
Specifically, let X be the vector of weights for the 3 factors, and let
f(X) = B(X) - NB(X) + penalty(X)
where
B(X) = the median score of birdie multiplets
NB(X) = the median score of the best 1000 non-birdie multiplets
penalty(X) = a multiple of the squared distance from X to the unit sphere. This keeps X from diverging.
We then find the X* that minimizes f(X). F isn't differentiable (because of the medians) but it's continuous and (I think) convex, so it's easy to minimize; the minimize() function in Python's SciPy converges quickly.
Notes:
- f(X) is one way to formulate the problem; there are others. If you have ideas, let me know.
- We do all this separately for barycentric and non-barycentric multiplets; the optimal weights are different. In particular, B(X) includes only multiplets for birdies of this baryness (i.e. if we're doing non-bary, we don't include the non-bary multiplets from bary birdies).
- We can only do this for spike/gaussian multiplets, because we only have spike/gaussian birdies. For the other types (pulse/triple and autocorr) we can use the weights given by X*, or stick with (1,1,1). We'll figure that out later.
OK. I coded this and ran it. The results are as follows.
Barycentric:
Original medians: birdie -51.55, non-birdie -90.71.
Optimized medians: birdie -4.44, non-birdie -3.69.
Optimal weights: (.13, -.003, 1.7)
Notes:
- It worked; median birdie scores were originally worse, now they're better.
- The scores are closer to zero, but that's OK; magnitude of score doesn't matter, just the order.
- Signal factor has a negative weight. Why?
Non-barycentric:
Original medians: birdie -63.50, non-birdie -403.58
Optimized medians: birdie 59.57, non-birdie 233.46
Optimal weights: (-1.73 .03 -.02)
Notes:
- It worked.
- Scores are now positive; again, this doesn't matter.
- Two out of three weights are negative. Why?