Nebula: architecture

Why not use volunteer computing?

We thought about using volunteer computing for the SETI@home back end as well as the front end. In this approach, we'd distribute the signal database to volunteer computers, divided by sky position, frequency, and/or time, and then farm out back-end tasks to these computers.

I decided not to take this approach for several reasons:

Data architecture

I knew we needed to move away from traditional SQL databases for primary data storage. I considered using a distributed "NoSQL" database system such as MongoDB, Hadoop, or Google's Big Table. In the end I decided to develop my own data storage system, based on Unix files and directories. Reasons for this include:

Hardware architecture

The first plan was to use the Amazon cloud: EC2 (computing) and S3 (storage). In June 2015 I began working on this with an undergrad student, Kevin Luong. We spent several months on this and got quite far along. But in the end we abandoned it for several reasons:

Next I considered using the UC Berkeley "Research Cluster". But it was more expensive than Amazon, and had limited capacity.

Around this time - late 2015 - I had a chat with my friend and collaborator Bruce Allen. Bruce is a brilliant physicist who runs the Albert Einstein Institute for Gravitational Physics (AEI) in Hannover, Germany. He also likes to build and experiment with computer systems. He created the Einstein@Home project, and contributed lots of code to BOINC during its early development.

Bruce built a huge computing cluster at AEI called Atlas. Atlas has thousands of compute nodes and lots of high-performance storage. It uses standard cluster software like NFS and Luster (for storage) and Condor (for job processing). When I told Bruce about my issues with Nebula, he immediately offered to let me use Atlas for it. His only condition was that we credit Atlas's contribution in our announcement of the discovery of ET.

This was a game-changer. Atlas was perfectly suited to the requirements of Nebula. My productivity skyrocketed. Instead of struggling with unfamiliar systems like EC2, I was working in Unix, my environment of choice. I hadn't used Condor before, but it was easy to use and worked great.

So a million thanks to Bruce for his generous help, and also to Carsten Aulbert, who manages Atlas; he taught me about it and has fixed many problems for me.

Software architecture

When I started Nebula, there was a lot of back-end code: roughly 30,000 lines of C++. This code consisted of several parts:
  1. General scientific code: computing angles and sky positions, correcting frequencies for Earth's motion, etc.
  2. Project-specific scientific code for things like finding multiplets and computing pixel scores.
  3. Code for retrieving records from the Informix DB.
  4. Code specific to NTPCkr, e.g. for maintaining the "hot pixel" list.

I wanted to reuse as much of this code as possible; in particular I wanted to use all of 1) and 2), with little modification. But the way the code was organized made this difficult. The project-specific scientific code represented signals and multiplets using C++ classes whose member functions included Informix DB access; in fact the .h and .cpp files were automatically generated from the SQL schema.

At first I tried removing these Informix dependencies, but this would have involved massive code changes. I decided that instead to keep all the dependencies; Nebula programs would be linked with the Informix client libraries; they just wouldn't use them. Later, when I moved the Nebula programs to Atlas, I had to figure out how to statically link the Informix libraries, since Informix doesn't exist on Atlas. But this wasn't hard to do.

In addition to the Informix dependencies, the pre-Nebula old signal classes had constructors and destructors, and they contained elements with constructors, and so on. None of these did anything useful to me, but they slowed everything down. I made my own versions of the signal classes, so that I could keep the classes clean and minimal, and add my own member functions as needed, I called these N_SPIKE, N_GAUSSIAN, etc. The source code is here. In the parts of Nebula that I wrote from scratch, I use these classes. In the parts that are based on existing code, I use the old classes.

Nebula consists of three bodies of code:

Browsing the source code

If you want to read the SETI@home back-end source code, it's divided among three Subversion repositories:

You can browse these through the web, or use Subversion to download the code.

Next: More about signals.




 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.