SETI@home: An Experiment in Public-Resource Computing

David P. Anderson¹
Jeff Cobb
Eric Korpela
Matt Lebofsky
Dan Werthimer

Space Sciences Laboratory
U.C. Berkeley

¹United Devices, Austin TX

Abstract

SETI@home uses computers in homes and offices around the world to perform scientific data processing. This approach, though it presents some difficulties, has provided unprecedented computing power. It has also led to a unique world-wide public scientific involvement. We describe SETI@home's design and implementation, and discuss its relevance to future distributed systems.

Introduction

SETI (Search for Extraterrestrial Intelligence) is a scientific research area whose goal is to detect intelligent life outside the Earth [SHO98]. One approach is to search for certain types of radio signals. Natural radio sources occupy wide frequency bands, while man-made signals such as television and radio have narrow bandwidth. A narrow-bandwidth signal from outside the solar system would evidence extraterrestrial life. In 1959 Morrison and Cocconi proposed searching near the 1.42 GHz Hydrogen line [COC59], and a number of projects have proceeded along these lines (see Table 1).

Project name	Sensitivity (Watts per square meter)	Sky coverage (% of sky)	Frequency coverage (MHz)	Maximum signal drift rate (Hz/sec)	Signal bandwidth(s) searched (Hz)	Computing power (GFLOPS)
Phoenix	1E-26	0.005 (1000 nearby stars)	2000	1	1	200
SETI@home	3E-25	33	2.5	50	0.07 to 1200 (15 octaves)	20,000
SERENDIP	1e-24	33	100	0.4	0.6	150
Southern SERENDIP	3e-24	70	300	0.4	0.6	50
Beta	2e-23	70	320	0.25	0.5	25

Table 1: A sample of radio SETI projects.

Radio SETI projects differ in the trade-off among sensitivity, frequency coverage and sky coverage; however, their data analysis generally has a similar form:

Compute the signal's power spectra over time.
Apply pattern recognition to the power spectra to identify "candidate" signals.
Eliminate candidate signals that are probably natural or man-made.

Previous projects have used special-purpose supercomputers, located at the telescope, to do tasks 1) and 2). The speed of these computers has often been the limiting factor in overall sensitivity and generality. The SETI@home project, however, uses Internet-connected computers for these tasks. We call this public-resource computing because the computing resources are provided by the general public. Since its release in April 1999, SETI@home has gathered more than 3 million users.

In the remainder of this paper we describe SETI@home - the scientific design, the process and data architecture, and the interactions with participants. We give some usage SETI@home statistics, compare it to other distributed computing projects, and discuss its relevance to the future design of computer systems.

Properties of the SETI@home computing task

SETI@home is a fixed-rate data processing task. Data is recorded at a constant rate, and the CPU time per unit data is roughly constant [KOR01]. If a task of this type receives too little CPU power, the input queue grows without bound (this is acceptable in our case since we archive all tapes anyway) and the completion is delayed accordingly. If there is too much CPU power, the project has two choices: turn participants away, or process each work unit multiple times.

SETI@home chose to do redundant computation. This has provided the added benefit of letting us detect and discard results from faulty processors and from malicious users. A redundancy level of three is adequate for this purpose. However, as the number of participants and the average CPU speed has increased, our redundancy level has at times exceeded this. To limit the redundancy level we again have had two choices: turn participants away, or to modify the application to use more CPU time per unit data. We have done the latter twice; this has allowed us to extend the range of signals detected.

SETI@home's computing task has several properties that make it well-suited to public-resource computing:

High computing-to-data ratio. Each data unit takes 3.9 trillion floating-point operations, or about 10 hours on a 500 MHz Pentium II, yet involves only a 350KB download and a 1 KB upload. This ties up a 56KB modem for about 60 seconds, a minimal inconvenience to participants. This high ratio also keeps our server network traffic at a manageable level.
Independent parallelism. Work unit computations are independent, so participant computers never have to wait for or communicate with one another. If a computer fails while processing a work unit, the work unit is eventually issued to another computer.
Tolerance of errors and omissions. It a data unit is analyzed incorrectly or not at all, it affects the overall goal only slightly. Furthermore, the omission is remedied the next time the telescope scans the same point in the sky.

Data recording and distribution

The Arecibo radio telescope in Puerto Rico, the world's largest radio telescope, is used for varied astronomical and atmospheric research. Starting in 1997, the SERENDIP project began "piggybacking" a secondary antenna at Arecibo [WER97]. As the main antenna tracks a fixed point in the sky, the secondary antenna traverses an arc. The disk of sensitivity (the beam), is about 0.1 degree in diameter. It passes over a fixed point in about 12 seconds (the beam period).

SETI@home shares SERENDIP's data source at Arecibo. A 2.5 MHz frequency band centered at the Hydrogen line (1.42 GHz) is recorded digitally. Radio signals are Doppler shifted in proportion to the sender's velocity relative to Earth; SETI@home's 2.5 MHz band is large enough to handle Doppler shifts for velocities up to 260 km/sec, or about the galactic rotation rate.

The recorded signal consists primarily of noise (from celestial sources and the receiver's electronics) and man-made terrestrial signals (such as TV stations, radar, and satellites). It also contains sine-wave signals injected by SETI@home for testing and calibration.

Data is sampled at 1-bit resolution and written to 35 GB Digital Linear Tape (DLT) cartridges. Every 16 hours a tape is filled, and about once a month a box of tapes is mailed from Arecibo to the SETI@home lab at U.C. Berkeley.

Figure 1: The SETI@home data distribution system.

The data is then "split" into work units. This is done by dividing the signal into 256 frequency bands, each about 10 KHz wide. Each band is then divided into 107-second segments. These segments are overlapped in time by 20 seconds, ensuring that every beam period is contained entirely in at least one work unit. The resulting work units are about 350 KB - enough data to keep a typical computer busy for several hours, but small enough to download over a slow modem in a few minutes.

A relational database stores information about tapes and workunits, as well as many other aspects of the project (see Table 2).

table name	description	number of rows
user	user account (name, password, work totals)	3.1 million
platform	compilation target (CPU + operating system)	175
version	compiled version of the client	657
team	group of users	81 thousand
credit	work totals for countries, platforms, etc.	435 thousand
tape	DLT tape of radio data	565
workunit_grp	107-second data segment	361 thousand
workunit	work unit	92 million
result	result of analyzing a work unit	281 million
spike	spike candidate signal	2.1 billion
gaussian	Gaussian candidate signal	171 million
triplet	triplet candidate signal	112 million
pulse	pulse candidate signal	99 million

Table 2: Tables in the SETI@home database.

A data/result server program distributes work units to clients. It uses an HTTP-based protocol so that clients inside firewalls can contact the server. The data server use a large number of processes with shared memory. Work units are sent round-robin in FIFO order.

A garbage collector program removes work units from disk, clearing an "on disk" flag in their database records. We have experimented with two policies:

Delete work units for which N results has been received, where N is our target redundancy level. If work unit storage space fills up, work unit production is blocked and system throughput declines.
Delete work units that have been sent M times, where M is slightly more than N. This eliminates the above throughput bottleneck, but causes some work units never to have results. This fraction can be made arbitrarily small by increasing M. We are currently using this policy.

In general we have divided server functionality into parts (such as the splitter, data/result server, and garbage collector) that are independent of one another and, where possible, of the database server. For example, the data/result server can be run in a mode where, instead of using the database to enumerate work units to send, it gets this information from a disk file.

The client program

The client program repeatedly gets a work unit from the data/result server, analyzes it, and returns the result (a list of candidate signals) to the server. It needs an Internet connection only while communicating with the server. The client can be configured to compute only when its host is idle, or to compute constantly at a low priority. The program periodically writes its state to a disk file, and reads this file on startup; hence it makes progress even if the host is frequently turned off.

Analyzing a work unit involves computing its power as a function of frequency and time, then looking for several types of patterns in this power function: spikes (short bursts), Gaussians (narrow-bandwidth constant signals with a Gaussian envelope, resulting from the telescope's beam moving across a point), pulsed signals (constant signals pulsed with arbitrary period, phase, and duty cycle), and triplets (three equally-spaced spikes at the same frequency; a simple pulsed signal). Signals whose power and goodness-of-fit exceed thresholds are recorded in the output file.

Outer loops varying the following parameters:

Doppler drift rate. If the sender of a fixed-frequency signal is accelerated relative to the receiver (e.g. by planetary motion) then the received signal drifts in frequency. Such signals can best be detected by undoing the drift in the original data, then looking for constant-frequency signals. The drift rate is unknown; we check 31,555 different drift rates, covering the range of physically likely acceleration rates.
Frequency resolution. Modulated signals will have different bandwidths; SETI@home covers 15 frequency resolutions ranging from 0.075 to 1220.7 Hz.

Figure 2: The SETI@home display, showing the power spectrum currently being computed (bottom) and the best-fit Gaussian (left).

The SETI@home client program is written in C++. It is structured in several parts: a platform-independent framework for distributed computing (6,423 lines), components with platform-specific implementations, such as the graphics library (2,058 lines in the UNIX version), SETI-specific data analysis code (6,572 lines), and SETI-specific graphics code (2,247 lines)

The SETI@home client has been ported to 175 different platforms. The Gnu tools, including gcc and autoconf, have greatly facilitated this task. SETI@home maintains the Windows, Macintosh, and SPARC/Solaris versions; all other porting is done by volunteers.

On most platforms the client comprises a main program that does all communication and data processing, and a separate graphics display program. These processes communicate via shared memory.

Collecting and analyzing results

A large part of the SETI@home server complex is devoted to collection and analyzing results (see Figure 3).

Figure 3: The system for collecting and analyzing results.

When a client has finished processing a work unit, it sends the result to the data/result server. Handling a result consists of two tasks:

Scientific: The data/server writes the result to a disk file. The process science files program reads these files, creating result and signal records in the database. To optimize throughput, several copies of this program run concurrently.
Accounting: For each result, the server appends to a log file a line describing the result's user, its CPU time, and so on. A process accounting files program reads these log files, accumulating in a memory cache the updates to all relevant database records (user, team, credit). Every few minutes it flushes this cache to the database.

By buffering database updates in disk files, the server system is able to handle periods of database outage, as well as periods of database overload such as when the data/result server starts up after a long outage.

Eventually, each work unit has a number of results in the database. The next step, result verification, examines each group of redundant results - which may differ in number of signals and signal parameters - and replaces them with a single "canonical" result for that work unit. The canonical results are stored in a separate database.

The final phase, back-end processing, consists of several steps. To verify the system, we check for the test signals injected at the telescope. Man-made signals (RFI) are identified and eliminated. We look for signals with similar frequency and sky coordinates detected at different times. These "repeated signals", as well as one-time signals of sufficient merit, are investigated further, potentially leading to a final cross-check by other radio SETI projects according to an accepted protocol [DOP90].

Interaction with and among participants

SETI@home relies primarily on news coverage and word-of-mouth to attract participants. Its web site (http://setiathome.berkeley.edu) informs visitors about the project and lets them download the client program. The site also shows the current scientific and technical status of the project.

The web site shows "leader boards" (based on work units processed) for individuals and for user categories such as country and email domain. Users can form teams which compete within categories; 82,000 such teams exist. Users are recognized on the web site when they pass work unit milestones. Leader-board competition (between individuals, teams, owners of particular computer types, and so on) have helped attract and retain participants.

We have tried to foster a SETI@home community by allowing users to exchange information and opinions. The web site has an online poll with questions concerning demographics, SETI, and distributed computing. Of the 95,000 users who have completed this poll, for example, 93% are male. We assisted in the creation of a newsgroup, sci.astro.seti, whose traffic is largely devoted to SETI@home. Users have created various "add-on" software, including programs to buffer multiple work units and results, and systems for graphically displaying work progress. Our web site contains links to these contributions. Users have also translated the web site into 30 languages.

History and statistics

SETI@home was conceived by David Gedye in 1995. and work began in 1997. Project staffing has averaged three full-time employees and several part-time volunteers. 400,000 people pre-registered for SETI@home. In May 1999 we released the Windows and Macintosh versions of the client. In a week about 200,000 people had run the client. This number has grown to over 3.13 million as of July 2001. People in 226 countries run SETI@home; about half are in the U.S.

In the 12 months starting in July 2000 SETI@home participants processed 178 million work units. The average throughput during that period was 22.05 TeraFLOPS. Overall, the computation has performed 8.6e20 floating point operations, the largest computation on record.

Security

A public-resource computing project is responsible for protecting its participants. It should ensure, for example, that its client program does not act as a vector for software "viruses". SETI@home has succeeded in this; our code download and data distribution servers have not (as far as we know) been penetrated, and in any case the client program does not download or install code. There have been two noteworthy attacks. Our web server was compromised, but only as a prank; the hackers did not, for example, install a Trojan-horse download page. Later, exploiting a design flaw in our client/server protocol, hackers obtained user email addresses and work totals. We fixed the flaw, but not until thousands of addresses had been stolen. In addition, a user developed an email-propagated virus that downloads and installs SETI@home on the infected computer, configuring it to give credit to his SETI@home account. This might have been prevented by forcing a manual step in the install process.

On the other hand, a public-resource computing project must protect itself from misbehaving and malicious participants. SETI@home has been confronted with many instances of this, although only a tiny fraction of participants are involved. A relatively benign example: some users modified the executable to improve its performance on specific processors. We didn't trust the correctness of such modifications, and we didn't want SETI@home to be used in "benchmark wars", so we adopted a policy banning modifications.

We discovered and defeated many schemes for getting undeserved work credit. Other users deliberately sent erroneous results. These activities are hard to prevent if users can run the client program under a debugger, analyze its logic, and obtain its encryption keys [MOL00]. Our redundancy-checking, together with the error-tolerance of our computing task, are sufficient for dealing with the problem; more sophisticated mechanisms have been proposed [SAR01].

Conclusion

By using public-resource computing, SETI@home harnessed unprecedented computing power, and directly involved a large, international group of participants, many of them non-scientists, in a significant scientific experiment. These successes were facilitated by several factors. A large sector of the population is fascinated by SETI, so it has been easy to attract participants; and the task of analyzing radio signals has properties (high compute-to-data ratio, independent parallelism, and error tolerance) that make it amenable to public-resource computing.

Public-resource computing relies on personal computers with excess capacity, e.g. large amounts of idle CPU time. The idea of using these cycles for distributed computing was explored by the Worm computation project at Xerox PARC, which used workstations within a research lab [SHO82]. We call this organization-resource computing.

Large-scale public-resource computing became feasible with the growth of the Internet in the 1990s. Two major public-resource projects predate SETI@home. The Great Internet Mersenne Prime Search (GIMPS), which searches for prime numbers, started in 1996. Distributed.net, which demonstrates brute-force decryption, started in 1997. More recent examples involve biology and medicine, such as folding@home and the Intel-United Devices Cancer Research Project. These projects use client/server architectures similar to SETI@home. Compared to SETI@home they have much higher compute/data ratios.

Several efforts are underway to develop general-purpose frameworks for public-resource and organization-resource computing. Projects collectively called The Grid are developing systems for resource-sharing between academic and research organizations [FOS99]. Private companies such as Entropia and United Devices are developing platforms for distributed computation and storage in both public and organizational settings.

What range of problems is amenable to public-resource computing? Applications that require frequent synchronization and data-sharing between nodes have been attacked via hardware-based approaches such as shared-memory multiprocessors, and more recently via software-based cluster computing, such as PVM [SUN90]. Public-resource computing, with its frequent computer outages and network disconnections, seems ill-suited to these applications. But appropriate scheduling mechanisms, such as exploiting groups of LAN-connected machines, may eliminate these difficulties.

Applications such as computer graphics rendering require large amounts of data per unit computation, perhaps making them unsuitable to public-resource computation. However, reductions in bandwidth costs will allay these problems. Furthermore, when a large part of the data is constant across work units, multicast techniques can further reduce cost and efficiency problems.

The question of how to attract participants is complex. While commercial efforts are exploring the use of financial and other incentives, it seems likely that volunteer-participation projects like SETI@home will continue to thrive. It is projected that in 2004 1 billion computers will be on the Internet. If only 10% of these take part in public-resource computation, they will support 100 projects the size of SETI@home. But a project can attract participants only if it explains and justifies its goals. In a sense, the public will participate in the research funding process, increasing public awareness of science,

On a more general level, public-resource computing is an aspect of the "peer-to-peer paradigm", which involves shifting resource-intensive functions (such as computation and data storage) from central computers to computers at the edge of the Internet, such as workstations and home PCs [ORA01]. Economic factors (such as the falling price, increasing proliferation, and decreasing utilization of personal computers) favor this paradigm, which is applicable to a range of tasks spanning the computation/storage/communication space.

Acknowledgements

SETI@home was conceived by David Gedye. Woody Sullivan, Craig Kasnov, Kyle Granger, Charlie Fenton, Hiram Clawson, Peter Leiser, Eric Heien, and Steve Fulton have contributed ideas and effort to SETI@home. Thanks to our participants, and thanks to The Planetary Society, Sun Microsystems, the U.C. DiMI program, Fujifilm, Cosmos Studios, and the other organizations and individuals who have donated to SETI@home.

References

COC59: Cocconi, G., and Morrison, P., "Searching for Interstellar Communications," Nature, Vol. 184, #4690, p. 844. Sept. 1959.
DOP90: Declaration of Principles Concerning Activities Following the Detection of Extraterrestrial Intelligence. Acta Astronautica 21(2) pp. 153-154, 1990.
KOR01: Korpela, E., D. Werthimer, D. Anderson, J. Cobb, and M. Lebofsky. "SETI@home - Massively distributed computing for SETI", Computing in Science and Engineering 3(1), p. 79, 2001.
FOS99: Foster, I. and C. Kesselman. The Grid: Blueprint for a New Computing Infrastructure. Morgan Kauffman, 1999.
MOL00: Molnar, D. "The SETI@home Problem". ACM Crossroads, Sept. 2000. [GIVE URL??]
ORA01: Oram, Andy (editor). Peer-to-Peer: Harnessing the Power of Disruptive Technologies. O'Reilly and Associates, 2001.
SAR01: Sarmenta, L. F. G. "Sabotage-Tolerance Mechanisms for Volunteer Computing Systems". ACM/IEEE International Symposium on Cluster Computing and the Grid, Brisbane, Australia, May 15-18 2001.
SHO82: Shoch, J. and J. Hupp. The Worm Programs -- Early Experience with a Distributed Computation". CACM 25(3) pp 172-180, March 1982.
SHO98: Shostak, Seth. Sharing the Universe: Perspectives on Extraterrestrial Life. Berkeley Hills Books, 1998.
SUN90: Sunderam, V.S. "PVM: A Framework for Parallel Distributed Programming". Concurrency: Practice and Experience, 2(4) pp. 315-339, Dec. 1990.
WER97: Werthimer, Bowyer, Ng, Donnelly, Cobb, Lampton and Airieau (1997). The Berkeley SETI Program: SERENDIP IV Instrumentation; in the book "Astronomical and Biochemical Origins and the Search for Life in the Universe". Cosmovici, Bowyer and Werthimer, editors.