Scaling up

Message boards : Nebula : Scaling up
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile David Anderson
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 13 Feb 99
Posts: 173
Credit: 502,653
RAC: 0
Message 1997522 - Posted: 9 Jun 2019, 6:23:38 UTC

Our machine-learning collaborators requested a larger number of birdies to run their algorithms on, so I did an analysis run with 20,000 birdies (10X as many as before). Curiously, the number of birdie multiplets increased only by 1.5X rather than the expected 10X. I scratched my head for a while, than had a "d'oh!" moment.

Here's what happened: scoring is done by running a large number of jobs in parallel on the Atlas cluster. Each job scores a block of 64 adjacent Healpix pixels. When birdies are present, we score only block for each pixel containing a birdie; the other blocks are chosen randomly over the pixels visible to Arecibo.

I was - as usual - scoring 100K pixels. But that's only 1,562 blocks of 64 pixels. So only the first 1,562 birdies - plus a few more that happened to lie in the same block as one of these - were actually being scored.

To score all 20K birdie pixels, I could either reduce jobs to 4 pixels, or score lot more pixels (namely 1.28M, or 64*20,000). I chose to do the latter, partly because it would be a first step to scaling up to the full 16M pixels.

It went well. The actual scoring took a couple of hours - not much longer than 100K pixels. The main difference is the 10X increase in the size of the output data. It took a while to copy everything from Atlas to the UCB servers. And when you load certain web pages - like the birdie page - it takes a long time to load because it reads the list of multiplets; there are now 28M of them.

So, this is promising. I don't see any major problems in scoring the entire sky. We should be getting to that fairly soon. Eric is currently focused on writing the first of two SETI@home papers (describing data collection and front-end analysis). When he's done we'll take a look at the results of the most recent run, maybe tweak a thing or two, then do a full-sky run. We're in the home stretch.

-- David
ID: 1997522 · Report as offensive

Message boards : Nebula : Scaling up


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.