Bottlenecks of SETI computations

Message boards : Number crunching : Bottlenecks of SETI computations
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Dan W

Send message
Joined: 25 Jun 09
Posts: 7
Credit: 73
RAC: 0
United Kingdom
Message 915966 - Posted: 9 Jul 2009, 1:14:51 UTC

Hi all,

I've just recently joined seti@home, and I have a few questions in regards to number crunching side of things and long term plans of SETI...

Since SETI is always collecting data, is the distributing project keeping up with all the data, or is it lagging behind in the hope of the growth of exponential CPU power (multi-core etc.).

If the former, then how big is the lag between the general analysing of data by users compared to the when the data is initially collected? Is data collected in June 2009, analysed by users in June 2009?

If the latter, then how big is the growth of the growth of the lag? (just rough approximates would be interesting).

Ideally, how much more resolution would SETI like users to analyse at if CPU power were no barrier? Is there any chance of increasing the resolution at some point in the future?

I don't see these questions answered on the main website, but it's always interesting to know the limits of computation, and how much more CPU power we really need for the project to reach saturation point with no more CPU bottlenecks.
ID: 915966 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 915988 - Posted: 9 Jul 2009, 1:47:06 UTC - in response to Message 915966.  
Last modified: 9 Jul 2009, 1:48:02 UTC

Hi all,

I've just recently joined seti@home, and I have a few questions in regards to number crunching side of things and long term plans of SETI...

Since SETI is always collecting data, is the distributing project keeping up with all the data, or is it lagging behind in the hope of the growth of exponential CPU power (multi-core etc.).

If the former, then how big is the lag between the general analysing of data by users compared to the when the data is initially collected? Is data collected in June 2009, analysed by users in June 2009?

If the latter, then how big is the growth of the growth of the lag? (just rough approximates would be interesting).

Ideally, how much more resolution would SETI like users to analyse at if CPU power were no barrier? Is there any chance of increasing the resolution at some point in the future?

I don't see these questions answered on the main website, but it's always interesting to know the limits of computation, and how much more CPU power we really need for the project to reach saturation point with no more CPU bottlenecks.

Until recently, SETI has been more or less keeping up with the incoming flow (with maybe a three to six month lag between recording and analysis). The flow of data isn't constant for a number of reasons, including problems (currently) with the data recorder, and maintenance at the telescope, or the planetary radar.

We're currently working on data recorded a while ago, from the archives.

It's also possible for the project to go back and do more sensitive analysis and recycle all that old data all over again: there is a lot of older data that has been through the Multibeam search (for narrowband signals) but not Astropulse (wideband signals).

It is also true that signals recorded last week may have spent decades (or centuries) getting to the telescope, so being a year or two behind isn't any big deal.

On the other side of the project, one goal is to demonstrate that BIG science can be done on a vanishingly small budget. SETI is often struggling from extra load caused by either "fast" work or outages. We're in one of those periods right now.

For many of us, it's no big deal. SETI (and BOINC) are supposed to harvest waste CPU cycles, and we have computers that are on anyway, so no worries.

Others have built special-purpose machines just for SETI, and they find any interruption in the flow of work to be very upsetting.

So, welcome. You came at an interesting time. Have a seat, buckle in, put on your helmet and enjoy the ride. I am.
ID: 915988 · Report as offensive
Dan W

Send message
Joined: 25 Jun 09
Posts: 7
Credit: 73
RAC: 0
United Kingdom
Message 915992 - Posted: 9 Jul 2009, 1:53:45 UTC
Last modified: 9 Jul 2009, 2:29:53 UTC

Cheers Ned for the informative reply!

Assuming a 'keeping up with the data' scenario, ideally, do you know how much extra CPU power would be ideal if we wanted to analyze the data at an ideal resolution? Would say... 1000x let us achieve everything we would want? Or more/less than that maybe?
ID: 915992 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 916014 - Posted: 9 Jul 2009, 3:00:08 UTC - in response to Message 915992.  

Cheers Ned for the informative reply!

Assuming a 'keeping up with the data' scenario, ideally, do you know how much extra CPU power would be ideal if we wanted to analyze the data at an ideal resolution? Would say... 1000x let us achieve everything we would want? Or more/less than that maybe?

Leaving aside practicalities, the kinds of analysis we're doing on the data could be improved perhaps 4 to 10x given additional compute resources. Beyond that, it would make sense to consider different kinds of analysis rather than idealizing what we have.

Considering practicalities, the limiting factors are getting the data back from the telescope and getting it out to participant's computers.

The true ideal from my point of view? Hmm, the ALFA array on the Arecibo telescope sees about 1 part in 3 million of the sky at any instant, the array is only in use 25 to 30 percent of the time, S@H records less than 1% of the bandwidth of the ALFA system (which is only a small fraction of the electromagnetic spectrum anyhow), etc. A major increase in any of those limitations would be very welcome to me. I don't seem to be able to really conceive an 'ideal' limit, though it would probably be possible to think along the lines of what could happen if every country in the world committed its 'defense' budget to SETI rather than buying weapons...
                                                                Joe
ID: 916014 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 916015 - Posted: 9 Jul 2009, 3:01:51 UTC - in response to Message 915992.  

Cheers Ned for the informative reply!

Assuming a 'keeping up with the data' scenario, ideally, do you know how much extra CPU power would be ideal if we wanted to analyze the data at an ideal resolution? Would say... 1000x let us achieve everything we would want? Or more/less than that maybe?


Welcome to the wonders of making your computer look for aliens.
I think ideally how much CPU power SETI needs is located here. The whole list if you please.

However, They do not have the budget for that. So we kindly let them use our idle computers to make a giant global supercomputer.

If you Visit the Server Status page here. You can see the current status of the tape splitters.

Looks like some Aug. '08 tapes are in there at the moment.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 916015 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 916016 - Posted: 9 Jul 2009, 3:02:22 UTC - in response to Message 916014.  

Cheers Ned for the informative reply!

Assuming a 'keeping up with the data' scenario, ideally, do you know how much extra CPU power would be ideal if we wanted to analyze the data at an ideal resolution? Would say... 1000x let us achieve everything we would want? Or more/less than that maybe?

Leaving aside practicalities, the kinds of analysis we're doing on the data could be improved perhaps 4 to 10x given additional compute resources. Beyond that, it would make sense to consider different kinds of analysis rather than idealizing what we have.

Considering practicalities, the limiting factors are getting the data back from the telescope and getting it out to participant's computers.

The true ideal from my point of view? Hmm, the ALFA array on the Arecibo telescope sees about 1 part in 3 million of the sky at any instant, the array is only in use 25 to 30 percent of the time, S@H records less than 1% of the bandwidth of the ALFA system (which is only a small fraction of the electromagnetic spectrum anyhow), etc. A major increase in any of those limitations would be very welcome to me. I don't seem to be able to really conceive an 'ideal' limit, though it would probably be possible to think along the lines of what could happen if every country in the world committed its 'defense' budget to SETI rather than buying weapons...
                                                                Joe

And the entire sweep available to the Aricebo telescope is only a few percent of the total sky.


BOINC WIKI
ID: 916016 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 916026 - Posted: 9 Jul 2009, 3:10:49 UTC - in response to Message 916016.  

...
Hmm, the ALFA array on the Arecibo telescope sees about 1 part in 3 million of the sky at any instant, the array is only in use 25 to 30 percent of the time, S@H records less than 1% of the bandwidth of the ALFA system (which is only a small fraction of the electromagnetic spectrum anyhow), etc.
...

And the entire sweep available to the Aricebo telescope is only a few percent of the total sky.

About 32% but that's hard to see on the usual cylindrical projection sky map.
                                                                Joe
ID: 916026 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916032 - Posted: 9 Jul 2009, 3:22:36 UTC - in response to Message 916014.  

Cheers Ned for the informative reply!

Assuming a 'keeping up with the data' scenario, ideally, do you know how much extra CPU power would be ideal if we wanted to analyze the data at an ideal resolution? Would say... 1000x let us achieve everything we would want? Or more/less than that maybe?

Leaving aside practicalities, the kinds of analysis we're doing on the data could be improved perhaps 4 to 10x given additional compute resources. Beyond that, it would make sense to consider different kinds of analysis rather than idealizing what we have.

Considering practicalities, the limiting factors are getting the data back from the telescope and getting it out to participant's computers.

The true ideal from my point of view? Hmm, the ALFA array on the Arecibo telescope sees about 1 part in 3 million of the sky at any instant, the array is only in use 25 to 30 percent of the time, S@H records less than 1% of the bandwidth of the ALFA system (which is only a small fraction of the electromagnetic spectrum anyhow), etc. A major increase in any of those limitations would be very welcome to me. I don't seem to be able to really conceive an 'ideal' limit, though it would probably be possible to think along the lines of what could happen if every country in the world committed its 'defense' budget to SETI rather than buying weapons...
                                                                Joe

... or how about the money wasted by the typical marketing department at any of the Fortune 1000 corporations trying to convince their customers to be happy instead of actually doing things to make their customers happy?

Sorry, but I've been trying to convince a major bank to give me the local number for their branch, and the best I've been able to get is one with a similar name 400 miles away.

They'd make the gang at the Sirius Cybernetics Corporation proud.
ID: 916032 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 916094 - Posted: 9 Jul 2009, 8:25:53 UTC - in response to Message 915988.  

On the other side of the project, one goal is to demonstrate that BIG science can be done on a vanishingly small budget.

Hope this goal will not be seen by bureaucracy.
Or this approach (small budget for big science) will be implemented in other areas too ;)
ID: 916094 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916191 - Posted: 9 Jul 2009, 16:23:41 UTC - in response to Message 916094.  

On the other side of the project, one goal is to demonstrate that BIG science can be done on a vanishingly small budget.

Hope this goal will not be seen by bureaucracy.
Or this approach (small budget for big science) will be implemented in other areas too ;)

The alternative is worthwhile science that is not done because no one will fund it.

Like so many things in life, this is a trade-off. We either have to accept less being done for more money, or we have to make "more for less" possible.

BOINC means than an individual can actually "fund" a project out of his own pocket. A small project can be run on servers that happen to be on-hand, using an inexpensive DSL line.

IIRC, HashClash was run by a graduate student, with no outside funding.
ID: 916191 · Report as offensive
Dan W

Send message
Joined: 25 Jun 09
Posts: 7
Credit: 73
RAC: 0
United Kingdom
Message 916340 - Posted: 9 Jul 2009, 22:18:39 UTC
Last modified: 9 Jul 2009, 22:45:49 UTC

Wow, thanks all for the input!

[...]the kinds of analysis we're doing on the data could be improved perhaps 4 to 10x given additional compute resources[...]

The true ideal from my point of view? Hmm, the ALFA array on the Arecibo telescope sees about 1 part in 3 million of the sky at any instant, the array is only in use 25 to 30 percent of the time, S@H records less than 1% of the bandwidth of the ALFA system (which is only a small fraction of the electromagnetic spectrum anyhow), etc.


So if I were to make the calculation 10 * 3,000,000 * 3 * 100 * 50? = 450,000,000,000...........

(the "50" number is my guess at the figure for the "small fraction of the electromagnetic spectrum" - is it about right?). Also, the 10 number at the beginning is from your first paragraph. Perhaps that needs increasing an order of magnitude for 'as-good-as-we-can-get' analysis?)

So assuming our telescope records all possible sky all the time, am I right in saying that gargantuan number (450 billion) gives us a very approximate idea of the ideal increase in CPU speed we would need to exhaustively process (and keep up with) all possible data? :)
ID: 916340 · Report as offensive
Ianab
Volunteer tester

Send message
Joined: 11 Jun 08
Posts: 732
Credit: 20,635,586
RAC: 5
New Zealand
Message 916369 - Posted: 10 Jul 2009, 0:26:24 UTC - in response to Message 916340.  

I'm not sure on your exact numbers, but enough CPU power to scan the whole sky continuosly would be a BIG number - more than all the processors in the world I would guess.

Of course you could scan the whole sky at a lower resolution, and thats done to look for BIG 'noisy' things like supernova etc. What SETI is looking for is low level signals that could be coming from an intelligent source. That signal would be mixed up with all the natural background radio noise. Hence needing the detailed analysis do find any signal that might be there.

Looking that closely at small portions of the sky is all that it's practical to do right tnow. Looking at the whole sky you dont have enough sensitivity to find what we are looking for even if it was captured.

In
ID: 916369 · Report as offensive
Dan W

Send message
Joined: 25 Jun 09
Posts: 7
Credit: 73
RAC: 0
United Kingdom
Message 916703 - Posted: 11 Jul 2009, 2:27:38 UTC

From this page, SETI analyzes between 1418.75 MHz to 1421.25 MHz. I wonder what the lowest and highest it could go to to cover all bases.

There's also the confusion about whether higher/lower frequencies would take quicker/longer to inspect, all else being equal.
ID: 916703 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916723 - Posted: 11 Jul 2009, 3:29:10 UTC - in response to Message 916703.  
Last modified: 11 Jul 2009, 3:30:51 UTC

From this page, SETI analyzes between 1418.75 MHz to 1421.25 MHz. I wonder what the lowest and highest it could go to to cover all bases.

There's also the confusion about whether higher/lower frequencies would take quicker/longer to inspect, all else being equal.

This is called the "water hole" -- it is a quiet area of spectrum between the hydrogen line and the hydroxyl line.

One story says this was chosen because everyone "gathers around the water hole" although I suspect the truth is that it's a quiet spot, and as good a place as any.

The radio spectrum goes from DC to Light -- something below 10 KHz (0.01 MHz) to over 300 GHz (300,000 MHz).

Lower frequencies have less data, and should take less time to analyze.

Edit: It isn't practical to design a single receiving system that covers 0.01 MHz to 300,000 MHz.
ID: 916723 · Report as offensive
Dan W

Send message
Joined: 25 Jun 09
Posts: 7
Credit: 73
RAC: 0
United Kingdom
Message 916792 - Posted: 11 Jul 2009, 13:54:52 UTC
Last modified: 11 Jul 2009, 14:33:14 UTC

I see. So hypothetically (given unlimited telescope resources) would it make more sense to analyze in fixed linear steps or logarithmic steps? So either:

0.01 MHz ... 2.51 ... 5.02 ............ 299,995... 299,997.5 ...300,000 MHz.

(the 2.5 comes from 1421.25 - 1418.75)

or more like... (logarithmic style)
0.01 MHz... 0.010017... 0.010035......... 298,982... 299,490... 300,000 Mhz.

(the factor of 1.0017 comes from 1421.25 / 1418.75)

From what you said, the former seems to make more sense. Anyway, if it's the former, then it's going to be around 120,000 times slower (the latter is more like 10,000x slower). Would that sound about right to exhaustively analyze the entire useful spectrum?

That brings our giant number up to 5*10^16. Well that *is* big, but I'm fairly optimistic Moore's law will be sparked off again (if only because so many people would like to see full global illuminated raytraced graphics in games including proper physics with billions of particles).

What might help reduce it somewhat further is if we could get telescopes on the far side of the moon. I'm guessing that the lack of atmosphere there may ease the computational algorithms...(?)

******POST EDITED FOR NUMBERS******
ID: 916792 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 916796 - Posted: 11 Jul 2009, 14:08:45 UTC - in response to Message 916792.  

What might help reduce it somewhat further is if we could get telescopes on the far side of the moon. I'm guessing that the lack of atmosphere there may ease the computational algorithms...(?)

Wouldn't help the computational algorithms, but it would sure help to block the interference from all those cellphones and "I love Lucy".
ID: 916796 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21129
Credit: 7,508,002
RAC: 20
United Kingdom
Message 916850 - Posted: 11 Jul 2009, 17:50:06 UTC - in response to Message 916723.  

The radio spectrum goes from DC to Light -- something below 10 KHz (0.01 MHz) to over 300 GHz (300,000 MHz).

Lower frequencies have less data, and should take less time to analyze.

What sort of sensitivities could be achieved down at the very low frequencies?

Is there less interstellar background noise so that a signal can be detected at a greater range?...

Or could we achieve a greater detection range due to easier/better electronics and analysis/processing?

Keep searchin',
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 916850 · Report as offensive
DJStarfox

Send message
Joined: 23 May 01
Posts: 1066
Credit: 1,226,053
RAC: 2
United States
Message 916915 - Posted: 11 Jul 2009, 20:53:13 UTC - in response to Message 916796.  

What might help reduce it somewhat further is if we could get telescopes on the far side of the moon. I'm guessing that the lack of atmosphere there may ease the computational algorithms...(?)

Wouldn't help the computational algorithms, but it would sure help to block the interference from all those cellphones and "I love Lucy".


True, but you'd just have to build it to be gamma ray burst and solar flare resistant. :)
ID: 916915 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 916916 - Posted: 11 Jul 2009, 20:57:52 UTC - in response to Message 916915.  

What might help reduce it somewhat further is if we could get telescopes on the far side of the moon. I'm guessing that the lack of atmosphere there may ease the computational algorithms...(?)

Wouldn't help the computational algorithms, but it would sure help to block the interference from all those cellphones and "I love Lucy".

True, but you'd just have to build it to be gamma ray burst and solar flare resistant. :)

What happens if ET aims a gamma ray burst at it - do you get the detection, or harden against it? :-)
ID: 916916 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 916964 - Posted: 11 Jul 2009, 23:35:49 UTC - in response to Message 916850.  
Last modified: 11 Jul 2009, 23:38:05 UTC

The radio spectrum goes from DC to Light -- something below 10 KHz (0.01 MHz) to over 300 GHz (300,000 MHz).

Lower frequencies have less data, and should take less time to analyze.

What sort of sensitivities could be achieved down at the very low frequencies?

Is there less interstellar background noise so that a signal can be detected at a greater range?...

Or could we achieve a greater detection range due to easier/better electronics and analysis/processing?

Keep searchin',
Martin

The main use of very low frequency transmissions has been communication with ballistic missile submarines -- the sub can simply trail a really long wire for the antenna (lower frequency, longer wavelength, bigger antenna) and VLF will penetrate water.

The signalling speed is very very slow, barely enough to tell a sub on patrol to come up to periscope depth and use SATCOM which is quite fast.

P.S. the other advantage to lower frequencies is that they tend to follow the ground instead of just shooting off into space. A bit higher (3 to 30 MHz or so) signals tend to go up, and with any luck, be bent back by the ionosphere, that's how international broadcasting and amateur radio ops. can cover the whole world. Above 30 MHz, signals tend to go in a straight line, unless they hit something opaque or highly absorbent on that frequency.
ID: 916964 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Bottlenecks of SETI computations


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.