The Plan


log in

Advanced search

Message boards : Number crunching : The Plan

1 · 2 · Next
Author Message
Astro
Volunteer tester
Avatar
Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 206378 - Posted: 8 Dec 2005, 4:19:36 UTC
Last modified: 8 Dec 2005, 4:36:39 UTC

Hi all, I'm about to layout the conversion plan the best I can piece together. Nothing I say here should be interpreted as coming from Berkeley. I'm seated comfortably in my office chair at my home in South Carolina. There the disclaimer is out of the way.

I've been watching boinc develop and grow. I've kept track in my mind what's said by the developers about the plans. Most of what I say can be referenced to statements by developers, but it is in bits and pieces, not one big layed out plan.

My intention is that given what I know that some of the newer users may feel comforted that there is atleast a plan, and that it's more or less on track.

History
Seti classic was wildly successful, so successful that they had more computer power than they needed. In order to keep users happy they would "on occasion" send out the exact same wu upwards of 29 times(source Rom Walton), Matt Lebofsky stated that the average number of times a wu was sent was 6-9. Note: to do good science only two "strongly similar" results are needed to validate an answer. This is inefficient (tony's interpretation). I could see Dr. A saying to himself "Wow this is fantastic, but I hate to see all this wasted computer power, what can I do?". "I know I'll design a program that will manage applications to allow other poor projects to benefit from our success and these extra cpu cycles" Boinc was Born.

The Problems
Now there are two projects crunching similar wus with similar software (the application part). By now Dr. Anderson had made his decision to use boinc, he was getting grant money to do it, it was the future. It was more efficient at acheiving the science since only 4 results (work units) are sent instead of 6-9. Now how to get both these projects merged into one? They have limited hardware and limited money.

The growth
Boinc started small, a couple servers and some volunteers. It grew. they started taking servers from classic to feed boincs growth, and it kept growing. they snag another server. Now both projects are on the ragged edge in need of gear.

The solution
Merge them both, but how to do that??? If we put all those people in one place, and since boinc is more efficient we'll run out of work again, and the existing servers at boinc won't handle it. what to do?? They've chosen to do the following:

1) recommend joining other projects
2) increase the sensitivity of our search (this is what the seti beta team has been working on since June 2005, it doubles the sensitivity of the search and takes 4 to 10 times longer to crunch a result. When released it will reduce the load by at least a factor of 4)
3)add on Astropulse and data from other sources (multibeam receiver) as soon as ready.
4) add the classic equipment to boinc

The plan
1) merge the master science databases (november 15th)
2) send out emails about the closure of classic
3) adopt the beta application as the seti standard (so the load will be reduced on the servers)
4) close classic (december 15th)
5) reconfigure and task the classic gear in with the boinc gear.

Today
Seti planned to merge the master science database of Classic with that of Boinc November 15th (see tech news), software/incompatibility issues postponed it.
Then they released the classic closure emails and blogs. All the new users overloaded the server, while it was operating at a reduced capacity, they decided to do the database merge, it's like killing two birds with one stone. It accomplishes the merge and lets the server operate faster

Since 22 November we've gone from 256K to 311K users(boincstats). This is the problem, but the solution is the next step scheduled to happen before 8 days from now(as I've heard the news,unless it gets pushed back). That's the adoption of the beta client which will cut the load drastically thereby making communications easier. Then when the classic gear is added, it will get even easier and have room for future growth. (new users)

I can't predict the future, maybe something doesn't happen like it should (like the master science database merge), and dates can change. My point here is poeple need to know a plan exists to ease the current problems. Hopefully new users will see this and say "hey they knew there'd be problems and they tried to account for it in the plans".

I hope you stick around, if nothing else the next couple weeks ought to be very interesting.

thanks for your time

tony
____________

KB7RZF
Volunteer tester
Avatar
Send message
Joined: 15 Aug 99
Posts: 9463
Credit: 3,020,518
RAC: 1,860
United States
Message 206390 - Posted: 8 Dec 2005, 4:27:22 UTC
Last modified: 8 Dec 2005, 4:54:03 UTC

Wow, awesome input Tony. Thanks for summing it all up. I think also once enhanced SETI comes to, things will improve. Right now I got 1 WU that is crunching, it has until March 31st 2007 to complete, (I'm thinking thats just a date they set for fun, but dunno), and it does take about 8-10 times longer. Once it rolls out, I think the whole SETI community will be a lot better off. Just my thoughts.

And Tony, keep up the great work your doing. It does not go un-noticed.

Jeremy
"edit" Also gave ya a plus Tony, I just didn't post it
____________

Astro
Volunteer tester
Avatar
Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 206421 - Posted: 8 Dec 2005, 4:38:55 UTC

Note: if anyone has seen flaws in the way I represented the plan, please speak up. I'd rather everyone be armed with the truth instead of gossip.

good night

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8470
Credit: 22,980,006
RAC: 13,754
United Kingdom
Message 206425 - Posted: 8 Dec 2005, 4:41:13 UTC

Tony,

Nice one, lets hope everybody reads it.

(give it a +)
____________
Only two things are infinite: the universe and human stupidity, and I am not sure about the former. - Albert Einstein

web03
Volunteer tester
Avatar
Send message
Joined: 13 Feb 01
Posts: 355
Credit: 719,156
RAC: 0
United States
Message 206427 - Posted: 8 Dec 2005, 4:41:49 UTC

Great job Tony!!! Plus from me as well.....

Wendy
____________
Wendy



Click Here for BOINC FAQ Service

Profile Darth Dogbytes™
Volunteer tester
Send message
Joined: 30 Jul 03
Posts: 7512
Credit: 2,021,148
RAC: 0
United States
Message 206428 - Posted: 8 Dec 2005, 4:49:16 UTC

Ditto's
____________
Account frozen...

SteveK
Volunteer tester
Send message
Joined: 23 Dec 04
Posts: 54
Credit: 15,550
RAC: 0
United States
Message 206431 - Posted: 8 Dec 2005, 4:52:48 UTC

You get a "plus" from me too, big guy! A great summary to help keep everything in perspective!
____________

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 206470 - Posted: 8 Dec 2005, 5:59:43 UTC - in response to Message 206378.

Tony

Sorry, I was having too much Fun in the Whining Thread... You have provided more from what was pieced together than what is actually in the Main page or Tech News..

It should be "sticky'd" or they can provide appropiate revisions...

Applause!

Hi all, I'm about to layout the conversion plan the best I can piece together. Nothing I say here should be interpreted as coming from Berkeley. I'm seated comfortably in my office chair at my home in South Carolina. There the disclaimer is out of the way.

I've been watching boinc develop and grow. I've kept track in my mind what's said by the developers about the plans. Most of what I say can be referenced to statements by developers, but it is in bits and pieces, not one big layed out plan.

My intention is that given what I know that some of the newer users may feel comforted that there is atleast a plan, and that it's more or less on track.

History
Seti classic was wildly successful, so successful that they had more computer power than they needed. In order to keep users happy they would "on occasion" send out the exact same wu upwards of 29 times(source Rom Walton), Matt Lebofsky stated that the average number of times a wu was sent was 6-9. Note: to do good science only two "strongly similar" results are needed to validate an answer. This is inefficient (tony's interpretation). I could see Dr. A saying to himself "Wow this is fantastic, but I hate to see all this wasted computer power, what can I do?". "I know I'll design a program that will manage applications to allow other poor projects to benefit from our success and these extra cpu cycles" Boinc was Born.

The Problems
Now there are two projects crunching similar wus with similar software (the application part). By now Dr. Anderson had made his decision to use boinc, he was getting grant money to do it, it was the future. It was more efficient at acheiving the science since only 4 results (work units) are sent instead of 6-9. Now how to get both these projects merged into one? They have limited hardware and limited money.

The growth
Boinc started small, a couple servers and some volunteers. It grew. they started taking servers from classic to feed boincs growth, and it kept growing. they snag another server. Now both projects are on the ragged edge in need of gear.

The solution
Merge them both, but how to do that??? If we put all those people in one place, and since boinc is more efficient we'll run out of work again, and the existing servers at boinc won't handle it. what to do?? They've chosen to do the following:

1) recommend joining other projects
2) increase the sensitivity of our search (this is what the seti beta team has been working on since June 2005, it doubles the sensitivity of the search and takes 4 to 10 times longer to crunch a result. When released it will reduce the load by at least a factor of 4)
3)add on Astropulse and data from other sources (multibeam receiver) as soon as ready.
4) add the classic equipment to boinc

The plan
1) merge the master science databases (november 15th)
2) send out emails about the closure of classic
3) adopt the beta application as the seti standard (so the load will be reduced on the servers)
4) close classic (december 15th)
5) reconfigure and task the classic gear in with the boinc gear.

Today
Seti planned to merge the master science database of Classic with that of Boinc November 15th (see tech news), software/incompatibility issues postponed it.
Then they released the classic closure emails and blogs. All the new users overloaded the server, while it was operating at a reduced capacity, they decided to do the database merge, it's like killing two birds with one stone. It accomplishes the merge and lets the server operate faster

Since 22 November we've gone from 256K to 311K users(boincstats). This is the problem, but the solution is the next step scheduled to happen before 8 days from now(as I've heard the news,unless it gets pushed back). That's the adoption of the beta client which will cut the load drastically thereby making communications easier. Then when the classic gear is added, it will get even easier and have room for future growth. (new users)

I can't predict the future, maybe something doesn't happen like it should (like the master science database merge), and dates can change. My point here is poeple need to know a plan exists to ease the current problems. Hopefully new users will see this and say "hey they knew there'd be problems and they tried to account for it in the plans".

I hope you stick around, if nothing else the next couple weeks ought to be very interesting.

thanks for your time

tony


Al

____________
Please consider a Donation to the Seti Project.

Jim
Avatar
Send message
Joined: 28 Jan 00
Posts: 614
Credit: 2,031,206
RAC: 0
United States
Message 206486 - Posted: 8 Dec 2005, 6:27:03 UTC

Tony - nicely done. Well written and very informative.

Bound to make the NY Times best seller list.

A big effort and much appreciated.

Jim
____________

Without love, breath is just a clock ... ticking.
Equilibrium

Profile Paul D. Buck
Volunteer tester
Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 206684 - Posted: 8 Dec 2005, 13:37:41 UTC

Just one minor quibble.

The introduction of the enhanced client will be "phased" in through the restriction of the work sent for the new version. In other words, it will not be a dramatic "switch-over" ...

When things are working as expected, the proportion of new work and old will be changed. One of the driving factors again is server load. By controlling how much work is issued they can control how many people will be downloading the new enhanced application.

(From the mailing lists)
____________

James Nelson
Volunteer tester
Avatar
Send message
Joined: 23 Mar 02
Posts: 377
Credit: 1,982,325
RAC: 211
United States
Message 206736 - Posted: 8 Dec 2005, 14:25:30 UTC - in response to Message 206378.

2) increase the sensitivity of our search (this is what the seti beta team has been working on since June 2005, it doubles the sensitivity of the search and takes 4 to 10 times longer to crunch a result. When released it will reduce the load by at least a factor of 4)

My only question is if it takes 4 - 10 times longer, wont the result files be
4 - 10 times larger, If so then total through put should be about the same maybe a little less but not a factor of 4 less.
Im no techie but Im not sure the enhanced version is the solution to all our problems, although reducing the number of files on the disk array should help stop all the timouts.
____________

Profile Landroval
Send message
Joined: 7 Oct 01
Posts: 188
Credit: 825,169
RAC: 842
United States
Message 206742 - Posted: 8 Dec 2005, 14:30:16 UTC - in response to Message 206736.
Last modified: 8 Dec 2005, 14:30:46 UTC

My only question is if it takes 4 - 10 times longer, wont the result files be 4 - 10 times larger, If so then total through put should be about the same maybe a little less but not a factor of 4 less.
Im no techie but Im not sure the enhanced version is the solution to all our problems, although reducing the number of files on the disk array should help stop all the timouts.


Uploading and reporting involves several steps and database accesses, and the time it takes to transfer the file is a relatively small part of it. Larger transfers, but fewer of them, will reduce the total load on the servers quite a bit. It won't be the answer to everything, but it'll help.

Cheers,
Brian

edit: dumb typo fixed.
____________
If you think education is expensive, try ignorance.

Astro
Volunteer tester
Avatar
Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 206745 - Posted: 8 Dec 2005, 14:35:01 UTC

The problem isn't the file size, as I understand it, it's getting a socket connection. Once you have a socket then up they go. The total bandwidth is currently sufficient to the task. Also, the crunchtimes are dependent upon angle range, and is supposed to vary between 4 and 10 times longer. I haven't seen an established average, so figures are sketchy.

"The Plan" is long enough already. If it were longer I probably wouldn't want to read it. I just tried to cover the highlights to keep it as trim as possible to increase the read count. For me the important part was letting frustrated user know that there WAS a plan.

thanks for you input.

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8470
Credit: 22,980,006
RAC: 13,754
United Kingdom
Message 206759 - Posted: 8 Dec 2005, 14:51:25 UTC
Last modified: 8 Dec 2005, 14:54:27 UTC

The files on Seti enhanced are exactly the same type and size of file we are presently processing.

The extra work is looking deeper into the data that is actually there. As my tutor on spread spectrum techniques described it, imagaine looking sideways at your lawn, when it's being cut, normal is like a 3inch cut, it just cuts off the tips of the longest blades of grass to be analysed, enhanced is like a 2 inch cut where it not only cuts the longest blades but many of the shorted blades as well for analysis.

[edit] sorry - spread spectrum techniques are used in communications, where you spread the power of the transmitter over a wide spectrum so that many channels can be used, doing this usually means that the receive signal is buried in the noise.
____________
Only two things are infinite: the universe and human stupidity, and I am not sure about the former. - Albert Einstein

Daniel Schaalma
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 297
Credit: 16,953,703
RAC: 0
United States
Message 206766 - Posted: 8 Dec 2005, 14:56:59 UTC - in response to Message 206684.

Just one minor quibble.

The introduction of the enhanced client will be "phased" in through the restriction of the work sent for the new version. In other words, it will not be a dramatic "switch-over" ...

When things are working as expected, the proportion of new work and old will be changed. One of the driving factors again is server load. By controlling how much work is issued they can control how many people will be downloading the new enhanced application.

(From the mailing lists)


I thought that the W/U's for the SETI_Enhanced app were the same W/U's now being crunched by the existing app, just that the enhanced app did finer analysis of the data? I could be very wrong here, but I thought I read something to that effect. Please correct me if I am wrong. Otherwise, I would like to know how to tell the difference between the W/U's so that I can remove the optimized science app running on my machines at the proper time, not too early, and not too late...

Also, major KUDOS to Tony for his hard work here.

Regards, Daniel.
____________

Profile Landroval
Send message
Joined: 7 Oct 01
Posts: 188
Credit: 825,169
RAC: 842
United States
Message 206772 - Posted: 8 Dec 2005, 15:01:43 UTC - in response to Message 206766.

I thought that the W/U's for the SETI_Enhanced app were the same W/U's now being crunched by the existing app, just that the enhanced app did finer analysis of the data? I could be very wrong here, but I thought I read something to that effect. Please correct me if I am wrong. Otherwise, I would like to know how to tell the difference between the W/U's so that I can remove the optimized science app running on my machines at the proper time, not too early, and not too late...

There's a tag at the top of the work unit that tells BOINC which science app should be used to analyze the file. The Enhanced work units are the same data, just a different tag. ASSuming the tag indicates a major version upgrade, the new app should be downloaded automagically.

I'm also running an optimized app. My plan is that when Enhanced goes live, I'll set my work queue to a small value and let it drain, which will take a few days, then let it switch to the new science app when it's ready.

Cheers,
Brian
____________
If you think education is expensive, try ignorance.

Profile tekwyzrd
Volunteer tester
Avatar
Send message
Joined: 21 Nov 01
Posts: 767
Credit: 30,009
RAC: 0
United States
Message 206915 - Posted: 8 Dec 2005, 18:08:19 UTC - in response to Message 206759.
Last modified: 8 Dec 2005, 18:14:51 UTC

The files on Seti enhanced are exactly the same type and size of file we are presently processing.

The extra work is looking deeper into the data that is actually there. As my tutor on spread spectrum techniques described it, imagaine looking sideways at your lawn, when it's being cut, normal is like a 3inch cut, it just cuts off the tips of the longest blades of grass to be analysed, enhanced is like a 2 inch cut where it not only cuts the longest blades but many of the shorted blades as well for analysis.

[edit] sorry - spread spectrum techniques are used in communications, where you spread the power of the transmitter over a wide spectrum so that many channels can be used, doing this usually means that the receive signal is buried in the noise.


I'll try once more to explain a few problems I see as likely to be encountered by switching to the enhanced app and the longer work unit run times.

I like your analogy so I'll use it to explain of one of my concerns.
The current science app is like the 3inch cut. The results returned are like descriptions of the portion of the blades cut (height, width, position). By switching to the enhanced app or 2inch cut more blades are cut (more signals analyzed) and more data is included in the returned data. This means more points of comparison to reach a quorum and more work for the validators. It's a decrease in the number of results to compare, but an increase in the work per comparison.

As for the three month deadline, it will increase the length of time that information is stored on the drives at berkeley, further burdening an already overloaded file system. Files pertaining to a particular work unit will be held at least three months, possibly much longer if a quorum isn't reached at the end of the three months. This could mean a delay of three months, six months, or longer before credit is granted for some units.

Then there's the factor of the widely varied capabilities of computers. With my computer's current configuration, it's processing work units using setiathome_SSE-naparst-r3.4 at a rate of 2 units every 4.5 hours. There's many newer computers processing work units in one or two hours. If a work unit is assigned to my computer and three faster computers the probability of a quorum being reached before my computer completes the unit is much greater with the enhanced seti app than with the current version. Some may not see how this is possible, but if you consider that the enhanced version will take up to 45 hours to process a work unit on my computer (90 hours if I have to use an unoptimized app) and 10 to 20 hours on newer computers, on many work units a quorum will be reached a full day or more before my computer finishes the work. The time spent processing these units would be wasted. It happens occasionally now, but not very often due to the practice of caching work. With the current app I often see work reported by faster computers a day or more after my computer.

my $0.02

edited to fix typos.
____________
Nothing travels faster than the speed of light with the possible exception of bad news, which obeys its own special laws.
Douglas Adams (1952 - 2001)

Profile The Gas Giant
Volunteer tester
Avatar
Send message
Joined: 22 Nov 01
Posts: 1894
Credit: 2,621,868
RAC: 282
Australia
Message 207016 - Posted: 8 Dec 2005, 19:47:19 UTC - in response to Message 206915.

The files on Seti enhanced are exactly the same type and size of file we are presently processing.

The extra work is looking deeper into the data that is actually there. As my tutor on spread spectrum techniques described it, imagaine looking sideways at your lawn, when it's being cut, normal is like a 3inch cut, it just cuts off the tips of the longest blades of grass to be analysed, enhanced is like a 2 inch cut where it not only cuts the longest blades but many of the shorted blades as well for analysis.

[edit] sorry - spread spectrum techniques are used in communications, where you spread the power of the transmitter over a wide spectrum so that many channels can be used, doing this usually means that the receive signal is buried in the noise.


I'll try once more to explain a few problems I see as likely to be encountered by switching to the enhanced app and the longer work unit run times.

I like your analogy so I'll use it to explain of one of my concerns.
The current science app is like the 3inch cut. The results returned are like descriptions of the portion of the blades cut (height, width, position). By switching to the enhanced app or 2inch cut more blades are cut (more signals analyzed) and more data is included in the returned data. This means more points of comparison to reach a quorum and more work for the validators. It's a decrease in the number of results to compare, but an increase in the work per comparison.

As for the three month deadline, it will increase the length of time that information is stored on the drives at berkeley, further burdening an already overloaded file system. Files pertaining to a particular work unit will be held at least three months, possibly much longer if a quorum isn't reached at the end of the three months. This could mean a delay of three months, six months, or longer before credit is granted for some units.

Then there's the factor of the widely varied capabilities of computers. With my computer's current configuration, it's processing work units using setiathome_SSE-naparst-r3.4 at a rate of 2 units every 4.5 hours. There's many newer computers processing work units in one or two hours. If a work unit is assigned to my computer and three faster computers the probability of a quorum being reached before my computer completes the unit is much greater with the enhanced seti app than with the current version. Some may not see how this is possible, but if you consider that the enhanced version will take up to 45 hours to process a work unit on my computer (90 hours if I have to use an unoptimized app) and 10 to 20 hours on newer computers, on many work units a quorum will be reached a full day or more before my computer finishes the work. The time spent processing these units would be wasted. It happens occasionally now, but not very often due to the practice of caching work. With the current app I often see work reported by faster computers a day or more after my computer.

my $0.02

edited to fix typos.


Why would your computer time be wasted? You'll still get credit.

Live long and crunch.

____________
Paul
(S@H1 8888)
And proud of it!

Profile tekwyzrd
Volunteer tester
Avatar
Send message
Joined: 21 Nov 01
Posts: 767
Credit: 30,009
RAC: 0
United States
Message 207080 - Posted: 8 Dec 2005, 20:31:59 UTC - in response to Message 207016.

Why would your computer time be wasted? You'll still get credit.

Live long and crunch.


It's not really a matter of credit. It's a matter of making a useful contribution to the project. In my opinion if my computer spends time processing a unit for which a quorum has already been reached the time spent on that unit is wasted. Don't misunderstand me. I am all for the enhanced version and it's improved capabilities but there has to be a way to implement it and reduce wasted resources.

I see a strong possibility that the changeover to the enhanced app as planned could lead to a decision to set minimum processor requirements and eliminate many computers from the project.

____________
Nothing travels faster than the speed of light with the possible exception of bad news, which obeys its own special laws.
Douglas Adams (1952 - 2001)

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 207118 - Posted: 8 Dec 2005, 20:50:57 UTC - in response to Message 207080.

I see a strong possibility that the changeover to the enhanced app as planned could lead to a decision to set minimum processor requirements and eliminate many computers from the project.

Why would you see that?

____________

1 · 2 · Next

Message boards : Number crunching : The Plan

Copyright © 2014 University of California