WASTING MY TIME?


log in

Advanced search

Message boards : Number crunching : WASTING MY TIME?

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author Message
Profile Fred J. Verster
Volunteer tester
Send message
Joined: 21 Apr 04
Posts: 3238
Credit: 31,758,000
RAC: 4,382
Netherlands
Message 1306867 - Posted: 16 Nov 2012, 19:04:31 UTC - in response to Message 1306861.

Was everyone told the point is to find ET. Yes.


Using exactly your words...

If the focus is FIND ET, then why try to keep AP-Splitting working and making all this wierd situation worst if is crystal clear the actual configuration can´t handle both project running all full capacity?

Astropulse development was funded in part by an NSF grant. Trying for future NSF grants without continuing AP processing would be silly. Stopping AP temporarily is possible, but the general idea of analyzing all fresh data with both SaH and AP algorithms is necessarily the goal.

In terms of future funding, it might actually be better to skip the SaH processing and do only AP on some of the data.
Joe


I agree to this.
AstroPulse has a better chance of finding 'ET' or something alike.

At least in the same data Einstein has found several Pulsars.

NOT what the majority of the crunchers want to hear, probably, but we do
have to be realistic and funding is dead on necessary for this project!



____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4167
Credit: 113,999,888
RAC: 142,148
United States
Message 1306888 - Posted: 16 Nov 2012, 20:14:48 UTC - in response to Message 1306861.

Was everyone told the point is to find ET. Yes.


Using exactly your words...

If the focus is FIND ET, then why try to keep AP-Splitting working and making all this wierd situation worst if is crystal clear the actual configuration can´t handle both project running all full capacity?

Astropulse development was funded in part by an NSF grant. Trying for future NSF grants without continuing AP processing would be silly. Stopping AP temporarily is possible, but the general idea of analyzing all fresh data with both SaH and AP algorithms is necessarily the goal.

In terms of future funding, it might actually be better to skip the SaH processing and do only AP on some of the data.
Joe

As I recall there was some talk about changing the software to download AP tasks & then process them for both AP & MB data. As the raw data is the same, but just broken up differently. It might have just been an idea someone mentioned here instead of something form the guys in the lab. I don't recall.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12492
Credit: 6,803,025
RAC: 6,085
United States
Message 1306918 - Posted: 16 Nov 2012, 21:43:35 UTC - in response to Message 1306888.

Was everyone told the point is to find ET. Yes.


Using exactly your words...

If the focus is FIND ET, then why try to keep AP-Splitting working and making all this wierd situation worst if is crystal clear the actual configuration can´t handle both project running all full capacity?

Astropulse development was funded in part by an NSF grant. Trying for future NSF grants without continuing AP processing would be silly. Stopping AP temporarily is possible, but the general idea of analyzing all fresh data with both SaH and AP algorithms is necessarily the goal.

In terms of future funding, it might actually be better to skip the SaH processing and do only AP on some of the data.
Joe

As I recall there was some talk about changing the software to download AP tasks & then process them for both AP & MB data. As the raw data is the same, but just broken up differently. It might have just been an idea someone mentioned here instead of something form the guys in the lab. I don't recall.

That is the splitters job. When they load a "tape" it is sent to both AP and MB splitters and both kinds of work units are generated.

____________

Profile ivan
Volunteer tester
Avatar
Send message
Joined: 5 Mar 01
Posts: 608
Credit: 137,982,721
RAC: 149,677
United Kingdom
Message 1306931 - Posted: 16 Nov 2012, 22:21:42 UTC - in response to Message 1306918.

As I recall there was some talk about changing the software to download AP tasks & then process them for both AP & MB data. As the raw data is the same, but just broken up differently. It might have just been an idea someone mentioned here instead of something form the guys in the lab. I don't recall.

That is the splitters job. When they load a "tape" it is sent to both AP and MB splitters and both kinds of work units are generated.

Which raises the question that (IIRC, I wish this forum would include attributions like modern software such as Usenet :-) HAL9000D alluded to -- could AP data be sent to hosts which could then re-process (locally split) them into simultaneous MB WUs? Note I said "could" -- logistics of making sure work was sent to hosts capable of processing both data types would probably be prohibitive.
____________

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12492
Credit: 6,803,025
RAC: 6,085
United States
Message 1306950 - Posted: 16 Nov 2012, 23:15:53 UTC - in response to Message 1306931.

As I recall there was some talk about changing the software to download AP tasks & then process them for both AP & MB data. As the raw data is the same, but just broken up differently. It might have just been an idea someone mentioned here instead of something form the guys in the lab. I don't recall.

That is the splitters job. When they load a "tape" it is sent to both AP and MB splitters and both kinds of work units are generated.

Which raises the question that (IIRC, I wish this forum would include attributions like modern software such as Usenet :-) HAL9000D alluded to -- could AP data be sent to hosts which could then re-process (locally split) them into simultaneous MB WUs? Note I said "could" -- logistics of making sure work was sent to hosts capable of processing both data types would probably be prohibitive.

Could yes. Issue is the raw data is many times larger than the split data, so it would seriously clog the pipe and the pipe is already clogged.


____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4242
Credit: 1,047,276
RAC: 293
United States
Message 1306987 - Posted: 17 Nov 2012, 0:40:45 UTC - in response to Message 1306918.


As I recall there was some talk about changing the software to download AP tasks & then process them for both AP & MB data. As the raw data is the same, but just broken up differently. It might have just been an idea someone mentioned here instead of something form the guys in the lab. I don't recall.

That is the splitters job. When they load a "tape" it is sent to both AP and MB splitters and both kinds of work units are generated.

There has been some speculation about unburdening the servers by delivering suitably sized chunks of the raw data and have the science application do the splitting. AFAIK that's user speculation only, if the project considered it at all in 2003 when work on Astropulse started they probably decided almost instantly it was neither needed nor practical.

Now it is at least close to practical for many fairly new hosts. The general idea would be to send 107.374 seconds of raw data from a channel to the host (a 64MiB + overhead super WU) and have the app split it into 256 frequency subbands (MB WUs) plus split into 8 sequential time sections at full bandwidth (AP WUs). The advantage is that single ~64MiB download replaces ~162MiB of downloads for work split server side. Both the raw data and split data use the same 2 bits to represent a complex data point, the savings come from only sending the data once plus far less overhead in the WU header information.

That simple scheme only matches current splitting for the first 107.374 seconds of data from a channel, because the MB splitting has an overlap of ~20%. The second group of 256 MB Wus should start at ~85.77 seconds, but the ninth AP WU starts at 107.374 seconds. Compromise parameters could be chosen and applied to both client-side and server-side splitting, IMO that would be better than needing a separate science database for the dissimilar results.

Result reporting plus other BOINC aspects would need a lot of thought, and there would be a lot of new code to implement the change. If we had funded the project well enough that they had a programmer with no high priority work under way, perhaps...
Joe

Profile Blurf
Volunteer tester
Send message
Joined: 2 Sep 06
Posts: 7547
Credit: 6,813,692
RAC: 8,175
United States
Message 1307001 - Posted: 17 Nov 2012, 2:18:39 UTC - in response to Message 1306736.
Last modified: 17 Nov 2012, 2:19:08 UTC

I take your point Gary, us long term hard liners are Ok, but there are those that want the "feelgood factor" which Matt managed to give them.

I think that maybe a fortnightly "Seti Newsletter" bringing together the latest news, could be a useful idea that would give many people an update on what is happening with the project. And hopefully make them feel that they haven't been forgotten. But it would need regular input from people like Richard Haselgrove etc, the Lab guys, and the GPU User group on donations raised and kit bought etc. Could that input be regularly guaranteed??

I'd be willing to volunteer to produce such a Newsletter, provided I got enough regular input to make it worthwhile.


Chris-there's the problem--you won't get enough input. The lab staff simply won't have the time to give you what you want and people like Richard can only give so much info.

We really need to find a way to bring more staff to help with the workload. Our GPU fundraiser group needs to start finding fresh sources of new funds. We have a local billionaire named Thomas Golisano--thinking of trying to invoke some connections I have to him.
____________


bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 23,380,397
RAC: 25,239
United States
Message 1307003 - Posted: 17 Nov 2012, 3:09:53 UTC - in response to Message 1307001.

$25 million dollars spread out over the next 5 years
would probably be adequate for enough salaried staff.

Profile Tron
Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,236,055
RAC: 0
United States
Message 1307015 - Posted: 17 Nov 2012, 5:12:39 UTC - in response to Message 1306987.

The general idea would be to send 107.374 seconds of raw data from a channel to the host (a 64MiB + overhead super WU) and have the app split it into 256 frequency subbands (MB WUs) plus split into 8 sequential time sections at full bandwidth (AP WUs). The advantage is that single ~64MiB download replaces ~162MiB of downloads for work split server side. Both the raw data and split data use the same 2 bits to represent a complex data point, the savings come from only sending the data once plus far less overhead in the WU header information.



Take that a step further and have the Host distribute some the "locally split" WU's

Profile Slavac
Volunteer tester
Avatar
Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1307034 - Posted: 17 Nov 2012, 6:32:25 UTC - in response to Message 1306486.

Mike, if I had volunteers contributing to my efforts, I would treat them as a valuable resource and not ignore them and take them for granted. It is simply the courteous thing to do.


While there's a few things the lab could do better (namely keeping the volunteers informed) here's something to keep in mind:

Through our volunteers who donate their time and money to SETI through the GPUUG we've accomplished to date http://www.gpuug.org/content/what-we-do.

Before we started the infrastructure of the lab was in a fairly poor condition. For evidence of this take a look at the first server donated, Synergy and now look at how many tasks that one machine is running. Add in Paddy and George and together these 3 servers have taken over the duties of I believe 8 now retired servers.

Since then our donors have upgraded everything from the server closet (Synergy, GeorgeM, PaddyM, new switch, more RAM, a filled JBOD) to the lab itself (workstations, desktop setups, UPS's) to the basics (120 and counting transport drives plus protective cases). We're even upgrading how SETI collects the data you process with our compute nodes, Brocade switches, docks and so on. Heck our donors have even contributed a large fist full of cash.

______________________________

While I like to point to the above and rant and rave at how awesome our donors are and what they've done, the issue remains; if it doesn't show up on Jim Donor's computer in some visible way, it doesn't seem to matter. This issue is frustrating to me however it's completely understandable given that the scientific community is largely focused on tangible results.

______________________________

The issue we're facing is a bit understandable. Consider that we're the largest BOINC project currently running. We chew through immense amounts of data thanks to ever increasing technology Compare a 560ti to a 690 for example and realize that advancement represents about a year's time of development.

As a result of the above, coupled with our need to upgrade infrastructure we run into problems like we've been having.

______________________________

For my end of the chain, we're going to continue to work through our donors to upgrade the project's infrastructure in hopes that we can avoid these issues we're having in the future. One of my primary goals is smoothing out the system while at the same time increasing the amount of data we're processing (yay more science!).

In short, try to be understanding. We have X resources while Y (amount of data users can process per (time)) is ever increasing. There are several logjams in our way namely a lack of staff, lack of proper bandwidth and necessary infrastructure upgrades. We (GPUUG) are working on fixing all of the above but we need time, money and volunteers who want to lend us a hand.

------------------------------

Sorry for the very long winded response but I hope it gives a few folks something to think about in light of the issues we've been having.
____________


Executive Director GPU Users Group Inc. -
brad@gpuug.org

Profile Slavac
Volunteer tester
Avatar
Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1307036 - Posted: 17 Nov 2012, 6:36:47 UTC - in response to Message 1307001.

I take your point Gary, us long term hard liners are Ok, but there are those that want the "feelgood factor" which Matt managed to give them.

I think that maybe a fortnightly "Seti Newsletter" bringing together the latest news, could be a useful idea that would give many people an update on what is happening with the project. And hopefully make them feel that they haven't been forgotten. But it would need regular input from people like Richard Haselgrove etc, the Lab guys, and the GPU User group on donations raised and kit bought etc. Could that input be regularly guaranteed??

I'd be willing to volunteer to produce such a Newsletter, provided I got enough regular input to make it worthwhile.


Chris-there's the problem--you won't get enough input. The lab staff simply won't have the time to give you what you want and people like Richard can only give so much info.

We really need to find a way to bring more staff to help with the workload. Our GPU fundraiser group needs to start finding fresh sources of new funds. We have a local billionaire named Thomas Golisano--thinking of trying to invoke some connections I have to him.


To be fair we're essentially a two man operation. I've begged for volunteers to write grants or letters, make calls, and generally help us however we they can but so far I've gotten next to nothing in the form of volunteers.

If I had two people who could spend maybe 4 hours a week contacting potential donors, I could do some good. One day I'll have those I hope.
____________


Executive Director GPU Users Group Inc. -
brad@gpuug.org

alan
Avatar
Send message
Joined: 18 Feb 00
Posts: 131
Credit: 401,606
RAC: 0
United Kingdom
Message 1307073 - Posted: 17 Nov 2012, 10:34:21 UTC - in response to Message 1307015.


Josef W. Segur wrote


The general idea would be to send 107.374 seconds of raw data from a channel to the host (a 64MiB + overhead super WU) and have the app split it into 256 frequency subbands (MB WUs) plus split into 8 sequential time sections at full bandwidth (AP WUs). The advantage is that single ~64MiB download replaces ~162MiB of downloads for work split server side. Both the raw data and split data use the same 2 bits to represent a complex data point, the savings come from only sending the data once plus far less overhead in the WU header information.


Exactly. By this method you would be distributing more of the processing out to the end-user machines, which is what this project is all about. Should attract Dr. Anderson's attention.

Nobody has addressed validation yet, though.

Tron wrote

Take that a step further and have the Host distribute some the "locally split" WU's


Why on Earth would you want to redistribute these WU? If the end-user machine is big and fast enough to take on a whole raw data unit, it's also big and fast enough to process the WU so generated. Indeed, the main saving is in reducing the bandwidth required at the central servers, by making only one download of the data and one upload of the combined set of results.

The combined results would need to be validated as a unit against the same unit processed by a different end-user. Any disagreement would result in the whole unit being sent out again for a processing by a third user, exactly as is done at present.

So you would reduce the load on the splitter processes and reduce the bandwidth requirement.
____________

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12492
Credit: 6,803,025
RAC: 6,085
United States
Message 1307100 - Posted: 17 Nov 2012, 14:57:59 UTC

All this splitting talk is forgetting one thing. It still would need to be split. A "tape" isn't just 107+ seconds of data. The "tape" may be several hours of data, and that data includes parameters that presently aren't being sent to us but that the splitters use in splitting the data. Aren't the "tapes" presently 2Tb hard drives? Do you want to wait for that to D/L before you can do the next work unit? Are you willing to have work units that make CPDN work units look small?

____________

Profile Slavac
Volunteer tester
Avatar
Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1307133 - Posted: 17 Nov 2012, 17:52:33 UTC - in response to Message 1307100.

All this splitting talk is forgetting one thing. It still would need to be split. A "tape" isn't just 107+ seconds of data. The "tape" may be several hours of data, and that data includes parameters that presently aren't being sent to us but that the splitters use in splitting the data. Aren't the "tapes" presently 2Tb hard drives? Do you want to wait for that to D/L before you can do the next work unit? Are you willing to have work units that make CPDN work units look small?


2TB for most everything (GBT/Arecibo), 3TB for AP reob and a few 1TB's tossed into the mix.
____________


Executive Director GPU Users Group Inc. -
brad@gpuug.org

Horacio
Send message
Joined: 14 Jan 00
Posts: 536
Credit: 73,331,351
RAC: 90,542
Argentina
Message 1307136 - Posted: 17 Nov 2012, 17:58:10 UTC - in response to Message 1307100.

All this splitting talk is forgetting one thing. It still would need to be split. A "tape" isn't just 107+ seconds of data. The "tape" may be several hours of data, and that data includes parameters that presently aren't being sent to us but that the splitters use in splitting the data. Aren't the "tapes" presently 2Tb hard drives? Do you want to wait for that to D/L before you can do the next work unit? Are you willing to have work units that make CPDN work units look small?

The idea, which might work or not, its not to avoid the splitting, but to avoid sending twice the "same" data once for AP and once for MB...

Of course this is not a trivial change in the way things work, but the general idea is to modify the AP WU so we, on client side will be able, to extract the MB data and crunch all that data with both approaches... if this were possible then it will lessen the bandwith usage.

I know that not every host/user has a broadband conection and that means that the project will need to send standard MB and APs along with this new WUs, but I think that those hosts are not the ones clogging the pipes, and all the "heavy" users will opt for this new kind of work which will keep them bussy for more time with less bandwith needed and also with less interaction with the scheduler...

In theory, it sounds good...In practice I dont see them beeing able to implement this at the current level of resources and "bussy-ness"... But as this is not a short term project and if there are not technical reasons against the idea, they could plan it on the long term as the next step after the MBv7 or any other already planned step...

And if there is something that make this definatelly not possible, it will be good to be known, so we can forget about it, and think another "wonderfull" idea to scorch with here! ;D
____________

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5277
Credit: 292,612,552
RAC: 471,825
Brazil
Message 1307148 - Posted: 17 Nov 2012, 18:48:30 UTC

I tired to try to make SETI works, now even with or without Proxy+TCP+Hosts can´t even report the results...

Will sail in others seas for a while, at least until the people from the lab makes something that realy finaly fix the problem.

And all that the day i celebrate the 100MM mark, that´s not fair...


____________

Andre Howard
Volunteer tester
Avatar
Send message
Joined: 16 May 99
Posts: 119
Credit: 153,420,607
RAC: 73,004
United States
Message 1307155 - Posted: 17 Nov 2012, 19:30:39 UTC - in response to Message 1307148.

I tired to try to make SETI works, now even with or without Proxy+TCP+Hosts can´t even report the results...

Will sail in others seas for a while, at least until the people from the lab makes something that realy finaly fix the problem.

And all that the day i celebrate the 100MM mark, that´s not fair...




Same problem here with the proxies in the last hour or so. Congratulations on the 100mm, hope things will get straightened out soon so I can get there too.
____________

Profile MusicGod
Avatar
Send message
Joined: 7 Dec 02
Posts: 97
Credit: 24,698,006
RAC: 61
United States
Message 1307157 - Posted: 17 Nov 2012, 19:59:22 UTC

I`m with you Clyde, I started in 2002 and the project just seems to be plagued. I`ve been think some time about shutting down permanantly on my 10th anniversary.I can`t babysit my machines and I`m donating my time and energy ( in the form of electricity) to the project and everyhting seems to be getting worse.I`m not crying, I`m not complaining,,,,,,I am just saying!!!! With so many bad hosts out there it has casued a lot of problems in my belief of units sending bad results ...and then we get stuck with leftovers..I know trhese guys put a lot of their time into the project on their own, and I thank them for that, but it is getting to be too much on this end.
____________

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4242
Credit: 1,047,276
RAC: 293
United States
Message 1307197 - Posted: 17 Nov 2012, 22:48:14 UTC - in response to Message 1307133.

All this splitting talk is forgetting one thing. It still would need to be split. A "tape" isn't just 107+ seconds of data. The "tape" may be several hours of data, and that data includes parameters that presently aren't being sent to us but that the splitters use in splitting the data. Aren't the "tapes" presently 2Tb hard drives? Do you want to wait for that to D/L before you can do the next work unit? Are you willing to have work units that make CPDN work units look small?

2TB for most everything (GBT/Arecibo), 3TB for AP reob and a few 1TB's tossed into the mix.

The "tape" files delivered to the splitters are generally 50.2 GB. and contain raw data for all channels over a period of about 1.5 hours. Yes, if this pie in the sky idea were ever implemented, a splitter process would have to extract 64MiB sections of raw data for one channel, do any radar blanking needed, and package the data with suitable header information. In effect that would be a minor change from what the ap_splitter processes now do for 8MiB data sections, so I don't see it as a significant hindrance to the idea.
Joe

Profile Gary CharpentierProject donor
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 12492
Credit: 6,803,025
RAC: 6,085
United States
Message 1307211 - Posted: 17 Nov 2012, 23:48:27 UTC - in response to Message 1307169.

I am a little bewildered.

How good does the numbers have to be in order to be deemed a signal?

As far as I know, a gaussian score alone is not a definitive indication of an extraterrestrial presence.

What more is needed?

For the same spot in the sky to have a signal repeatedly. Next look at a given spot may be a couple years from the last look. Others, principally near interesting objects for radio astronomers, get looked at several times a year.

____________

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : Number crunching : WASTING MY TIME?

Copyright © 2014 University of California