Who analyzes the results?

Message boards : Number crunching : Who analyzes the results?

To post messages, you must log in.

AuthorMessage
Health Networks

Send message
Joined: 7 Mar 12
Posts: 16
Credit: 195,739
RAC: 0
United States
Message 1203786 - Posted: 8 Mar 2012, 18:05:01 UTC

Years ago I was involved with the team at Scripps who was running "FightAIDS@Home". since then it has branched off into something else, and they've made some discoveries. I was super excited to be involved as a volunteer and my expertise in building online communities was something they were looking for. A way to make the experience more fun for users, get them involved (stats, compete, etc).

Unfortunately, when I finally showed up at their offices, it became all too clear to me what I was afraid of in the first place. All the data being crunched by the unsuspecting participants at home was sitting in a growing pile of paper printouts. Some piles were literally 5 feet high. The lead programmer at the time told me point blank: "We have the data. We just have no way to analyze it. There's too much."

As a result I lost a lot of hope in the project and did not continue with them. I have always wondered - wouldn't it take another 300,000 people with computers to analyze the data that the first 300,000 "crunch" ?

I realize SETI may be unique in that we are doing the analyzing on the first passthrough - and maybe automated alerts are put in place to trigger "finds". So both crunching and analyzing are synonymous in this case. But with other projects - especially involving health issues - protein folding, etc. ... I cant imagine the manpower needed to take all that data and do something with it!

ID: 1203786 · Report as offensive
Profile tullioProject Donor
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 5714
Credit: 972,831
RAC: 2,858
Italy
Message 1203809 - Posted: 8 Mar 2012, 18:58:15 UTC - in response to Message 1203786.  
Last modified: 8 Mar 2012, 19:11:32 UTC

There is an article on "Nature" magazine of March 8 on text mining. Very interesting especially for pharmaceutical firm looking for genome sequences in published texts.I think it could be applied on every printed text.
Tullio
Nature


ID: 1203809 · Report as offensive
JLConawayII

Send message
Joined: 2 Apr 02
Posts: 188
Credit: 2,834,354
RAC: 0
United States
Message 1203873 - Posted: 8 Mar 2012, 21:17:47 UTC
Last modified: 8 Mar 2012, 21:23:32 UTC

In most cases, analyzing the results takes nowhere near the same processing power as creating them. That's the whole point of doing it this way, because they don't have the supercomputing horsepower on their end to accomplish this level of calculation themselves. If you've made your project to where you actually need as much or more compute power to analyze the results as it takes to create them, you've done something horribly wrong.

Take GPUGrid, for example. The thousands of WUs that are sent out to end users each calculate a small part of a molecular dynamics model. This takes enormous horsepower because every atomic interaction must be calculated. However, once this gets sent back it gets put together into a completed model that is (relatively) simple for the team to analyze. The WUs just have to be constructed so that their system knows where to put them in the model when they get sent back. This is a pretty oversimplified description of the process, but I trust the point is made.


ID: 1203873 · Report as offensive
Health Networks

Send message
Joined: 7 Mar 12
Posts: 16
Credit: 195,739
RAC: 0
United States
Message 1203908 - Posted: 8 Mar 2012, 23:19:58 UTC - in response to Message 1203873.  

Okay.

So who is analyzing the numbers for SETI ?

Is there a room full of hundreds of people?

ID: 1203908 · Report as offensive
Kafo

Send message
Joined: 17 Dec 00
Posts: 17
Credit: 5,907,625
RAC: 3,222
Canada
Message 1203942 - Posted: 9 Mar 2012, 0:53:25 UTC

Seti@home has outsourced the analysis to an extra-terrestrial conglomerate located off planet

ID: 1203942 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1203952 - Posted: 9 Mar 2012, 1:07:23 UTC
Last modified: 9 Mar 2012, 1:18:36 UTC

NTPCkr the analysis program discussion (Thread starts back in 2008)
http://setiathome.berkeley.edu/forum_thread.php?id=46636#743622

The project had constructed a server (Synergy I believe) to run the data search algorithm on the returned data but ran into several problems.
1) several other servers died and the analysis server took over the functions of the dead or dying servers
2) the DB pounding slowed down the main project so much that both could not run at the same time so the project instituted for a time 3 day analysis breaks/week no one was happy.
3) the early version of the software returned many false positives so the programs need to be rewritten the last I heard. The problem being they are short on staff time fixing day to day problems to do this. Also to test the code the project may need to shut down the other parts of the project to support DB access of the new NTPCkr program.

Solution in progress and proposed
1) 2 new servers came in today that at the end of the day will speed up several parts of the backend systems to allow for NTPCkr to run at the same time as primary data processing.
2) the Synergy server should be freed up to return to NTPCkr duty when the new servers come online
3) A new RAID box is coming in with the new servers to allow for faster science DB access to allow simultaneous access by the Validators of primary data processing and NTPCkr use.
Link to the staff post on the new RAID and servers http://setiathome.berkeley.edu/forum_thread.php?id=67190#1203945
I know I have seen other discussions of proposed fixes but am not finding them right now. They may have been it the original NTPCkr thread at the end.


ID: 1203952 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1214049 - Posted: 4 Apr 2012, 20:57:37 UTC

Matt just posted in the tech forum that the new builds for the analysis programs are again building on test machines and some of the futures are starting to work. The project does post updates on the backend stuff. The update is the April 3rd 2012 post.


ID: 1214049 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1214050 - Posted: 4 Apr 2012, 21:00:43 UTC - in response to Message 1214049.  

The plan currently is that GeorgeM will be taking on most if not all of the analytical work once it's properly configured. Prior to this Synergy was doing the lions share but has been retasked into other roles.

BTW GeorgeM is a Synergy clone with 128G of RAM vs Synergy's 96 and has updated CPU's. It's quite a beast.




Executive Director GPU Users Group Inc. -
brad@gpuug.org

ID: 1214050 · Report as offensive
Profile Michel448a
Volunteer tester
Avatar

Send message
Joined: 27 Oct 00
Posts: 1201
Credit: 2,891,635
RAC: 0
Canada
Message 1214064 - Posted: 4 Apr 2012, 22:21:09 UTC
Last modified: 4 Apr 2012, 22:22:02 UTC

but... cant have a such analyzer that would analyze my own work before i send it to seti ? in the same way GeorgeM would do it ? ^^


ID: 1214064 · Report as offensive
Profile ignorance is no excuse
Avatar

Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1214116 - Posted: 5 Apr 2012, 1:56:51 UTC - in response to Message 1214064.  

You don't think some sneaky coding wonk wouldn't use your idea to intentionally add data to file and quickly make a mockery of the process.

Not that someone couldnt do that now. however the work being done also needs to have corroborating results. Examining the result before making sure yours are actually good is putting the cart before the horse


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

ID: 1214116 · Report as offensive

Message boards : Number crunching : Who analyzes the results?


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.