Who analyzes the results?

Message boards : Number crunching : Who analyzes the results?
Message board moderation

To post messages, you must log in.

AuthorMessage
Health Networks

Send message
Joined: 7 Mar 12
Posts: 16
Credit: 195,739
RAC: 0
United States
Message 1203786 - Posted: 8 Mar 2012, 18:05:01 UTC

Years ago I was involved with the team at Scripps who was running "FightAIDS@Home". since then it has branched off into something else, and they've made some discoveries. I was super excited to be involved as a volunteer and my expertise in building online communities was something they were looking for. A way to make the experience more fun for users, get them involved (stats, compete, etc).

Unfortunately, when I finally showed up at their offices, it became all too clear to me what I was afraid of in the first place. All the data being crunched by the unsuspecting participants at home was sitting in a growing pile of paper printouts. Some piles were literally 5 feet high. The lead programmer at the time told me point blank: "We have the data. We just have no way to analyze it. There's too much."

As a result I lost a lot of hope in the project and did not continue with them. I have always wondered - wouldn't it take another 300,000 people with computers to analyze the data that the first 300,000 "crunch" ?

I realize SETI may be unique in that we are doing the analyzing on the first passthrough - and maybe automated alerts are put in place to trigger "finds". So both crunching and analyzing are synonymous in this case. But with other projects - especially involving health issues - protein folding, etc. ... I cant imagine the manpower needed to take all that data and do something with it!

ID: 1203786 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1203809 - Posted: 8 Mar 2012, 18:58:15 UTC - in response to Message 1203786.  
Last modified: 8 Mar 2012, 19:11:32 UTC

There is an article on "Nature" magazine of March 8 on text mining. Very interesting especially for pharmaceutical firm looking for genome sequences in published texts.I think it could be applied on every printed text.
Tullio
Nature
ID: 1203809 · Report as offensive
JLConawayII

Send message
Joined: 2 Apr 02
Posts: 188
Credit: 2,840,460
RAC: 0
United States
Message 1203873 - Posted: 8 Mar 2012, 21:17:47 UTC
Last modified: 8 Mar 2012, 21:23:32 UTC

In most cases, analyzing the results takes nowhere near the same processing power as creating them. That's the whole point of doing it this way, because they don't have the supercomputing horsepower on their end to accomplish this level of calculation themselves. If you've made your project to where you actually need as much or more compute power to analyze the results as it takes to create them, you've done something horribly wrong.

Take GPUGrid, for example. The thousands of WUs that are sent out to end users each calculate a small part of a molecular dynamics model. This takes enormous horsepower because every atomic interaction must be calculated. However, once this gets sent back it gets put together into a completed model that is (relatively) simple for the team to analyze. The WUs just have to be constructed so that their system knows where to put them in the model when they get sent back. This is a pretty oversimplified description of the process, but I trust the point is made.
ID: 1203873 · Report as offensive
Health Networks

Send message
Joined: 7 Mar 12
Posts: 16
Credit: 195,739
RAC: 0
United States
Message 1203908 - Posted: 8 Mar 2012, 23:19:58 UTC - in response to Message 1203873.  

Okay.

So who is analyzing the numbers for SETI ?

Is there a room full of hundreds of people?
ID: 1203908 · Report as offensive
Kafo

Send message
Joined: 17 Dec 00
Posts: 19
Credit: 15,432,367
RAC: 18
Canada
Message 1203942 - Posted: 9 Mar 2012, 0:53:25 UTC

Seti@home has outsourced the analysis to an extra-terrestrial conglomerate located off planet
ID: 1203942 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1203952 - Posted: 9 Mar 2012, 1:07:23 UTC
Last modified: 9 Mar 2012, 1:18:36 UTC

NTPCkr the analysis program discussion (Thread starts back in 2008)
http://setiathome.berkeley.edu/forum_thread.php?id=46636#743622

The project had constructed a server (Synergy I believe) to run the data search algorithm on the returned data but ran into several problems.
1) several other servers died and the analysis server took over the functions of the dead or dying servers
2) the DB pounding slowed down the main project so much that both could not run at the same time so the project instituted for a time 3 day analysis breaks/week no one was happy.
3) the early version of the software returned many false positives so the programs need to be rewritten the last I heard. The problem being they are short on staff time fixing day to day problems to do this. Also to test the code the project may need to shut down the other parts of the project to support DB access of the new NTPCkr program.

Solution in progress and proposed
1) 2 new servers came in today that at the end of the day will speed up several parts of the backend systems to allow for NTPCkr to run at the same time as primary data processing.
2) the Synergy server should be freed up to return to NTPCkr duty when the new servers come online
3) A new RAID box is coming in with the new servers to allow for faster science DB access to allow simultaneous access by the Validators of primary data processing and NTPCkr use.
Link to the staff post on the new RAID and servers http://setiathome.berkeley.edu/forum_thread.php?id=67190#1203945
I know I have seen other discussions of proposed fixes but am not finding them right now. They may have been it the original NTPCkr thread at the end.
ID: 1203952 · Report as offensive
B-Man
Volunteer tester

Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1214049 - Posted: 4 Apr 2012, 20:57:37 UTC

Matt just posted in the tech forum that the new builds for the analysis programs are again building on test machines and some of the futures are starting to work. The project does post updates on the backend stuff. The update is the April 3rd 2012 post.
ID: 1214049 · Report as offensive
Profile Slavac
Volunteer tester
Avatar

Send message
Joined: 27 Apr 11
Posts: 1932
Credit: 17,952,639
RAC: 0
United States
Message 1214050 - Posted: 4 Apr 2012, 21:00:43 UTC - in response to Message 1214049.  

The plan currently is that GeorgeM will be taking on most if not all of the analytical work once it's properly configured. Prior to this Synergy was doing the lions share but has been retasked into other roles.

BTW GeorgeM is a Synergy clone with 128G of RAM vs Synergy's 96 and has updated CPU's. It's quite a beast.


Executive Director GPU Users Group Inc. -
brad@gpuug.org
ID: 1214050 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1214116 - Posted: 5 Apr 2012, 1:56:51 UTC - in response to Message 1214064.  

You don't think some sneaky coding wonk wouldn't use your idea to intentionally add data to file and quickly make a mockery of the process.

Not that someone couldnt do that now. however the work being done also needs to have corroborating results. Examining the result before making sure yours are actually good is putting the cart before the horse


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1214116 · Report as offensive

Message boards : Number crunching : Who analyzes the results?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.