Message boards :
Number crunching :
Who analyzes the results?
Message board moderation
Author | Message |
---|---|
Health Networks Send message Joined: 7 Mar 12 Posts: 16 Credit: 195,739 RAC: 0 |
Years ago I was involved with the team at Scripps who was running "FightAIDS@Home". since then it has branched off into something else, and they've made some discoveries. I was super excited to be involved as a volunteer and my expertise in building online communities was something they were looking for. A way to make the experience more fun for users, get them involved (stats, compete, etc). Unfortunately, when I finally showed up at their offices, it became all too clear to me what I was afraid of in the first place. All the data being crunched by the unsuspecting participants at home was sitting in a growing pile of paper printouts. Some piles were literally 5 feet high. The lead programmer at the time told me point blank: "We have the data. We just have no way to analyze it. There's too much." As a result I lost a lot of hope in the project and did not continue with them. I have always wondered - wouldn't it take another 300,000 people with computers to analyze the data that the first 300,000 "crunch" ? I realize SETI may be unique in that we are doing the analyzing on the first passthrough - and maybe automated alerts are put in place to trigger "finds". So both crunching and analyzing are synonymous in this case. But with other projects - especially involving health issues - protein folding, etc. ... I cant imagine the manpower needed to take all that data and do something with it! |
tullio Send message Joined: 9 Apr 04 Posts: 8797 Credit: 2,930,782 RAC: 1 |
There is an article on "Nature" magazine of March 8 on text mining. Very interesting especially for pharmaceutical firm looking for genome sequences in published texts.I think it could be applied on every printed text. Tullio Nature |
JLConawayII Send message Joined: 2 Apr 02 Posts: 188 Credit: 2,840,460 RAC: 0 |
In most cases, analyzing the results takes nowhere near the same processing power as creating them. That's the whole point of doing it this way, because they don't have the supercomputing horsepower on their end to accomplish this level of calculation themselves. If you've made your project to where you actually need as much or more compute power to analyze the results as it takes to create them, you've done something horribly wrong. Take GPUGrid, for example. The thousands of WUs that are sent out to end users each calculate a small part of a molecular dynamics model. This takes enormous horsepower because every atomic interaction must be calculated. However, once this gets sent back it gets put together into a completed model that is (relatively) simple for the team to analyze. The WUs just have to be constructed so that their system knows where to put them in the model when they get sent back. This is a pretty oversimplified description of the process, but I trust the point is made. |
Health Networks Send message Joined: 7 Mar 12 Posts: 16 Credit: 195,739 RAC: 0 |
Okay. So who is analyzing the numbers for SETI ? Is there a room full of hundreds of people? |
Kafo Send message Joined: 17 Dec 00 Posts: 19 Credit: 15,432,367 RAC: 18 |
Seti@home has outsourced the analysis to an extra-terrestrial conglomerate located off planet |
B-Man Send message Joined: 11 Feb 01 Posts: 253 Credit: 147,366 RAC: 0 |
NTPCkr the analysis program discussion (Thread starts back in 2008) http://setiathome.berkeley.edu/forum_thread.php?id=46636#743622 The project had constructed a server (Synergy I believe) to run the data search algorithm on the returned data but ran into several problems. 1) several other servers died and the analysis server took over the functions of the dead or dying servers 2) the DB pounding slowed down the main project so much that both could not run at the same time so the project instituted for a time 3 day analysis breaks/week no one was happy. 3) the early version of the software returned many false positives so the programs need to be rewritten the last I heard. The problem being they are short on staff time fixing day to day problems to do this. Also to test the code the project may need to shut down the other parts of the project to support DB access of the new NTPCkr program. Solution in progress and proposed 1) 2 new servers came in today that at the end of the day will speed up several parts of the backend systems to allow for NTPCkr to run at the same time as primary data processing. 2) the Synergy server should be freed up to return to NTPCkr duty when the new servers come online 3) A new RAID box is coming in with the new servers to allow for faster science DB access to allow simultaneous access by the Validators of primary data processing and NTPCkr use. Link to the staff post on the new RAID and servers http://setiathome.berkeley.edu/forum_thread.php?id=67190#1203945 I know I have seen other discussions of proposed fixes but am not finding them right now. They may have been it the original NTPCkr thread at the end. |
B-Man Send message Joined: 11 Feb 01 Posts: 253 Credit: 147,366 RAC: 0 |
Matt just posted in the tech forum that the new builds for the analysis programs are again building on test machines and some of the futures are starting to work. The project does post updates on the backend stuff. The update is the April 3rd 2012 post. |
Slavac Send message Joined: 27 Apr 11 Posts: 1932 Credit: 17,952,639 RAC: 0 |
The plan currently is that GeorgeM will be taking on most if not all of the analytical work once it's properly configured. Prior to this Synergy was doing the lions share but has been retasked into other roles. BTW GeorgeM is a Synergy clone with 128G of RAM vs Synergy's 96 and has updated CPU's. It's quite a beast. Executive Director GPU Users Group Inc. - brad@gpuug.org |
skildude Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 |
You don't think some sneaky coding wonk wouldn't use your idea to intentionally add data to file and quickly make a mockery of the process. Not that someone couldnt do that now. however the work being done also needs to have corroborating results. Examining the result before making sure yours are actually good is putting the cart before the horse In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.