Near Time Persistency Checker (NTPCKR)


log in

Advanced search

Message boards : Number crunching : Near Time Persistency Checker (NTPCKR)

1 · 2 · 3 · Next
Author Message
Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691176 - Posted: 13 Dec 2007, 19:32:45 UTC
Last modified: 14 Dec 2007, 0:59:53 UTC

Some of the items in the Tech News about Near Time Persistency Checker (NTPCKR) and progress towards Analysis.

Here are some excerpts from Matt's Tech News.
Broon's Bane (Dec 11 2007)


As we continue pushing forward on the analysis code, we needed to build another index on the master science database (thumper). This takes many hours, during which the table in question is locked and therefore the parts of the back end that require science database access have to be shut down, which is why we time such events with the regular outages.



Prrifth (Dec 06 2007)


Early tech news report today as we're going to have a power outage in about 4-5 hours. Yep. Everything is coming down. No web sites and no data servers until we power up Friday morning. That said, there's not much to report. Still waiting on final pieces to fall into place before I start sending out the mass donation e-mail. Slow steady progress on increasing space for workunit storage. Doing some actual programming again (mostly ramping up on Jeff/Eric's work on the nitpicker and data recorder code to deal with the radar blanking signal). Nothing terribly exciting - more of the same. Yeah... hopefully this will be the last lab-wide power outage to deal with those long-standing breaker problems.



The Three Leaves Insect (Dec 05 2007)


This morning had a nitpicker (near time persistency checker) design review. Maybe we'll post the (rather cryptic) minutes somewhere soon. I did update the plans page - it's really hard for us to keep all these informative pages in sync and up to date. I do have a public SETI wiki ready to go but we're too busy to get it started (import the current pages, etc.). Usual manpower problems around here.


Gnat Attack (Dec 04 2007)


Ah... here's the catch - the definition of "promising." No single result is promising - it has to be matched with all the signals from previous results to determine if there is any interesting persistency. As of right now this process happens every 3-4 years. Redundancy/validation is much faster.

This is where the Near Time Persistency Checker comes in (still in development - news to come on that at some point), which will continually be checking persisitency on incoming results. But the key here is "near time" - i.e. not right away. It may be a couple weeks behind on checking your result, maybe less, maybe more - so no significant gain if any on the current 2/2 scheme.



Dismantling the Berlin Waltz (Nov 28 2007)


Regarding the near time persistency compute server on your xmas list, why does it have to be such a major platform? I don't recall what needs to be done. Is there a current link on this program?


This machine will be continually analyzing and randomly scanning a database about 1TB in size, which will require some computing heft to say the least. Jeff is working on this program - a writeup of current progress is coming soon. See the scientific newsletters for older writeups.



There is an indication that a machine is being "cobbled togehter" (sound familiar) for the proof of concept. While the first growing pains may not be visible for a bit. Work is underway.

What is needed is in the Donate Hardware listed as Item Number 1.

If I go looking at a white box AMD Opteron Server with 4 Opteron 880 (2.4 GHZ 2 Meg L2) CPU's, 16 gig of RAM (8x2 Gig), 3 - 1TB SATA II drives. What is needed could be had for a bit less than $8,000. Yes it could suffer a bit from disk I/O.

Here is the link to the configurator SDR-AS1040C-TB
____________
Please consider a Donation to the Seti Project.

Brian Silvers
Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 691269 - Posted: 14 Dec 2007, 2:13:57 UTC - in response to Message 691176.
Last modified: 14 Dec 2007, 2:35:22 UTC


What is needed is in the Donate Hardware listed as Item Number 1.

If I go looking at a white box AMD Opteron Server with 4 Opteron 880 (2.4 GHZ 2 Meg L2) CPU's, 16 gig of RAM (8x2 Gig), 3 - 1TB SATA II drives. What is needed could be had for a bit less than $8,000. Yes it could suffer a bit from disk I/O.

Here is the link to the configurator SDR-AS1040C-TB


Several concerns:

1) The "wish list" indicates 32-64GB memory. What you selected only has 16GB. Also max capacity is 32GB for that system. Expansion to 64GB would only be possible if larger memory modules come about and they are supported. Since this is a DDR-1 platform, I tend to doubt we'll see 4GB DDR-1 modules.

2) Why the selection of an older platform?

3) Why the selection of an AMD platform (Intel is showing up better in DB I/O now)?

[edit]
I'm seeing the struggle now when looking at that web site. The Intel 1U DDR2 options only offer 8 memory banks and only offer 2GB for each, thus a max of 16GB right now, and expansion to 32GB when 4GB modules become available...

Question 4) Does the system have to be 1U, or can it be 2U?
[/edit]
____________

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691275 - Posted: 14 Dec 2007, 2:39:51 UTC - in response to Message 691269.

Brian

The simple reply is some that is fairly modern "new" that might be affordable.

So if I go to Cutting Edge you are talking $25K So to recommend a compromise. Everyone is welcome to see what they can put together... I was in a hurry and did a onestop shopping.

Regards



What is needed is in the Donate Hardware listed as Item Number 1.

If I go looking at a white box AMD Opteron Server with 4 Opteron 880 (2.4 GHZ 2 Meg L2) CPU's, 16 gig of RAM (8x2 Gig), 3 - 1TB SATA II drives. What is needed could be had for a bit less than $8,000. Yes it could suffer a bit from disk I/O.

Here is the link to the configurator SDR-AS1040C-TB


Several concerns:

1) The "wish list" indicates 32-64GB memory. What you selected only has 16GB. Also max capacity is 32GB for that system. Expansion to 64GB would only be possible if larger memory modules come about and they are supported. Since this is a DDR-1 platform, I tend to doubt we'll see 4GB DDR-1 modules.

2) Why the selection of an older platform?

3) Why the selection of an AMD platform (Intel is showing up better in DB I/O now)?


____________
Please consider a Donation to the Seti Project.

Profile The Gas Giant
Volunteer tester
Avatar
Send message
Joined: 22 Nov 01
Posts: 1894
Credit: 2,622,049
RAC: 195
Australia
Message 691276 - Posted: 14 Dec 2007, 2:47:56 UTC
Last modified: 14 Dec 2007, 2:48:51 UTC

The acronym NTPCKR looks suspiciously like NITPICKER to me, which maybe appropriate in the case of a near time persistency checker....

Live long and BOINC!
____________
Paul
(S@H1 8888)
And proud of it!

Brian Silvers
Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 691277 - Posted: 14 Dec 2007, 3:00:53 UTC - in response to Message 691276.

The acronym NTPCKR looks suspiciously like NITPICKER to me, which maybe appropriate in the case of a near time persistency checker....


It is referred to as such all the time by Matt...

Brian Silvers
Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 691278 - Posted: 14 Dec 2007, 3:06:26 UTC - in response to Message 691275.
Last modified: 14 Dec 2007, 3:08:11 UTC

Brian

The simple reply is some that is fairly modern "new" that might be affordable.

So if I go to Cutting Edge you are talking $25K So to recommend a compromise. Everyone is welcome to see what they can put together... I was in a hurry and did a onestop shopping.


I understand. I don't think you need to go "cutting edge", but I'd be awful concerned about donating something that is not up to the task, then having to turn around and ask for more donations for the "right tool for the job" while fending off complaints about why you didn't get something better to start out with.

That said, you may be able to get by with less if this thing will not put a strain on the public-facing databases / infrastructure, i.e. if a lower powered system does not cause participants to experience "no work" messages or sluggish upload/downlod/web page performance, etc...

Also, what about my question about 1U vs. 2U, and, in fact, does it have to be a rackmount?

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691300 - Posted: 14 Dec 2007, 4:52:06 UTC - in response to Message 691276.

Matt has used both

What NTPCKR refers to N T P CKR So taht is the shortest form.

The acronym NTPCKR looks suspiciously like NITPICKER to me, which maybe appropriate in the case of a near time persistency checker....

Live long and BOINC!


____________
Please consider a Donation to the Seti Project.

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691312 - Posted: 14 Dec 2007, 5:12:13 UTC - in response to Message 691278.

Brian

From my understanding a Segment of the Science Database will be copied (thus the Large Drives). So in that respect it should not put an extra load on the backend servers as is does its own thing. About the only output would be when it finds something of interest and writes data...

The fact that it is a 1U is fine a 2U could also work.

So that you know, I looked at quite a few server configurations. So the 8 Cores's was my first concern.

So for the cost of the 1 $25K server three could be constructed (hmmm 8 cores or 24 cores). So if it took one new machine 3 years to work through the 2 billion results then three could possibly reduce that to a single year.

Brian

The simple reply is some that is fairly modern "new" that might be affordable.

So if I go to Cutting Edge you are talking $25K So to recommend a compromise. Everyone is welcome to see what they can put together... I was in a hurry and did a onestop shopping.


I understand. I don't think you need to go "cutting edge", but I'd be awful concerned about donating something that is not up to the task, then having to turn around and ask for more donations for the "right tool for the job" while fending off complaints about why you didn't get something better to start out with.

That said, you may be able to get by with less if this thing will not put a strain on the public-facing databases / infrastructure, i.e. if a lower powered system does not cause participants to experience "no work" messages or sluggish upload/downlod/web page performance, etc...

Also, what about my question about 1U vs. 2U, and, in fact, does it have to be a rackmount?


____________
Please consider a Donation to the Seti Project.

Brian Silvers
Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 691319 - Posted: 14 Dec 2007, 5:48:45 UTC - in response to Message 691312.
Last modified: 14 Dec 2007, 6:29:10 UTC

Brian

From my understanding a Segment of the Science Database will be copied (thus the Large Drives). So in that respect it should not put an extra load on the backend servers as is does its own thing. About the only output would be when it finds something of interest and writes data...

The fact that it is a 1U is fine a 2U could also work.

So that you know, I looked at quite a few server configurations. So the 8 Cores's was my first concern.

So for the cost of the 1 $25K server three could be constructed (hmmm 8 cores or 24 cores). So if it took one new machine 3 years to work through the 2 billion results then three could possibly reduce that to a single year.


I'm not sure you'd get linear scaling like that. When I worked retail computer sales, I was not on commission. I always hesitated on suggesting the absolute bottom end for someone unless it was absolutely a stretch for them to even be affording that. I generally steered people towards having expansion capabilities rather than rock-bottom pricing...

Anyway, if you want to stick with 1U and an AMD platform, then you might want to take a look at this system from Servers Direct instead. It will bump you up to the 82xx series and into DDR2 memory for essentially the same pricing. I chose 4 of the 8216 processors and filled it with 32GB memory. The DDR1 memory is more expensive than DDR2, that's what makes it the same price.

Oh, and yes, I'm aware that clock-for-clock the 880 (Socket 940) system will benchmark faster than the 8216 (Socket F). I think that difference can be offset by having double the amount of memory though. Also, you could bump up to the 8218 processors for about $800 more...

I haven't been able to find any "better" Intel 1U systems with the larger memory support. Another real "gotcha" is storage support. Many disks seem to be 320GB or less, even if being offered in SATA...

Brian Silvers
Send message
Joined: 11 Jun 99
Posts: 1681
Credit: 492,052
RAC: 0
United States
Message 691338 - Posted: 14 Dec 2007, 7:02:00 UTC

One other thing I forgot to mention is that the "under $8000" price seems to only have the default warranty, which is "1-year depot limited". That means that if the thing breaks, UCB will have to mail it / cart it off somewhere. This may not be a good choice, but if it is ok with UCB, then that's fine. The only other config that comes out to under 8K is if you do not include the management card and take the first 3-year on-site warranty. The problem with that is the warranty is set for specific pricing levels, of which the system would be in the "enterprise" range... Best price without the management card and no heatsinks on the 3rd and 4th processors with this warranty is $8279.69.
With heatsinks, the management card, and the 1-year warranty, but only 16GB memory: $7717.47

The quad-8216 Socket F platform can be had with heatsinks, management card, 32GB memory, and the 3-year warranty for $8251.53. Drop out the 3-year warranty on it and it goes down to $7589.03.

IMO, with what they're talking about doing, throwing more memory at it is the best thing that can be done, even if the processor is a bit weaker.

PhonAcq
Send message
Joined: 14 Apr 01
Posts: 1622
Credit: 21,959,701
RAC: 3,777
United States
Message 691379 - Posted: 14 Dec 2007, 13:49:30 UTC
Last modified: 14 Dec 2007, 13:50:04 UTC

1) does the nitpcker algorithm exist and has it been tested on existing equipment
2) would progress be made if they ran it on an existing server, eventhough it would be slow (not-so-near term picker?).

Or are they saying that the hardware has to exist before any data is analyzed?

Profile RandyC
Avatar
Send message
Joined: 20 Oct 99
Posts: 714
Credit: 1,704,345
RAC: 0
United States
Message 691427 - Posted: 14 Dec 2007, 16:57:13 UTC

Just a thought...

Has any consideration been given to the feasibility of creating a BOINC project to do the analysis targeted for the NTPCKR???

I have no idea what the algorithm for doing the analysis would be, but if the data could be separated into specific locations in the sky, it should be possible to distribute that to users for comparison, analysis and feedback.

Just a thought...

Profile popandbob
Volunteer tester
Send message
Joined: 19 Mar 05
Posts: 535
Credit: 1,896,421
RAC: 0
Canada
Message 691463 - Posted: 14 Dec 2007, 19:28:07 UTC - in response to Message 691427.

Just a thought...

Has any consideration been given to the feasibility of creating a BOINC project to do the analysis targeted for the NTPCKR???

I have no idea what the algorithm for doing the analysis would be, but if the data could be separated into specific locations in the sky, it should be possible to distribute that to users for comparison, analysis and feedback.

Just a thought...


The only problem is the shear amount of data that would need to be downloaded by clients.. 1TB database split into 64 blocks (as I recall it was 64)... 15.625GB download...

~BoB
____________


Do you Good Search for Seti@Home? http://www.goodsearch.com/?charityid=888957
Or Good Shop? http://www.goodshop.com/?charityid=888957

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691469 - Posted: 14 Dec 2007, 19:58:44 UTC - in response to Message 691379.

Phonacq

The original post, shows that work on software (as stated by Matt) has continued to progress. Positionally I am aware of "cobbling hardware" to start the first live run. This is to find any problems with the software and confirm the proof of concept.

The second part is as the servers can be potentaly upgraded NTPCKR would be ran as the Result Returns. The catch is, further load the system or start proving that NTPKCER works and hope for either a Nice Box that someone has sitting around (donated) or get the users behind it.

Regards

1) does the nitpcker algorithm exist and has it been tested on existing equipment
2) would progress be made if they ran it on an existing server, eventhough it would be slow (not-so-near term picker?).

Or are they saying that the hardware has to exist before any data is analyzed?


____________
Please consider a Donation to the Seti Project.

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691471 - Posted: 14 Dec 2007, 20:00:01 UTC - in response to Message 691338.

Brian

Thank You

I will make sure this information gets to Eric and Matt.

Regards

One other thing I forgot to mention is that the "under $8000" price seems to only have the default warranty, which is "1-year depot limited". That means that if the thing breaks, UCB will have to mail it / cart it off somewhere. This may not be a good choice, but if it is ok with UCB, then that's fine. The only other config that comes out to under 8K is if you do not include the management card and take the first 3-year on-site warranty. The problem with that is the warranty is set for specific pricing levels, of which the system would be in the "enterprise" range... Best price without the management card and no heatsinks on the 3rd and 4th processors with this warranty is $8279.69.
With heatsinks, the management card, and the 1-year warranty, but only 16GB memory: $7717.47

The quad-8216 Socket F platform can be had with heatsinks, management card, 32GB memory, and the 3-year warranty for $8251.53. Drop out the 3-year warranty on it and it goes down to $7589.03.

IMO, with what they're talking about doing, throwing more memory at it is the best thing that can be done, even if the processor is a bit weaker.


____________
Please consider a Donation to the Seti Project.

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 691474 - Posted: 14 Dec 2007, 20:31:37 UTC - in response to Message 691427.

RandyC

At a point in time it was discussed that a part of NTPCKR "could be" run as a Seti App or part of a Seti App. So whether that can prove out or is even feasable has to wait until we see the start here shortly.


Just a thought...

Has any consideration been given to the feasibility of creating a BOINC project to do the analysis targeted for the NTPCKR???

I have no idea what the algorithm for doing the analysis would be, but if the data could be separated into specific locations in the sky, it should be possible to distribute that to users for comparison, analysis and feedback.

Just a thought...


____________
Please consider a Donation to the Seti Project.

Eric Korpela
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 3 Apr 99
Posts: 1085
Credit: 8,290,496
RAC: 7,545
United States
Message 692649 - Posted: 18 Dec 2007, 22:59:52 UTC - in response to Message 691427.


Has any consideration been given to the feasibility of creating a BOINC project to do the analysis targeted for the NTPCKR???


I've thought quite a bit about doing just that: distributing each user all of the signals in an area of sky a few beam-widths across and having a distributed nitpicker. There are a few problems with this approach that we'll need to resolve before we could do this...

1. We're fairly certain that the speed of the nitpicker will be dominated by how fast signals can be pulled out of the database. A typical beamwidth on the sky will have a few hundred signals, so an area a few beamwidths across will have a few thousand signal. It'll take a few seconds to pull those out of the database, and then a few hundred milliseconds to run the nitpicker on them, and a few seconds to insert the results back into the database. In order to overcome this bottleneck we would need to devise a way of only sending out the signals that have arrived since the last time the host that handles that sky area had checked in.

2. The nitpicker might need to know things about the RFI environment at the time of the observation. We would need to distribute that info with the signals.

3. The nitpicker and RFI rejection are conceived of as an iterative process... i.e. we run the nitpicker, then run RFI rejection on the best candidates and then send them back through the nitpicker. That scheme would be more difficult to distribute without increasing the server-client communications or implementing RFI rejection in the distributed nitpicker.

4. It doesn't take any less effort to develop a distributed nitpicker than it does to develop a non-distributed nitpicker. In fact it will take more effort. The main delay in implementing the nitpicker is that, because we're short handed, Jeff is spending to much time on keeping the servers running and not enough time on the nitpicker development. (It's not his fault. It's just the way things are.) Trying to develop a distributed nitpicker might delay the nitpicker development more than concentrating on a non-distributed one would.

So for the time being we'll develop the simpler non-distributed nitpicker. If it turns out that it's not fast enough to keep up or if the server requirements grow too large, then we'll consider putting in the time and effort of developing a distributed version.

Eric

____________

DJStarfox
Send message
Joined: 23 May 01
Posts: 1040
Credit: 539,206
RAC: 541
United States
Message 692669 - Posted: 19 Dec 2007, 1:46:32 UTC - in response to Message 692649.

So for the time being we'll develop the simpler non-distributed nitpicker. If it turns out that it's not fast enough to keep up or if the server requirements grow too large, then we'll consider putting in the time and effort of developing a distributed version.

Eric


I agree. I think your biggest bottleneck will actually be database access and I/O. Have you thought about (or have money for) fiber between the database and the NTPCKR server to be? Also, fiber (or a dedicated 1000M link) would be helpful between the primary and secondary databases, for replication performance.

This is probably too soon to look at performance, without the code being written yet! Maybe just keep this in the back of your mind when the prototype is working.

Profile Pappa
Volunteer tester
Avatar
Send message
Joined: 9 Jan 00
Posts: 2562
Credit: 12,301,681
RAC: 0
United States
Message 697138 - Posted: 3 Jan 2008, 23:57:29 UTC

One of the things that is happening if we read Matt's post to the Tech News is they are working on the indexes to the Master Science Database to account for what is needed for NITPCKR. They have found a way to create the indexes without seriously reducing the amount of work available. Progress



____________
Please consider a Donation to the Seti Project.

HTH
Volunteer tester
Send message
Joined: 8 Jul 00
Posts: 690
Credit: 835,288
RAC: 0
Finland
Message 697306 - Posted: 4 Jan 2008, 13:03:28 UTC

So, NTPCKR could be ready before the end of 2008? Cool! Thanks for the info, Pappa! :)
____________

Manned mission to Mars in 2019 Petition <-- Sign this, please.

1 · 2 · 3 · Next

Message boards : Number crunching : Near Time Persistency Checker (NTPCKR)

Copyright © 2014 University of California