Posts by Mr. Kevvy


log in
11) Message boards : Cafe SETI : The joke thread Part 4. (Message 1761422)
Posted 6 days ago by Profile Mr. KevvyProject donor
I have this strange sense of deja vu...
12) Message boards : Number crunching : Proposal/idea for a near-real-time, low-resource, inexpensive NTPCkr (Message 1761389)
Posted 6 days ago by Profile Mr. KevvyProject donor
Too late. Already on the way.


I doubt that is going to be used long-term as it would be prohibitively expensive (I had a gander at Amazon's rates recently.) Also, as indicated, even with a run of a few weeks it's anything but "Near Time" ie you can click on your recently completed work after a few minutes to see how it matches up with others' in NTPCkr.
13) Message boards : Number crunching : Proposal/idea for a near-real-time, low-resource, inexpensive NTPCkr (Message 1761382)
Posted 6 days ago by Profile Mr. KevvyProject donor
This thread is intended as a discussion to bat ideas around and get feedback. I've posted it in Number Crunching, rather than SETI@Home Science where the other NTPCkr thread is, because this is a computing rather than astrophysical topic, and this forum seems to be where the most technically savvy folks with the greatest knowledge of SETI@Home's infrastructure can be found.

I hope that it proves fruitful, as I think that a working NTPCkr or equivalent is very important to the future of this project. If is seems to pass muster, or if any flaws found with it seem to be reasonably solvable, it could certainly be proposed to the SETI@Home team for consideration.

Note: Please do not reply quoting this entire post unless you are inserting commentary throughout! I will become rather annoyed. :^p

First, we need some preliminaries, of course, for new volunteers:

What is NTPCkr?
NTPCkr is Near-Time Persistency Checker. Simply, it's a server/computer that takes our crunched results and quickly matches them up (correlates them) with the billions of other results on file to see if multiple interesting signals have been found at one point in the sky. Ideally, we also could access it through our browsers and see where work units we've crunched are located on a sky map, and whether other interesting results have been found there.

Why is NTPCkr so important to SETI@Home?
NTPCkr's importance is two-fold:

1) Science: By virtue of being near-real-time, NTPCkr offers to project scientists much faster responsiveness should a transient signal be detected, so that it can be investigated further in a minimal amount of time before it is gone. Responsiveness is now easier and thus this has become a priority due to SETI@Home's access to the Green Bank Telescope, which can be pointed at interesting locations, in contrast to Arecibo's constraint of only listening at whatever location the telescope happens to be "pointing" to for other research.

2) Project participation and long-term success: One of the statistics I've seen the project team use is the unfortunate 8% retain rate of SETI@Home volunteers. This means that 92% of everyone who ever did any project work is now gone, or at least isn't doing any more work. Over the years I've seen several BOINC and other distributed computing projects fail or stagger because of lack of participant enthusiasm, and the biggest contributing factor is always that participants don't think that their work is valuable to the project and being used. With the flood of new Breakthrough Listen data from Green Bank and possibly elsewhere requiring maximum participation, providing a captivating way of showing volunteers that their results are being analyzed and that they possibly contain interesting data would not only prevent them from leaving, but may also convince those who have left to return.

What's passing for NTPCkr now?
Currently, NTPCkr requires the entire science database to be exported and then the job run on a supercomputer or cloud computing facility. This takes weeks, so it should be just called "PCkr" as it's anything but Near-Time. (Perhaps Far-Time, which would make it FTPCkr, so would ideally be run on the IBM computing cluster in Poughkeepsie... and props to anyone who gets the reference.) The cost estimate for a NTPCkr owned by and local to the SETI@Home facility to replace this was initially $30-40K. I hope to demonstrate that it can be done for far less cost.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

Synopsis
Doing a complete run of the entire SETI@Home science database to find persistent signals more than once is unnecessary and inefficient. Instead, only one initial run is required, because, as the result is a database itself, further runs can build on it incrementally (much like backing up a hard disk only copying new/changed files since the last backup) only adding new results since the last run, rather than starting from scratch. These further runs can be done on a much smaller/cheaper computer at the SETI@Home facility than initially estimated, and eventually this process would catch up to the newest results in the science database, so would fulfill the "Near Time" requirement of correlating results almost as soon as they are received.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

Methodology


paddym is the SETI@Home science server for the SETI@Home science database of MB (Multi-Beam) results which NTPCkr would be looking at. While I don't know the exact number of results it has, we know that the total credit in SETI@Home's results is about 3.2 x 1011. This grows by about 1.5 x 108 credits per day. Thus assuming the MB/AP credit ratio hasn't changed too drastically averaged over time, the science database contains 3.2 x 1011/1.5 x 108 ≈ 2,130 days worth of work. We can thus estimate the number of results paddym holds, knowing that about 92,000 results are returned per hour (which means 25 writes/second to the science database.) This works out to (2.2 x 106 per day x 2,133 days' worth) / 2 results per work unit ~ 2.3 x 109 results in total.



The current time/date is noted as the checkpoint time for use later, and the SETI@Home science database is exported and sent to a cloud or supercomputing facility to run the NTPCkr job. Note that in this time the SETI@Home science database has grown as it has been collecting results continuously (I show new results as halftone at the bottom.) This initial run needs to be done elsewhere as it is computationally unfeasible to perform with SETI@Home's in-house computing resources (it also can't be distributed to volunteers as each would need enough of a percentage of the science database in local storage that there is not enough network bandwidth to distribute it.) The actual time requirement for the job to run is 35 days. The processing requirement for this job is about 5,000 CPU days... ie a quad-core desktop would take about 5,000/4 = 1,250 days — 3 years and 3 months — to complete it.



The resulting NTPCkr database from the run is returned to SETI@Home and installed in the NTPCkr computer. With the 35 day run time, plus an estimated week of transport and setup each way, 49 days have elapsed, with the SETI@Home science database growing continuously during this time.



The cloud or supercomputing facility is no longer required. The SETI@Home science database is then copied to NTPCkr so that the complete database is available locally as otherwise the correlations would be too slow.



NTPCkr is now started up and runs the same correlation job that was run on the cloud or supercomputing facility, correlating its local replica copy of the science database. The critical difference is that NTPCkr only runs the job on results that arrived between the checkpoint time/date noted above and the current time/date.



NTPCkr has now finished additions to its result database. However, it hasn't caught up with the science database yet, because in the time it took to do this plenty of new results arrived.



NTPCkr now sets the checkpoint time as the previous current time, and runs the job again. However from this point forward copying the entire science database over is no longer required. Instead, NTPCkr directly queries paddym for the new results since checkpoint and adds them to its local copy of the science database, then correlates them into the NTPCkr database.
This process repeats with NTPCkr converging on the state of the science database with each iteration. The time to convergence can be easily determined... no calculus required!
We know in the initial run that 5,000 CPU days correlated 2,133 days' worth of science database results, so results accumulate at the rate of 5,000/2,133 = 2.34 CPU days per day. Thus, a NTPCkr with only 3 or more CPU cores could keep with the science database results as they arrive.
We need to allow for NTPCkr outages requiring catch-up to the last checkpoint, sub-optimal utilization and other issues, and that the NTPCkr computer is also expected to serve out local and web queries. I would therefore expect a minimum of a six-core Xeon.
There is initially a 49-day offset between the science database and the NTPCkr database. This is 49 x 2.34 = 114.66 CPU days of computation.
Every day the science database adds 2.34 CPU days of results to correlate, and NTPCkr's six-core Xeon reduces this by 6 CPU days, a net reduction of 6-2.34 = 3.66 days' offset.
NTPCkr would thus catch up to the science database in 114.66/3.66 ≈ 31.3 days.
During this time it can be limited to querying the science database (6 / 2.34) x 25 = 64 queries/sec. (The 25 is the number of new results/second written to the science database.) This should not present an issue, considering the BOINC database receives 20x as many queries/sec. with fewer resources.



After the 31.3 days, NTPCkr would then be near-real-time, checkpointing perhaps once per minute or whatever is determined to be a proper interval. At this point, it could be linked with muarae1 to serve web queries from anyone interested in checking results for correlation. if NTPCkr crashed or was otherwise cut off from the science database, it would catch up in 2.34 / 6 = 0.39 of the time offset, so for example if it was down for a day it would catch up in just over nine hours.

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

Proposed Costing/parts list

The initial NTPCkr spec. indicated 10TB of SSD and 512GB RAM. Given the differing methodology, with only 64 queries/sec. to the science database it doesn't seem that SSDs would be required. (This would also be 180 MBits/sec. even if complete work units were being extracted, which they aren't, so the built-in GBit networking on the mainboard would be more than sufficient.) A single six-core CPU has been demonstrated to be enough, but eight-cores are only a couple of hundred dollars more so would be worth the difference. As well to facilitate further upgrades and mitigate possible lack of resources, a dual-CPU board would be ideal. Not knowing how the requirements would change, the original 512GB RAM spec. should also be kept.

SuperMicro rackmount with dual-Xeon LGA2011 mainboard and 950W PSU: $1,850
Xeon eight-core CPU: $700
512GB memory (32GB DDR4 ECC DIMM server memory @ $300 x 16) = $4,800 (well over half the cost... it appears that the price-fixing convictions haven't stopped RAM from being way too expensive!)
12TB hard disk (3 TB NAS SATA III HD @ $120 x 5 [one as a parity drive]) = $600

Total: $7,950. (This is between one-fifth and one-quarter the cost of full-correlation NTPCkr estimated at $30-40K. It also matches the actual cost of equivalent SETI@Home servers that I know of.)

Time estimate for raising this amount: One week with matching fundraising.
14) Message boards : Number crunching : Its breaking my Heart but I may have to bow out of SETI@Home .. (Message 1760803)
Posted 8 days ago by Profile Mr. KevvyProject donor
“It is the greatest of mistakes to do nothing because you can only do little. Do what you can.”
--- Sydney Smith
15) Message boards : SETI@home Science : First Signs of the NTPCkr (Message 1760743)
Posted 8 days ago by Profile Mr. KevvyProject donor
I have no doubt that Matt could easily lay out the specs for such a m/c, but where would the money come from to fund its purchase?


Already made it clear to the team several times that I'd be more than happy to raise the funds via matching fundraiser or equivalent as per Matt's spec. and cost estimate, so that isn't an issue. With the v8 transition, building instruments at Green Bank, integrating Breakthrough Listen and all of their other duties, I think that they're just too busy right now.

So, I will remind them from time to time.... gently. :^)

I'm working on a writeup of an idea for NTPCkr that I will shortly post in Number Crunching, perhaps this weekend...
16) Message boards : Cafe SETI : R.I.P. Abe Vigoda (for real this time) (Message 1759474)
Posted 12 days ago by Profile Mr. KevvyProject donor
Fish sleeps with the Tessios.

Abe Vigoda, the sad-faced actor who emerged from a workmanlike stage career to find belated fame in the 1970s as the earnest mobster Tessio in “The Godfather” and the dyspeptic Detective Phil Fish on the hit sitcom “Barney Miller,” died on Tuesday morning in Woodland Park, N.J. He was 94, having outlived by about 34 years an erroneous report of his death that made him a cult figure.

His daughter Carol Vigoda Fuchs, told The Associated Press that Mr. Vigoda had died in his sleep at her home.


I am working through the complete Barney Miller and just got through all of his episodes only days ago, and wondered when he would be gone for real. It's amazing that he was playing a retiree over forty years ago yet was still around. So long, Abe, for the last time.
17) Message boards : Number crunching : Abandoned Tasks (Message 1758628)
Posted 15 days ago by Profile Mr. KevvyProject donor
You have some marked 'Abandoned' I see.


Same thing here across several machines at the same time.
18) Message boards : Number crunching : AVG detecting IDP.AREW.Generic (Message 1758475)
Posted 16 days ago by Profile Mr. KevvyProject donor
Anybody else seeing that?


Yes, or similar false positive.

Edit: Options>Advanced Settings>Exceptions, Add Exception, select Folder and add the BOINC program folder.
19) Message boards : Number crunching : Lunatics Windows Installer v0.44 - new release for v8 (required upgrade) (Message 1758433)
Posted 16 days ago by Profile Mr. KevvyProject donor
After installing the client reports missing files, but I am crunching away normally.

22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/setiathome_8.00_windows_intelx86.exe not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/setigraphics_8.00_windows_intelx86.exe not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/setiathome_8.00_windows_intelx86__cuda50.exe not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/setiathome_8.00_windows_intelx86__cuda42.exe not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/cudart32_42_9.dll not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/cufft32_42_9.dll not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/setiathome_7.00_windows_intelx86__cuda42.exe not found 22/01/2016 7:30:47 PM | SETI@home | File projects/setiathome.berkeley.edu/setiathome_7.00_windows_intelx86__cuda50.exe not found


In order to clear out any old junk, I detached/reattached to clear the folder when v8 GPU client was released and ran stock with no app_info.xml until installing Lunatics 0.44, so nothing I did (I think!) But it seems to be working.

Edit: After quitting/restarting the client (which I am doing often tweaking app_info) seems to have gone away.
20) Message boards : Number crunching : Lunatics Windows Installer v0.44 - new release for v8 (required upgrade) (Message 1758166)
Posted 16 days ago by Profile Mr. KevvyProject donor
And by lunchtime Friday. Very well done to those that had a hand in it, Richard, William, and others.


Absolutely... I thought I'd at least have to wait a few days. I guess it helps that the unofficial developers have become the official developers. :^) Excellent job and thanks!
21) Message boards : Number crunching : Virus warning in SETI through Boinc... (Message 1758163)
Posted 16 days ago by Profile Mr. KevvyProject donor
I have had that AVG "Identity Protection" component flag BOINC files (and others) as suspicious for years. It's also responsible for most of the reboot requirements whenever it updates. My conclusion: it's crap and disable it.

As Wiggo noted, you'll have to restore that file under Options > Virus Vault.
22) Message boards : Number crunching : How far out is gpu processing weeks months ? (Message 1757768)
Posted 18 days ago by Profile Mr. KevvyProject donor
Tomorrow.

And not a minute too soon...
23) Message boards : Number crunching : Gripes and Kudos V (Message 1757107)
Posted 21 days ago by Profile Mr. KevvyProject donor
Gripe: People such as myself who miss the obviously posted rules and guidelines. Apologies.

Kudos: Our unpaid and oft unappreciated team of volunteer moderators, without whom I shudder to think what this forum would be like.
24) Message boards : Number crunching : Gripes and Kudos V (Message 1756854)
Posted 22 days ago by Profile Mr. KevvyProject donor
does "patiently" include those of us who are mumbling under our breaths as they watch their average count go down daily?... my shiny new 980Ti's are chomping at the bit to take off and run again... ;)


After about a day of similar mumbling, I attached all my GPU hosts to Beta and am helping out all I can. The more assistance we provide, the faster and better the v8 apps can be released. Of course, there are also plenty of other BOINC projects with GPU apps to keep us busy in the interim. The most popular secondary project around here is Einstein@Home, as it is both astrophysical and uses Arecibo data.

Certainly a better use of one's time than mumbling. :^)
25) Message boards : Number crunching : Dumb Question(s) About Seti v8 (Message 1755230)
Posted 29 days ago by Profile Mr. KevvyProject donor
If you only want v8 work, go to Beta then into Your Account and then SETI@home/AstroPulse Beta preferences. Click Edit Preferences and uncheck SETI@home v7 and AstroPulse v7, then click Update Preferences.

Edit: Going to drink beer and watch movies now so anything else will have to wait until tomorrow. :^D
26) Message boards : Number crunching : Dumb Question(s) About Seti v8 (Message 1755217)
Posted 29 days ago by Profile Mr. KevvyProject donor
Which CUDA?


Wasn't specified but I picked up CUDA42 and CUDA50 work units along with assoc. apps. Was just announced as "A big pile of new GPU apps for Windows, MacOS and Linux are now available."
27) Message boards : Number crunching : Dumb Question(s) About Seti v8 (Message 1755213)
Posted 29 days ago by Profile Mr. KevvyProject donor
We have apps in alpha, we will be moving to beta and hand them over to Eric for public release when we are confident they work properly.


v8 CUDA is now live on Beta.
28) Message boards : Number crunching : New work for Nvidia GPUs? When? (Message 1755110)
Posted 29 days ago by Profile Mr. KevvyProject donor

To get the latest and greatest app do I need to do anything or will they install automatically?

I am running as installed, no command line, app_info, or other changes. Should I be trying something? If so, I need directions,


That's all... you'll see posts for manually installed update apps but you can just wait for the automatic updates through BOINC when the official betas are released. I am doing that and running absolutely stock is no app_info.xml etc.

It's recommended to read through the News thread to know the caveats (any important updates will also appear in your BOINC client), check your Tasks to see if there are excessive Invalids/Errors/Inconclusives and post if you find anything wrong that isn't covered.
29) Message boards : Number crunching : New app? (Message 1754750)
Posted 8 Jan 2016 by Profile Mr. KevvyProject donor
I will investigate the News thread and see just how technical I need to be to get this done properly. Since I have just reset everything today I want to make sure that everything is running smoothly before I try to help out.


You don't have to be very technical... don't let some of the development posts in News scare you off. :^) As noted just report any issues (not already known) if you have them.

Also it's an open/public beta so I'm just a cruncher rather than a concierge. :^D
30) Message boards : Number crunching : New app? (Message 1754729)
Posted 7 Jan 2016 by Profile Mr. KevvyProject donor
I may try to get my GPU back to work, but on another project. *:)


If you're that enthusiastic to keep your GPUs busy and crunch v8 early (can't blame you for that) and want to help the project, why not join us in Beta? Just connect BOINC to http://setiweb.ssl.berkeley.edu/beta/

It's recommended to read through the News thread to know the caveats (any important updates will also appear in your BOINC client), check your Tasks to see if there are excessive Invalids/Errors/Inconclusives and post if you find anything wrong that isn't covered.

(edit: and no tinkering... lol)

By helping there, you'll help get the v8 GPU clients released sooner here...


Previous 20 · Next 20

Copyright © 2016 University of California