Status Update (Nov 19, 2014)

Message boards : Technical News : Status Update (Nov 19, 2014)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Eric KorpelaProject Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1228
Credit: 19,300,261
RAC: 9,964
United States
Message 1602805 - Posted: 20 Nov 2014, 1:15:21 UTC

The AstroPulse database rebuild is continuing with 31M rows left to go (as far as I can tell). In which case, by Friday morning (PST) we should know whether it worked. Meanwhile Matt and Jeff are scrounging the archives for data we've overlooked.

Unfortunately, I think the rebuild didn't work. There hasn't been an error message or any error indication, but the appearance of the read/write statistics leads me to believe that it failed about 5 days ago and has spent the time since then undoing what it had done so far. Stopping it now would only make recovering things worse. That's not how databases work.

If I'm right, we'll end up doing a data dump and reload. At this point we're sticking with Informix. I looked into other databases, and PostgreSQL was the only feature complete database in the correct price range. With MySQL and its derivatives I would need to come up with a way to emulate defined types and LISTs. PostgreSQL has defined types and its array support is very much like lists. The only thing we use that PostgreSQL is missing is synonyms. Annoyingly "end" is a reserved word, so columns named "end" are forbidden, so that would have to change. It would probably only take be a few days of coding to write the interface layer to our database classes and to modify the schema_to_class compiler to parse PostgreSQL's schema syntax (it currently does Informix and MySQL). And a couple days to build and test all the server components. Of course those are full time days, and I don't really have any of those in the next couple weeks. So Informix it is, for the time being. I may peck at the PostgreSQL code during my down time, for future use.
@SETIEric

ID: 1602805 · Report as offensive
Profile Ageless
Avatar

Send message
Joined: 9 Jun 99
Posts: 14103
Credit: 3,388,595
RAC: 353
Netherlands
Message 1602814 - Posted: 20 Nov 2014, 1:28:04 UTC - in response to Message 1602805.  

Well, I'll keep sitting on my 3 days worth of work then, only slowly crunching it. Not that my GPU gets much time in between my runs of Far Cry 4... :)

But it sucks if the rebuild didn't work. Ah computers, finicky things.
Jord

Ancient Astronaut Theorists suggest that in many ways, you can be considered an alien conspiracy!
ID: 1602814 · Report as offensive
Profile ZalsterProject Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 3716
Credit: 170,188,785
RAC: 85,814
United States
Message 1602835 - Posted: 20 Nov 2014, 1:58:44 UTC - in response to Message 1602814.  

Thanks for the update. Terrible if the rebuild doesn't work. I think someone had quietly mentioned that possibility on the Panic thread. We'll keep sending positive thoughts your way. Good luck
ID: 1602835 · Report as offensive
Profile mr.mac52
Avatar

Send message
Joined: 18 Mar 03
Posts: 51
Credit: 182,379,593
RAC: 118,002
United States
Message 1602920 - Posted: 20 Nov 2014, 6:12:17 UTC

Thanks for keeping us updated, it is much appreciated from the trenches...
ID: 1602920 · Report as offensive
Bitman
Volunteer tester

Send message
Joined: 14 May 99
Posts: 9
Credit: 6,763,823
RAC: 3,876
China
Message 1602964 - Posted: 20 Nov 2014, 9:46:55 UTC - in response to Message 1602805.  

On some databases, reserved words can be used as column names if you enclose them in double-quotes (Oracle does this, I think). Perhaps PostgreSQL allows this.
Just a thought.
ID: 1602964 · Report as offensive
Bitman
Volunteer tester

Send message
Joined: 14 May 99
Posts: 9
Credit: 6,763,823
RAC: 3,876
China
Message 1602974 - Posted: 20 Nov 2014, 9:59:08 UTC - in response to Message 1602964.  

And it does!

http://stackoverflow.com/questions/7651417/escaping-keyword-like-column-names-in-postgres

You can used reserved words, like "end" as column names in PostgreSQL by enclosing them in double-quotes.

Hope this helps.
ID: 1602974 · Report as offensive
Profile Dimly Lit Lightbulb 😀
Volunteer tester
Avatar

Send message
Joined: 30 Aug 08
Posts: 14724
Credit: 3,834,362
RAC: 12,159
United Kingdom
Message 1602978 - Posted: 20 Nov 2014, 10:00:55 UTC

Thanks for the update Eric, I'll keep my fingers crossed that it's worked. I take it the data dump and reload would mean starting over?
ID: 1602978 · Report as offensive
Profile Chris SCrowdfunding Project Donor
Volunteer tester
Avatar

Send message
Joined: 19 Nov 00
Posts: 39535
Credit: 29,238,600
RAC: 15,849
United Kingdom
Message 1603071 - Posted: 20 Nov 2014, 12:52:06 UTC

Great update Eric, many thanks. Ok so Informix it is for the time being then, fair enough. But I think it is excellent that other avenues have been explored, and looked at, and will be evaluated. But as you say it all boils down to time that you just don't have available. Clearly you and the team are on top of it all, as we all expected anyway, just good to hear some of the detail behind it.
ID: 1603071 · Report as offensive
Profile Paris
Avatar

Send message
Joined: 20 May 99
Posts: 103
Credit: 859,315
RAC: 151
United States
Message 1603105 - Posted: 20 Nov 2014, 14:08:52 UTC

Thanks to Dr. Korpela and the team. The work you folks do is phenomenal. I am in awe of the abilities demonstrated by everyone. Don't let a few whiners bug you. Thanks again for a fascinating project. I hope my meager contributions help the cause.

Plus SETI Classic = 21,082 WUs
ID: 1603105 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6382
Credit: 169,610,091
RAC: 57,380
United States
Message 1603132 - Posted: 20 Nov 2014, 15:38:12 UTC

Would the process of doing it in chunks as Matt described when doing this for the SETI@home science database last time be of benefit in this situation?
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1603132 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 17827
Credit: 22,484,250
RAC: 4,211
United States
Message 1603141 - Posted: 20 Nov 2014, 15:59:11 UTC

As some character on MASH once said, "No news is frustrating news." So, even bad news is good to hear.

What that boils down to is, thank you for posting this, Eric.
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1603141 · Report as offensive
Juha
Volunteer tester

Send message
Joined: 7 Mar 04
Posts: 330
Credit: 721,896
RAC: 2,062
Finland
Message 1603246 - Posted: 20 Nov 2014, 20:29:19 UTC

If you don't mind me asking, what version of Informix do you use?
ID: 1603246 · Report as offensive
Eric KorpelaProject Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1228
Credit: 19,300,261
RAC: 9,964
United States
Message 1603328 - Posted: 20 Nov 2014, 23:32:36 UTC - in response to Message 1602978.  

Thanks for the update Eric, I'll keep my fingers crossed that it's worked. I take it the data dump and reload would mean starting over?


Yes, but I think a dump and reload would be faster than this reorg attempt was because it can be done without indexes on the new table. It's usually faster to rebuild indexes than it is to update every entry (which is probably what is causing this to take so long.)
@SETIEric

ID: 1603328 · Report as offensive
Eric KorpelaProject Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1228
Credit: 19,300,261
RAC: 9,964
United States
Message 1603331 - Posted: 20 Nov 2014, 23:42:36 UTC - in response to Message 1603246.  

If you don't mind me asking, what version of Informix do you use?


IDS 10.00, unfortunately.
@SETIEric

ID: 1603331 · Report as offensive
musicplayer

Send message
Joined: 17 May 10
Posts: 1787
Credit: 842,842
RAC: 0
Message 1603520 - Posted: 21 Nov 2014, 9:31:33 UTC

Here is another quite important thing not to forget.

Being just one user among many others, I do in fact run on a quite big hard disc.

In fact it is a 3 TB SATA disc.

Such a disc is typically divided into segments called partitions.

Partitioning a disc is a process which comes first, also before the process of formatting these partitions.

Compare with the pull-out drawers of an office card holder or the similar.

You may perhaps recall these metallic grey boxes from old television series like Kojak (or Derrick for us Europeans).

In order to use such a partition on a disc, it needs to be formatted first. Otherwise it becomes just a raw data partition.

Formatting is a process which creates a logical drive on a partition which then becomes a volume.

In the end these drives or becomes C: through Z:, although higher than J: or K: being used should not be very common.

The file system which becomes the result of the formatting process may have names like FAT, FAT16, FAT32 and NTFS.

That is for the PC.

If you are running Unix or the similar and the system is supposed to be a big server, the file system belonging to a disc and the process of setting up a database on the same disc becomes a different matter.

One workplace I had a short assignment were having a relational database containing sales data. They also were having a Unix machine. Whether or not the database was lying on the Unix machine or possible was located on another machine, making need for a local network, the database needed to communicate with the Unix machine and vice versa.

Which is not only a network thing, of course.

The interesting thing is that as long as a disc is physically in order, data may theoretically be written to it even if it has not been formatted yet. In such instances, data may be written, but not necessarily be read. They become raw data instead.

That is the reason why partitioning and formatting is working at all.

If you are dealing with a very large amount of total data, compression and decompression would not be working very well and would take a considerable amount of time.

But accessing such a database usually is about a small amount of records at a given time.

You would like to know the best results all the time and how certain records in the database relate to other ones. One record may be stored in one table. The other record may be found in another table.

Therefore the indexing of records should be an important issue.

A relational database is consisting of tables. The individual elements in such a table which may be listed on a single line may contain different record elements. Their specific names and labeling I need to get back to.

For example, both the gaussian table and the triplet table in the SMV contains the element "chirp_rate".

Before checking, I assume that both these numbers should be similar for a single task or WU and that the total number of such chirp_rates correspond to the lines representing the total number of gaussians and triplets in such a task.

The problem is really that instead of dealing with some 4045 lines, a database is supposed to be containing possibly millions of lines. Here the difference in size is really showing up.

Easy for me, really.

But a difficult task to handle for those people including Dr. Korpela who are doing an excellent job on the servers.
ID: 1603520 · Report as offensive
Profile Chris SCrowdfunding Project Donor
Volunteer tester
Avatar

Send message
Joined: 19 Nov 00
Posts: 39535
Credit: 29,238,600
RAC: 15,849
United Kingdom
Message 1603536 - Posted: 21 Nov 2014, 10:46:29 UTC

IDS 10.00, unfortunately.

Any Mileage in updating to V12?
ID: 1603536 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3703
Credit: 8,820,788
RAC: 652
Bulgaria
Message 1603571 - Posted: 21 Nov 2014, 12:00:35 UTC - in response to Message 1603331.  

If you don't mind me asking, what version of Informix do you use?


IDS 10.00, unfortunately.

And which Edition?
Some say e.g. "Resource limited to a single CPU core and 1GB of server memory"
http://en.wikipedia.org/wiki/IBM_Informix_Dynamic_Server



- ALF - "Find out what you don't do well ..... then don't do it!" :)
ID: 1603571 · Report as offensive
Profile Dimly Lit Lightbulb 😀
Volunteer tester
Avatar

Send message
Joined: 30 Aug 08
Posts: 14724
Credit: 3,834,362
RAC: 12,159
United Kingdom
Message 1603577 - Posted: 21 Nov 2014, 12:21:03 UTC - in response to Message 1603571.  

If you don't mind me asking, what version of Informix do you use?


IDS 10.00, unfortunately.

And which Edition?
Some say e.g. "Resource limited to a single CPU core and 1GB of server memory"
http://en.wikipedia.org/wiki/IBM_Informix_Dynamic_Server

That must be the problem then, we'll have to have a fundraiser so they can get a version that can use 2 CPU cores and 2GB of memory.
ID: 1603577 · Report as offensive
Eric KorpelaProject Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1228
Credit: 19,300,261
RAC: 9,964
United States
Message 1603671 - Posted: 21 Nov 2014, 17:56:52 UTC - in response to Message 1603536.  

IDS 10.00, unfortunately.

Any Mileage in updating to V12?


It has fewer constraints on table size, so there would be a long term benefit. But we would need to rebuild the tables to get those benefits, so short term we'd be in the same boat. The problem with upgrading is that the people at Informix/IBM who were enabling the donation of the software are long gone. I haven't checked IBM's licensing terms recently. They were unaffordable for us last time I checked. We certainly don't want to upgrade to anything that requires an annual license fee.
@SETIEric

ID: 1603671 · Report as offensive
Eric KorpelaProject Donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 3 Apr 99
Posts: 1228
Credit: 19,300,261
RAC: 9,964
United States
Message 1603674 - Posted: 21 Nov 2014, 17:59:50 UTC - in response to Message 1603571.  

If you don't mind me asking, what version of Informix do you use?


IDS 10.00, unfortunately.

And which Edition?
Some say e.g. "Resource limited to a single CPU core and 1GB of server memory"
http://en.wikipedia.org/wiki/IBM_Informix_Dynamic_Server


It would be the equivalent of Ultimate Edition. No CPU, number of users, or database size limitations.
@SETIEric

ID: 1603674 · Report as offensive
1 · 2 · Next

Message boards : Technical News : Status Update (Nov 19, 2014)


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.