Message boards :
Technical News :
Status Update (Nov 19, 2014)
Message board moderation
Author | Message |
---|---|
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
The AstroPulse database rebuild is continuing with 31M rows left to go (as far as I can tell). In which case, by Friday morning (PST) we should know whether it worked. Meanwhile Matt and Jeff are scrounging the archives for data we've overlooked. Unfortunately, I think the rebuild didn't work. There hasn't been an error message or any error indication, but the appearance of the read/write statistics leads me to believe that it failed about 5 days ago and has spent the time since then undoing what it had done so far. Stopping it now would only make recovering things worse. That's not how databases work. If I'm right, we'll end up doing a data dump and reload. At this point we're sticking with Informix. I looked into other databases, and PostgreSQL was the only feature complete database in the correct price range. With MySQL and its derivatives I would need to come up with a way to emulate defined types and LISTs. PostgreSQL has defined types and its array support is very much like lists. The only thing we use that PostgreSQL is missing is synonyms. Annoyingly "end" is a reserved word, so columns named "end" are forbidden, so that would have to change. It would probably only take be a few days of coding to write the interface layer to our database classes and to modify the schema_to_class compiler to parse PostgreSQL's schema syntax (it currently does Informix and MySQL). And a couple days to build and test all the server components. Of course those are full time days, and I don't really have any of those in the next couple weeks. So Informix it is, for the time being. I may peck at the PostgreSQL code during my down time, for future use. @SETIEric@qoto.org (Mastodon) |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Well, I'll keep sitting on my 3 days worth of work then, only slowly crunching it. Not that my GPU gets much time in between my runs of Far Cry 4... :) But it sucks if the rebuild didn't work. Ah computers, finicky things. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Thanks for the update. Terrible if the rebuild doesn't work. I think someone had quietly mentioned that possibility on the Panic thread. We'll keep sending positive thoughts your way. Good luck |
mr.mac52 Send message Joined: 18 Mar 03 Posts: 67 Credit: 245,882,461 RAC: 0 |
Thanks for keeping us updated, it is much appreciated from the trenches... |
Bitman Send message Joined: 14 May 99 Posts: 9 Credit: 9,918,653 RAC: 632 |
On some databases, reserved words can be used as column names if you enclose them in double-quotes (Oracle does this, I think). Perhaps PostgreSQL allows this. Just a thought. |
Bitman Send message Joined: 14 May 99 Posts: 9 Credit: 9,918,653 RAC: 632 |
And it does! http://stackoverflow.com/questions/7651417/escaping-keyword-like-column-names-in-postgres You can used reserved words, like "end" as column names in PostgreSQL by enclosing them in double-quotes. Hope this helps. |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
Thanks for the update Eric, I'll keep my fingers crossed that it's worked. I take it the data dump and reload would mean starting over? Member of the People Encouraging Niceness In Society club. |
Paris Send message Joined: 20 May 99 Posts: 110 Credit: 1,012,250 RAC: 0 |
Thanks to Dr. Korpela and the team. The work you folks do is phenomenal. I am in awe of the abilities demonstrated by everyone. Don't let a few whiners bug you. Thanks again for a fascinating project. I hope my meager contributions help the cause. Plus SETI Classic = 21,082 WUs |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
Would the process of doing it in chunks as Matt described when doing this for the SETI@home science database last time be of benefit in this situation? SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
As some character on MASH once said, "No news is frustrating news." So, even bad news is good to hear. What that boils down to is, thank you for posting this, Eric. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
If you don't mind me asking, what version of Informix do you use? |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
Thanks for the update Eric, I'll keep my fingers crossed that it's worked. I take it the data dump and reload would mean starting over? Yes, but I think a dump and reload would be faster than this reorg attempt was because it can be done without indexes on the new table. It's usually faster to rebuild indexes than it is to update every entry (which is probably what is causing this to take so long.) @SETIEric@qoto.org (Mastodon) |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
If you don't mind me asking, what version of Informix do you use? IDS 10.00, unfortunately. @SETIEric@qoto.org (Mastodon) |
musicplayer Send message Joined: 17 May 10 Posts: 2442 Credit: 926,046 RAC: 0 |
Here is another quite important thing not to forget. Being just one user among many others, I do in fact run on a quite big hard disc. In fact it is a 3 TB SATA disc. Such a disc is typically divided into segments called partitions. Partitioning a disc is a process which comes first, also before the process of formatting these partitions. Compare with the pull-out drawers of an office card holder or the similar. You may perhaps recall these metallic grey boxes from old television series like Kojak (or Derrick for us Europeans). In order to use such a partition on a disc, it needs to be formatted first. Otherwise it becomes just a raw data partition. Formatting is a process which creates a logical drive on a partition which then becomes a volume. In the end these drives or becomes C: through Z:, although higher than J: or K: being used should not be very common. The file system which becomes the result of the formatting process may have names like FAT, FAT16, FAT32 and NTFS. That is for the PC. If you are running Unix or the similar and the system is supposed to be a big server, the file system belonging to a disc and the process of setting up a database on the same disc becomes a different matter. One workplace I had a short assignment were having a relational database containing sales data. They also were having a Unix machine. Whether or not the database was lying on the Unix machine or possible was located on another machine, making need for a local network, the database needed to communicate with the Unix machine and vice versa. Which is not only a network thing, of course. The interesting thing is that as long as a disc is physically in order, data may theoretically be written to it even if it has not been formatted yet. In such instances, data may be written, but not necessarily be read. They become raw data instead. That is the reason why partitioning and formatting is working at all. If you are dealing with a very large amount of total data, compression and decompression would not be working very well and would take a considerable amount of time. But accessing such a database usually is about a small amount of records at a given time. You would like to know the best results all the time and how certain records in the database relate to other ones. One record may be stored in one table. The other record may be found in another table. Therefore the indexing of records should be an important issue. A relational database is consisting of tables. The individual elements in such a table which may be listed on a single line may contain different record elements. Their specific names and labeling I need to get back to. For example, both the gaussian table and the triplet table in the SMV contains the element "chirp_rate". Before checking, I assume that both these numbers should be similar for a single task or WU and that the total number of such chirp_rates correspond to the lines representing the total number of gaussians and triplets in such a task. The problem is really that instead of dealing with some 4045 lines, a database is supposed to be containing possibly millions of lines. Here the difference in size is really showing up. Easy for me, really. But a difficult task to handle for those people including Dr. Korpela who are doing an excellent job on the servers. |
BilBg Send message Joined: 27 May 07 Posts: 3720 Credit: 9,385,827 RAC: 0 |
If you don't mind me asking, what version of Informix do you use? And which Edition? Some say e.g. "Resource limited to a single CPU core and 1GB of server memory" http://en.wikipedia.org/wiki/IBM_Informix_Dynamic_Server  - ALF - "Find out what you don't do well ..... then don't do it!" :)  |
Dimly Lit Lightbulb 😀 Send message Joined: 30 Aug 08 Posts: 15399 Credit: 7,423,413 RAC: 1 |
If you don't mind me asking, what version of Informix do you use? That must be the problem then, we'll have to have a fundraiser so they can get a version that can use 2 CPU cores and 2GB of memory. Member of the People Encouraging Niceness In Society club. |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
IDS 10.00, unfortunately. It has fewer constraints on table size, so there would be a long term benefit. But we would need to rebuild the tables to get those benefits, so short term we'd be in the same boat. The problem with upgrading is that the people at Informix/IBM who were enabling the donation of the software are long gone. I haven't checked IBM's licensing terms recently. They were unaffordable for us last time I checked. We certainly don't want to upgrade to anything that requires an annual license fee. @SETIEric@qoto.org (Mastodon) |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
If you don't mind me asking, what version of Informix do you use? It would be the equivalent of Ultimate Edition. No CPU, number of users, or database size limitations. @SETIEric@qoto.org (Mastodon) |
Juha Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0 |
If you don't mind me asking, what version of Informix do you use? So 2005-ish. I was half expecting something from '98/99... |
Eric Korpela Send message Joined: 3 Apr 99 Posts: 1382 Credit: 54,506,847 RAC: 60 |
|
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.