Comedy (Jun 17 2009)


log in

Advanced search

Message boards : Technical News : Comedy (Jun 17 2009)

1 · 2 · 3 · Next
Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 908453 - Posted: 17 Jun 2009, 20:16:11 UTC

I've been busy. Almost too much to write about, none of it all that interesting in the grand scheme of things, so I'll just stick to recent stuff.

Our main problem lately has been the mysql database. Given the increased number of users, and the lack of astropulse work (which tends to "slow things down"), the result table in the mysql database is under constant heavy attack. Over the course of a week this table gets severely fragmented, thus resulting in more disk i/o to do the same selects/updates. This has always been a problem, which is why we "compress" the database every tuesday. However, the increased use means a larger, more fragmented table, and it doesn't fit so easily into memory.

This is really a problem when the splitter comes along every ten minutes and checks to see if there's enough work available to send out (thus asking the question: should I bother generating more work). This is a simple count on the result table, but if we're in a "bad state" this count which normally takes a second could take literally hours, and stall all other queries, like the feeder, and therefore nobody can get any work. There are up to six splitters running at any given time, so multiple this problem by six.

We came up with several obvious solutions to this problem, all of which had non-obvious opposite results. Finally we had another thing to try, which was to make a tiny database table which contains these counts, and have a separate program that runs every so often do these counts and populate the proper table. This way instead of six splitters doing a count query every ten minutes, one program does a single count query every hour (and against the replica database). We made the necessary changes and fired it off yesterday after the outage.

Of course it took forever to recover from the outage. When I checked in again at midnight last night I found the splitters finally got the call to generate more work.. and were failing on science database inserts. I figured this was some kind of compile problem, so I fell back to the previous splitter version... but that one was failing reading the raw data files! Then I realized we were in a spate of raw data files that were deemed "questionable" so this wasn't a surprise. I let it go as it was late.

As expected, nature took its course and a couple hours later the splitter finally found files it could read and got to work. That is, until our main /home account server crashed! When that happens, it kills *everything*.

Jeff got in early and was already recovering that system before I noticed. He pretty much had it booted up just when I arrived. However, all the systems were hanging on various other systems due to our web of cross-automounts. I had to reboot 5 or 6 of them and do all the cleanup following that. In one lucky case I was able to clean up the automounter maps without having to reboot.

So we're recovering from all that now. Hopefully we can figure out the new splitter problems and get that working as well or else we'll start hitting those bad mysql periods really soon.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Johnney Guinness
Volunteer tester
Avatar
Send message
Joined: 11 Sep 06
Posts: 3091
Credit: 2,460,392
RAC: 2,456
Ireland
Message 908464 - Posted: 17 Jun 2009, 20:51:15 UTC
Last modified: 17 Jun 2009, 20:52:57 UTC

Matt,
It sounds like the "ER" in there every day. Its surgery on the MySql databases every few hours!.



John.
____________

B-Man
Volunteer tester
Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 908466 - Posted: 17 Jun 2009, 20:54:38 UTC

It looks like you are like the classic Little Dutch boy at the dike and are running out of fingers to stick in the dike. Good luck fixing all the issues.
____________

Vid Vidmar*
Volunteer tester
Avatar
Send message
Joined: 19 Aug 99
Posts: 136
Credit: 1,830,317
RAC: 0
Slovenia
Message 908480 - Posted: 17 Jun 2009, 21:13:55 UTC - in response to Message 908453.

... We came up with several obvious solutions to this problem, all of which had non-obvious opposite results. Finally we had another thing to try, which was to make a tiny database table which contains these counts, and have a separate program that runs every so often do these counts and populate the proper table...


Hey.
Thanks for updating us. Must have been hellish booting all those 'puters and making them work.
But, what really interests me, why did you choose to use a separate program instead of triggers?
BR,
____________

Profile ML1
Volunteer tester
Send message
Joined: 25 Nov 01
Posts: 8346
Credit: 4,093,766
RAC: 1,144
United Kingdom
Message 908492 - Posted: 17 Jun 2009, 21:42:33 UTC - in response to Message 908453.
Last modified: 17 Jun 2009, 21:43:39 UTC

I've been busy...

Sounds like quite a maelstrom!

Our main problem lately has been the mysql database. Given the increased number of users, and the lack of astropulse work (which tends to "slow things down")...

As in "slows down the flood of results from the users" 'slow down' and so lets the servers speed up! :-)

... hanging on various other systems due to our web of cross-automounts. I had to reboot 5 or 6 of them and do all the cleanup following that. ...

Any possibility of arranging the mounts to be in a tree structure rather than a random mapping?...

Or rearrange the data locations to avoid them in the first place?...

(I'm sure you're on to that if at all possible already. I've played with multiple network mounts. Very useful on guaranteed dedicated bandwidth or if for just as temporary fixes. Otherwise, it is a nightmare just waiting to happen...)


or else we'll start hitting those bad mysql periods really soon.

The central database is a recurring theme... Can the database itself be split up across multiple machines to ease the central bottleneck?

Could some operations run offline from updating the database in real-time to be instead batched through more efficiently?


Good luck,
Martin
____________
See new freedom: Mageia4
Linux Voice See & try out your OS Freedom!
The Future is what We make IT (GPLv3)

ra]in-man
Send message
Joined: 25 Aug 06
Posts: 2
Credit: 331,453
RAC: 0
United States
Message 908536 - Posted: 17 Jun 2009, 22:28:46 UTC

so im assuming this is why my completed seti projects are having issues uploading, and getting new jobs is a pain

DJStarfox
Send message
Joined: 23 May 01
Posts: 1040
Credit: 541,672
RAC: 182
United States
Message 908554 - Posted: 17 Jun 2009, 23:04:43 UTC - in response to Message 908453.

Matt,

I have a constructive idea.

Instead of deleting results every 24 hours, could you just schedule this operation late Monday night/Tue morning (weekly) before the Tuesday outage? That would avoid the fragmented table extents and indexes during the week. Before the outage, the validated results at least 24 hours old would be purged. During the outage, you can do your compress as normal.

The only downside is that this method will consume more disk space on the database server. But the performance should increase.

It might also help the database if the compress operation you do has the option to "reuse storage"; Oracle has this feature. This keeps the extents allocated for the table and index empty but locked for its table. That way, when more inserts happen, there is no need for the DB system to "find free space". You may have to drop index(es) before this delete operation and re-create them afterward for acceptable performance.

Please point the DBA (Jeff?) to this post. He could tell in about 10 seconds if either or both of these strategies will work or not. I hope it helps.

Profile perryjay
Volunteer tester
Avatar
Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 15,182,367
RAC: 11,861
United States
Message 908572 - Posted: 17 Jun 2009, 23:54:29 UTC

Matt, you should see this thread http://setiathome.berkeley.edu/forum_thread.php?id=54204 A lot of people are getting validate errors. (myself included)
____________


PROUD MEMBER OF Team Starfire World BOINC

Berserker
Volunteer tester
Send message
Joined: 2 Jun 99
Posts: 105
Credit: 5,386,463
RAC: 0
United Kingdom
Message 908580 - Posted: 18 Jun 2009, 0:00:37 UTC - in response to Message 908554.
Last modified: 18 Jun 2009, 0:03:37 UTC

The only downside is that this method will consume more disk space on the database server. But the performance should increase.

Far from the only downside I'm afraid. You just replaced one problem (fragmentation) with another (seven times more records). The whole point of db_purge is to stop the tables getting so big that they won't fit in memory.

The optimal solution is to design the query such that it does not need to traverse all rows - an inherently expensive operation - but I think we can safely assume that isn't possible or it would have been done already. The next best is to reduce the number of times the query is used, for example by caching the result, which is what was attempted. After that comes keeping the table in memory to avoid using slow disks, which is what db_purge is supposed to do (but isn't doing very well right now).
____________
Stats site - http://www.teamocuk.co.uk - still alive and (just about) kicking.

Profile Keith Jones
Send message
Joined: 25 Oct 00
Posts: 17
Credit: 5,266,279
RAC: 0
United Kingdom
Message 908582 - Posted: 18 Jun 2009, 0:03:01 UTC - in response to Message 908554.

Hi Matt,

Sorry to hear about all the difficulties. Thanks for all the work.
Hang in there!

I wasn't sure what version of MySQL you're currently using (or even planning to use in the future) so at the risk of teaching you to suck eggs I thought I'd better mention that MySQL began shipping version 7 in April. It supports clustering, in-memory tables, load balancing, fault tolerance etc.

If you haven't upgraded to it , it might be worth a thought so you can spread the load around as per ML1's suggestion.

Best wishes,

Keith
____________

DJStarfox
Send message
Joined: 23 May 01
Posts: 1040
Credit: 541,672
RAC: 182
United States
Message 908607 - Posted: 18 Jun 2009, 1:36:33 UTC - in response to Message 908580.

The only downside is that this method will consume more disk space on the database server. But the performance should increase.

Far from the only downside I'm afraid. You just replaced one problem (fragmentation) with another (seven times more records). The whole point of db_purge is to stop the tables getting so big that they won't fit in memory.

The optimal solution is to design the query such that it does not need to traverse all rows - an inherently expensive operation - but I think we can safely assume that isn't possible or it would have been done already. The next best is to reduce the number of times the query is used, for example by caching the result, which is what was attempted. After that comes keeping the table in memory to avoid using slow disks, which is what db_purge is supposed to do (but isn't doing very well right now).


What query are you talking about that must traverse all rows? That's a bad query in any context.

The problem I was addressing relates to minimizing disk I/O. The table is obviously too big to fit completely into memory anyway (remember to include indexes). So, reducing I/O should also reduce the amount of steady-state memory usage for that table. Having contiguous records in an index also helps it when searching, because the search algorithm goes from log(n)+(total_records / deleted_records) to log(n). By improving the index seek time, queries will average shorter. By optimizing the most common case, we're making the most impact on performance.

There are several other things they could do, including lowering the value of query_cache_min_res_unit to be the data size of one row from the result table (default is 4K). But without actually being there to work on this and see the performance metrics (e.g., Qcache_lowmem_prunes, etc.), I can only make thoughtful, constructive suggestions and research them before saying anything. It's up to Matt, Jeff, and the rest of them to find the time for discussion, decision, and implementation.

Performance vs storage space has always been a hallmark trade-off in computer science applications, but here I am simply presenting an alternative to status quo, that to the best of my professional experience, will accomplish what I said it would.

ra]in-man
Send message
Joined: 25 Aug 06
Posts: 2
Credit: 331,453
RAC: 0
United States
Message 908609 - Posted: 18 Jun 2009, 2:00:07 UTC

Sorry if this is the wrong section to post...but i am unable to upload completed seti@home work

6/17/2009 8:47:05 PM SETI@home Started upload of 14mr09ac.1821.2526.11.8.232_2_0
6/17/2009 8:47:22 PM Project communication failed: attempting access to reference site
6/17/2009 8:47:22 PM SETI@home Temporarily failed upload of 01mr09ae.31974.7025.7.8.170_3_0: connect() failed
6/17/2009 8:47:22 PM SETI@home Backing off 2 hr 14 min 37 sec on upload of 01mr09ae.31974.7025.7.8.170_3_0
6/17/2009 8:47:22 PM SETI@home Started upload of 10mr09ac.6383.762157.5.8.217_0_0
6/17/2009 8:47:23 PM Internet access OK - project servers may be temporarily down.
6/17/2009 8:47:27 PM Project communication failed: attempting access to reference site
6/17/2009 8:47:27 PM SETI@home Temporarily failed upload of 14mr09ac.1821.2526.11.8.232_2_0: connect() failed
6/17/2009 8:47:27 PM SETI@home Backing off 22 min 24 sec on upload of 14mr09ac.1821.2526.11.8.232_2_0
6/17/2009 8:47:27 PM SETI@home Started upload of 12mr09ac.9323.17250.5.8.227_1_0
6/17/2009 8:47:28 PM Internet access OK - project servers may be temporarily down.
6/17/2009 8:47:44 PM Project communication failed: attempting access to reference site
6/17/2009 8:47:44 PM SETI@home Temporarily failed upload of 10mr09ac.6383.762157.5.8.217_0_0: connect() failed
6/17/2009 8:47:44 PM SETI@home Backing off 1 hr 44 min 21 sec on upload of 10mr09ac.6383.762157.5.8.217_0_0
6/17/2009 8:47:45 PM Internet access OK - project servers may be temporarily down.
6/17/2009 8:47:49 PM Project communication failed: attempting access to reference site
6/17/2009 8:47:49 PM SETI@home Temporarily failed upload of 12mr09ac.9323.17250.5.8.227_1_0: connect() failed
6/17/2009 8:47:49 PM SETI@home Backing off 12 min 44 sec on upload of 12mr09ac.9323.17250.5.8.227_1_0
6/17/2009 8:47:50 PM Internet access OK - project servers may be temporarily down.
6/17/2009 8:49:34 PM SETI@home Started upload of 04mr09ac.12347.481.6.8.24_0_0
6/17/2009 8:49:55 PM SETI@home work fetch suspended by user
6/17/2009 8:49:55 PM Project communication failed: attempting access to reference site
6/17/2009 8:49:55 PM SETI@home Temporarily failed upload of 04mr09ac.12347.481.6.8.24_0_0: connect() failed
6/17/2009 8:49:55 PM SETI@home Backing off 3 min 44 sec on upload of 04mr09ac.12347.481.6.8.24_0_0
6/17/2009 8:49:56 PM Internet access OK - project servers may be temporarily down.
6/17/2009 8:53:39 PM SETI@home Started upload of 04mr09ac.12347.481.6.8.24_0_0
6/17/2009 8:54:01 PM Project communication failed: attempting access to reference site
6/17/2009 8:54:01 PM SETI@home Temporarily failed upload of 04mr09ac.12347.481.6.8.24_0_0: connect() failed
6/17/2009 8:54:01 PM SETI@home Backing off 10 min 3 sec on upload of 04mr09ac.12347.481.6.8.24_0_0

Profile Andy Lee Robinson
Avatar
Send message
Joined: 8 Dec 05
Posts: 615
Credit: 40,370,430
RAC: 42,120
Hungary
Message 908614 - Posted: 18 Jun 2009, 2:26:41 UTC - in response to Message 908607.

Performance vs storage space has always been a hallmark trade-off in computer science applications, but here I am simply presenting an alternative to status quo, that to the best of my professional experience, will accomplish what I said it would.


Well, to avoid hitting the db, I'd just add another line to the code that would add +n or -n to a persistent key in memcache after the sql update. It would only use a few bytes instead of causing such grief.
The memcache key could be reset daily or weekly with absolute values from one of more count(*) queries, and then just record deltas as the db is updated.

The values would be available immediately to all machines aware of the memcache pool, and not hit the db at all.
It could also be used for real time stats on the public website with no hit on the db either.

yum install memcached

See http://blogs.vinuthomas.com/2006/02/06/memcached-with-php/

Norwich Gadfly
Avatar
Send message
Joined: 29 Dec 08
Posts: 100
Credit: 488,414
RAC: 0
United Kingdom
Message 908663 - Posted: 18 Jun 2009, 7:10:27 UTC - in response to Message 908492.
Last modified: 18 Jun 2009, 7:10:56 UTC


Could some operations run offline from updating the database in real-time to be instead batched through more efficiently?

Martin


Yes, that reminds me of the dark ages (well the 1970s actually) where updates were glorified merges of transactions with the previous master to create the new generation of the master. Master files were copied to indexed-sequential when random access was required.

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7047
Credit: 59,739,633
RAC: 21,562
Germany
Message 908756 - Posted: 18 Jun 2009, 14:36:28 UTC


Thanks Matt for the (daily) update! *thumb up*


Could someone guess/say when SETI@home (and -Beta Test) will be again running well?

My GPU cruncher is offline and I miss the noise of the fan.. ;-)

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8431
Credit: 47,857,396
RAC: 56,672
United Kingdom
Message 908757 - Posted: 18 Jun 2009, 14:37:19 UTC
Last modified: 18 Jun 2009, 14:37:42 UTC

Matt,

1) I'm sure you've noticed, but just in case you run straight to the database when you get in - you've got another problem: no uploads since yesterday. That can't be a database problem, of course, and it's not a bandwidth problem either (been below 40 Mbit/s all day). So I presume it's one of those pesky mounts, or an underlying file system problem on the upload data store.

2) I hadn't realised quite how successful you'd been in recruiting active users recently:



If they are going to enjoy a satisfactory user experience (and not take their spare cycles away again), you've GOT to get that database under control: and if it's grown too big to fit in memory, I don't see how that's going to be possible.

Instead of trying to get it to work at the present (bloated) size, have you thought about trying to reduce it to a manageable size?

Size will be proportional to (active_users)*(tasks_per_user): you want to maximize the first term, so that would mean reducing the second.

There is an unfortunate tendency among crunchers to increase cache sizes to the maximum at the first sign of trouble. That solves the problem from their own personal perspective, but I think too few of them (us?) stop to think about the strain they impose on the project they purport to support.

I saw an interesting FAQ at GPUgrid this morning, while looking for an alternative CUDA project: they "... give an additional 25% [credit] for WUs returned within two days. This is useful for us to reduce latency of the results ..." - in other words, use a bit of economic social engineering instead of brute database force to solve the problem. Look at host 4947578: eight cores, 5 GPUs, and a turnround of 19.4 days. How much does that add to your query load?

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7047
Credit: 59,739,633
RAC: 21,562
Germany
Message 908758 - Posted: 18 Jun 2009, 14:39:15 UTC


@ ra]in-man

Have a look in the NC forum:
http://setiathome.berkeley.edu/forum_forum.php?id=10

You are not alone with this prob..

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7047
Credit: 59,739,633
RAC: 21,562
Germany
Message 908762 - Posted: 18 Jun 2009, 14:49:42 UTC - in response to Message 908757.
Last modified: 18 Jun 2009, 14:50:21 UTC

...
Look at host 4947578: eight cores, 5 GPUs, and a turnround of 19.4 days. How much does that add to your query load?


AFAIK, for example.. if you switch off the PC and the claimed credits will be granted, this will increase this value.


BTW.
Normally it should be 6 GPUs.. 3 x GTX295.. maybe the user have probs to enable the last GPU..?

____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile Virtual Boss*
Volunteer tester
Avatar
Send message
Joined: 4 May 08
Posts: 417
Credit: 6,177,710
RAC: 329
Australia
Message 908764 - Posted: 18 Jun 2009, 14:56:11 UTC - in response to Message 908453.

This is really a problem when the splitter comes along every ten minutes and checks to see if there's enough work available to send out (thus asking the question: should I bother generating more work). This is a simple count on the result table, but if we're in a "bad state" this count which normally takes a second could take literally hours, and stall all other queries, like the feeder, and therefore nobody can get any work. There are up to six splitters running at any given time, so multiple this problem by six.


Is it possible to have different settings for the each of the splitters?

If you could set individual timing & turn on/off values on queue size.....

For example -

1 splitter - every 10 mins, turnon @ <90% full, turnoff @ >100% full.
2 splitters - every 30 minutes, turnon @ <80% full, turnoff @ >95% full.
2 splitters - every 60 minutes, turnon @ <70% full, turnoff @ >85% full.
Rest of splitters - every 120 minutes, turnon @ <60% full, turnoff @ >75% full.

This would reduce the query load considerably while maintaining a good supply of WU,s ready to send.

PS: The values I suggested were only guesswork on my part, and would probably have to be adjusted for optimum performance.

1mp0£173
Volunteer tester
Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 908791 - Posted: 18 Jun 2009, 16:22:39 UTC - in response to Message 908757.

If they are going to enjoy a satisfactory user experience (and not take their spare cycles away again), you've GOT to get that database under control: and if it's grown too big to fit in memory, I don't see how that's going to be possible.

Hmmm....

Admittedly, I learned about databases when 64k was a lot of memory (and it was core, not RAM) but you sure couldn't keep the whole database in memory then, and I'm not sure why it "must" fit now. I don't claim to be a MySQL expert by any means, I just assume that it's competent.

You have to measure to work this out, but scanning the whole database is likely rarely needed. There is probably an active region that'd be nice to have cached.

But the whole database? Seems unlikely, even in this day of much RAM.

I can definitely see that some operations (like getting a count) would be expensive -- and would tend to push the active records from the cache, and it sounds like they've got a start on preventing those.

It does sound like MySQL does not recover used space well on the scale that SETI adds and removes records. That's a (bigger) flaw.
____________

1 · 2 · 3 · Next

Message boards : Technical News : Comedy (Jun 17 2009)

Copyright © 2014 University of California