The Geese Fly West (Dec 10 2007)


log in

Advanced search

Message boards : Technical News : The Geese Fly West (Dec 10 2007)

Author Message
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 690450 - Posted: 10 Dec 2007, 23:26:47 UTC

We had another batch of "fast" workunits this weekend. No big deal, except we did run out of a ready-to-send queue for a while there. To help alleviate panic I added a couple items to the server status page for your (and our) diagnostic pleasure: count of results returned over the past hour, and their average "turnround" time (i.e. "wall" time between workunit download and its result upload). It seems the current "normal" average is about 60 hours, during the weekend we were as low as 30. It would be be more meaningful to have median instead of average (as there are always slow computers that turnaround mere seconds before the deadline, thus skewing the averages), but mysql doesn't have a "median" function and it's not really worth implementing one of our own - we have so many other fish to fry.

Our air conditioner tech was in today to wrap up work on fixing the current (and hopefully last) coolant leak. No real news there, except it was fun to see our temperatures shoot up 6 degrees Celsius within a few minutes as the air conditioner was temporarily turned off.

I'm about to start the latest donation drive. This will wreak havoc on a few of our isolated servers which are dedicated for such large mass mailings. Hopefully this will happen without incident - people are understandably sensitive about what they perceive as spam.

- Matt

____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile Dr. C.E.T.I.
Avatar
Send message
Joined: 29 Feb 00
Posts: 15993
Credit: 690,597
RAC: 0
United States
Message 690457 - Posted: 10 Dec 2007, 23:34:55 UTC
Last modified: 10 Dec 2007, 23:38:54 UTC

. . . as usual - Thanks for the Update Matt

< and Best of Luck with the Donation Drive Sir!

ps - almost forgot - real nice: Sleepfall

- Anthony Phillips - 'The Geese And The Ghost'
____________
BOINC Wiki . . .

Science Status Page . . .

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1939
Credit: 10,078,773
RAC: 17,173
United States
Message 690571 - Posted: 11 Dec 2007, 4:25:45 UTC

I really hate to keep harping on this, but the "client connection stats" is broken - and has been since at least Friday - possibly Thursday (I don't recall!)
____________
.

DJStarfox
Send message
Joined: 23 May 01
Posts: 1040
Credit: 547,399
RAC: 229
United States
Message 690585 - Posted: 11 Dec 2007, 4:48:15 UTC - in response to Message 690450.

Median in mySQL? Easy:
SELECT x.val
FROM data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*))

But you should look at the query plans between the above and avg() before deciding to put it into production. Only works with MySQL > 5.0; I got the tip from here:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html

Jesse Viviano
Send message
Joined: 27 Feb 00
Posts: 95
Credit: 474,230
RAC: 0
United States
Message 690591 - Posted: 11 Dec 2007, 5:27:20 UTC - in response to Message 690585.

Median in mySQL? Easy:
SELECT x.val
FROM data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*))

But you should look at the query plans between the above and avg() before deciding to put it into production. Only works with MySQL > 5.0; I got the tip from here:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html

Next time if you want to include source code, you can use code tags, like what is shown on this page. The result produces this:
SELECT x.val
FROM data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*))

It also seems that the code tags seem useful to draw tables using ASCII on these forums if you need to do that because the code tags' contents render in a monospace font.

Profile ML1
Volunteer tester
Send message
Joined: 25 Nov 01
Posts: 8412
Credit: 4,131,263
RAC: 1,418
United Kingdom
Message 690671 - Posted: 11 Dec 2007, 15:27:13 UTC - in response to Message 690591.
Last modified: 11 Dec 2007, 15:27:46 UTC

...Only works with MySQL > 5.0; I got the tip from here:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html

Good quote but not accurate: Just one small but vital ommission...

It should read:

SELECT x.val
FROM data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2


Further comment:

assumes (COUNT(*)+1)/2 returns an int, if not
then use INT((COUNT(*)+1)/2)
There are variations for the varied defs of
median too.



That lot could also be done on a unix command pipe using sort, head, wc, and bc/dc for those wishing a challenge for a minimalist solution!

Regards,
Martin
____________
See new freedom: Mageia4
Linux Voice See & try out your OS Freedom!
The Future is what We make IT (GPLv3)

archae86
Send message
Joined: 31 Aug 99
Posts: 888
Credit: 1,572,688
RAC: 5
United States
Message 690727 - Posted: 11 Dec 2007, 23:33:40 UTC - in response to Message 690591.

SELECT x.val
FROM data x, data y
GROUP BY x.val
HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*))

It also seems that the code tags seem useful to draw tables using ASCII on these forums if you need to do that because the code tags' contents render in a monospace font.

Much thanks for pointing that out. They did not work the way one would want when last I tried them (probably between one and two years ago). Looks like someone fixed something.

____________

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1389
Credit: 74,079
RAC: 0
United States
Message 690735 - Posted: 12 Dec 2007, 0:08:37 UTC
Last modified: 12 Dec 2007, 0:09:07 UTC

To clarify, I did see all these median functions/queries but (a) the current average query was already complex enough that wrapping this additional logic around it, however simple, was enough to make me move on to a more productive task and (b) I have, perhaps unfounded, fears that this would eat up more memory than an average query and that's enough to move on to something else.

My day is usaully a lot of doing constant triage, picking low hanging fruit, playing whack-a-mole, etc. so if anything relatively low priority that smacks of non-zero difficulty or raising non-zero concern it is immediately put to the bottom of the list.

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Profile ML1
Volunteer tester
Send message
Joined: 25 Nov 01
Posts: 8412
Credit: 4,131,263
RAC: 1,418
United Kingdom
Message 690740 - Posted: 12 Dec 2007, 0:43:13 UTC - in response to Message 690727.

[...]
Much thanks for pointing that out. They did not work the way one would want when last I tried them (probably between one and two years ago). Looks like someone fixed something.

It would have been less confusing not to have reposted the incorrect code...

Good find on the code, but looks like we stay on something simpler for the time being...

Anyone game for the unix solution to keep the database bashing minimalist?

Happy crunchin',
Martin

____________
See new freedom: Mageia4
Linux Voice See & try out your OS Freedom!
The Future is what We make IT (GPLv3)

Profile ML1
Volunteer tester
Send message
Joined: 25 Nov 01
Posts: 8412
Credit: 4,131,263
RAC: 1,418
United Kingdom
Message 690741 - Posted: 12 Dec 2007, 0:46:43 UTC - in response to Message 690671.

Reposting for clarity.

Also trying "pre" rather than "code" 'cos the "=" looks too much like "-" when using "code"!

...Only works with MySQL > 5.0; I got the tip from here:
http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html

Good quote but not accurate: Just one small but vital ommission...

It should read:

SELECT x.val FROM data x, data y GROUP BY x.val HAVING SUM(SIGN(1-SIGN(y.val-x.val))) = (COUNT(*)+1)/2


Further comment:

assumes (COUNT(*)+1)/2 returns an int, if not
then use INT((COUNT(*)+1)/2)
There are variations for the varied defs of
median too.



That lot could also be done on a unix command pipe using sort, head, wc, and bc/dc for those wishing a challenge for a minimalist solution!


Happy crunchin',
Martin

____________
See new freedom: Mageia4
Linux Voice See & try out your OS Freedom!
The Future is what We make IT (GPLv3)

Profile Uioped1
Volunteer tester
Avatar
Send message
Joined: 17 Sep 03
Posts: 50
Credit: 1,179,926
RAC: 0
United States
Message 690892 - Posted: 12 Dec 2007, 17:45:59 UTC

I seem to recall that S@H uses InnoDB tables, so this probably won't work for them, but I had to post this for completeness sake:

SET @medianRow = 0; SELECT @medianRow = INT(table_rows/2) + 1 FROM INFORMATION_SCHEMA.TABLES WHERE table_schema = '<dbName>' AND table_name = '<statsTable>'; SELECT <stat> FROM <statsTable> ORDER BY <stat> LIMIT @medianRow, @medianRow


A couple of notes:
A) This defines the median as the middle _result_ not the middle _value_
B) It's been a while since I used mysql, so I may have some syntax errors
C) As I alluded to above, innodb tables maintain the rowcount as an estimate only, thus the median will be only an estimate.
D) It might be possible to turn off statistics generation for tables, and if so S@H probably has...

Performance Explanation:
This statement may or may not be faster than the average aggregate, depending on the index structure of the table. The average aggregate must by definition touch every row in the table (At best an index scan) whereas the query outlined above must at worst perform a table scan and a sort. On the other hand the best case for the query above is one index lookup.

Message boards : Technical News : The Geese Fly West (Dec 10 2007)

Copyright © 2014 University of California