Where are the stats?

Message boards : Number crunching : Where are the stats?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 115050 - Posted: 26 May 2005, 4:35:16 UTC
Last modified: 26 May 2005, 4:50:31 UTC

How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day?

The project stats are worthless - and the 3rd party stats are starved for data more often than not.

What kind of a circus is this?

Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post.
ID: 115050 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 115056 - Posted: 26 May 2005, 5:23:29 UTC - in response to Message 115050.  

How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day?

The project stats are worthless - and the 3rd party stats are starved for data more often than not.

What kind of a circus is this?

Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post.


No, it was just their time to be published.

This is a project dependent issue, some projects issue updates more often, others do this only once a day. Statistics was always intended to be a 3rd party effort and this is not going to change. The basic stats pages are to allow a project to use them if they do not want to link to external sites, for example: a project that in strictly internally supported by the organization, but uses BOINC as part of the solution.

In the BOINC Community, rants seldom get anyone anywhere ...
ID: 115056 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 115059 - Posted: 26 May 2005, 6:25:14 UTC
Last modified: 26 May 2005, 6:37:57 UTC

Well, they were published sometime around 19:00 yesterday, and 21:40 (PDT or GMT-8) or so today...

What kind of cron job is that? I don't recall any kind of randomization in cron.

How are these stats "project dependant"? XML stats

In case anyone has been underground without computer access for the last 6 years or so, STATS are the lifeblood of Distributed Computing. Without STATS - accurate and TIMELY stats- the majority of contributors will drift away, leaving only those who are in it for the "science".

Since it is too much strain on BIONIC to produce real-time stats, the least they could do is produce accurate & timely stats on a published schedule.
ID: 115059 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 115072 - Posted: 26 May 2005, 9:21:37 UTC - in response to Message 115050.  

How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day?

The project stats are worthless - and the 3rd party stats are starved for data more often than not.

What kind of a circus is this?



Having used Informix Enterprise myself over a number of years I do not think you can describe it as "mickey-mouse hobby-class database engine". It is FAR from that for sure! I don't disagree that stats should be produced more often and they are important to these projects. But there is a balance to be struck here with getting the work in/out and running what must be demanding stats SQL queries. Perhaps you should start a thread to see how much support there is for whatever idea you have, critically discuss it and then approach the project with the output of that disquisition.


ID: 115072 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 115106 - Posted: 26 May 2005, 13:02:04 UTC - in response to Message 115059.  

Well, they were published sometime around 19:00 yesterday, and 21:40 (PDT or GMT-8) or so today...

What kind of cron job is that? I don't recall any kind of randomization in cron.


CRON can be suspended, restarted, modified, etc.


How are these stats "project dependant"? XML stats


The timing of the output for XML stats is based on project ability, need, and desire. The use of the "Top" pages, like that of the "Pending" which is OFF for SETI@Home, but ON for other projects. If Dr. Anderson decided to not do the "Top" pages, they too could be turned off ...


In case anyone has been underground without computer access for the last 6 years or so, STATS are the lifeblood of Distributed Computing. Without STATS - accurate and TIMELY stats- the majority of contributors will drift away, leaving only those who are in it for the "science".

Since it is too much strain on BIONIC to produce real-time stats, the least they could do is produce accurate & timely stats on a published schedule.


Well, truth be told; I doubt that there is much danger in anyone leaving. And, even for the projects, right now they are straining to keep up with the demand for work. And, there is still a large quantity of people running SETI@Home Classic. So, if this is such an issue for you, I hate to see you leave, we shall miss you ...

ID: 115106 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 115109 - Posted: 26 May 2005, 13:19:25 UTC - in response to Message 115072.  


Having used Informix Enterprise myself over a number of years I do not think you can describe it as "mickey-mouse hobby-class database engine". It is FAR from that for sure! I don't disagree that stats should be produced more often and they are important to these projects.


Classic is running on Informix - BIONIC is running on MySQL.

Angus
full-time IDS user

ID: 115109 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 115121 - Posted: 26 May 2005, 13:56:24 UTC - in response to Message 115109.  



Classic is running on Informix - BIONIC is running on MySQL.



Hmmm...well I stand corrected. Question of cash again I guess. Having said that I use mySQL too for domestic stuff but I'm not sure how well it scales. Hey perhaps you could offer to help them?

ID: 115121 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 115146 - Posted: 26 May 2005, 16:27:33 UTC - in response to Message 115050.  
Last modified: 26 May 2005, 16:51:34 UTC

Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post.



LOL@Angus

Don't flatter yourself, pal.

Nobody's going to go out of their way to do anything for a big crybaby, especially the likes of you. Not a single one of your posts since joining this board have had anything positive or constructive or helpful to say about BOINC in any way, despite the fact that most folks here like the new platform quite a bit. So... you just continue to sit in your little corner and throw things across the room, have your little tantrums, and go "waaaaa! waaaaa! waaaaa!" while the rest of us have a good hearty laugh at you. :)

Tired of your crap... put your biggie boy pants on and make a helpful suggestion for once, or keep your mouth shut.

EDIT: Everyone here knows that you spell it BIONIC on purpose like some petulant little kid trying to get under us adults' skin. Just how freaking childish ARE you?? LOL.


ID: 115146 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 115245 - Posted: 27 May 2005, 0:58:48 UTC - in response to Message 115146.  

Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post.



LOL@Angus

Don't flatter yourself, pal.

Nobody's going to go out of their way to do anything for a big crybaby, especially the likes of you. Not a single one of your posts since joining this board have had anything positive or constructive or helpful to say about BOINC in any way, despite the fact that most folks here like the new platform quite a bit. So... you just continue to sit in your little corner and throw things across the room, have your little tantrums, and go "waaaaa! waaaaa! waaaaa!" while the rest of us have a good hearty laugh at you. :)

Tired of your crap... put your biggie boy pants on and make a helpful suggestion for once, or keep your mouth shut.

EDIT: Everyone here knows that you spell it BIONIC on purpose like some petulant little kid trying to get under us adults' skin. Just how freaking childish ARE you?? LOL.




At least I don't stoop to personal attacks. That would get you thrown off or banned from most boards.

I will reserve the right to criticize the project decisions or strategies.

BOINC BOINC BOINC Happy now? It still rolls off the fingertips easier as BIONIC.

Here's a constructive suggestion. Convert to an Enterprise Class database engine so queries will run faster and more efficiently, thereby freeing up more CPU time for a 4 times daily stats run.


ID: 115245 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 115248 - Posted: 27 May 2005, 1:02:05 UTC - in response to Message 115245.  

Here's a constructive suggestion. Convert to an Enterprise Class database engine so queries will run faster and more efficiently, thereby freeing up more CPU time for a 4 times daily stats run.


There... now don't you feel a bit more like an adult now and a bit less like a little whiner? :)

Dig
ID: 115248 · Report as offensive
Profile Digger
Volunteer tester

Send message
Joined: 4 Dec 99
Posts: 614
Credit: 21,053
RAC: 0
United States
Message 115250 - Posted: 27 May 2005, 1:06:27 UTC - in response to Message 115050.  

How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day?

The project stats are worthless - and the 3rd party stats are starved for data more often than not.

What kind of a circus is this?

ID: 115250 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 115253 - Posted: 27 May 2005, 1:13:17 UTC - in response to Message 115245.  

It still rolls off the fingertips easier as BIONIC.

At least it didn't cost 6 million 1970's US Dollars. :)


ID: 115253 · Report as offensive
Profile Bruno G. Olsen & ESEA @ greenholt
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 875
Credit: 4,386,984
RAC: 0
Denmark
Message 115259 - Posted: 27 May 2005, 1:21:20 UTC

"and the 3rd party stats are starved for data more often than not"

I wonder how much statistical data available to support that statement..


ID: 115259 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 115272 - Posted: 27 May 2005, 1:58:16 UTC - in response to Message 115259.  
Last modified: 27 May 2005, 2:05:33 UTC

"and the 3rd party stats are starved for data more often than not"

I wonder how much statistical data available to support that statement..



http://setiweb.ssl.berkeley.edu/forum_thread.php?id=14596

http://setiweb.ssl.berkeley.edu/forum_thread.php?id=14724

http://setiweb.ssl.berkeley.edu/forum_thread.php?id=14366

Those would seem to cover most of this month.

I know boincsynergy didn't have SETI stats for most of that time.
ID: 115272 · Report as offensive
Profile Bruno G. Olsen & ESEA @ greenholt
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 875
Credit: 4,386,984
RAC: 0
Denmark
Message 115280 - Posted: 27 May 2005, 2:26:28 UTC - in response to Message 115272.  

Yes, I see how that amounts to 50+% of the time ;)

And BTW, changing to a different db - won't change the fact that harddrives have limited space and the replica is slow ;)


ID: 115280 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 115281 - Posted: 27 May 2005, 2:29:15 UTC
Last modified: 27 May 2005, 2:30:34 UTC

But when the drives are not full, it would let the stats be run more than once a day.

Why would the DBA/Sys Admins allow the drives to get that way in the first place?
ID: 115281 · Report as offensive
Profile Toby
Volunteer tester
Avatar

Send message
Joined: 26 Oct 00
Posts: 1005
Credit: 6,366,949
RAC: 0
United States
Message 115321 - Posted: 27 May 2005, 4:51:49 UTC - in response to Message 115050.  

How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day?


From what I have observed it takes somewhere between 20 and 45 minutes to generate the XML for seti. During this time the entire database will be slowed down considerably. Doing it every 4 hours would degrade the performance of the servers for a good portion of the day. There is more going on than a simple select * from <table>. I believe the biggest bottleneck right now is the hosts table. Since users can chose to hide their hosts, each host must be checked before it is dumped out to the XML. Right now the db_dump program which generates the XML is doing separate SQL query for each host on the user table to determine if the userid should be included in the host record or not. That is a LOT of queries for any database. This problem was brought to my attention within the last couple of days since there was a problem with a user being deleted so the query wasn't returning and was screwing up the XML so that it caused XML parsers to crash. There IS a better way to do it however this isn't a top priority for the developers since it is working (most of the time) and there are other more pressing bugs to fix. Some of us in the stats "business" (AthlonRob in particular) are looking into fixing this to make it more efficient which will hopefully reduce the error rate and possibly allow seti to export XML more often. But that is the beauty of an open source project like BOINC. Instead of just complaining, we (the community) can actually help to make it better.
A member of The Knights Who Say NI!
For rankings, history graphs and more, check out:
My BOINC stats site
ID: 115321 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 115332 - Posted: 27 May 2005, 5:38:17 UTC

I think Angus's point is that if an Enterprise level dbms was in use (and I thought Informix was but apparently it is not) then there would be no replication issue as that would have damned near real-time onto a secondary server and the 4 hourly run to produce xml stats would be carried out on this secondary thereby not impacting the primary server at all. That way no delays on transaction for WUs in/out, credits etc etc. Its still a question of cost I guess.....but I do ask then how Informix is being used on classic and when classic ends can it be taken into use - it is better than MySQL as I see it; haha I don't even know what version they are using and I'm making judgements.....any how you get my drift.

ID: 115332 · Report as offensive
Profile Bruno G. Olsen & ESEA @ greenholt
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 875
Credit: 4,386,984
RAC: 0
Denmark
Message 115363 - Posted: 27 May 2005, 10:25:26 UTC - in response to Message 115281.  

[quote]But when the drives are not full, it would let the stats be run more than once a day.quote]

Why?


ID: 115363 · Report as offensive
Ian Thompson
Volunteer tester

Send message
Joined: 3 Jan 04
Posts: 35
Credit: 182,911
RAC: 0
United Kingdom
Message 115374 - Posted: 27 May 2005, 11:15:56 UTC

Hi All



I have made the following point on several occasions.

Just cleaning the database up would improve the performance significantly.


Take my Team name Borg Technologies Ltd.

If I type Borg into a search about 20 teams show.

Only 4 have members and WU's

Most of the teams entered seem to be redirectors to alsorts of things from dating sites. to online shops.

If this is true for every team (I know it wont be.) It would represent a 500$ increase in the speed of stat production and a 5th of the disk resource.

I expect more realisticly 20% saving could be achieved.

This is not to be brushed aside lightly, It is free to implement apart for a bit of admin or script writing.

ID: 115374 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : Where are the stats?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.