Message boards :
Number crunching :
Where are the stats?
Message board moderation
Author | Message |
---|---|
Angus Send message Joined: 26 May 99 Posts: 459 Credit: 91,013 RAC: 0 |
How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day? The project stats are worthless - and the 3rd party stats are starved for data more often than not. What kind of a circus is this? Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day? No, it was just their time to be published. This is a project dependent issue, some projects issue updates more often, others do this only once a day. Statistics was always intended to be a 3rd party effort and this is not going to change. The basic stats pages are to allow a project to use them if they do not want to link to external sites, for example: a project that in strictly internally supported by the organization, but uses BOINC as part of the solution. In the BOINC Community, rants seldom get anyone anywhere ... |
Angus Send message Joined: 26 May 99 Posts: 459 Credit: 91,013 RAC: 0 |
Well, they were published sometime around 19:00 yesterday, and 21:40 (PDT or GMT-8) or so today... What kind of cron job is that? I don't recall any kind of randomization in cron. How are these stats "project dependant"? XML stats In case anyone has been underground without computer access for the last 6 years or so, STATS are the lifeblood of Distributed Computing. Without STATS - accurate and TIMELY stats- the majority of contributors will drift away, leaving only those who are in it for the "science". Since it is too much strain on BIONIC to produce real-time stats, the least they could do is produce accurate & timely stats on a published schedule. |
Tigher Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 |
How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day? Having used Informix Enterprise myself over a number of years I do not think you can describe it as "mickey-mouse hobby-class database engine". It is FAR from that for sure! I don't disagree that stats should be produced more often and they are important to these projects. But there is a balance to be struck here with getting the work in/out and running what must be demanding stats SQL queries. Perhaps you should start a thread to see how much support there is for whatever idea you have, critically discuss it and then approach the project with the output of that disquisition. |
Paul D. Buck Send message Joined: 19 Jul 00 Posts: 3898 Credit: 1,158,042 RAC: 0 |
Well, they were published sometime around 19:00 yesterday, and 21:40 (PDT or GMT-8) or so today... CRON can be suspended, restarted, modified, etc.
The timing of the output for XML stats is based on project ability, need, and desire. The use of the "Top" pages, like that of the "Pending" which is OFF for SETI@Home, but ON for other projects. If Dr. Anderson decided to not do the "Top" pages, they too could be turned off ...
Well, truth be told; I doubt that there is much danger in anyone leaving. And, even for the projects, right now they are straining to keep up with the demand for work. And, there is still a large quantity of people running SETI@Home Classic. So, if this is such an issue for you, I hate to see you leave, we shall miss you ... |
Angus Send message Joined: 26 May 99 Posts: 459 Credit: 91,013 RAC: 0 |
Classic is running on Informix - BIONIC is running on MySQL. Angus full-time IDS user |
Tigher Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 |
|
Digger Send message Joined: 4 Dec 99 Posts: 614 Credit: 21,053 RAC: 0 |
Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post. LOL@Angus Don't flatter yourself, pal. Nobody's going to go out of their way to do anything for a big crybaby, especially the likes of you. Not a single one of your posts since joining this board have had anything positive or constructive or helpful to say about BOINC in any way, despite the fact that most folks here like the new platform quite a bit. So... you just continue to sit in your little corner and throw things across the room, have your little tantrums, and go "waaaaa! waaaaa! waaaaa!" while the rest of us have a good hearty laugh at you. :) Tired of your crap... put your biggie boy pants on and make a helpful suggestion for once, or keep your mouth shut. EDIT: Everyone here knows that you spell it BIONIC on purpose like some petulant little kid trying to get under us adults' skin. Just how freaking childish ARE you?? LOL. |
Angus Send message Joined: 26 May 99 Posts: 459 Credit: 91,013 RAC: 0 |
Edit - I guess a simple rant is what it takes, the XML stats appeared immediately after this post. At least I don't stoop to personal attacks. That would get you thrown off or banned from most boards. I will reserve the right to criticize the project decisions or strategies. BOINC BOINC BOINC Happy now? It still rolls off the fingertips easier as BIONIC. Here's a constructive suggestion. Convert to an Enterprise Class database engine so queries will run faster and more efficiently, thereby freeing up more CPU time for a 4 times daily stats run. |
Digger Send message Joined: 4 Dec 99 Posts: 614 Credit: 21,053 RAC: 0 |
Here's a constructive suggestion. Convert to an Enterprise Class database engine so queries will run faster and more efficiently, thereby freeing up more CPU time for a 4 times daily stats run. There... now don't you feel a bit more like an adult now and a bit less like a little whiner? :) Dig |
Digger Send message Joined: 4 Dec 99 Posts: 614 Credit: 21,053 RAC: 0 |
How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day? |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
It still rolls off the fingertips easier as BIONIC. At least it didn't cost 6 million 1970's US Dollars. :) |
Bruno G. Olsen & ESEA @ greenholt Send message Joined: 15 May 99 Posts: 875 Credit: 4,386,984 RAC: 0 |
|
Angus Send message Joined: 26 May 99 Posts: 459 Credit: 91,013 RAC: 0 |
"and the 3rd party stats are starved for data more often than not" http://setiweb.ssl.berkeley.edu/forum_thread.php?id=14596 http://setiweb.ssl.berkeley.edu/forum_thread.php?id=14724 http://setiweb.ssl.berkeley.edu/forum_thread.php?id=14366 Those would seem to cover most of this month. I know boincsynergy didn't have SETI stats for most of that time. |
Bruno G. Olsen & ESEA @ greenholt Send message Joined: 15 May 99 Posts: 875 Credit: 4,386,984 RAC: 0 |
|
Angus Send message Joined: 26 May 99 Posts: 459 Credit: 91,013 RAC: 0 |
But when the drives are not full, it would let the stats be run more than once a day. Why would the DBA/Sys Admins allow the drives to get that way in the first place? |
Toby Send message Joined: 26 Oct 00 Posts: 1005 Credit: 6,366,949 RAC: 0 |
How hard is it to produce the XML stats at least once a day ??? Why can't these be produced every 4 or 6 hours? Is it because the mickey-mouse hobby-class database engine is brought to it's knees by a simple query run 4 times a day? From what I have observed it takes somewhere between 20 and 45 minutes to generate the XML for seti. During this time the entire database will be slowed down considerably. Doing it every 4 hours would degrade the performance of the servers for a good portion of the day. There is more going on than a simple select * from <table>. I believe the biggest bottleneck right now is the hosts table. Since users can chose to hide their hosts, each host must be checked before it is dumped out to the XML. Right now the db_dump program which generates the XML is doing separate SQL query for each host on the user table to determine if the userid should be included in the host record or not. That is a LOT of queries for any database. This problem was brought to my attention within the last couple of days since there was a problem with a user being deleted so the query wasn't returning and was screwing up the XML so that it caused XML parsers to crash. There IS a better way to do it however this isn't a top priority for the developers since it is working (most of the time) and there are other more pressing bugs to fix. Some of us in the stats "business" (AthlonRob in particular) are looking into fixing this to make it more efficient which will hopefully reduce the error rate and possibly allow seti to export XML more often. But that is the beauty of an open source project like BOINC. Instead of just complaining, we (the community) can actually help to make it better. A member of The Knights Who Say NI! For rankings, history graphs and more, check out: My BOINC stats site |
Tigher Send message Joined: 18 Mar 04 Posts: 1547 Credit: 760,577 RAC: 0 |
I think Angus's point is that if an Enterprise level dbms was in use (and I thought Informix was but apparently it is not) then there would be no replication issue as that would have damned near real-time onto a secondary server and the 4 hourly run to produce xml stats would be carried out on this secondary thereby not impacting the primary server at all. That way no delays on transaction for WUs in/out, credits etc etc. Its still a question of cost I guess.....but I do ask then how Informix is being used on classic and when classic ends can it be taken into use - it is better than MySQL as I see it; haha I don't even know what version they are using and I'm making judgements.....any how you get my drift. |
Bruno G. Olsen & ESEA @ greenholt Send message Joined: 15 May 99 Posts: 875 Credit: 4,386,984 RAC: 0 |
|
Ian Thompson Send message Joined: 3 Jan 04 Posts: 35 Credit: 182,911 RAC: 0 |
Hi All I have made the following point on several occasions. Just cleaning the database up would improve the performance significantly. Take my Team name Borg Technologies Ltd. If I type Borg into a search about 20 teams show. Only 4 have members and WU's Most of the teams entered seem to be redirectors to alsorts of things from dating sites. to online shops. If this is true for every team (I know it wont be.) It would represent a 500$ increase in the speed of stat production and a 5th of the disk resource. I expect more realisticly 20% saving could be achieved. This is not to be brushed aside lightly, It is free to implement apart for a bit of admin or script writing. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.