Wish: An alternative statistics folder (like /sah/stats_split), where the files are split in more, smaller segments


log in

Advanced search

Questions and Answers : Wish list : Wish: An alternative statistics folder (like /sah/stats_split), where the files are split in more, smaller segments

Author Message
John McFo
Volunteer tester
Send message
Joined: 6 May 00
Posts: 4
Credit: 52,841
RAC: 0
Germany
Message 1129 - Posted: 24 Jun 2004, 12:38:48 UTC

Because I don't want to download those rather large files like user_id.gz (23-Jun-2003, 20:18 : 154MB!) to create some little statistics for my little team.
I think, many little teams couldn't afford daily downloads/parsing in such huge amounts like the greater ones (e.g. boinc.dk) can, as they would just download the segments where the really needed information about their team/users is stored.

(Sry 'bout my english, I'm german :) )

Mark Gray
Send message
Joined: 3 Apr 99
Posts: 22
Credit: 0
RAC: 0
United Kingdom
Message 1136 - Posted: 24 Jun 2004, 12:51:17 UTC

I agree with you (and I'm from one of those larger teams - Overclockers UK). I understand that the old system couldn't handle the number of users/teams in the public project (that was evident from the 19,000+ stats files), but this has gone from one extreme to the other.

If lots of teams start downloading 150MB+ every day to update their stats, that can't be good for Berkeley's servers. That isn't going to happen though because most teams don't have the resources to download and process 150MB (compressed) worth of XML stats every day. That's well into dedicated server territory (there's no way the average hosting account could handle that).

I understand that this wasn't a particularly high priority in getting the project released, so I'm just expressing my opinion that the solution we have now isn't workable. I hope it's just a temporary 'solution' while the project gets on it's feet.

I'll look at a dedicated server if I must, but I'd rather not - at least for now.

John McFo
Volunteer tester
Send message
Joined: 6 May 00
Posts: 4
Credit: 52,841
RAC: 0
Germany
Message 1145 - Posted: 24 Jun 2004, 13:02:45 UTC

Thx for your opinion, Mark.

If they could split them into pieces of ~2-3MB, this would make this problem a bit easier. But if the number of records walks off into thousands or tens of thousands, it would be better to spread those records among several servers/mirrors. (First thousand here, this thousand there, that thousand... hmm... ah, over there... - and so on)

Profile fathertyme
Send message
Joined: 4 Feb 02
Posts: 7
Credit: 61,804
RAC: 0
United States
Message 1754 - Posted: 25 Jun 2004, 7:21:21 UTC

Actually, even 1-2 megs per file isn't a really good answer.

Shoot, I download 7 team stats once a day, and then, at non-peak hours.

those 7 team stats, mine, and 3 on either side of me (to watch the competition)
now if those 7 teams are not within x number of id's with each other, I'm looking at 7-14 megs alone. Not counting the fact that there is no team_standings list that I can reference to even know which teams are around me. So that would mean I have to download ALL the files just to find that out.

And I wont even get INTO all the seti-banners out there that simply provide individual user stats. PHP wont parse a 158 meg file to grab 3 lines of stats.

Something has to be done or all those small stats/news/etc pages out there, which REALLY encourage their users, and make the whole project fun, are going to die out.

I think they still need stats divided into individual teams, perhaps each 100 teams in a different directory?!?! and large files for those LARGE sites that need all the data?

All I know right now is that the current system is NOT going to work.

(sorry, I've written the setinews for my team for almost 2 years, and I don't even have a clue where to start with the stuff I'm seeing now, and I know how many members of my team crunch simply because of the news (or so they tell me) )

John McFo
Volunteer tester
Send message
Joined: 6 May 00
Posts: 4
Credit: 52,841
RAC: 0
Germany
Message 1760 - Posted: 25 Jun 2004, 7:34:10 UTC - in response to Message 1754.

A SQL-DB perhaps? (MY, M, MS, POSTGRE, whatever)

Questions and Answers : Wish list : Wish: An alternative statistics folder (like /sah/stats_split), where the files are split in more, smaller segments

Copyright © 2014 University of California