留言板 :
Number crunching :
Requesting Results in XML Output Format
留言板合理
| 作者 | 消息 |
|---|---|
Pappa 发送消息 已加入:9 Jan 00 贴子:2562 积分:12,301,681 近期平均积分:0
|
Lee John has not seen this and had a chance to look at all this information. Read On... what's all this bonfire business about then? anything in particular being burt? the first thing that comes to mind is old, dead computers lolThe wife is going to go visit Sister in April... I would be happy if you have tools to help rebuild the cylinders on the backhoe and buy More Beer, for the Bonfire... In Fact I would be "Honored" flying down for the Bonfire Seen From Space..JohnI still have the wood and many more are prime candidates What UCB has tucked away that they can migrate has not fully be published. As a Silly IT Professional having watched where comptuer have been and where they are now... Would prefer that If John really wants something special that users could end up having to pay for... Then John needs to define that... As for the Bonfire Seen From Space, Click on my name and follow my posts back... The Short Story is that Tony had a small fit of depression and talked about sitting in front of a Bonfire and how it makes people feel "Good!" So while that Thread is buried... It would be good to reserect... Currently there are many "New and Old" User that need to feel "Good" about what they are doing here... With this I am Off to Bed! R/ Al Please consider a Donation to the Seti Project. |
Lee Carre 发送消息 已加入:21 Apr 00 贴子:1459 积分:58,485 近期平均积分:0
|
what's all this bonfire business about then? anything in particular being burt? the first thing that comes to mind is old, dead computers lolThe wife is going to go visit Sister in April... I would be happy if you have tools to help rebuild the cylinders on the backhoe and buy More Beer, for the Bonfire... In Fact I would be "Honored" flying down for the Bonfire Seen From Space..JohnI still have the wood and many more are prime candidates also, pappa, what about the classic servers |
Lee Carre 发送消息 已加入:21 Apr 00 贴子:1459 积分:58,485 近期平均积分:0
|
Lee this is weird!ok, even more weird, links in your most recent post don't appear either, something to do with the http://www.sun.com/products-n-solutions/edu/promotions/index.html URL, but there seems to be nothing wrong with your BBCode, i checked it character by character lol your... (The Nice thing is that there is an) "educational discount" link didn't work either Numbered 1-12 Tests (click "reply" to view the BBCode): 1 http://www.sun.com/products-n-solutions/edu/promotions/ 2 URL 3 http://www.sun.com/products-n-solutions/edu/promotions/ 4 "http://www.sun.com/products-n-solutions/edu/promotions/" 5 =http://www.sun.com/products-n-solutions/edu/promotions/] 6 [url=http://www.sun.com/products-n-solutions/edu/promotions/ 7 url=http://www.sun.com/products-n-solutions/edu/promotions/ 8 url=http://www.sun.com/products-n-solutions/edu/promotions/] 9 10 http://www.sun.com/products-n-solutions/edu/promotions/]link text 11 [url]http://www.sun.com/products-n-solutions/edu/promotions/[url] 12 URL[url] oooook, all display apart from the ones with correctly writen link code :S (numbers 1 & 2) so it's gotta be something with that URL hmmm, ok, this must be a BBCode interpretation error, because the HTML source only has [br] tags, there's no [a] tags at all in lines 1 and 2 ! yet if i edit my original post, it remembered the BBCode that i entered :S Numbered 1-12 Tests (click "reply" to view the BBCode):<br /> 1 <br /> 2 <br /> [url=http://www.google.com/]Google ^^ but other links work, ok this doesn't make sense at all long URL (with link text) long URL with link text http://www.network-tools.com/nslook/Default.asp?domain=setiathome.berkeley.edu&type=255&server=adns1.berkeley.edu&class=255&port=53&timeout=3000&no_recurse=true&go.x=0&go.y=0 ^^^ and other long URLs work too so it's just http://www.sun.com/products-n-solutions/edu/promotions/ that's a problem |
Pappa 发送消息 已加入:9 Jan 00 贴子:2562 积分:12,301,681 近期平均积分:0
|
Tony The wife is going to go visit Sister in April... I would be happy if you have tools to help rebuild the cylinders on the backhoe and buy More Beer, for the Bonfire... In Fact I would be "Honored" flying down for the Bonfire Seen From Space.. In my younger days cussed at a 12 footer that I was "told" to rebuild... LOL... John Al Please consider a Donation to the Seti Project. |
Pappa 发送消息 已加入:9 Jan 00 贴子:2562 积分:12,301,681 近期平均积分:0
|
Lee this is weird! I would hate for the chance for some User that decided that a Donation link was not working... Or where John can find information to purchase a New Sun Server for Seti... It has been stated that Matt L "Sorta" needs some New hardware to help... UCB / Seti is a Sun Shop and in order for you to develope a Portal for the users there would need to be some mutial form of communcations (access to the overworked database). So connectivy aside (now that seti classic has closed there is a bit more left over). It could be doable.. Thank You for Pointing that out... I reposted the links. Al Please consider a Donation to the Seti Project. |
Lee Carre 发送消息 已加入:21 Apr 00 贴子:1459 积分:58,485 近期平均积分:0
|
It has been stated that Matt L "Sorta" needs some New hardware to help... UCB / Seti is a Sun Shop and in order for you to develope a Portal for the users there would need to be some mutial form of communcations (access to the overworked database). So connectivy aside (now that seti classic has closed there is a bit more left over). It could be doable.. Pappa, do you know if one of the classic servers is going to function as a slave DB server that was talked about ages ago, but promptly abandoned due to performance reasons? also the link you posted just before you quoted john doesn't appear for some reason, it didn't appear in my reply either, so i'm guessing it's BBCode being helpful again! |
Lee Carre 发送消息 已加入:21 Apr 00 贴子:1459 积分:58,485 近期平均积分:0
|
prehaps force users to "sign-up" for the service, so that only the stats that need to be accessed are the only ones being requested, via XML of course. An XML version of the page that is, not an XML export (as this just wouldn't be fesable), by using page.php?otherattribute=othervalue&format=xml in very much the way Neil Munday's signeture site is set up, the advantage of Neils way rather than the open boincstats way is that each user can customise things, like which projects to display, and have a custom sig for each project rather than a single general one this would remove a lot of extra load, and with XML verions of pages i don't think it would make much of a difference to UCB, this removing the need for yet more servers, because the other problem is where does UCB put everything? they don't have a proper server-room as can be seen in SETI Photos |
|
Astro 发送消息 已加入:16 Apr 02 贴子:8026 积分:600,015 近期平均积分:0 |
John I still have the wood and many more are prime candidates |
Lee Carre 发送消息 已加入:21 Apr 00 贴子:1459 积分:58,485 近期平均积分:0
|
I don't think they are going to want to put the results info into the stats-dump. They currently dump the teams, users and hosts data as compressed (.gz) XML files. The teams.gz file is the smallest, currently at 3.5 MB. Users.gz is at 24.2 MB, and hosts.gz is a whopping 98.9 MB. IF they were to also generate a 'results.gz' file, it would likely be 100's of MB in size. Not only would it constipate the BOINC database while it was being generated, the bandwidth required to distribute it would quickly become an issue.I wasn't suggesting that they do an export for results/workunits, i was just saying that they don't export that data to the XML DB dump, incase john wasn't aware it's more efficient to serve just the page as XML because that's usually what they user wants, one page, rather than ALL the results in an XML file all the talk about this is about serving various pages as XML rather than some kind of export Now, all this said, the BEST (in my opinion) place for data-collection about results completed is client-side.not if you want some public web-based service for users (it could also give a pending credit summary more efficiently too, john, have you thought about this too?) Now, I know there are various logging-programs out there now, and maybe some of them would implement this (if it is not already being worked on): this is exactly what BoincView does, if you enable the "logging" option
the problem still remains, by doing this it's no longer client side, because you need to access the server, so it would be far more efficient to either do it all on berkeley's servers (not ideal thou) or have a 3rd party site do it (like john's), the other problem is that there should be an XML version of the results page(s) otherwise you're "scraping" stats from the HTML which is very inefficient, and has caused some sites to (rightly) get their "collecting" IP blocked, one of these was just being lazy and not using the XML exports, no excuse for that, especially considering the load it put on the web server!
again, much more efficient to have this all server-side, otherwise you run into all the bandwidth issues you mentioned earlier, or do it on a 3rd party site, otherwise you've got yet something else pestering the SETI servers trying to update yet another table in the already stressed DB, or another DB which would require time and money, more efficient for berkeley to generate the numbers needed (like total workunits, and total time) I am thinking that this might be a great tool for teams to use. And it has the advantages of not involving ANY project-resources except, of course, for the once a day 'results grab' (to get granted-credit data for each result). This puts the burden of stats collection where it rightly belongs.well a user who ran a simple script to collect stats for his team (only a few times a day) got his IP blocked, so imagine what the load of 710,000 hosts all making requests that cause DB access (which would just kill it) this is a definate no-no the burden would end up being on UCB, and wouldn't be the casual "once a day" request that it seems, also all that traffic, as you said yourself, would go thru the UCB network, which i'm sure they wouldn't be too happy about the SETI front-end has a hard enough job keeping up with demands for work, never mind any additional fluf related to credits or monitoring, which would kill the bandwidth, servers, and database, not good also as a whole, using your system, how would i check my WU/day (for example) against others? i don't see any method to do that, hence a web-based solution being the better option, and as someone is actually willing to do it, well, great :) |
Pappa 发送消息 已加入:9 Jan 00 贴子:2562 积分:12,301,681 近期平均积分:0
|
John Hmmmmm I really do like this! From my understanding of the size of the Database and the current hardware that is used to generate the XML ouput which takes an hour of processing time... It has been stated that Matt L "Sorta" needs some New hardware to help... UCB / Seti is a Sun Shop and in order for you to develope a Portal for the users there would need to be some mutial form of communcations (access to the overworked database). So connectivy aside (now that seti classic has closed there is a bit more left over). It could be doable.. From some of my earlier posts in finding out what really runs Seti and a couple of threads attempting to help get another "Sun V40z" through Donations Sorry I could not miss the chance... This with Matt's statement about needed 2+ terabytes to backup Databases (which would also mean Outages would decrease in time or disappear)... You might be able get UCB/Seti to buy into this... Say a Sun 4 proc Opteron (dual core) and about two terabytes of storage... Then it could handle most of what is needed and then still have time for your hourly stat dump... The Nice thing is that there is an Educational Discount... I will be happy to go look up the hardware and post a link if you desire... To: Willy and Toby & other posters to this thread; My wife chastised me for both of my small donations, but it keeps me off the streets... When the Time Comes, I look forward to flying down to Meet Tony and helping with the Bonfire Seen From Space! R/ Al Please consider a Donation to the Seti Project. |
KWSN - MajorKong 发送消息 已加入:5 Jan 00 贴子:2892 积分:1,499,890 近期平均积分:0
|
John, take a look at my post in your other thread regarding the matter, you would be best to contact Dr. David Anderson as the result(s) data isn't part of the exported XML stats I don't think they are going to want to put the results info into the stats-dump. They currently dump the teams, users and hosts data as compressed (.gz) XML files. The teams.gz file is the smallest, currently at 3.5 MB. Users.gz is at 24.2 MB, and hosts.gz is a whopping 98.9 MB. IF they were to also generate a 'results.gz' file, it would likely be 100's of MB in size. Not only would it constipate the BOINC database while it was being generated, the bandwidth required to distribute it would quickly become an issue. Remember, everything *except* result uploads/downloads still uses the UCB network. The reason *why* result uploads/downloads were moved onto the Cogent line (purchased for this very purpose) was that UCB's network was being adversely affected by the S@H traffic, and UCB threatened to bill them for it if it was not reduced back below a cap. Now maybe they COULD move the stats dump files over to the Cogent line now that Classic is shut-down... But even *IF* S@H decides to do this, there is still the big hit that the BOINC database would take during generation. Now, all this said, the BEST (in my opinion) place for data-collection about results completed is client-side. Now, I know there are various logging-programs out there now, and maybe some of them would implement this (if it is not already being worked on):
https://youtu.be/iY57ErBkFFE #Texit Don't blame me, I voted for Johnson(L) in 2016. Truth is dangerous... especially when it challenges those in power. |
Lee Carre 发送消息 已加入:21 Apr 00 贴子:1459 积分:58,485 近期平均积分:0
|
John, take a look at my post in your other thread regarding the matter, you would be best to contact Dr. David Anderson as the result(s) data isn't part of the exported XML stats you might also want to contact Andy K about how his Pending Credit "calculator" works, as i'm sure it would use a lot of the techquiues you're interested in (but requesting UCB to publish an XML version of pages is the best option) |
Toby 发送消息 已加入:26 Oct 00 贴子:1005 积分:6,366,949 近期平均积分:0
|
If you want data for many/all users you must use the XML supplied in the /stats/ directory on project websites. Seti only updates these once/day (and sometimes not at all - like today). I believe it takes well over an hour to generate those files so an hourly export is out of the question. And realistically speaking, no one needs hourly stats updates. The credit granting process is so asynchronous that it doesn't make much sense (to me) to do anything more frequent than once or twice per day. Attempting to poll hundreds/thousands of user stats through the web site (be it HTML or XML) is what will get your IP blocked. But as MajorKong said, you will want to contact Dave for a project of this magnitude. A member of The Knights Who Say NI! For rankings, history graphs and more, check out: My BOINC stats site |
KWSN - MajorKong 发送消息 已加入:5 Jan 00 贴子:2892 积分:1,499,890 近期平均积分:0
|
Yes, what you propose is quite a bit more involved than what I had assumed (keeping a database of your OWN results, hitting the berkeley server once a week, or so). What you propose is quite similar to the new 'account manager' functionality being added; the first account manager (gridrepublic) is due to open shortly. Contact this person: Dr. David P. Anderson Director and architect. Contact him at davea at ssl.berkeley.edu. https://youtu.be/iY57ErBkFFE #Texit Don't blame me, I voted for Johnson(L) in 2016. Truth is dangerous... especially when it challenges those in power. |
JRL 发送消息 已加入:6 Dec 01 贴子:23 积分:4,206,402 近期平均积分:0
|
To: Willy and Toby & other posters to this thread; Thanks for your input. Here is the reason behind my question. I am the CEO of a software Company that would like further the goals of Berkley, BOINC, and their distributed computing projects. I am not a programmer myself, thus my very non-technical questions. I have written a high-level whitepaper on developing a web-based portal that would have a dedicated resource or resources devoted to creating a portal that will take all of the information distributed by BOINC and give you (the user) the ability to create your own customized portal with several pages of data with the information YOU want to see. It will also (in the beginning give you the ability to perform other analysis that is not available on any of the other third party web-sites I have researched. In order to finish the whitepaper for presentation to the Board and approval, I have been posting these rather simplistic questions. There will also be a new “patented†feature that will I know will display certain information that (almost) all users will be excited about and attempt to reinstate some of the feature that vanished with classic SETI Now having said that, will there be a problem with a complete XML dump at hourly intervals for the purpose of updating each user’s portal and give allow me to track donated CPU time? Perhaps I should contact Berkeley directly? Many Thanks, John Do this and your IP will surely be blocked by Berkeley. The XML outputs are here to prevent the scraping of HTML pages. John R. Lee, Jr. Akuratus Corporation, CEO |
Toby 发送消息 已加入:26 Oct 00 贴子:1005 积分:6,366,949 近期平均积分:0
|
Do this and your IP will surely be blocked by Berkeley. The XML outputs are here to prevent the scraping of HTML pages. Not if you are only doing a single user's results and only once or twice a day. The sites that were blocked were scraping thousands of user/team stats which are already available in XML format. However HTML scraping is still kind of a nasty way of getting data. How about BoincLogX? This is a client-side utility that logs information about work units as they are processed by your client. A member of The Knights Who Say NI! For rankings, history graphs and more, check out: My BOINC stats site |
|
[BOINCstats] Willy 发送消息 已加入:4 Mar 01 贴子:201 积分:152,243 近期平均积分:0
|
Well, I have just gone over the BOINC source code, and I just don't see this function as being available. Perhaps you could write a perl script that would download this page in html, strip out the html with regexps, then place the remaining data into a database. This is how I used to handle a stats site I ran. Do this and your IP will surely be blocked by Berkeley. The XML outputs are here to prevent the scraping of HTML pages. Join team BOINCstats |
KWSN - MajorKong 发送消息 已加入:5 Jan 00 贴子:2892 积分:1,499,890 近期平均积分:0
|
Thanks for your response. I have this information in XML format, I am looking for the page that lists the total CPU time taken to return each work unit. The XML feeds this page: Well, I have just gone over the BOINC source code, and I just don't see this function as being available. Perhaps you could write a perl script that would download this page in html, strip out the html with regexps, then place the remaining data into a database. This is how I used to handle a stats site I ran. Sorry I couldn't be of more help. Oh, and greetings from Lewisville, Tx. https://youtu.be/iY57ErBkFFE #Texit Don't blame me, I voted for Johnson(L) in 2016. Truth is dangerous... especially when it challenges those in power. |
JRL 发送消息 已加入:6 Dec 01 贴子:23 积分:4,206,402 近期平均积分:0
|
Thanks for your response. I have this information in XML format, I am looking for the page that lists the total CPU time taken to return each work unit. The XML feeds this page: http://setiathome.berkeley.edu/results.php?userid=2084190 My goal is to track time WU was sent, CPU time to complete WU, and Time WU was returned to the project. Thanks for your help! John To all, John R. Lee, Jr. Akuratus Corporation, CEO |
KWSN - MajorKong 发送消息 已加入:5 Jan 00 贴子:2892 积分:1,499,890 近期平均积分:0
|
To all, http://setiathome.berkeley.edu/show_user.php?userid=2084190&format=xml This what you are looking for? <user> <id>2084190</id> <cpid>8788594b6382858ff50920512465d0e3</cpid> <create_time>1007648011</create_time> <name>Team: Lee</name> <country>United States</country> <total_credit>278624.629705</total_credit> <expavg_credit>2861.471542</expavg_credit> <expavg_time>1136093760.3394</expavg_time> <teamid>118944</teamid> <url>www.akuratus.com</url> <has_profile>1</has_profile> </user> EDIT: Oh... you wanted the results... gimmie a bit. https://youtu.be/iY57ErBkFFE #Texit Don't blame me, I voted for Johnson(L) in 2016. Truth is dangerous... especially when it challenges those in power. |
©2020 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.