Devs - Suggestion for improved web performance

Message boards : Number crunching : Devs - Suggestion for improved web performance
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 228798 - Posted: 10 Jan 2006, 10:42:55 UTC

i remember reading a while ago that there were some problems with the seti site, and with the various DB problems too, i'd like to suggest that the use of web caching be implemented/enabled, by that i mean turning on the various options in your web server config to add http headers so that country-level and local-level cache's may work to help you

i notice that no caching techniques are used at all, and by enabling them, you'll see a much improved web service, that users will appreciate too, as pages will seem to load faster, and there will be less load on the BOINC DB because cached versions of pages will be used instead of being re-requested in their entirirty

there is a notably good guide about web caching that is recommended by many "web development" sites
ID: 228798 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 228802 - Posted: 10 Jan 2006, 10:52:05 UTC

Hi Lee.
I'm not sure how this would work even having looked at your guide. Are you suggesting that caches would be established outside of UCB? Wherever they are how do they work given the dynamic content of forums and credits etc?
Thanks
Ian

ID: 228802 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 228806 - Posted: 10 Jan 2006, 11:39:15 UTC - in response to Message 228798.  

i notice that no caching techniques are used at all, and by enabling them, you'll see a much improved web service, that users will appreciate too, as pages will seem to load faster, and there will be less load on the BOINC DB because cached versions of pages will be used instead of being re-requested in their entirirty


Some of the pages are already cached, example if you looks on SETI.Germany and looks in lower-right corner you'll see "Generated 10 Jan 2006 10:49:20 UTC", and if you looks on the Account data for their top producer, Hans Dorn, you'll see "Generated 10 Jan 2006 9:49:54 UTC". Even if you looks on his computer-list you'll see it's a cached copy.

But, many of the web-pages isn't being cached, like the info an user sees when logins to his own page, and all the result-overview isn't being cached. This info can't really be cached, for if it's cached it's not really very useful at all to track down if there's any problems with a computer or something.

The same is the case with the forums, there's no good way to cache something that is nearly constantly changing.
ID: 228806 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 228820 - Posted: 10 Jan 2006, 12:38:03 UTC - in response to Message 228802.  
Last modified: 10 Jan 2006, 12:38:34 UTC

I'm not sure how this would work even having looked at your guide. Are you suggesting that caches would be established outside of UCB?
web cache "servers/systems" are already established, many ISPs use transparent proxies as web cache's to provide a faster service to users, and i'm sure ISPs use cache's at their "peer points" where they connect to each other for improved performance

there are also cache's that serve whole coutries, and from what i've read on the system, they're needed, because considering the amount of information transfered daily over the internet, it simply wouldn't be feasable with out all these cache's dotted about

what i'm suggesting is nothing to do with additional hardware, i'm suggesting that UCB enable options in the server config that will add HTTP headers which will allow these cache's to work (currently they can't, because of the lack of cache-ability info about pages, so they can't cache anything

Wherever they are how do they work...
ok, i'm guessing you're after some basic/background info, so i'll start by discussing the very basic stuff...
the internet is just a large network, lots of devices connected up together that can all communicate with each other.

the web is another "layer" on top of the internet, an "internet application" if you will (similar in the way your web browser runs on top of your operating system)

when you request a web page (keeping it simple here) your browser finds the server, contacts it, and requests the page, the page is sent and your browser displays it, great.

that was fine in the early days, but as more and more people used the web, and the internet for that matter, more and more bandwidth and server resources are needed to serve content at a decent speed, so peopled worked on ideas to improve efficincy, and reduce the bandwidth needed, one of them is the idea of "caching", and in the context of the web it's to keep copies of web pages locally.

"locally" can mean on the computer the page is being view on - a browser cache, and/or on a network server that the browsers of the internal network contact and ask for content, most commonly called a "proxy server". A network proxy exploits many good things about caching, the main one being, if PC1 requests a page, then PC2 requests the same page, the proxy can just send it's locally stored copy over the (faster) internal network, rather than the page having to be retransmitted over the net, thus resulting in an almost instant display of the page, this is makes users happy because things are faster, and it enables the network to have a slower connection to the internet that it would need otherwise
I worked for a company that had a few hundred employees, each of whom had a computer, and most of whom used the web most of the day to do their work, and with a properly configured proxy server, they only needed a 1 Mbps internet connection, which was much faster than before they had a proxy, even with a ~5 Mbps connection ;)

for all this to work, any cache needs to know information about web content, the most basic principal is if the copy they've got stored locally is the same as the copy on the server, is the local copy "fresh".

now, i won't try to give a history lesson here, because many things have changed, and it's going a bit off track too,
the current possibilities include being able to specify when a document was last changed (last-modified: "date and time"),

when the document/file is no longer "fresh" (expires: "date and time"),
(you can use "conditional requests" with the above, which will only download the page if it's changed, but "expires" is only accurate to a second, so for rapidly changing content (like a forum thread) it has some flaws, but these will be discussed later...)

how long a document is to be considered "fresh" and should be checked on the server again (but only the headers are downloaded, not the whole document, unless the document has changed, but that's the purpose of the check) (cache-control: "number of seconds from 'now'") you can do a veriety of very useful things with cache-control, and can control how a cache handles your content with more control, the main one is giving the ability to give an indication to cache's of how long you expect the file/document to remain fresh, images for example should have a long time before they expire, because they're unlikely to change (as mentioned in the "tips" of the guide linked, if you need to "change" and image, but have a long time before it expires, you can specify a new image, and with a relatively short expiration time on the document refering to the image, the new image will be used instead, if the change isn't important you can still just replace/change the original image file and it'll be distributed over time (good if you have server load/bandwidth restrictions and don't want lots of people downloading your new image)

there are a few other useful headers as well, one is called "ETag" which solves the problem with "expires" only being acurate to a second, ETag is different in that it's a hex string generated by the server, and changes everytime the document is changed, so that there's no dispute if a document has changed or not. If the content changes, the ETag changes, which allows more precise contidtional requests to be made... "if the ETag has changed, then re-download the document"

armed with all this information (and the more the better) a site can make good use of cache's already available, and have a much more responcive site, using much fewer resources. Google for example doesn't use any of these options, but they spend a ton of money deploying more servers around the world, seems kinda stupid doesn't it

...given the dynamic content of forums and credits etc?
right, dynamic content by it's nature is harder to cache, because it changes so often/quickly, but that doesn't mean it's "uncacheable" it just means that you can't cache it for as long, but there are many old threads that reach a point where they don't change for months, if they even change ever again, and these would benifit most from caching, think how many DB requests need to be made for a 300-post thread, quite a few i'd imagine, caching can help with reducing the load of UCB's stressed DB server as well, which i'm sure will make matt and co happier too

but a recent forum thread isn't constantly changing, and even if you have constantly changing content on a server, having caching headers won't make things any worse than if they weren't there, but by leaving them out, you're removing the posibility of a cache helping. when the content changes, it's redownloaded, but if it doesn't change, then it can be served by caches

but a thread will be static for at least some amount of time, and there are usually many more "views" than "posts" in a thread, and every view that can be served from a cache because nobody posted, thus the thread hasn't changed, will reduce the load on the whole system and make for a faster experience for all users (because the web server and DB won't be so busy), i recall quite a few threads commenting on website/forum performance, caching will help speed everything up

even if caching wasn't used for the forum (although there's no reason why it shouldn't be) it can be used for all the static files. The RSS feed, front page, and all the images are the first things that come to mind, as these are probably the most frequently requested files, but any traditional "static" (rather than "dynamic") content will benifit greatly from using all the cacheing headers, but at present not even the home page has any of these options enabled, it's downloaded in it's entirity everytime someone wants to view the page

for PHP coding, it's a bit different, becuase the php file is giving different content depending on the request made (all threads use "forum_thread.php" but to see different threads the "?id=26856" at the end changes to indicate the "ID" of the thread the user is asking for, so things have to be done internally as part of the PHP code, but that's not too difficult, and details are given in the caching guide (link in original post)

suggestions: all static pages (regular HTML, images, the RSS feed etc.) i'd suggest using all the options to allow for best use of caches

for frequently changing content such as the forums (list of threads) and threads (list of posts) i'd suggest leaving cache-control out, and just using the "ETag" and "last-modified" headers, the browser will be able to use these to check if a page has changed or not, rather than trying to give it an estimate of when it might change, because you don't know, and the frequency of change will reduce over time, so telling it "this page is "fresh" for xxxx seconds" isn't a good idea, and could mess up the order of posts and such, so just use ETag and "expires"

(by the way, although the value for "expires" needs to be an absolute date, you don't have to set it manually, you can get the server to generate it dynamically at say an hour from the time it was served, you can also have seperate values for cache-control and expires, cache-control is used to tell the cache how long it can safely serve content without checking it's freshness with the server, expires specifies when content is defenetly no longer fresh, and should at least be checked on the server, and redownloaded if changed, if not just download it anyway.
Different cache's handle this differently, to explain "expires" is the older header "cache-control" is the newer one, it's the best practice to use both, that way you provide backwards compatibility to older caches, and enable the new features that cache-control brings.
Some cache's consider "expires" to mean "this content has expired" and will redownload it from the server, more intelligent cache's will only check if it's changed, and only download it again if it has changed. They'll also update the local copy with the header information it got while checking (so it'll be "fresh" untill the date/time specified in the headers it recieved in it's most recent "check" request).
So if cache-control set to an hour, when that hour is up, it'll check to see if the copy it has is still "fresh" compared to the one on the server (using last-modified and/or ETag), if it hasn't changed it's carry on using the loacl copy for another hour (this cycle repeats untill/unless the content changes or the date from "expires" is reached), then after another hour it checks again, and if the content has changed (by means of a changed last-modified and/or ETag) it'll redownload the content again

I hope that the above now allows the non-tech folk to make more sense out of the cache guide in my original post
If anyone would like further explaination, or has any questions, please feel free to fire away and i'll offer my advice.

Phew, what a long post, i'm off to grab a beer after all that ;)
ID: 228820 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 228827 - Posted: 10 Jan 2006, 13:05:03 UTC - in response to Message 228806.  
Last modified: 10 Jan 2006, 13:09:30 UTC

Some of the pages are already cached, example if you looks on SETI.Germany and looks in lower-right corner you'll see "Generated 10 Jan 2006 10:49:20 UTC" and if you looks on the Account data for their top producer, Hans Dorn, you'll see "Generated 10 Jan 2006 9:49:54 UTC". Even if you looks on his computer-list you'll see it's a cached copy.
yes, i noticed that, but that's just page content.
Assuming that the DB generates this every so often (rather than dynamically when the page is requested) and keeps it in some cache somewhere, it'll only reduce the load on the DB (although it still has to serve the request from cache) it won't have any effect on serving the web page, the 2 are seperate things, i'm focusing more on sending web content out over the net, rather than any internal imporvements at UCB (not that internal caching is a bad thing, but the generated web page still has to be resent over the net)
also looking at the HTTP headers for these pages, there are some caching headers used
Cache-Control: public, max-age=7200
Last-Modified: Thu, 01 Jan 1970 00:00:00 GMT
Expires: Thu, 01 Jan 1970 02:00:00 GMT

the cache-control one is good, it says it's "fresh" for 7200 seconds, which is 2 hours, but this should match the frequency that the page is generated, and expires should be the date/time that it'll next change (ideally), so if this page was generated at 00:00:00 on monday, then cache-control should be "max-age=86400" and expires should be 00:00:00 on tuesday.
but that doesn't matter because it's overriden by last-modified and expires, which both specify a date in the past, so the page will always be redownloaded from the server (no caching used at all) because according to the server, the page has expired already, so a cache can't use it, this needs to be fixed for it to work, and as mentioned in my long post the server can generate the date dynamically (so it can specify a date/time that is relative to when the content was served, say 6, 12, or 24 hours, which ever the admins wish to use, but even a short expires value will help greatly, but preferably longer, if they're only generated daily, then a 24 hour value would be ideal, but this will need to be done as PHP coding, changing the last-modified and ETag when the page changes

But, many of the web-pages isn't being cached, like the info an user sees when logins to his own page, and all the result-overview isn't being cached.
as i've said, from a web point of view, none of the pages are cachable, the headers either aren't there, or they prevent caching (an expires value in the past), but it's entirely possible, having the caching headers including won't have any negative effects if the server is configured properly, and if the content is static for even 5 minutes (on a busy thread this will make a big difference), if the changes are applied to the whole site, i'm sure they'll see a large drop in bandwidth usage, and CPU usage on the web server, even if these aren't problems for UCB doing so will give users a much more enjoyable experience, it takes a good minute to load a long thread even over a fast connection, with caching enabled this will be almost instant the majority of the time
This info can't really be cached, for if it's cached it's not really very useful at all to track down if there's any problems with a computer or something. The same is the case with the forums, there's no good way to cache something that is nearly constantly changing.
ah, maybe i didn't explain too well, so i appologies, but a for something that you know is dynamic and will change often (especially if you don't know when it'll change, or how often it will either) it's still more than possibly to have efficient caching
even the most basic caching techniques don't just keep content for a length of time they decide, and only check every day/hour or something (but cache-control allows you to tell a cache to keep something for a length of time you specify, and after that it should check again)
for the forums, and result pages etc. (all dymanic content) as i suggested, leave out cache-control, and just use last-modified, ETag and expires, then the browser will always check freshness when the page is viewed by the user, and if the page has changed, it'll download the new one, so caching won't have any of these "expired/out of date/old/ content" promlems you mention
ID: 228827 · Report as offensive
Profile Tigher
Volunteer tester

Send message
Joined: 18 Mar 04
Posts: 1547
Credit: 760,577
RAC: 0
United Kingdom
Message 228846 - Posted: 10 Jan 2006, 14:02:52 UTC

Lee what a great explanation. A lot of effort from you in doing that and thanks. I wish I had been a bit more sincere and said my question was "tongue in cheek" as I know about caching. I feel guilty now. Sorry to have put you to that trouble.

But as you say many ISPs (most I think) employ caching so if it can be cached then surely it already is just so. Given how people here squeal when the very latest result upload didn't show a credit straight away I guess that try to cache anything dynamic would cause very much louder squeals.

Again thanks.

ID: 228846 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 228856 - Posted: 10 Jan 2006, 14:30:19 UTC - in response to Message 228846.  
Last modified: 10 Jan 2006, 14:35:41 UTC

Lee what a great explanation. A lot of effort from you in doing that and thanks. I wish I had been a bit more sincere and said my question was "tongue in cheek" as I know about caching. I feel guilty now. Sorry to have put you to that trouble.
Haha, no worries, that made me laugh, the info's there for others to read (hopefully "others" will include the devs ;) )

But as you say many ISPs (most I think) employ caching so if it can be cached then surely it already is just so.
well, my point was that a "caching inferstructure" already exists, but a cache will only keep and use local copies if it has the relevent info from the server (otherwise how is it ment to know how long to keep it, and when it'll expire etc. or when it's changed)
so unless the server gives out the info needed, no caching occours (well, it shouldn't occour)
hence suggesting that the web server admins/devs take a look at this, because it's quite simple to do

also if BOINC is aimed at projects without much money, then this will help reduce the bandwidth needed, and thus save even more money :)

Given how people here squeal when the very latest result upload didn't show a credit straight away I guess that try to cache anything dynamic would cause very much louder squeals.
nope, won't be a problem as long as the server is configured properly, because as long as you don't tell it "keep this for xx amout of time" (using cache-control) and only use last-modified, expires and ETag headers, then anyone viewing a page, will always see the most recent version, because the browser will know when it's been updated (because the headers will have changed, and without cache-control it'll check the fresheness every time the page is viewed)

so the idea of "out of date" content isn't a problem at all, because as soon as the page changes, then the last-modified and ETag headers will change, even if that wasn't the case, by using expires you force the browser to at least check the freshness of the content again at the specified date
and all as long as the content is the same, it can be served from caches, which will make everything faster :)
ID: 228856 · Report as offensive
Profile [B^S] Paul@home
Volunteer tester

Send message
Joined: 20 Dec 99
Posts: 121
Credit: 1,885,420
RAC: 0
Ireland
Message 228859 - Posted: 10 Jan 2006, 14:40:45 UTC - in response to Message 228856.  

Haha, no worries, that made me laugh, the info's there for others to read (hopefully "others" will include the devs ;) )


Well I'm no dev, but i read it and found it interesting! :)

Paul.

Wanna visit BOINC Synergy? Click my stats!

Join BOINC Synergy Team
ID: 228859 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 228915 - Posted: 10 Jan 2006, 17:54:26 UTC - in response to Message 228859.  
Last modified: 10 Jan 2006, 18:33:12 UTC

Well I'm no dev, but i read it and found it interesting! :)

Paul.
thanks Paul :) glad to know half an hour of typing was worth while ;)
would all those that agree with my first post give it a +1

another thought regarding the freshness of content concerns, if users really need to make sure they have the most current version, they can always manually refresh the page in their browser, which will redownload the page, and all content on it (all images, completly) from the server(s)
so again, there isn't an issue with this, and even so, if it's implmented properly (the php code changing the last-modified and ETag values when the page changes) you won't have to use your F5 key at all, because things will work as they should anyway, like i said, for dynamic content, don't use cache-control so that the browser always checks for freshness, and if the page has changed, it'll redownload it :)
and besides, even if it were an issue, the results pages don't change too often anyway, and there's always F5 just incase ;)

i've had a quick flick thru a few pages of the number crunching forum, and on even just the first page, there are threads with less than 100 posts, but with at least 1000 views, one with 70 something posts, and nearly 2500 views, each time having to be requested from the DB when it could be served from cache
so even the forum could benifit greatly from using caching, and it's far from "uncachable"
ID: 228915 · Report as offensive
Profile AndyK
Avatar

Send message
Joined: 3 Apr 99
Posts: 280
Credit: 305,079
RAC: 0
Germany
Message 229000 - Posted: 10 Jan 2006, 21:47:20 UTC - in response to Message 228915.  

...thanks Paul :) glad to know half an hour of typing was worth while ;)

Yes, it was worth the effort!

would all those that agree with my first post give it a +1

So I did. :-)

AndyK
Want to know your pending credit?


The biggest bug is sitting 10 inch in front of the screen.
ID: 229000 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 229010 - Posted: 10 Jan 2006, 22:09:21 UTC

Just so everyone knows, i've emailed matt with a link to this thread, just so that we're not all pestering him with multiple emails ;)
ID: 229010 · Report as offensive
Profile AthlonRob
Volunteer developer
Avatar

Send message
Joined: 18 May 99
Posts: 378
Credit: 7,041
RAC: 0
United States
Message 229558 - Posted: 11 Jan 2006, 18:59:38 UTC - in response to Message 229010.  

Just so everyone knows, i've emailed matt with a link to this thread, just so that we're not all pestering him with multiple emails ;)

Matt mainly does SETI@Home work - I think we're discussing more general Boinc stuff, specifically the web/forum code. That's Janus' area of expertise. I help a bit, too... and David has done a bit of code there.

Give me a few minutes and I'll write up a response to your main point. :-)
Rob
ID: 229558 · Report as offensive
Profile AthlonRob
Volunteer developer
Avatar

Send message
Joined: 18 May 99
Posts: 378
Credit: 7,041
RAC: 0
United States
Message 229577 - Posted: 11 Jan 2006, 19:16:57 UTC

Let me begin by saying I really like the idea of speeding up the forums, reducing server load, and doing things the Right Way(tm).

I hadn't looked at implementing any kind of caching in the forums for reasons already mentioned in this thread. The content is constantly changing, so why cache it? However, Lee makes a valid point: It's viewed a whole lot more than it's changed.

The first issue that popped in my head was how do we know if the page changed (with a hash (ETag) or remembered "last-modified" field) without generating the page? And I don't think it's been addressed here. The resources we need to save are database resources. Saving bandwidth is a minor issue (although one important to dialuppers) and more easily solved with simple compressed output.

Well, we're going to need to do another database query to find out when the page last changed. We can then report that to the browser/caching proxy/whatnot. But by the time we've done that query and outputted that information, our PHP script is going to be running... we can't just stop it, can we? Once it's running, it's going to grab the entire thread and go from there...

The more I think about this, the more I'm beginning to understand why Janus hadn't implemented this before. It's a can of worms...

However, I think we can utilize caching here in the forums to speed things up for people. From the code point of view, we'd need to modify Janus' server-side caching implementation to do some additional checks before it decided if it was to use a cached copy of the page or do the regular DB queries. It would need to have three versions of the page cached, server-side... one for regular logged in users, one for special logged in users (moderators), and one for users who aren't logged in. Ugh, no - user preferences change how the page is viewed... you may have images enabled while I have images disabled. You may have signatures enabled while I have signatures disabled. You may have "Show newest first" while I have things sorted to show the oldest post in a thread first.

How would we work around that?

The content is very dynamic, not only changing from one minute to the next, but changing from one user to the next. Yes, we *could* cache individual versions of the page server-side for each set of preferences to view the page and pick out the right one for the person viewing it (and then have a real copy of the page to use an ETag on)... but that would be ugly and save relatively few DB queries.

It seems caching could save us some bandwidth... but projects generally have plenty of that to spare for things like web pages. The forums eat up virtually no bandwidth compared with things like workunit downloads or science application downloads. It's the database we need to save and I'm not convinced there's a caching scheme that will reduce the DB server load effectively.

I'm certainly open to suggestions, though! :-)
Rob
ID: 229577 · Report as offensive
Profile AthlonRob
Volunteer developer
Avatar

Send message
Joined: 18 May 99
Posts: 378
Credit: 7,041
RAC: 0
United States
Message 229587 - Posted: 11 Jan 2006, 19:23:34 UTC - in response to Message 229577.  

Well, we're going to need to do another database query to find out when the page last changed. We can then report that to the browser/caching proxy/whatnot. But by the time we've done that query and outputted that information, our PHP script is going to be running... we can't just stop it, can we? Once it's running, it's going to grab the entire thread and go from there...

Actually, no, this point is addressed in the link Lee provided:
If you can’t do that, you’ll need to make the script generate a validator, and then respond to If-Modified-Since and/or If-None-Match requests. This can be done by parsing the HTTP headers, and then responding with 304 Not Modified when appropriate. Unfortunately, this is not a trival task.

...which is doable and would work for single-host type caches. However, we're still doing authentication and the output varies based on the results of that authentication, rendering the ability to share a cache between multiple users with issues.

I don't want you to get my cached copy - mine has stuff in it yours doesn't.
Rob
ID: 229587 · Report as offensive
Profile Paul D. Buck
Volunteer tester

Send message
Joined: 19 Jul 00
Posts: 3898
Credit: 1,158,042
RAC: 0
United States
Message 229608 - Posted: 11 Jan 2006, 19:40:23 UTC

Using a standard technique would likely not work. However, you could cache the basic content with placeholders for the various pieces that are potentially suppressed. But, you would be saving database reads for the price of file reads. I doubt, as you have said, that it would be worth the effort.

Especially with the content being so dynamic as it is issued.
ID: 229608 · Report as offensive
koldphuzhun
Volunteer tester

Send message
Joined: 19 Feb 01
Posts: 69
Credit: 288,938
RAC: 0
United States
Message 229635 - Posted: 11 Jan 2006, 20:07:13 UTC

One thing I think people are failing to point out is the fact that by caching something, you are making a copy of it in a place that can be accessed faster and easier (i.e. ram). Caching works this way in all respects. So, by caching the websites and stuff, how many more resources will be needed? We have a server that has obviously been strained on resources in recent times. Why start doubling the requirement for it? If it were something where there were only like 47% of resources being used (total), then I could see setting up some caching (possibly bringing the total to say 72%). But suppose we have a system already using 68% of it's resources. If we brought it up to 91% by caching, look how much room there is for error. The possibility of a system failure becomes more likely, there's a chance of data mixup, and, how do they clear out everything in a quick and effective manner? Caching may be a good idea in some places, but in our situation, I think it's something in need of a waiting time.
Matt
ID: 229635 · Report as offensive
koldphuzhun
Volunteer tester

Send message
Joined: 19 Feb 01
Posts: 69
Credit: 288,938
RAC: 0
United States
Message 229647 - Posted: 11 Jan 2006, 20:21:22 UTC
Last modified: 11 Jan 2006, 20:23:56 UTC

Another quick idea. Maybe instead of caching, one could use "smart programming". With the files on database, they're in permanent record. Each time a page is loaded, the database accesses this record. So to post something, a person accesses this three times (once for initial read, once for posting and once for post-post). Why not write a program that latched onto certain "selected" threads and then downloaded those threads to a clientside program (much like RSS feed). Whenever a person wanted to view the thread, they could just load the program and view it for stuff. If the thread needed updating, there could be some kind of notice sent out (much like Java chat programs) that dynamically updated the clientside program with the 1 post, or the person could update the posts every so often and the server (which already tracks what has/hasn't been viewed) would only send the new stuff. This would reduce server load, make viewing easier, and all around make the forums a somewhat more efficient place. Maybe while they were at it they could put in a live chat program so that a person or group could get together and give help, clarify points or flame each other without it going into the forum database (maybe let other people watch like chat programs). So if anyone can code these:

Forum reader

  • Updates or actively watches threads
  • Has a convenient post window where a person can type their post and click a "post it" button to post and probably update any newer posts
  • Has built in chat channel capabilities
  • Has built in text editing (word processor style)
  • Has built in preference settings (to alleviate some tracking on the database's part)
  • Is all client based (except for database updates)



Basically a client based GUI for the forums. Have fun!


Matt
ID: 229647 · Report as offensive
Profile AthlonRob
Volunteer developer
Avatar

Send message
Joined: 18 May 99
Posts: 378
Credit: 7,041
RAC: 0
United States
Message 229658 - Posted: 11 Jan 2006, 20:48:58 UTC - in response to Message 229635.  

One thing I think people are failing to point out is the fact that by caching something, you are making a copy of it in a place that can be accessed faster and easier (i.e. ram). Caching works this way in all respects. So, by caching the websites and stuff, how many more resources will be needed?

As I tried to explain earlier... the primary concern is with the database server. The DB server is always under more load than it should be. Web servers are under a relatively light load (most of the time) comparitively.

If we came up with a solution that doubled the webserver's load but halved the db server's load I think we'd all jump on it. :-)

So we don't care so much about the resources except on the database server...
Rob
ID: 229658 · Report as offensive
Profile AthlonRob
Volunteer developer
Avatar

Send message
Joined: 18 May 99
Posts: 378
Credit: 7,041
RAC: 0
United States
Message 229660 - Posted: 11 Jan 2006, 20:51:43 UTC - in response to Message 229647.  

Basically a client based GUI for the forums. Have fun!

Nice idea... and one that's been done with other types of forums... but to come up with something like that here, basically from scratch, would be a lot of work.

If somebody would like to do it, more power to 'em - but developers don't seem to be something we have an overabundance of.
Rob
ID: 229660 · Report as offensive
Profile Lee Carre
Volunteer tester

Send message
Joined: 21 Apr 00
Posts: 1459
Credit: 58,485
RAC: 0
Channel Islands
Message 229845 - Posted: 12 Jan 2006, 0:25:48 UTC - in response to Message 229635.  
Last modified: 12 Jan 2006, 0:26:28 UTC

One thing I think people are failing to point out is the fact that by caching something, you are making a copy of it in a place that can be accessed faster and easier (i.e. ram). Caching works this way in all respects. So, by caching the websites and stuff, how many more resources will be needed? We have a server that has obviously been strained on resources in recent times. Why start doubling the requirement for it?

when i say caching, i mean web caching, as in browsers and the dedicated machines out on the internet caching content, not caching it on the server

if a sucessful caching method can be implemented it will actually reduce the load on the server, not increase it, because by using caches, there are fewer requests to the server, and less than 100% of them are full GET requests, most will (or should) be HEAD requests, which are much easier on a server
ID: 229845 · Report as offensive
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : Devs - Suggestion for improved web performance


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.