Message boards :
Technical News :
Calm a Llama Down (Feb 13 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
I'm realizing the server status page is giving a slightly bogus picture of our current server setup, and it's actually too much work right now to fix the status script, so I'll just tell you now what the current situation is: our public web server is thinman, our scheduling server is ptolemy, our upload server is bruno, and our download server is bane. None of these currently a redundant twin or a "hot" backup (but we have vader and maul all set up to be a replacement for any of the above if need be). More on that below Our primary/secondary BOINC (mysql) database servers are jocelyn/sidious, and our primary/secondary SETI science (informix) database servers are thumper/bambi. Specs for all these are correctly noted on the status page. We have other systems employed for less interesting but important things, but that's basically the meat of it. If we could double the CPU/memory/disk space on everything we have we'll be set (for the time being). Anyway.. things are looking better. Weekly outage recovery is still a little weird - I don't think our single download server (bane) can handle such crunch periods alone so we'll probably bring vader back into the fold for that. The other servers are super happy given the recent changes to reduce NFS traffic. I enacted some more such changes this morning. This tweaking, coupled with server ewen (where Eric does his Hydrogen work) crashing and hanging the network a bit, made for a slightly bumpy ride this morning. However, between smoother seas and perhaps running "update stats" on a couple signal tables made the assimilators much faster. We'll finally catch up on that queue in a couple hours I think. Due to the reduced dropped connections on the scheduling/upload servers it seem that the router got more cycles to spend on downloads, and we reached almost 70Mbps last night. Still need to get that new router going... Other than that - more mail drudgery. As much as I like computers, I hate when perfectly good but nevertheless wonky solutions to small problems become the foundations for advanced development, thus amplifying the original wonky-ness. Oh yeah - Eric sent some graphs around. Looks like the radar blanking code is working. Neat. Jeff's working that code into the splitter now so we can retest that small data file and compare results. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
|
[BAT]ptagi Send message Joined: 7 Mar 07 Posts: 4 Credit: 61,338 RAC: 0 |
I'm facing problems uploading my finsihed work. It keeps telling me project servers unavailable. |
Kenn Benoît-Hutchins Send message Joined: 24 Aug 99 Posts: 46 Credit: 18,091,320 RAC: 31 |
SETI@home Wed 13 Feb 1943:09 2008 Sending scheduler request: Requested by user. Requesting 283755 seconds of work, reporting 98 completed tasks SETI@home Wed 13 Feb 19:43:14 2008 Scheduler request succeeded: got 0 new tasks SETI@home Wed 13 Feb 19:43:14 2008 Message from server: No work sent SETI@home Wed 13 Feb 19:43:14 2008 Message from server: (reached daily quota of 2 results) What does that mean? I have my boinc manager set to retain three days of work. Kenn Kenn What is left unsaid is neither heard, nor heeded. Ce qui est laissé inexprimé ni n'est entendu, ni est observé. |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
Updating statistics and indexes is regular DBA maintenance, just like backups (though not as often). If updating one table helped this much, doing the rest of the DB should help significantly too. Hard to say over the internet what the best interval for maintenance should be. When it gets slow, the DBA just does it. :) That said, to keep the users off my case, I usually run weekly or bi-weekly for hard-core production servers (e.g., Oracle, MSSQL). I run monthly for not so used databases. Also, I found an empty splitter file 24ja07af (zero bytes) on the status page. Is this any concern? Is this what happens before the file is "filled up"? |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
SETI@home Wed 13 Feb 1943:09 2008 Sending scheduler request: Requested by user. Requesting 283755 seconds of work, reporting 98 completed tasks Kenn: This means your computer has gotten too much work but not returned enough successful work units. Make sure your WU are completed with success and your machine is not crashing or such. Over time, SETI servers will let you download more work with such validated successful WU (you have to wait for credit/validation). If you need more help, please search the forums or ask your question in the "Number Crunching" forum. |
Shane Meyer Send message Joined: 22 Jan 00 Posts: 126 Credit: 31,280,265 RAC: 42 |
Kenn Stop aborting WU's during downloading just let them come through they will eventually!! Or detaching you need to complete some units for your download limit to be restored |
Jesse Viviano Send message Joined: 27 Feb 00 Posts: 100 Credit: 3,949,583 RAC: 0 |
SETI@home Wed 13 Feb 1943:09 2008 Sending scheduler request: Requested by user. Requesting 283755 seconds of work, reporting 98 completed tasks Because your computer seems to have wasted a bunch of work units due to aborted downloads, your computer's quota was reduced by one for each result that was wasted in this manner. BOINC throttles computers that generate invalid results mostly as a safety measure, so that a computer that is flaky due to overclocking, overheating, or bad hygiene (Yes, computers can accumulate dust, so open them up and clean them out from time to time so they don't overheat) cannot cause too much damage to a project. This damage can cause good work units to be tossed out because BOINC tosses out work units that accumulate too many errors. Each result successfully processed will double your quota until it reaches the administratively set maximum quota or goes beyond it, in which it is forced back to just the maximum quota. |
Fred J. Verster Send message Joined: 21 Apr 04 Posts: 3252 Credit: 31,903,643 RAC: 0 |
I'm realizing the server status page is giving a slightly bogus picture of our current server setup, and it's actually too much work right now to fix the status script, so I'll just tell you now what the current situation is: our public web server is thinman, our scheduling server is ptolemy, our upload server is bruno, and our download server is bane. None of these currently a redundant twin or a "hot" backup (but we have vader and maul all set up to be a replacement for any of the above if need be). More on that below Our primary/secondary BOINC (mysql) database servers are jocelyn/sidious, and our primary/secondary SETI science (informix) database servers are thumper/bambi. Specs for all these are correctly noted on the status page. We have other systems employed for less interesting but important things, but that's basically the meat of it. If we could double the CPU/memory/disk space on everything we have we'll be set (for the time being). +++++++++++++++ Thanx Matt , ++++++++++++++++++++ for the update on the situation @ SETI . Quite tricky, having no, or @ least to LITTLE, back up, equipment. Is your NETWORK LIMIT 'theoretic', 100Mbit/s .? Hope to be able to DONATE in the future |
KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67 |
Sadly I seem to be in the same position I was in a couple of weeks ago. Nothing is actually being downloaded to my work computer. It was still fine at home this a.m. but that is a much slower machine. I am being told here that access to the servers succeeded but nothing gets any further. 14/02/2008 08:52:02|SETI@home|Started download of 13ja07ae.26417.23385.10.7.33 14/02/2008 08:52:09||Access to reference site succeeded - project servers may be temporarily down. 14/02/2008 08:53:33||Project communication failed: attempting access to reference site 14/02/2008 08:53:33|SETI@home|Temporarily failed download of 24fe07ab.14574.2117.15.7.77: http error 14/02/2008 08:53:33|SETI@home|Backing off 1 min 0 sec on download of 24fe07ab.14574.2117.15.7.77 14/02/2008 08:53:39|SETI@home|Started download of 13ja07ae.7364.1299.11.7.204 14/02/2008 08:53:47||Access to reference site succeeded - project servers may be temporarily down. Last time this happened, in my haste I detached from the project. When I tried again after the weekend, all went sailing through happily and has continued thus until last night. Therefore the problem does not seem to be at my end or in the intervening space between here and SETI. If the server page is showing a bogus picture, is there in reality a problem with Bane or is it something else? Meanwhile I shall be patient this time and not cause others problems by either aborting or detaching. Trouble is, I suspect this is why Seti gets deserters. My faith in the ultimate prize is tarnished but otherwise undiminished! [/quote] Because your computer seems to have wasted a bunch of work units due to aborted downloads, your computer's quota was reduced by one for each result that was wasted in this manner. BOINC throttles computers that generate invalid results mostly as a safety measure, so that a computer that is flaky due to overclocking, overheating, or bad hygiene (Yes, computers can accumulate dust, so open them up and clean them out from time to time so they don't overheat) cannot cause too much damage to a project. This damage can cause good work units to be tossed out because BOINC tosses out work units that accumulate too many errors. Each result successfully processed will double your quota until it reaches the administratively set maximum quota or goes beyond it, in which it is forced back to just the maximum quota.[/quote] |
AndyW Send message Joined: 23 Oct 02 Posts: 5862 Credit: 10,957,677 RAC: 18 |
|
KWSN Ekky Ekky Ekky Send message Joined: 25 May 99 Posts: 944 Credit: 52,956,491 RAC: 67 |
Thank goodness it's not just me! The whole site seemed to be down for the best part of half an hour just now so I suspect the problems are in California. I have similar messages on all my machines this morning, so it looks like either a server or connectivity issue somewhere. |
QSilver Send message Joined: 26 May 99 Posts: 232 Credit: 6,452,764 RAC: 0 |
Thank goodness it's not just me! The whole site seemed to be down for the best part of half an hour just now so I suspect the problems are in California. Anyone who has upload problems, processing problems, etc. would be better served by reading the Number Crunching forum. Typically, widespread problems will get noticed very quickly by the inhabitants of that forum. They will also be able to quickly diagnose local problems that may only affect your rig/farm/set-up. For instance, there's an Upload problems?? thread that was updated about 2 hours ago (from this posting). The threads in this forum are for the project managers to inform users of techinical issues related to project administration. Issues related to uploads, downloads, and processing are best discussed and resolved in Number Crunching. Just my $1/50. QS |
lostcub Send message Joined: 2 May 03 Posts: 2 Credit: 1,122,746 RAC: 0 |
(SIGH)...At least I know NOW I didn't do something wrong... LOL ================================================= Thank goodness it's not just me! The whole site seemed to be down for the best part of half an hour just now so I suspect the problems are in California. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
I've noticed that on the 14th of February you had over 20 task that had client errors. Have you changed something on your pc? Once you start returning completed work unit's your daily quota will start to increase again. ______ Speedy |
Kenn Benoît-Hutchins Send message Joined: 24 Aug 99 Posts: 46 Credit: 18,091,320 RAC: 31 |
I am now running properly. I uninstalled and reinstalled (after having tried reset unsuccessfully). Everything seems to be operating properly now. I had an update for my operating system, and that is the only thing that I can think of that may have fubarred the works. That update, though, did not affect any other programmes. Kenn Kenn What is left unsaid is neither heard, nor heeded. Ce qui est laissé inexprimé ni n'est entendu, ni est observé. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
I'm pleased you got everything working. ______ Speedy |
Kenn Benoît-Hutchins Send message Joined: 24 Aug 99 Posts: 46 Credit: 18,091,320 RAC: 31 |
Message 712612 The above URL explains the probable cause of my problems. Kenn Kenn What is left unsaid is neither heard, nor heeded. Ce qui est laissé inexprimé ni n'est entendu, ni est observé. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.