Message boards :
Technical News :
Ricochet (Jun 02 2011)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Long time no speak. I've been out of town and/or busy and/or admittedly falling out of the habit of posting to the forums. So I was gone last week (camping in various remote corners of Utah, mostly) and like clockwork a lot of server problems hit the fan once I was out of contact. Among other things, the raw data storage server died (but has since been recovered), oscar wedged up for no reason (a power cycle fixed that) and Jeff's desktop had some issues as well (nothing a replacement power supply couldn't handle). Then we had the holiday weekend of course, but we all returned here yesterday and continued handling the fallout from all that, as well as the usual weekly outage stuff. We're still using thumper as the active raw data storage server and worf is now where we're keeping the science backups. Basically they switched roles for the time being, until we let this all incubate and decide what to do next, if anything. This morning we brought the projects down to replace some DIMMs (the have been sending complaints to the OS) on thumper. One thing I kinda loathe about professional computing in general is poor documentation - a problem compounded by chronic zero-index vs. one-index confusion, and physical hardware labels vs. how they are depicted in the software. Long story short despite all kinds of effort to determine exactly which DIMMs were broken, it wasn't until after we did the surgery and brought everything back on line that we found out we probably replaced the wrong ones. Oops. We'll have to do this again sometime soon. There are some broken astropulse results clogging one of the validators (which is why it shows up on red on the status page). We'll have to figure out an automated way to detect these results and push them through (it's a real pain to do by hand). In the meantime, this is causing our workunit storage server to be quite full, and might hamper other workunit development sooner than later. Gripes and server issues aside, there is continuing happy progress. I'm still tinkering with visualization stuff for web based analysis of our candidates (for private and potential public use), and we have tons of data from the Kepler mission arriving here any day now which will be fun to play with. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
kittyman Send message Joined: 9 Jul 00 Posts: 51478 Credit: 1,018,363,574 RAC: 1,004 |
Thank you for once again informing us of the trials and tribulations in the Seti server closet. "Time is simply the mechanism that keeps everything from happening all at once." |
SciManStev Send message Joined: 20 Jun 99 Posts: 6658 Credit: 121,090,076 RAC: 0 |
Thank you for the news Matt! We love hearing what goes on at your end. The new data from the Kepler mission sounds very interesting. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website |
Claggy Send message Joined: 5 Jul 99 Posts: 4654 Credit: 47,537,079 RAC: 4 |
Thanks for the update Matt, Claggy |
Bill G Send message Joined: 1 Jun 01 Posts: 1282 Credit: 187,688,550 RAC: 182 |
Sounds like you are keeping a close rein on things. Thanks for the efforts. SETI@home classic workunits 4,019 SETI@home classic CPU time 34,348 hours |
Akio Send message Joined: 18 May 11 Posts: 375 Credit: 32,129,242 RAC: 0 |
Thanks for insight on what has been giving you guys trouble. It's quite interesting to hear all the curve balls you are thrown and how you are quickly able to adapt! Well done, and thanks for the update ;) |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
I'm beginning to learn how to read your news with a British accent using Rowan Atkinson's voice as I read it. Certainly makes things far more entertaining and a right bit hilarious as I read about the server issues. Hope you don't mind at all. |
Mike Send message Joined: 17 Feb 01 Posts: 34381 Credit: 79,922,639 RAC: 80 |
Thanks for the update Matt. With each crime and every kindness we birth our future. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
I'm beginning to learn how to read your news with a British accent using Rowan Atkinson's voice as I read it. LOL! How many people besides me tried it after reading this reply? I think I was drifting between British, Australian, and Boston, but yes it was interesting. Thanks for the update, Matt. And I must say, I really REALLY like the new stats in the header and footer of the All Tasks for Computer _______ pages. Since I got my new computer, it got extremely difficult to add these numbers up by hand. I gave up it after a couple days; now I know again. Maybe I should ask this in Number Crunching... I looked at my new machine's BOINC manager yesterday for the first time since the day I installed it, I think. Almost half of the tasks listed on the tasks tab had a status of "GPU missing; Ready to run." None of them showed any progress. Does this mean the computer's GPU failed? Obviously, I'm still getting video output. The card is an nvidia GT 440 (not the latest and greatest, but adequate to my primary need). I restarted the computer a couple of times while I was messing with it and did not check the BOINC manager again, so maybe it recovered and I don't know it. I have different preferences for this machine to allow it to use most of its potential, whereas I restrict my others a bit for their overall health and performance of other apps. David David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Maybe I should ask this in Number Crunching... I looked at my new machine's BOINC manager yesterday for the first time since the day I installed it, I think. Almost half of the tasks listed on the tasks tab had a status of "GPU missing; Ready to run." None of them showed any progress. Does this mean the computer's GPU failed? Obviously, I'm still getting video output. The card is an nvidia GT 440 (not the latest and greatest, but adequate to my primary need). I restarted the computer a couple of times while I was messing with it and did not check the BOINC manager again, so maybe it recovered and I don't know it. I have different preferences for this machine to allow it to use most of its potential, whereas I restrict my others a bit for their overall health and performance of other apps. Yes, it would be better to continue the conversation in Number Crunching if this doesn't answer the point. "GPU missing; ..." (in Windows) is most likely the result of using Fast User Switching or Remote Desktop without stopping and restarting BOINC: GPUs only run if the user who started the BOINC session, and the user currently active at the console, are one and the same. If that doesn't apply to you, ask again in NC. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
A few things to work on, now that the major fires have been stomped out: 1) The "client connection stats" page hasn't been updated since before the big black-out last year. 2) The "Multi-Beam Data Recorder Status" shows "34206m ago" - that's ~ 24 days... These assume, of course that the BOINC server software hasn't been changed so that they are unavailable. Also, if the "Pending Credit" page is permanently gone, could someone delete the link? . Hello, from Albany, CA!... |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
1) The "client connection stats" page hasn't been updated since before the big black-out last year. For some reason this is a big pain to keep working (and obviously low priority to keep kicking back into working mode). Will try to look into that again soon. 2) The "Multi-Beam Data Recorder Status" shows "34206m ago" - that's ~ 24 days... Oh yeah that. There was a cluster of power/security concern issues at Arecibo a few weeks back and lots of things haven't been adjusted to work with new networking/security regimes yet. So we haven't gotten telescope info up here in real time for a while, hence the big delays.. Also will look into that again soon. Also, if the "Pending Credit" page is permanently gone, could someone delete the link? Wait... what's the situation here (I have zero pending credit, so the page link works - it just says pending credit: 0.00 and shows no tasks)? - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 31013 Credit: 53,134,872 RAC: 32 |
Also, if the "Pending Credit" page is permanently gone, could someone delete the link? There are two places for pending credit now. The one on your account page is now dead. I think a BOINC server update killed it. IIRC there is now a server side switch for it to be active. The other pending credit is on your tasks page and works. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Also, if the "Pending Credit" page is permanently gone, could someone delete the link? Back on or around 8 March (this year), David Anderson made a change in the back-end BOINC server code - specifically, changeset [trac]changeset:23118[/trac] for sched_result.cpp - which meant that no calculated value for "claimed credit" was put in the database when a result was reported - David wants us all to use CreditNew instead. The 'pending credit' list, as its name suggests, only shows pending tasks which have a non-zero claimed credit - so no newly-reported results have appeared on the pages since 8 March. Most people, like you, now have empty lists - we've been comparing notes in Pending Credit List Has Almost Disappeared - but I've still got five left..... Yes, the link is pretty much redundant now, and unless David has a major change of heart, won't be coming back. You might as well save the space. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Maybe I should ask this in Number Crunching... I looked at my new machine's BOINC manager yesterday for the first time since the day I installed it, I think. Almost half of the tasks listed on the tasks tab had a status of "GPU missing; Ready to run." None of them showed any progress. Does this mean the computer's GPU failed? Obviously, I'm still getting video output. <snip> I restarted the computer a couple of times while I was messing with it and did not check the BOINC manager again, so maybe it recovered and I don't know it.<snip> Thanks for the advice. I did check it again this morning and I still had all the "GPU missing"s. Based on your comments, what I'll do is remove BOINC as a scheduled task (even though it runs under the same user name), restart the computer, and start BOINC manually, then not log off (I log off for security and so I can use Remote Desktop Connection, but this computer has a Home version so I can't Remote in anyway). If this doesn't work, I'll ask in Crunching. David David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
David S Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12 |
Maybe I should ask this in Number Crunching... I looked at my new machine's BOINC manager yesterday for the first time since the day I installed it, I think. Almost half of the tasks listed on the tasks tab had a status of "GPU missing; Ready to run." None of them showed any progress.<snip> The situation seems to be resolved. I removed the Scheduled task and stopped and restarted BOINC without restarting the computer. All the Seti tasks then showed ready to run. But then I got a bit stupid. I looked some more and found about 10 Einstein units due in 2-3 days (ONE of which was running in high-priority mode), so I suspended Seti so it would finish the Einstains. What I failed to do was set Einstein to no new tasks, so when it finished the 10 and Seti was suspended, it downloaded 100+ new Einsteins, also with short deadlines. -sigh- How does BOINC decide what to work on, anyway? I see it nearly finished with tasks that are due in a month and a half, while others that are due in a week aren't started. David David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
How does BOINC decide what to work on, anyway? I see it nearly finished with tasks that are due in a month and a half, while others that are due in a week aren't started. Unless BOINC is under extreme deadline pressure, it runs tasks (within one project) in the order they're received from the project's servers. Any more explanation than that is for NC. Glad you got your GPU back, anyway. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
1) The "client connection stats" page hasn't been updated since before the big black-out last year. I know both are low priority problems, that's why I don't bother posting them when there are bigger fish to fry... . Hello, from Albany, CA!... |
justsomeguy Send message Joined: 27 May 99 Posts: 84 Credit: 6,084,595 RAC: 11 |
Hey Matt, Do you guys ever use the service processor on Thumper? I've seen erroneous errors thrown on the x4500's and x4600's. If you look in the logs in the service processor it will give exact dimms...if you have a hard memory failure there is a little button on the system board near the dimms...it will light up the failed dimm when this is pressed, it's a small cap and doesn't last long, but long enough. Let me know if I can be of assistance! Kevin "Two things are infinite: The universe and human stupidity; and I'm not sure about the universe." - Albert Einstein |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.