Message boards :
Technical News :
Switcheroo (Mar 21 2007)
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 ![]() |
Just after I posted yesterday's tech news message we had to reboot kryten and penguin as they both lost NFS mounts. In fact, we had to boot kryten twice (as it came up immediately being unable to mount bruno's disks). I really wish I knew what was causing these to happen, but perhaps this problem will simply just "time out." The first technical issue for today was the hill shuttle bus broke down, so I got in a few minutes later than expected. This at least afforded me an extra few minutes to complete a rather pesky sudoku puzzle. Take that, unruly numbers! So what happened with the replica yesterday? Turns out, for some (currently) inexplicable reason the .MYD files under data/mysql were all zero length. None of the other files were affected, just the .MYD's. Oddly their time stamps were sane (they were rather old as they haven't been updated in a while). So what emptied out these very specific files but didn't update their time stamps? In any case, we're forced to recover the replica from scratch (not that big a deal). Bob was finally able to wiggle his way in to at least clean out the current database so we can drop everything and reload. We might have an outage soon to dump the current data for such a reload. Meanwhile, bruno progresses. Making it the new upload server was held up on being able to compile a working fastcgi-enabled file_upload_handler. Jeff finally got one to compile. So we embarked on what should have been a quick transition - basically just moving a cable from one jack to another and updating DNS. However the file_upload_handler didn't work. Refusing to debug it I suggested we just use a normal garden variety handler without the fastcgi hooks. All the fastcgi was buying us was process spawning overhead. This was a major necessity on our old n' slow 3500, but bruno didn't even break a sweat once we fired it up. So bruno is now our upload server! But wait! After a half hour or so I noticed the traffic graphs were a bit "dampened." Why weren't we sending out as much data as before? After finding no obvious bottlenecks we dug out a gigabit switch and split the Hurricane link so both kryten and bruno could act as simultaneous upload servers. Sure enough, a third of our clients were still trying to connect to the kryten address. This is odd as the DNS entry has a 5 minute TTL (time to live). Perhaps we're seeing the effect of DNS caching (in Windows or otherwise). Fair enough - we'll leave both kryten and bruno up as "mirror" servers as DNS (hopefully) corrects itself over the coming days. I'll reflect the changes in the server status page eventually. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
![]() ![]() Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 ![]() |
Just after I posted yesterday's tech news message we had to reboot kryten and penguin as they both lost NFS mounts. In fact, we had to boot kryten twice (as it came up immediately being unable to mount bruno's disks). I really wish I knew what was causing these to happen, but perhaps this problem will simply just "time out." > good work @ Berkeley - Thanks for the Post Matt . . . note: 'IF' you have any time - look @ this Thread "Client error Aborted . . ." it would be appreciated - some strange 'anomalies' |
![]() Send message Joined: 25 Nov 01 Posts: 21704 Credit: 7,508,002 RAC: 20 ![]() ![]() |
... we had to reboot kryten and penguin as they both lost NFS mounts. ... but perhaps this problem will simply just "time out." Silly guesses time... An overheated or even an overloaded switch?... With too much simultaneous traffic, they can run out of table space or even saturate their backplane... And, any error packets reported anywhere by any machine's ifconfig? Good luck, Martin See new freedom: Mageia Linux Take a look for yourself: Linux Format The Future is what We all make IT (GPLv3) |
![]() Send message Joined: 28 Sep 02 Posts: 362 Credit: 16,590,653 RAC: 0 ![]() |
Well done Bob and Matt! Keep us updated if you find out what happened to the .MYD's... interesting... Matt: talking about the server status page - how is that done technically? Would be cool to have that for my servers too! keep up the good work! mic. mic. ![]() |
![]() ![]() Send message Joined: 14 Jul 03 Posts: 3224 Credit: 4,603,826 RAC: 0 ![]() |
As I suspected, Matt is another Sudoku fan. Is the rest of the group there into it, also? I love the professional puzzles. *Most* of the time I finish them in 20-25 minutes, but there are a few that take a little longer. I now need to get into the Hex ones, or the 26 letter ones and see what I can do with those puppies. ![]() My movie https://vimeo.com/manage/videos/502242 |
Wander Saito Send message Joined: 7 Jul 03 Posts: 555 Credit: 2,136,061 RAC: 0 ![]() |
Very good news indeed! Thanks for all your efforts. Let's hope that Bruno lives up to the expectation we all have for it. Congrats to all the people involved in the switch. Regards, Wander |
John McLeod VII Send message Joined: 15 Jul 99 Posts: 24806 Credit: 790,712 RAC: 0 ![]() |
As I suspected, Matt is another Sudoku fan. Is the rest of the group there into it, also? I am a programmer at heart. I have written a 9x9 sudoku solver - that finds all the solutions for a given puzzle (some of them do indeed have more than one). It took me most of a weekend. ![]() ![]() BOINC WIKI |
![]() Send message Joined: 3 Apr 99 Posts: 1603 Credit: 2,700,523 RAC: 0 ![]() |
In case anyones' missed the obvious, a Google for 'zero length MYD files' does find several discussions re empty MYD files. ![]() |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 ![]() |
Matt, I don't know if anyone has made you or any member of your team aware of this, or if anything can be done about it, but yesterday and the day before there were a fairly considerable number of results marked as invalid. I'll guess that it was due to Kryten losing it's mounts or a glitch during the switchover to Bruno, but whatever the cause I wanted to be sure you knew about it in case adjustments can be made. Many of these have been posted in Number Crunching in the "Validate Errors - Please post them" thread. Of course I know that you and your team don't have time to read all of the boards so I'm bringing this to your attention here in the hopes that it will be noticed. One issue is that the earliest of these have already been resent and reported, and so are in danger of being deleted. So, in the event that credit may be granted I guess time is becoming a factor. Whatever, thanks for your hard work and that of the rest of the Berkeley crew in the effort to keep this project going despite all of the limitations; hardware, manpower and otherwise. Regards, Gus Obermeyer (gomeyer) |
![]() ![]() Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 ![]() |
Matt, > @ Gus - see Matt's NEW POST - Ups and Downs (Mar 22 2007) edit - sorry 'bout Link (fixed) |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 ![]() |
> @ Gus - see Matt's NEW POST - Ups and Downs (Mar 22 2007) Already saw it, but thanks. |
Bellator Send message Joined: 3 Sep 04 Posts: 15 Credit: 50,270 RAC: 0 ![]() |
this was not my message and I have deleted it. |
Bellator Send message Joined: 3 Sep 04 Posts: 15 Credit: 50,270 RAC: 0 ![]() |
Perhaps my problem has something to do with this, perhaps not. At any rate, my Seti completed task does not upload because it is "locked by file upload handler". I had this problem with the previous task and eventually I just detached and reattached, meaning I had lost all credit. I have also had problems with Climateprediction, i.e. unable to contact server. Are the problems related. Is there a solution for a poor computer semi-illiterate? |
![]() ![]() Send message Joined: 3 Apr 99 Posts: 9659 Credit: 251,998 RAC: 0 |
Very good news indeed! Thanks for all your efforts. Let's hope that Bruno lives up to the expectation we all have for it. Congrats to all the people involved in the switch. And a big thank you to all who have donated the parts for Bruno. "I'm trying to maintain a shred of dignity in this world." - Me ![]() |
![]() ![]() Send message Joined: 9 Jan 00 Posts: 2562 Credit: 12,301,681 RAC: 0 ![]() |
Bellator What Version of the BOINC Core Client are you using? With Problems with Seti and ClimatePrediction I would suspect that it is the BOINC Core. For Best Results/help if you post in the Number Crunching Forum it may get attention faster... Perhaps my problem has something to do with this, perhaps not. At any rate, my Seti completed task does not upload because it is "locked by file upload handler". I had this problem with the previous task and eventually I just detached and reattached, meaning I had lost all credit. Please consider a Donation to the Seti Project. |
![]() ![]() Send message Joined: 17 Sep 03 Posts: 50 Credit: 1,179,926 RAC: 0 ![]() |
Perhaps my problem has something to do with this, perhaps not. At any rate, my Seti completed task does not upload because it is "locked by file upload handler". I had this problem with the previous task and eventually I just detached and reattached, meaning I had lost all credit. I had that problem as well yesterday. Restarting the boinc service did not help, but a reboot did fix it. I was having intermittant internet access at the time, so I assumed it was related to that. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 ![]() |
What is the retirement plan for the old suns? (No I don't mean 401k's) With the modern servers, do you really need all those others running? Are they going to be gutted for parts? Reducing complexity is always good. May this Farce be with You |
OzzFan ![]() ![]() ![]() ![]() Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 ![]() ![]() |
What is the retirement plan for the old suns? (No I don't mean 401k's) Hmmm... and I was going to say black holes or white dwarfs. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 ![]() |
What is the retirement plan for the old suns? (No I don't mean 401k's) Good one. I completely missed the opportunity to be cleaver and am glad you didn't! May this Farce be with You |
![]() ![]() Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 ![]() |
Regarding old suns: We basically need whatever we're using, if that makes any sense. I'm all for simplicity and reducing the number of machines, especially as servers become more powerful but occupy less space. We did recently retire a few Ultra 10's (mostly being used as desktops or backup web servers/file servers). Since the university official owns these we basically hand them over to the UC system when we're done and they do their best to sell them and make a few bucks. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.