Message boards :
Technical News :
Ups and Downs (Aug 05 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Today was another one of them "outage days" where we shut everything down to do basic weekly maintenance (database backup and whatnot). We had a particularly large task list this time around. A lot of it was fairly mundane - like moving/compressing files to make more room on various storage systems. The sidious crash the other day did in fact break the mysql replica again. No big deal, but that meant recreating the database from the master - a seemingly weekly occurrence. It's easy to do, just adds extra time to the whole operation. Also, we tried to fix that broken index on the science database. We found the corruption was actually not on the RAID system we thought (the one that required a drive replacement). Huh. Anyway.. the index repair on the whole table was taking too long. We might just go ahead and drop/rebuild the specific index later now that we are more sure what's what. We brought all our backend services (feeder, transitioner, validator, etc.) up to spec on current BOINC code for the first time in a long time, so we carefully turned these on one at a time to observe the logs/results and make sure nothing got all screwy with the updated code. So we're back up, more or less. The current mystery is why we are using so much bandwidth. Too many factors at play to make a clear determination - lots of known network bottlenecks, lots of database bottlenecks, unknown Astropulse behavior, etc. We'll give this a closer look tomorrow after (hopefully) some of the traffic jams disappear. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
[KWSN]John Galt 007 Send message Joined: 9 Nov 99 Posts: 2444 Credit: 25,086,197 RAC: 0 |
|
Mr. Majestic Send message Joined: 26 Nov 07 Posts: 4752 Credit: 258,845 RAC: 0 |
|
Dirk Sadowski Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
@ Matt Lebofsky I hope you noticed this thread in the Number_crunching area: AstroPulse Ghost WUs !!! I got to now two AP-WUs but didn't changed my app_info.xml for to get AP-WUs.. #1 - 4 Aug 2008 22:47:48 UTC #2 - 5 Aug 2008 23:13:30 UTC When will be the server updated/fixed? When we can choose in the preferences Enhanced- and/or Astropulse- WUs? EDIT: It would be nice if it would be possible to choose (Enhanced- and/or AP- WUs) different in the 'Computing preferences'. For 'Primary (default) preferences', 'Home', 'School' and 'Work'. |
DaBrat and DaBear Send message Joined: 13 Dec 00 Posts: 69 Credit: 191,564 RAC: 0 |
Thanks for keeping us up to date. This unable to upload is driving me mad.... Maybe the bandwidth issue is with so many WUS trying to upload. I started having issues sometime before midnight EST on the 4th... before the Tues outage. Oh well off to bed... maybe by morning the 70 or so uploaders I have will have safely made it back to the arms of SETI. BTRW.... hopefully they won't all end up in the pending pile....lol |
muddocktor Send message Joined: 2 Aug 06 Posts: 12 Credit: 28,074,814 RAC: 0 |
I sure hope you can clear up all the server problems soon, Matt.I have several machines with work results that won't upload and I've also had problems with my machines getting new work too. Good luck on getting the issues sorted tomorrow. |
Rev. Tim Olivera Send message Joined: 15 Jan 06 Posts: 20 Credit: 1,717,714 RAC: 0 |
Simple question, why has my BOINC gone from SETI@home to ASTRO PULSE?? I didn't sign on to any ASTRO PULSE... Tim Olivera |
Dirk Sadowski Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5 |
Simple question, why has my BOINC gone from SETI@home to ASTRO PULSE?? I didn't Everything is O.K. .. :-) You are member of SETI@home and your PC get Enhanced- and/or Astropulse- WUs. Have a look here: Astropulse FAQ BTW. Have a look in my profile, because of opt. Enhanced- applications.. :-) |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
Simple question, why has my BOINC gone from SETI@home to ASTRO PULSE?? I didn't Simple answer, the data recorded by the SETI@home project at Arecibo has much more potential information than we have been extracting with the original Spike and Gaussian searches, or the Pulse and Triplet finding added later. The AstroPulse application looks for microsecond pulses, either single or repeating, which are beyond the capability of setiathome_enhanced. Because the additional searching requires the full 2.5 MHz. recorded spectrum but less duration, a separate application is more appropriate than trying to combine it into a setiathome_double_enhanced. Joe |
Robert Gammon Send message Joined: 29 Aug 01 Posts: 21 Credit: 1,573,250 RAC: 0 |
I sure hope you can clear up all the server problems soon, Matt.I have several machines with work results that won't upload and I've also had problems with my machines getting new work too. Good luck on getting the issues sorted tomorrow. I will note from observation of the server status over the last two days, that the back end processes, db-purge, and wu-purge (may have the process names in error) are working fine. In a few hours, certainly before Monday, all that back end work will be complete. Result analysis is clogged up and not making much progress. The tapes are not spitting out new WUs, or more precisely, any WUs read from the tapes are not making it to the outgoing queue. So there is more going on than simply "Can't upload completed WUs" |
Sumyunguyy Send message Joined: 14 Jul 04 Posts: 12 Credit: 1,173,168 RAC: 0 |
I have about 25 WU's waiting to be uploaded,when do you think this issue might be fixed? |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
I have about 25 WU's waiting to be uploaded,when do you think this issue might be fixed? Project servers and staff are in CA, USA. Their local TZ is UTC-7. I don't think tomorrow is a public holiday in CA, so I would think probably by around 18:00 UTC tomorrow work may start flowing again. Then you can expect a big flood as everyone else tries to upload, download, report tasks etc. Things may be back to "normal" by around 01:00 UTC on 12th August. Sir Arthur C Clarke 1917-2008 |
Walter Schmidt Send message Joined: 28 Aug 99 Posts: 1 Credit: 959,166 RAC: 0 |
Can't U/L or D/L at 080810.0858-4 |
Menno Vos Send message Joined: 4 Jul 99 Posts: 1 Credit: 328,705 RAC: 0 |
Same here from the Netherlands, all uploads and downloads are stuck since noon Saturday 07-08 (local time, which is GMT +2) |
[B^S] madmac Send message Joined: 9 Feb 04 Posts: 1175 Credit: 4,754,897 RAC: 0 |
|
Bob Mahoney Design Send message Joined: 4 Apr 04 Posts: 178 Credit: 9,205,632 RAC: 0 |
A moment-to-moment online service such as SETI will lose members if the members are not informed. It is likely more members/visitors will see the "News" column on the setiathome.berkeley.edu home page than expecting visitors to dig into the message boards. I think SETI can enhance its excellent and generous reputation by performing much more frequent (and small) updates of the "News" column on the home page. This should take less staff time than posting multiple messages to the boards. There is no need to feel embarrassment about a service outage - this is bleeding-edge stuff and no pain is no gain. We all know SETI is on the edge and is driven by passion as much as anything else. Remember that when our crunching computers choke, we jump into action at our homes and offices and reload BOINC, reset our computers, do numerous Project Update commands, fiddle with our settings, Google for "SETI no new tasks" and read about an issue 5 years ago, etc. In other words, if I could have found out that my idle computers were not a result of some mistake on my part, I would be very sympathetic to all the fine scientists trying to keep SETI together with that magic bailing wire and duct tape. Keep up the good work! But please save us users from the angst that comes with not knowing. |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
A moment-to-moment online service such as SETI will lose members if the members are not informed. It is likely more members/visitors will see the "News" column on the setiathome.berkeley.edu home page than expecting visitors to dig into the message boards. Or, instead of going into panic mode and doing all that, you can just give it time. Even without an update, which I agree would be most helpful, there is no need to panic because of a few workunits that can't upload or download for a couple days. I think people tend to panic too much and turn to micromanaging BOINC when there is a perceived problem. There are some computers on my account which belong to family members who have no idea when there are server problems or not. They simply go about their day, using their computer as usual, never even knowing if anything is wrong. They do not panic, they do not start pressing buttons and they don't check in on the website if there appears to be a problem. And BOINC always recovers on its own. |
gugi Send message Joined: 26 Mar 01 Posts: 1 Credit: 33,490 RAC: 0 |
Or, instead of going into panic mode and doing all that, you can just give it time. Even without an update, which I agree would be most helpful, there is no need to panic because of a few workunits that can't upload or download for a couple days. If I see 10 or more workunits that won't upload for two days, and no news about server downtime, I assume it's an error on my side. If it is on MY side, it won't just magically go away in few days, so I do a lot of micromanaging. I just lost 2 hours checking my connections, settings, restarting, upgrading BOINC to the new version, restarting few more times and just went crazy because servers should be up and I cannot upload. All because seti@home team was too lazy to post one sentence in the News saying "servers down". I can see the latest News entry now, but as any journalist will tell you: news about something that happened yesterday is not really news. |
Pooh Bear 27 Send message Joined: 14 Jul 03 Posts: 3224 Credit: 4,603,826 RAC: 0 |
If I see 10 or more workunits that won't upload for two days, and no news about server downtime, I assume it's an error on my side. You expect the handful of part time paid people to work 24x7 just to tell you there is an issue? Please read the Number Crunching forum which will give you information way before the team posts about the issues. The team knows about them but they have lives too. They do not babysit the systems, because they are not paid to. If they do work on weekends and stuff it's of their own time, etc. This is a voluntary project for you, if you are doing work, good, if not, so what? It hurts nothing. The work will get done eventually. Why do people think they must be so pushy at the project people? These guys go above and beyond the small amount they are paid to work. They monitor the project 24x7, but sometimes are not in a spot to fix it right away. That's why they even recommend you crunch other projects, so if you want your CPU to stay warm it will. If you only want to crunch SETI, well then be that as it may, but if you run out of work, or have problems sending and receiving, the people that work on the project already tell us it will happen, and may happen more often as we get more people, larger work, etc. There is no unlimited funds at this project, and they work on some substandard machines, etc. I cannot understand why people who get nothing out of the project except a few credits to brag about, are so rude to the admins on this project. |
speedimic Send message Joined: 28 Sep 02 Posts: 362 Credit: 16,590,653 RAC: 0 |
... I just lost 2 hours checking my connections, settings, restarting, upgrading BOINC to the new version, restarting few more times and just went crazy because servers should be up and I cannot upload. All because seti@home team was too lazy to post one sentence in the News saying "servers down". ... A quick look in the Number crunching forum could have saved you about 1 hour 58... mic. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.