Message boards :
Technical News :
Rudy and Spider (Feb 28 2008)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
Fully recovered from the long outages earlier this week. I also employed more assimilators (and even more just now) to try to capitalize on periods of low I/O to help catch up on the big assimilator queue backlog. Seems to be working, sort of. We also changed the mount flags on the database volume to include "noatime" - we'll see if this actually makes a difference in performance. Jeff and I are still getting beyond the router config. One of our roadblocks was using cables that were gigabit capable mixed with ones that were not (once again it's cheap parts causing the headache). We might actually be ready to go except we have to upgrade the super-long cable going from our closet to the main lab server closet, which is inaccessible to us. Waiting on the appropriate parties to handle that. Regarding hardware/software RAID: We tend to shy away from hardware RAID as we've had many nightmares in the past regarding configuration and implementation. Namely, it takes forever to figure it out, and then drives fail spuriously and/or silently. The software RAID hit isn't enough to make us consider going hardware on our current systems any time soon. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
telluric Send message Joined: 12 Feb 06 Posts: 9 Credit: 102,871 RAC: 0 |
The catchup that you refer to, does it have to do with bringing the database statistics and thus the related website statistics up to date? [i was here 2 yrs ago only briefly, so am new at this whole process] Fully recovered from the long outages earlier this week. I also employed more assimilators (and even more just now) to try to capitalize on periods of low I/O to help catch up on the big assimilator queue backlog. Seems to be working, sort of. We also changed the mount flags on the database volume to include "noatime" - we'll see if this actually makes a difference in performance. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
The catchup that you refer to, does it have to do with bringing the database statistics and thus the related website statistics up to date? Nope. Check out the previous Tech News posts for the nitty gritty. Grant Darwin NT |
Dr. C.E.T.I. Send message Joined: 29 Feb 00 Posts: 16019 Credit: 794,685 RAC: 0 |
Great work being done by each of you @ Berkeley - Keep it up all . . . Thanks for the Post Matt . . . iT's always Appreciated Sir!!! BOINC Wiki . . . Science Status Page . . . |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Umm, whatever else is going on, at this writing (0630 PST) uploads appear to be on the fritz... have tried with 4 different WU's on two different computers (different client builds on each) and no joy... the upload never starts. . Hello, from Albany, CA!... |
William Roeder Send message Joined: 19 May 99 Posts: 69 Credit: 523,414 RAC: 0 |
Umm, whatever else is going on, at this writing (0630 PST) uploads appear to be on the fritz... have tried with 4 different WU's on two different computers (different client builds on each) and no joy... the upload never starts. Me too. System Connect |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
|
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
Back in business again... F. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. Good things: 1) BOINC tolerates some downtime with the project so no work is lost; 2) SETI team fixes everything every morning |
AllenIN Send message Joined: 5 Dec 00 Posts: 292 Credit: 58,297,005 RAC: 311 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. The don't fix everything, every morning, I've got over 1000 more pending credit than usual today. |
Fred W Send message Joined: 13 Jun 99 Posts: 2524 Credit: 11,954,210 RAC: 0 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. C'mon... It is weekend and they gotta have a life as well!! F. |
AllenIN Send message Joined: 5 Dec 00 Posts: 292 Credit: 58,297,005 RAC: 311 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. What? You don't check your machines daily to see that they are working properly? Are we more devoted to the project than they are? I can't buy that. |
OzzFan Send message Joined: 9 Apr 02 Posts: 15691 Credit: 84,761,841 RAC: 28 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. They may check in on them, but most of them have lives outside of SETI. It's not a matter of devotion, it's a matter of having other things/obligations to do. For instance, Matt is also part of a band. If he's got a gig on stage and he happens to get a text message via his cell from the server (assuming he could even hear the ring or vibration) saying there's a problem, do you expect him to ditch his band to come in and fix the servers? These people are not sitting around at home with their families doing nothing but waiting for servers to go down. If one of the guys happens to be free and notice a problem, they come in if they can. If they can't, then there's nothing that can be done about it. This is the reason why BOINC is designed to hide server downtime through caching or utilizing other projects. 99.99% uptime is just not feasible and their manpower is very limited. |
PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1 |
All of this is true, but it certainly ruins the fun if the performance data is incorrect. I'm referring to scarecrow's graphs. At least, the statistics reporting scripts should automatically be disabled when the servers are belly-up. Just like I tell my divorce lawyer, I hate being lied to. |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Matt, the validators, even though the status page shows them as "running", are down, hung, in infinite loops, or otherwise not doing their jobs; and have been this way for at least 24 hours... [added] My pending, normally in the 1700-2900 range is now at 3700 and growing! [/add] . Hello, from Albany, CA!... |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
they probably have that server now connected to the light switch; when someone came to work today, the lights went on and we all started to 'communicate' with the seti-borg. make that: "every weekday morning" ;-) . Hello, from Albany, CA!... |
muddocktor Send message Joined: 2 Aug 06 Posts: 12 Credit: 28,074,814 RAC: 0 |
Matt, the validators, even though the status page shows them as "running", are down, hung, in infinite loops, or otherwise not doing their jobs; and have been this way for at least 24 hours... Yeah, I was just coming here to see if there was an update on this myself. I saw my pending results balloon up from around 20k to around 55k credits over the weekend, with the majority of the ballooning happening since yesterday. |
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
|
Clyde C. Phillips, III Send message Joined: 2 Aug 00 Posts: 1851 Credit: 5,955,047 RAC: 0 |
I noticed that my RAC nosedived about ten percent in the last 24 hours. I looked at my pendings and saw the level at over 9000, about 50% more than normal. Still I'm not alarmed. This will fix itself just like the many "Ready To Reports" that disappeared during the same time. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.