Stretch (Apr 22 2009)

Author	Message
Matt Lebofsky Volunteer moderator Project administrator Project developer Project scientist Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0	Message 887410 - Posted: 22 Apr 2009, 22:33:18 UTC Looks like there were some beta project problems after the outage yesterday caused by a missing executable. That got replaced, and I think that everything should be okay now on that front. I heard rumors that regular users were seeing beta errors, but I'm hoping that was just confusion. I haven't heard anything since. Other than that today was more or less a day of system/web plumbing. The web stuff I'm working on is becoming a major kludge due to time constraints. It's actually a conglomeration of C code and perl, php, and C-shell scripts. You know, whatever works. I'm a big fan of getting things working as soon as possible, then making it pretty later. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude ID: 887410 ·

perryjay Volunteer tester Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0	Message 887412 - Posted: 22 Apr 2009, 22:44:34 UTC - in response to Message 887410. I don't think there were any actual errors but a few people reported getting switched to Beta. This is one of the threads started on it... http://setiathome.berkeley.edu/forum_thread.php?id=53205 PROUD MEMBER OF Team Starfire World BOINC ID: 887412 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30637 Credit: 53,134,872 RAC: 32	Message 887436 - Posted: 23 Apr 2009, 0:24:03 UTC - in response to Message 887410. Looks like there were some beta project problems after the outage yesterday caused by a missing executable. That got replaced, and I think that everything should be okay now on that front. I heard rumors that regular users were seeing beta errors, but I'm hoping that was just confusion. I haven't heard anything since. Other than that today was more or less a day of system/web plumbing. The web stuff I'm working on is becoming a major kludge due to time constraints. It's actually a conglomeration of C code and perl, php, and C-shell scripts. You know, whatever works. I'm a big fan of getting things working as soon as possible, then making it pretty later. - Matt Saw those things about failed BETA d/l's and wondered if a router got pointed in the wrong direction, if just for a moment, while the project was coming back up. As for making things work, well, there is never time to go back and make them pretty. If you have to maintain it, better do it right the first time or not close the ticket until you have finished the clean up. Thanks for all the good work to the whole staff! ID: 887436 ·

PhonAcq Send message Joined: 14 Apr 01 Posts: 1656 Credit: 30,658,217 RAC: 1	Message 887557 - Posted: 23 Apr 2009, 12:21:51 UTC I'd like to suggest that the balance between MB and AP wu generation be adjusted somehow. Looking at the last few entries of my log from last night, I see that the servers ran out of MB's to give me (I only run MB). But then 16 seconds later boinc 6.6.20 requests work again and gets 2. Then 10 mins later it requests work again and gets 1. Then 3 mins later it requests more, and gets 1. And then 2 mins later it requests and gets 1. If this goes on for 100,000 clients, no wonder life is tough at times for the servers. Wouldn't it make sense to produce a few more MB/s hour and then (hopefully) eliminate all these extra requests? (Or is multiple, sequential, redundant requests a 'feature' of boinc that would occur independent of the #wu's available? I would believe that too.) ID: 887557 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13727 Credit: 208,696,464 RAC: 304	Message 887667 - Posted: 23 Apr 2009, 18:09:26 UTC - in response to Message 887557. Looking at the last few entries of my log from last night, I see that the servers ran out of MB's to give me (I only run MB). It's a glitch of some sort. Scarecrow's Graphs show there's been plenty of MB work ready to send for over 4 days. Often my system will request work & get 23/04/2009 6:51:01 SETI@home Message from server: No work sent 23/04/2009 6:51:01 SETI@home Message from server: No work is available for SETI@home Enhanced 23/04/2009 6:51:01 SETI@home Message from server: No work available for the applications you have selected. Please check your settings on the web site. Then a minute or so later when it makes another request, it gets some. Grant Darwin NT ID: 887667 ·

Gary Charpentier Volunteer tester Send message Joined: 25 Dec 00 Posts: 30637 Credit: 53,134,872 RAC: 32	Message 887668 - Posted: 23 Apr 2009, 18:09:53 UTC - in response to Message 887557. I'd like to suggest that the balance between MB and AP wu generation be adjusted somehow. Looking at the last few entries of my log from last night, I see that the servers ran out of MB's to give me (I only run MB). But then 16 seconds later boinc 6.6.20 requests work again and gets 2. Then 10 mins later it requests work again and gets 1. Then 3 mins later it requests more, and gets 1. And then 2 mins later it requests and gets 1. If this goes on for 100,000 clients, no wonder life is tough at times for the servers. Wouldn't it make sense to produce a few more MB/s hour and then (hopefully) eliminate all these extra requests? (Or is multiple, sequential, redundant requests a 'feature' of boinc that would occur independent of the #wu's available? I would believe that too.) Answering you last question first, yes. There is even method to the madness. You client asks for a given amount of work. The project sends some work units. It does not know how fast your machine crunches. It may send a single work unit that is twice (or more) the amount you request or that work unit might be only a fraction of the amount of your request. You machine after it gets the work unit calculates if it needs more. If it does it sends another request. Now as to your specific request and getting none, well in this case the splitters couldn't make work fast enough or much more likely the internal queue between the splitter and the server backed up and the server was itself waiting due to too much data being moved around. That backlog may have broken a few microseconds after your request arrived. And in the 16 seconds before your machine again asked for work, formed and gone away a few dozen times. IIRC Matt said it grabs work units from the splitter queue is 100 unit hunks. Matt might want to tune this value so it is higher right after he brings the project back up if it has been offline for a while to reduce thrashing, but he knows much better than I what bottlenecks or if it is even possible to dynamically tune this value. As to the ratio of M/B to A/P unless the queues for those aren't being kept reasonably full then the ratio is correct. If it wasn't, one would overflow and the other be empty. ID: 887668 ·

John G Send message Joined: 29 Dec 01 Posts: 68 Credit: 10,932,850 RAC: 0	Message 887733 - Posted: 23 Apr 2009, 21:50:29 UTC Hey Guys and Gals Uploads and downlaods just went down hopefully its just preventative maintenance ??? regards ID: 887733 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.