Thousand Island (Feb 23 2009)

Author	Message
Madas91 Send message Joined: 7 Apr 04 Posts: 1 Credit: 127,253 RAC: 0	Message 868961 - Posted: 24 Feb 2009, 8:11:37 UTC Just finished my last cuda unit :( still have 14 complete to upload :( Dont let my graphics card drop below 85C or it will think its holiday time. My upload list is finally looking smaller but no new work yet. Hope the indigestion goes soon.......keep up all the good work anyway. At least now i know its not just me i'm happier......how does that work ID: 868961 ·

KWSN Sir Clark Volunteer tester Send message Joined: 17 Aug 02 Posts: 139 Credit: 1,002,493 RAC: 8	Message 868966 - Posted: 24 Feb 2009, 8:40:33 UTC I'm doing my bit. Successfully crunching some WUs but got SETI and SETI beta on NNW until this dies down. I've got 17 other projects to crunch for (all on one PC) so I'm not gonna run out of work. Hope it gets sorted soon.... ID: 868966 ·

littlegreenmanfrommars Volunteer tester Send message Joined: 28 Jan 06 Posts: 1410 Credit: 934,158 RAC: 0	Message 868975 - Posted: 24 Feb 2009, 10:02:42 UTC Thanks for filling us in on the latest hiccup, Matt. I have a little to add, though: My laptop received a message to the effect that a newer version of BOINC was avalailable. I downloaded the new version, and upon installing it, lost my entire cache of WU's, including a "long distance" AP unit which had almost finished. Not only that, I found my laptop had to be re-attached to the project. All the lost WU's will have to be resent again, adding to the bottleneck. It might be a good idea to try two things here: 1) During times of high bandwidth usage, turn off the downloading to clients of updated BOINC versions. 2) Installing a newer version of BOINC never used to cause a PC to detach from the project. Can we please return to this state? Cheers lgm ID: 868975 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 868978 - Posted: 24 Feb 2009, 10:24:03 UTC - in response to Message 868911. gomeyer wrote: ... [edit] BTW, I still think these rollouts should only be launched earlier in the week. [/edit] Astropulse v5 was rolled out Wednesday February 11 at about 5 pm Berkeley time. No obvious problem for 9 days, then whammo! Joe ID: 868978 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 868987 - Posted: 24 Feb 2009, 10:49:28 UTC - in response to Message 868975. 2) Installing a newer version of BOINC never used to cause a PC to detach from the project. Can we please return to this state? Upgrading to a new BOINC does not detach you from your projects. But what can happen is that the data is not migrated to the correct place, or you put in a wrong place for the Data directory to be in, one that BOINC is not allowed to read. Please post a question of help about that anywhere else, let's not do that in this thread. I am sure that the old work is still on your computer and it could possibly be gotten back. ID: 868987 ·

Virtual Boss* Volunteer tester Send message Joined: 4 May 08 Posts: 417 Credit: 6,440,287 RAC: 0	Message 868993 - Posted: 24 Feb 2009, 11:19:18 UTC - in response to Message 868960. But then there's problem 2. An application download checksum error (a) doesn't cause exponential backoff and (b) causes all workunits also requested by this particular client to be errored out and resent. This would explain why I have processed virtually nothing for the past month while my computer has been running almost 24x7?! After a week of processing an AP WU it gets errored out?! @ Batman Not the same problem as yours. In the problem Matt was referring to, no processing could be done because the application which does the crunching had not been downloaded yet. If you did a week of processing then you must have already downloaded the application! ID: 868993 ·

Kaylie Send message Joined: 26 Jul 08 Posts: 39 Credit: 333,106 RAC: 0	Message 868999 - Posted: 24 Feb 2009, 12:13:02 UTC All uploaded but still having troubled downloading... Would it be possible to move the servers closer to the node. $80,000 would buy a small trailer to put the equipment in if there is a place to locate it. ( My dumb ideas are flowing far more freely than WU's lately. ) ID: 868999 ·

gomeyer Volunteer tester Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0	Message 869022 - Posted: 24 Feb 2009, 14:23:16 UTC - in response to Message 868978. gomeyer wrote: ... [edit] BTW, I still think these rollouts should only be launched earlier in the week. [/edit] Astropulse v5 was rolled out Wednesday February 11 at about 5 pm Berkeley time. No obvious problem for 9 days, then whammo! Joe Thanks Joe. I certainly agree with the whammo bit, but what changed? Did we just hit some limit or tipping point in the system after 9 days that suddenly pushed things over the edge? Things just don't suddenly go bump in the night by themselves. Oh well, as has been correctly pointed out by many, there is no lack of other projects to run. ID: 869022 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 869025 - Posted: 24 Feb 2009, 14:42:38 UTC - in response to Message 869022. 'whammo' happened at about 23:00 lab time on Friday 20 February (07:00 UTC Saturday 21 Feb). The '-200' errors started much earlier - e.g. task 1166928853, 18 Feb 2009 1:45:56 UTC. So I suspect it wasn't a deliberate change, but a tipping point - the only other possibility is that I think I've seen an increase in the amount of Astropulse_v5 v5.03 being sent out, perhaps as Astropulse v5.00 WUs clear out of the database, which might have nudged the system into instability. ID: 869025 ·

perryjay Volunteer tester Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0	Message 869026 - Posted: 24 Feb 2009, 14:55:43 UTC - in response to Message 869025. Maybe that was the length of time it took for the problem machines to get to them. When they tried to work on the new WUs and found the client missing and started returning them as -200. PROUD MEMBER OF Team Starfire World BOINC ID: 869026 ·

gomeyer Volunteer tester Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0	Message 869032 - Posted: 24 Feb 2009, 15:06:52 UTC - in response to Message 869026. Last modified: 24 Feb 2009, 15:07:45 UTC Maybe that was the length of time it took for the problem machines to get to them. When they tried to work on the new WUs and found the client missing and started returning them as -200. Considering the number of 10 day queues out there, "9 days later" would make a lot of sense. Interesting observation! [edit] and Murphy would account for the fact that it happened on a weekend [/edit] ID: 869032 ·

Jesse Viviano Send message Joined: 27 Feb 00 Posts: 100 Credit: 3,949,583 RAC: 0	Message 869038 - Posted: 24 Feb 2009, 15:25:27 UTC Last modified: 24 Feb 2009, 16:09:02 UTC Maybe if CNS finishes up drawing up the plan to get Gigabit Ethernet between Hurricane Electric and the Space Sciences Laboratory, maybe it could submit the plan to compete for the stimulus package money :-D . (The closest SONET speed near one gigabit would be OC-24, but that speed is apparently not commercially viable. The closest commercially viable speed apparently is OC-48.) However, it would be a long shot, because California probably needs all that money to stay out of needing to file for bankruptcy due to the subprime mortgage crisis that started in California and Florida that is currently destroying the tax base. ID: 869038 ·

Neil Blaikie Volunteer tester Send message Joined: 17 May 99 Posts: 143 Credit: 6,652,341 RAC: 0	Message 869076 - Posted: 24 Feb 2009, 17:05:53 UTC Last modified: 24 Feb 2009, 17:08:00 UTC 2/24/2009 12:04:52 PM\|SETI@home\|Message from server: No work available for the applications you have selected. Please check your settings on the web site. Erm, not seem that one before, took 3 update requests to finally get some new work and even then some of them had http errors and are download pending. Will be patient and wait and see what happens after the outage which should be starting sometime soon California time. ID: 869076 ·

Swibby Bear Send message Joined: 1 Aug 01 Posts: 246 Credit: 7,945,093 RAC: 0	Message 869114 - Posted: 24 Feb 2009, 17:33:25 UTC - in response to Message 869022. Last modified: 24 Feb 2009, 17:37:16 UTC I certainly agree with the whammo bit, but what changed? In my view, the tipping point came right after Lunatics released the new optimised AP 5.03 app, then everyone started downloading new AP WUs to run on the new (r112) opt app. I myself am making a specialty of running only the older AP opt app (r103) and cleaning up many of the old-version WUs that are being returned late or aborted. I am keeping my 4-day cache filled easily. Three cheers and thanks to those wonderful, skilled programmers who can take full advantage of the CPU hardware features and get both MB and AP WUs to complete in half the time. THANK YOU !!! Whit ID: 869114 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 869128 - Posted: 24 Feb 2009, 22:55:59 UTC - in response to Message 869114. I certainly agree with the whammo bit, but what changed? In my view, the tipping point came right after Lunatics released the new optimised AP 5.03 app, then everyone started downloading new AP WUs to run on the new (r112) opt app. The release posts for Lunatics r112 (v5.03 equivalent) are timed at 23:32 UTC on 21 Feb (Lunatics site), and 00:13 UTC on 22 Feb (here) - at least 16 hours after 'whammo'. Although some of us have been running r112 since 13 February in an attempt to confirm accurate validations (unfortunately v5.03 was never tested at Beta, so no pre-release validation tests could be carried out there), I don't think you can say that the Lunatics release 'caused' this event. [We did think of claiming that the release timing was deliberate - "download from Lunatics if you can't get an official copy from Berkeley" - but that would have untrue: the timing was pure coincidence]. BTW, congratulations on the effort to clean up v5.00 - but it is possible to run both versions together until the cleanup is complete. ID: 869128 ·

Westsail and Pyxey Volunteer tester Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0	Message 869350 - Posted: 25 Feb 2009, 15:29:32 UTC Over 1500 Mahalos to You! Tanks yea for all you guys doing. Also hope we can help for make more bandwidth. Everything seems 5x5 now. Got all work uploaded and got nearly full caches on all 3 machines. Woo hoo, life is good. Take care everyone! -Brandon "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov ID: 869350 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.