Message boards :
Technical News :
Thousand Island (Feb 23 2009)
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Madas91 Send message Joined: 7 Apr 04 Posts: 1 Credit: 127,253 RAC: 0 |
Just finished my last cuda unit :( still have 14 complete to upload :( Dont let my graphics card drop below 85C or it will think its holiday time. My upload list is finally looking smaller but no new work yet. Hope the indigestion goes soon.......keep up all the good work anyway. At least now i know its not just me i'm happier......how does that work |
KWSN Sir Clark Send message Joined: 17 Aug 02 Posts: 139 Credit: 1,002,493 RAC: 8 |
I'm doing my bit. Successfully crunching some WUs but got SETI and SETI beta on NNW until this dies down. I've got 17 other projects to crunch for (all on one PC) so I'm not gonna run out of work. Hope it gets sorted soon.... |
littlegreenmanfrommars Send message Joined: 28 Jan 06 Posts: 1410 Credit: 934,158 RAC: 0 |
Thanks for filling us in on the latest hiccup, Matt. I have a little to add, though: My laptop received a message to the effect that a newer version of BOINC was avalailable. I downloaded the new version, and upon installing it, lost my entire cache of WU's, including a "long distance" AP unit which had almost finished. Not only that, I found my laptop had to be re-attached to the project. All the lost WU's will have to be resent again, adding to the bottleneck. It might be a good idea to try two things here: 1) During times of high bandwidth usage, turn off the downloading to clients of updated BOINC versions. 2) Installing a newer version of BOINC never used to cause a PC to detach from the project. Can we please return to this state? Cheers lgm |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
gomeyer wrote: ... Astropulse v5 was rolled out Wednesday February 11 at about 5 pm Berkeley time. No obvious problem for 9 days, then whammo! Joe |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
2) Installing a newer version of BOINC never used to cause a PC to detach from the project. Can we please return to this state? Upgrading to a new BOINC does not detach you from your projects. But what can happen is that the data is not migrated to the correct place, or you put in a wrong place for the Data directory to be in, one that BOINC is not allowed to read. Please post a question of help about that anywhere else, let's not do that in this thread. I am sure that the old work is still on your computer and it could possibly be gotten back. |
Virtual Boss* Send message Joined: 4 May 08 Posts: 417 Credit: 6,440,287 RAC: 0 |
But then there's problem 2. An application download checksum error (a) doesn't cause exponential backoff and (b) causes all workunits also requested by this particular client to be errored out and resent. @ Batman Not the same problem as yours. In the problem Matt was referring to, no processing could be done because the application which does the crunching had not been downloaded yet. If you did a week of processing then you must have already downloaded the application! |
Kaylie Send message Joined: 26 Jul 08 Posts: 39 Credit: 333,106 RAC: 0 |
All uploaded but still having troubled downloading... Would it be possible to move the servers closer to the node. $80,000 would buy a small trailer to put the equipment in if there is a place to locate it. ( My dumb ideas are flowing far more freely than WU's lately. ) |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
gomeyer wrote:... Thanks Joe. I certainly agree with the whammo bit, but what changed? Did we just hit some limit or tipping point in the system after 9 days that suddenly pushed things over the edge? Things just don't suddenly go bump in the night by themselves. Oh well, as has been correctly pointed out by many, there is no lack of other projects to run. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
'whammo' happened at about 23:00 lab time on Friday 20 February (07:00 UTC Saturday 21 Feb). The '-200' errors started much earlier - e.g. task 1166928853, 18 Feb 2009 1:45:56 UTC. So I suspect it wasn't a deliberate change, but a tipping point - the only other possibility is that I think I've seen an increase in the amount of Astropulse_v5 v5.03 being sent out, perhaps as Astropulse v5.00 WUs clear out of the database, which might have nudged the system into instability. |
perryjay Send message Joined: 20 Aug 02 Posts: 3377 Credit: 20,676,751 RAC: 0 |
Maybe that was the length of time it took for the problem machines to get to them. When they tried to work on the new WUs and found the client missing and started returning them as -200. PROUD MEMBER OF Team Starfire World BOINC |
gomeyer Send message Joined: 21 May 99 Posts: 488 Credit: 50,370,425 RAC: 0 |
Maybe that was the length of time it took for the problem machines to get to them. When they tried to work on the new WUs and found the client missing and started returning them as -200. Considering the number of 10 day queues out there, "9 days later" would make a lot of sense. Interesting observation! [edit] and Murphy would account for the fact that it happened on a weekend [/edit] |
Jesse Viviano Send message Joined: 27 Feb 00 Posts: 100 Credit: 3,949,583 RAC: 0 |
Maybe if CNS finishes up drawing up the plan to get Gigabit Ethernet between Hurricane Electric and the Space Sciences Laboratory, maybe it could submit the plan to compete for the stimulus package money :-D . (The closest SONET speed near one gigabit would be OC-24, but that speed is apparently not commercially viable. The closest commercially viable speed apparently is OC-48.) However, it would be a long shot, because California probably needs all that money to stay out of needing to file for bankruptcy due to the subprime mortgage crisis that started in California and Florida that is currently destroying the tax base. |
Neil Blaikie Send message Joined: 17 May 99 Posts: 143 Credit: 6,652,341 RAC: 0 |
2/24/2009 12:04:52 PM|SETI@home|Message from server: No work available for the applications you have selected. Please check your settings on the web site. Erm, not seem that one before, took 3 update requests to finally get some new work and even then some of them had http errors and are download pending. Will be patient and wait and see what happens after the outage which should be starting sometime soon California time. |
Swibby Bear Send message Joined: 1 Aug 01 Posts: 246 Credit: 7,945,093 RAC: 0 |
I certainly agree with the whammo bit, but what changed? In my view, the tipping point came right after Lunatics released the new optimised AP 5.03 app, then everyone started downloading new AP WUs to run on the new (r112) opt app. I myself am making a specialty of running only the older AP opt app (r103) and cleaning up many of the old-version WUs that are being returned late or aborted. I am keeping my 4-day cache filled easily. Three cheers and thanks to those wonderful, skilled programmers who can take full advantage of the CPU hardware features and get both MB and AP WUs to complete in half the time. THANK YOU !!! Whit |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
I certainly agree with the whammo bit, but what changed? The release posts for Lunatics r112 (v5.03 equivalent) are timed at 23:32 UTC on 21 Feb (Lunatics site), and 00:13 UTC on 22 Feb (here) - at least 16 hours after 'whammo'. Although some of us have been running r112 since 13 February in an attempt to confirm accurate validations (unfortunately v5.03 was never tested at Beta, so no pre-release validation tests could be carried out there), I don't think you can say that the Lunatics release 'caused' this event. [We did think of claiming that the release timing was deliberate - "download from Lunatics if you can't get an official copy from Berkeley" - but that would have untrue: the timing was pure coincidence]. BTW, congratulations on the effort to clean up v5.00 - but it is possible to run both versions together until the cleanup is complete. |
Westsail and *Pyxey* Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 |
Over 1500 Mahalos to You! Tanks yea for all you guys doing. Also hope we can help for make more bandwidth. Everything seems 5x5 now. Got all work uploaded and got nearly full caches on all 3 machines. Woo hoo, life is good. Take care everyone! -Brandon "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.