Message boards :
Technical News :
Blop (Jun 25 2007)
Message board moderation
Author | Message |
---|---|
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
No major failures to report today. Good. Maybe you noticed web servers going up and down today - I was upgrading versions just to keep up with security. You may also note the result queue draining a bit. I changed the ceiling from 500K to 200K. This is plenty high, and the lower ceiling will free up some extra breathing room so when multibeam workunits are created they won't fill up the download volume. I also fixed the top_hosts.php again. I guess I didn't check changes fast enough into SVN and they were overwritten with the previous buggy code. Should be okay now. I also took some time to upgrade my desktop machine to Fedora Core 7, just so I can start getting used to that process. Not sure when I'll get to working on "bane" again, but Intel in conjuction with Colfax International assembled and donated a master science database replica machine which was delivered at the very end of last week. It basically has the same specs as thumper and the plan is to use it as a replica on which we do some real scientific development and final analysis. I should try to get that rolling soon. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Pooh Bear 27 Send message Joined: 14 Jul 03 Posts: 3224 Credit: 4,603,826 RAC: 0 |
Some nice hardware donations lately. I am glad to see the hardware getting a little closer to today's machines. Glad to hear that the day was a lot less of a rush and fix, and more trying to get some of the small things done. Keep up the grand work, all of you. Maybe we can get those pesky old Orphan's cleaned up, that are still clogging the DB a little. Especially those results that no longer have a WU associated with them. Pointers have to be really funky with things like that hanging on. My movie https://vimeo.com/manage/videos/502242 |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Matt, The "http error", first mentioned in the "blip" thread, and as evidenced by: 6/25/2007 6:56:18 PM|SETI@home|Requesting 185928 seconds of new work, and reporting 1 completed tasks 6/25/2007 6:56:39 PM|SETI@home|Scheduler request failed: HTTP file not found 6/25/2007 6:56:39 PM|SETI@home|Sending scheduler request: To fetch work 6/25/2007 6:56:39 PM|SETI@home|Requesting 185928 seconds of new work, and reporting 1 completed tasks 6/25/2007 6:56:59 PM|SETI@home|Scheduler RPC succeeded [server version 509] 6/25/2007 6:56:59 PM|SETI@home|Deferring communication for 11 sec 6/25/2007 6:56:59 PM|SETI@home|Reason: requested by project 6/25/2007 6:56:59 PM|SETI@home|Deferring communication for 1 min 25 sec 6/25/2007 6:56:59 PM|SETI@home|Reason: no work from project is (obviously) still with us. (times PDT) I'm also not sure why that last line reads "no work from project" as there were about 200K WU's in the pipeline, ready to go... . Hello, from Albany, CA!... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
I'm also not sure why that last line reads "no work from project" as there were about 200K WU's in the pipeline, ready to go... Definately a glitch somewhere- for the last 4 hours i've been getting "No work from project" responses to requests for more Work Units. Grant Darwin NT |
Rob Send message Joined: 25 Sep 06 Posts: 1 Credit: 199,258 RAC: 0 |
G'day. I've had no new work units for about 4 days now. Rob Adelaide SA |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
I've had no new work units for about 4 days now. Would be worth posting about the problem in Numbner Crunching or the appropriate Help forum (Windows, LUNIX etc). This problem has only just occured today. Prior to that, downloading as required, first attempt every time. Grant Darwin NT |
Ingleside Send message Joined: 4 Feb 03 Posts: 1546 Credit: 15,832,022 RAC: 13 |
I'm also not sure why that last line reads "no work from project" as there were about 200K WU's in the pipeline, ready to go... Just a "stab in the dark", but SETI@home is running 2 Scheduling-servers, there one of them handles all Even results and the other all Odd results. If one of the servers gets so long ahead of the other server, it's possible there aren't any Odd or Even results left, meaning one of the Scheduling-servers doesn't have any work at all. Only then the other scheduling-server "catches-up" enough, will both scheduling-server have more work to send-out... Now, in theory the Odd/Even Scheduling-server should be choosen randomly for each scheduler-request, but it's possible this isn't really happening so if you asks the "wrong" server you'll continue to ask the "wrong" server again and again... "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." |
W-K 666 Send message Joined: 18 May 99 Posts: 19013 Credit: 40,757,560 RAC: 67 |
In last half hour my two computers have downloaded 10 results all with even number ID's. Andy edit] Just noticed that splitters are now splitting work. They weren't when I was getting 'no work from project' [/edit |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
In last half hour my two computers have downloaded 10 results all with even number ID's. Just got some work, all starting with even IDs. It's still asking for more work & now getting the "No work from project" messages again. EDIT- after about 6 automatic retries i finally got another Work Unit, odd this time. Grant Darwin NT |
Matt Lebofsky Send message Joined: 1 Mar 99 Posts: 1444 Credit: 957,058 RAC: 0 |
There are two schedulers, running redundantly. However, to make sure they don't send out the same work twice they are given "control" of one half of the work, i.e. one sends out results with even id's, the other sends out odd. So, depending on which scheduler DNS randomly hands you at the time, you'll get either even or odd id'ed results to work on. - Matt -- BOINC/SETI@home network/web/science/development person -- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude |
Byron Leigh Hatch @ team Carl Sagan Send message Joined: 5 Jul 99 Posts: 4548 Credit: 35,667,570 RAC: 4 |
There are two schedulers, running redundantly. However, to make sure they don't send out the same work twice they are given "control" of one half of the work, i.e. one sends out results with even id's, the other sends out odd. So, depending on which scheduler DNS randomly hands you at the time, you'll get either even or odd id'ed results to work on. Matt , thanks very much for this information. Kind Regards Byron |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
There are two schedulers, running redundantly. However, to make sure they don't send out the same work twice they are given "control" of one half of the work, i.e. one sends out results with even id's, the other sends out odd. So, depending on which scheduler DNS randomly hands you at the time, you'll get either even or odd id'ed results to work on. Matt, someone should check on the scheduler sending out the "odd" results, as I think that's the culprit that's handing out the "no new work" (even with 200k WU's ready to send) line... the http error is still with us, too; see following: 6/28/2007 6:49:50 AM|SETI@home|Sending scheduler request: To fetch work 6/28/2007 6:49:50 AM|SETI@home|Requesting 456833 seconds of new work 6/28/2007 6:49:55 AM|SETI@home|Scheduler request failed: HTTP file not found 6/28/2007 6:49:55 AM|SETI@home|Sending scheduler request: To fetch work 6/28/2007 6:49:55 AM|SETI@home|Requesting 456833 seconds of new work 6/28/2007 6:50:00 AM|SETI@home|Scheduler RPC succeeded [server version 509] 6/28/2007 6:50:00 AM|SETI@home|Deferring communication for 11 sec 6/28/2007 6:50:00 AM|SETI@home|Reason: requested by project 6/28/2007 6:50:00 AM|SETI@home|Deferring communication for 1 min 30 sec 6/28/2007 6:50:00 AM|SETI@home|Reason: no work from project . Hello, from Albany, CA!... |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 |
There are two schedulers, running redundantly. However, to make sure they don't send out the same work twice they are given "control" of one half of the work, i.e. one sends out results with even id's, the other sends out odd. So, depending on which scheduler DNS randomly hands you at the time, you'll get either even or odd id'ed results to work on. While it's unclear why "odd" is so far behind "even", I think it is likely that the "no new work" replies are coming from "even". Take a look at wuid 137582495 and note that the results were sent within seconds of creation. That indicates "even" doesn't really have any "ready to send" and is sending as soon as it gets some from the splitters. When it has sent all those it really doesn't have any until another batch comes from a splitter or the transitioner does a resend. Also look at wuid 137443002 where "odd" was working at about the same 19:00 UTC time. It was sending work which had been created 16 hours earlier, the even result had been sent immediately. I'm not sure how to relate the wuid difference of about 140000 and the resultid difference of about 435000 to the 200000 target for ready to send. Joe |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.