it's the AP Splitter processes killing the Scheduler |
![]() |
| log in |
Message boards : Number crunching : it's the AP Splitter processes killing the Scheduler
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next
| Author | Message |
|---|---|
Grant you are a long playing record that has got stuck, and a very wrong oner at that. Well, last AP unit was produced about 11 Nov 2012, 4:00 UTC (in weekend). About 24h later, Cricket started to drop down... And no more server timouts for users... ____________ | |
| ID: 1306173 · | |
Grant you are a long playing record that has got stuck, and a very wrong oner at that. Grant will be right at home on these message boards, we're all long-playing records here. But actually, I'm with him here. My observations were that the scheduler was considerably freeer, both faster to respond and more likely to allocate MB work (even when both requests and reports were combined in a single update), starting from the time when the last of the then-loaded tapes had its last AP tasks split (or when I got up on Monday morning, which was a few hours later). Now the timeouts are almost certain again, I'm about to try a little experiment: sitting at a machine with dual monitors (BOINC Manager open on one, the same host's website task list on the other), I'm going to see how long the delay is between the scheduler request being made and the ghosts appearing on the website. From preliminary observations with two separate computers (when variations in local clock settings come into play), my guess is 'seconds at most'. Then, I may have to dig out the old Wireshark to see what packets appear on the line, and when. | |
| ID: 1306175 · | |
But actually, I'm with him here. I'm with him too. | |
| ID: 1306182 · | |
But actually, I'm with him here. +1 edit: Just run out of MB - starting timeouts now! btw: Backup project Primegrid runs as it should run! | |
| ID: 1306185 · | |
|
Ah well, Murphy strikes again. Just as I settle down in front of the dual monitors on host 2901600, it fetches three times in succession without a timeout - just topping up to the 100 quota level. And I can't get any more until the next one finishes.... | |
| ID: 1306202 · | |
|
Is not possible to bypass the scheduller to get the already assigned ghosts? | |
| ID: 1306216 · | |
|
Well, here's the first snippet of evidence from this session: 14/11/2012 21:00:48 | SETI@home | Sending scheduler request: To fetch work. Both the two old tasks reported, and the two new tasks assigned, got a server time stamp of 14 Nov 2012 | 21:00:52 UTC (I'd done a special clock synchronisation before I started, so the times should be pretty good). So, the scheduler's actual work was completed in under five seconds, but it took almost two more minutes for the reply to reach me. | |
| ID: 1306217 · | |
|
And then I got 14/11/2012 21:07:51 | SETI@home | Reporting 1 completed tasks Again, the scheduler marked the work completed/allocated at 14 Nov 2012 | 21:07:53 UTC / 14 Nov 2012 | 21:07:54 UTC respectively - so it did its job, just didn't tell me about it. | |
| ID: 1306218 · | |
|
Could you do the same test with the AP-splitters stoped? and/or with the use of a proxie... that could be very interesting... | |
| ID: 1306219 · | |
Could you do the same test with the AP-splitters stoped? I'll try, but my arms aren't quite long enough to reach the off-switch from the UK.... Looks like the AP splitters will be with us for a while, so I'll try WireShark after dinner. | |
| ID: 1306220 · | |
Could you do the same test with the AP-splitters stoped? Sorry i forget you are in UK not in the Lab, but keep that in mind when you have the oportunity to try. ____________ | |
| ID: 1306222 · | |
Could you do the same test with the AP-splitters stoped? and/or with the use of a proxie... that could be very interesting... What i'd like to see is as a test, run the scheduler off the Campus Network, that would help prove whether the Hurricane link and associated routers was the problem (which are almost always heavily loaded), or whether the problem was a bit more upstream, Claggy | |
| ID: 1306226 · | |
|
Well, my "ghosts-only" machine (Unimatrix02) has gotten down to about 700 ghosts (nothing in the machine itself - he did get some resent WUs rather sporadically since my last msg, but never got near 100 in the machine) and gets Timeouts all the time now on work requests...this sucks! | |
| ID: 1306229 · | |
|
Ive found that using a proxy I can get the scheduller to answer but then all the downloads fails... if I take out the proxy, then the downloads succeed but the scheduller fails... | |
| ID: 1306237 · | |
Ive found that using a proxy I can get the scheduller to answer but then all the downloads fails... if I take out the proxy, then the downloads succeed but the scheduller fails... That's why i'd like to see them try the Campus Network and ISP, using a Proxy might be bypassing some or all of the Hurricane Network/ISP, Claggy | |
| ID: 1306238 · | |
Ive found that using a proxy I can get the scheduller to answer but then all the downloads fails... if I take out the proxy, then the downloads succeed but the scheduller fails... Try this proxie: 8.21.6.225 port 80, it works very fast on both directions... > 50Kbps ____________ | |
| ID: 1306239 · | |
Ive found that using a proxy I can get the scheduller to answer but then all the downloads fails... if I take out the proxy, then the downloads succeed but the scheduller fails... Yes, that's quite zippy, contacts complete without timeout now, downloads are quite slow. Claggy | |
| ID: 1306250 · | |
Try this proxie: 8.21.6.225 port 80, it works very fast on both directions... > 50Kbps Working for me, too. I tried it, forced an Update, and immediately got 20 resends. D/l is slow, but I will try toggling as mentioned above and see what happens. Thanks for the proxy address!!! ____________ | |
| ID: 1306252 · | |
Grant you are a long playing record that has got stuck, and a very wrong oner at that. rob, there's something wrong at your end. I was waiting for the AP Splitters to stop to try to get to the scheduler with one of my computers that could not make a successful Scheduler contact to report many hours of work. When the AP Splitters stopped, after hours of having zero luck, I was able to do the following: 11/10/2012 7:28:14 PM | SETI@home | Sending scheduler request: Requested by user. 11/10/2012 7:28:14 PM | SETI@home | Reporting 250 completed tasks, not requesting new tasks 11/10/2012 7:28:31 PM | SETI@home | Scheduler request completed 11/10/2012 7:29:44 PM | SETI@home | update requested by user 11/10/2012 7:29:48 PM | SETI@home | Sending scheduler request: Requested by user. 11/10/2012 7:29:48 PM | SETI@home | Reporting 250 completed tasks, not requesting new tasks 11/10/2012 7:29:58 PM | SETI@home | Scheduler request completed 11/10/2012 7:30:07 PM | SETI@home | update requested by user 11/10/2012 7:30:10 PM | SETI@home | Sending scheduler request: Requested by user. 11/10/2012 7:30:10 PM | SETI@home | Reporting 250 completed tasks, not requesting new tasks 11/10/2012 7:30:32 PM | SETI@home | Scheduler request completed 11/10/2012 7:30:38 PM | SETI@home | update requested by user 11/10/2012 7:30:43 PM | SETI@home | Sending scheduler request: Requested by user. 11/10/2012 7:30:43 PM | SETI@home | Reporting 250 completed tasks, not requesting new tasks 11/10/2012 7:31:19 PM | SETI@home | Scheduler request completed 11/10/2012 7:31:21 PM | SETI@home | update requested by user 11/10/2012 7:31:24 PM | SETI@home | Sending scheduler request: Requested by user. 11/10/2012 7:31:24 PM | SETI@home | Reporting 250 completed tasks, not requesting new tasks 11/10/2012 7:31:59 PM | SETI@home | Scheduler request completed 11/10/2012 7:32:21 PM | SETI@home | update requested by user 11/10/2012 7:32:25 PM | SETI@home | Sending scheduler request: Requested by user. 11/10/2012 7:32:25 PM | SETI@home | Reporting 86 completed tasks, not requesting new tasks 11/10/2012 7:34:06 PM | SETI@home | Scheduler request completed Your assertion that things did not get better is simply not-true. It may be 100% true for you which would point to a problem you continued to have, but for "the rest" of us there was a direct correlation to the AP Splitters running and our inability to report. As soon as the AP Splitters stopped running (meaning AP work was still in distribution, just not being split), things got miraculously better. | |
| ID: 1306255 · | |
|
And this is what i get on my E8500/9800GTX+ when i report and ask at once when using the proxy: | |
| ID: 1306256 · | |
Message boards : Number crunching : it's the AP Splitter processes killing the Scheduler
| Copyright © 2013 University of California |