Message boards :
Number crunching :
Panic Mode On (100) Server Problems?
Message board moderation
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 32 · Next
Author | Message |
---|---|
Jimbocous Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349 |
According to David the routing problem is fixed. The BOINC route is OK. Still no go on reporting or downloading tasks. setiboinc.ssl.berkeley.edu is still lost in space ... |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I'm still having to hit the manual Update button more often than not, but, it is connecting to the scheduler. The Win 8.1 Host got down to the last couple APs so I rebooted into a new Ubuntu system that only had 3 tasks remaining. It downloaded new tasks and continues to Report completed tasks and Download new ones. The traceroot looks the same as it did with OSX and Win 8.1, no problem finding the scheduler at setiboinc.ssl.berkeley.edu. The addresses in the client_state.xml trace fine, even http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi opens an xml file. I'm not sure what will happen when I go to bed and stop hitting the Update button though... |
Zombu2 Send message Joined: 24 Feb 01 Posts: 1615 Credit: 49,315,423 RAC: 0 |
still not working here even pushing the update button like there s no tomorrow I came down with a bad case of i don't give a crap |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
still not working here even pushing the update button like there s no tomorrow Same here. I would cobble together a bash script to update every 10 seconds if I thought it would help, but doing it many times manually has convinced me that it won't. Even killing and restarting boinc that worked for les-helen-day didn't help. 119 APs uploaded and not one acknowledgement so no new WUs The good news is that at least no new APs are being offered for download. That would really break my heart :). |
Oz Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 212 |
If you think it will help. itcsshelp@berkeley.edu 510-664-9000, ext. 1 Member of the 20 Year Club |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
It seems like the folks who can't connect at all are mostly those who have built up large numbers of tasks waiting to report. I suspect that means that the scheduler requests are large and perhaps end up getting fragmented on the way to Berkeley, increasing the likelihood of failure. Most of my requests have been to report fewer than 5 tasks at a time and, although about half of those fail, the next attempt usually succeeds. Looking at some old threads regarding connection problems, I noticed that there's an option available in cc_config.xml for <max_tasks_reported>xx</max_tasks_reported> which essentially cuts the scheduler requests into smaller chunks. Perhaps that's something that would help here. Or perhaps not (but I think it might be worth a try). ;^) |
Oz Send message Joined: 6 Jun 99 Posts: 233 Credit: 200,655,462 RAC: 212 |
It seems like the folks who can't connect at all are mostly those who have built up large numbers of tasks waiting to report. I suspect that means that the scheduler requests are large and perhaps end up getting fragmented on the way to Berkeley, increasing the likelihood of failure. Most of my requests have been to report fewer than 5 tasks at a time and, although about half of those fail, the next attempt usually succeeds. It may help some folks, but I am sitting on a laptop with ONE task to report - it has not managed to connect since 30/9/15 at 14:05UTC... I don't think Berkeley IT is aware of the problem as there is no mention of it on their Service Status page (http://systemstatus.berkeley.edu/) which begins with: The page will be updated whenever there is a change in system status that will affect users for more than 30 minutes. If you need assistance with a system or network problem, call Campus Shared Services IT at 510-664-9000 Option 1, 1, 1 - All Other Technology Requests. Member of the 20 Year Club |
Zombu2 Send message Joined: 24 Feb 01 Posts: 1615 Credit: 49,315,423 RAC: 0 |
It seems like the folks who can't connect at all are mostly those who have built up large numbers of tasks waiting to report. I suspect that means that the scheduler requests are large and perhaps end up getting fragmented on the way to Berkeley, increasing the likelihood of failure. Most of my requests have been to report fewer than 5 tasks at a time and, although about half of those fail, the next attempt usually succeeds. yep i got about 600 wu's waiting to upload from all the machines funny enough one of my machines has no issue at all it s happily crunching away and reporting ...same wan ip I came down with a bad case of i don't give a crap |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
One of the old tricks when reporting a large number was to set BOINC Manager to No new tasks in the Projects tab and then hitting the Update button for around a dozen times. If that doesn't work I suppose it's hopeless. |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
It seems like the folks who can't connect at all are mostly those who have built up large numbers of tasks waiting to report. I suspect that means that the scheduler requests are large and perhaps end up getting fragmented on the way to Berkeley, increasing the likelihood of failure. Most of my requests have been to report fewer than 5 tasks at a time and, although about half of those fail, the next attempt usually succeeds. That was a very good thought and well worth trying but it doesn't seem to work for me even when set to reporting 1 task and updating many times. The results are all similar to this. 01-Oct-2015 23:15:27 [SETI@home] work fetch resumed by user 01-Oct-2015 23:15:29 [SETI@home] update requested by user 01-Oct-2015 23:15:31 [SETI@home] [sched_op] Starting scheduler request 01-Oct-2015 23:15:31 [SETI@home] Sending scheduler request: Requested by user. 01-Oct-2015 23:15:31 [SETI@home] Reporting 1 completed tasks 01-Oct-2015 23:15:31 [SETI@home] Requesting new tasks for CPU and NVIDIA 01-Oct-2015 23:15:31 [SETI@home] [sched_op] CPU work request: 3257848.37 seconds; 0.00 devices 01-Oct-2015 23:15:31 [SETI@home] [sched_op] NVIDIA work request: 500067.24 seconds; 0.00 devices 01-Oct-2015 23:15:35 [---] Project communication failed: attempting access to reference site 01-Oct-2015 23:15:35 [SETI@home] Scheduler request failed: Couldn't connect to server 01-Oct-2015 23:15:35 [SETI@home] [sched_op] Deferring communication for 2 hr 25 min 18 sec 01-Oct-2015 23:15:35 [SETI@home] [sched_op] Reason: Scheduler request failed |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
You have two machines on a LAN network behind the same WAN IP address and one works and one doesn't. Is that correct? That really would be strange. Edit: If that is the case, I would be looking at the configs and anything else I could think of to determine why one works and one doesn't. |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
One of the old tricks when reporting a large number was to set BOINC Manager to No new tasks in the Projects tab and then hitting the Update button for around a dozen times. If that doesn't work I suppose it's hopeless. Another good thought, but alas. 01-Oct-2015 23:31:27 [SETI@home] work fetch suspended by user 01-Oct-2015 23:31:29 [SETI@home] update requested by user 01-Oct-2015 23:31:31 [SETI@home] [sched_op] Starting scheduler request 01-Oct-2015 23:31:31 [SETI@home] Sending scheduler request: Requested by user. 01-Oct-2015 23:31:31 [SETI@home] Reporting 1 completed tasks 01-Oct-2015 23:31:31 [SETI@home] Not requesting tasks: scheduler RPC backoff 01-Oct-2015 23:31:31 [SETI@home] [sched_op] CPU work request: 0.00 seconds; 0.00 devices 01-Oct-2015 23:31:31 [SETI@home] [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices 01-Oct-2015 23:31:34 [---] Project communication failed: attempting access to reference site 01-Oct-2015 23:31:34 [SETI@home] Scheduler request failed: Couldn't connect to server 01-Oct-2015 23:31:34 [SETI@home] [sched_op] Deferring communication for 27 min 8 sec 01-Oct-2015 23:31:34 [SETI@home] [sched_op] Reason: Scheduler request failed 01-Oct-2015 23:31:36 [---] Internet access OK - project servers may be temporarily down. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
It seems like the folks who can't connect at all are mostly those who have built up large numbers of tasks waiting to report. I suspect that means that the scheduler requests are large and perhaps end up getting fragmented on the way to Berkeley, increasing the likelihood of failure. Most of my requests have been to report fewer than 5 tasks at a time and, although about half of those fail, the next attempt usually succeeds. Darn! I was trying to puzzle out what the commonality might be that divides those who can't connect at all and those who are having at least some modestly consistent success. Richard kind of shot down my thoughts about using TCP Optimizer earlier, and perhaps it's not the size of the scheduler request, either, based on your results and Oz's post. Oh, well, my machines are still plugging away. A scheduler request on my daily driver failed about an hour and a half ago, then the next one succeeded less than 2 minutes later. EDIT: Just had another successful request a couple minutes ago, reporting 2 completed tasks and downloading 2 new ones. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
yep i got about 600 wu's waiting to upload from all the machines For the one that's successfully reporting, about how many tasks is it reporting in each scheduler request? Also, do you happen to know if the machines have different MTU values? |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349 |
According to David the routing problem is fixed. boinc.berkeley.edu is again unreachable, at least to me: et3-48.inr-311-ewdc.Berkeley.EDU [128.32.0.101] reports: Destination host unreachable. |
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
yep i got about 600 wu's waiting to upload from all the machines If changing the MTU is a possible cure, I can tell you 1500 is one value that is not working for me - and now all my APs are gone and I have only a few MBs left :(. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
et3-47.inr-311-ewdc.Berkeley.EDU 128.32.0.103 Works for me. setiboinc.ssl.berkeley.edu Hop Hostname IP Time 6 lag-10.ear2.Miami2.Level3.net 4.68.71.169 21.090 13 ae-1-60.ear1.LosAngeles1.Level3.net 4.69.144.18 72.656 14 CENIC.ear1.LosAngeles1.Level3.net 4.35.156.66 72.503 15 dc-svl-agg4--lax-agg6-100ge.cenic.net 137.164.11.1 79.911 16 dc-oak-agg4--svl-agg4-100ge.cenic.net 137.164.46.144 83.179 17 ucb--oak-agg4-10g.cenic.net 137.164.50.31 82.773 18 t2-3.inr-201-sut.Berkeley.EDU 128.32.0.37 82.058 19 et3-47.inr-311-ewdc.Berkeley.EDU 128.32.0.103 82.115 20 et3-47.inr-311-ewdc.Berkeley.EDU 128.32.0.103 1278.088 et3-48.inr-311-ewdc.berkeley.edu (128.32.0.101) is 'download server 2 - vader' Vader is Not the scheduler. |
Jeff Buck Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0 |
If changing the MTU is a possible cure, I can tell you 1500 is one value that is not working for me - and now all my APs are gone and I have only a few MBs left :(. That's really just a guess on my part, possibly one of the things that might differentiate those machines that are getting through and those that aren't. I know that before the move to the co-lo there were a lot of connection issues and running TCP Optimizer, which, among other things adjusted the MTU size, seemed to help a lot of people. For those that haven't been able to get through at all, it might be worth a shot. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349 |
et3-47.inr-311-ewdc.Berkeley.EDU 128.32.0.103 Works for me. Consistently? Reason I ask is that I'm consistently: et3-47.inr-311-ewdc.Berkeley.EDU [128.32.0.103] reports: Destination host unreachable. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
et3-47.inr-311-ewdc.Berkeley.EDU 128.32.0.103 Works for me. The only time I see et3-48.inr-311-ewdc.Berkeley.EDU [128.32.0.101] is when I trace for Vader, http://setiathome.berkeley.edu/forum_thread.php?id=77990&postid=1730852#1730852 et3-47.inr-311-ewdc.Berkeley.EDU 128.32.0.103 has been Synergy all day and clicking on http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi gets you the scheduler. The machines that see setiboinc.ssl.berkeley.edu at et3-47.inr-311-ewdc.Berkeley.EDU (128.32.0.103) aren't having that much trouble, just an occasional manual update. Fri 02 Oct 2015 12:18:00 AM EDT | SETI@home | [sched_op] Starting scheduler request |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.