Problem connecting to server - status page show no apparent problem

Message boards : Number crunching : Problem connecting to server - status page show no apparent problem
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
-Bert-

Send message
Joined: 23 Mar 02
Posts: 152
Credit: 412,754
RAC: 0
Netherlands
Message 848123 - Posted: 2 Jan 2009, 10:04:21 UTC - in response to Message 848119.  
Last modified: 2 Jan 2009, 10:07:32 UTC

The same here. Although I got 1 MB and 2 AP's a few hours ago and the completed WU's got reported.
ID: 848123 · Report as offensive
DeMus
Avatar

Send message
Joined: 5 Jan 08
Posts: 238
Credit: 1,765,862
RAC: 0
Netherlands
Message 848135 - Posted: 2 Jan 2009, 11:02:04 UTC - in response to Message 848123.  

The same here. Although I got 1 MB and 2 AP's a few hours ago and the completed WU's got reported.


It looks like problems again. After having received some 40WU's this morning, it now stops again.
I can't upload and also not download anything. The strange thing is when I press the upload button I read a message saying I am requesting 0 seconds of work. I set my cache to 4 days and still I request 0 seconds of work. What could be the reason for this?


______
DeMus


ID: 848135 · Report as offensive
Profile AlphaLaser
Volunteer tester

Send message
Joined: 6 Jul 03
Posts: 262
Credit: 4,430,487
RAC: 0
United States
Message 848136 - Posted: 2 Jan 2009, 11:09:08 UTC - in response to Message 848135.  

The same here. Although I got 1 MB and 2 AP's a few hours ago and the completed WU's got reported.


It looks like problems again. After having received some 40WU's this morning, it now stops again.
I can't upload and also not download anything. The strange thing is when I press the upload button I read a message saying I am requesting 0 seconds of work. I set my cache to 4 days and still I request 0 seconds of work. What could be the reason for this?



Are any of your tasks running in high priority? AFAIK, if the cache is too big, then BOINC might download more work than it thinks it can handle, kicking in EDF mode which also prevents any new downloads until enough tasks are cleared.
ID: 848136 · Report as offensive
DeMus
Avatar

Send message
Joined: 5 Jan 08
Posts: 238
Credit: 1,765,862
RAC: 0
Netherlands
Message 848137 - Posted: 2 Jan 2009, 11:14:39 UTC - in response to Message 848136.  

The same here. Although I got 1 MB and 2 AP's a few hours ago and the completed WU's got reported.


It looks like problems again. After having received some 40WU's this morning, it now stops again.
I can't upload and also not download anything. The strange thing is when I press the upload button I read a message saying I am requesting 0 seconds of work. I set my cache to 4 days and still I request 0 seconds of work. What could be the reason for this?



Are any of your tasks running in high priority? AFAIK, if the cache is too big, then BOINC might download more work than it thinks it can handle, kicking in EDF mode which also prevents any new downloads until enough tasks are cleared.



No things are running normally. No EDF mode. Any more ideas I can look into?


______
DeMus


ID: 848137 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 848138 - Posted: 2 Jan 2009, 11:24:52 UTC - in response to Message 848050.  
Last modified: 2 Jan 2009, 11:50:48 UTC

Message on the Home page:
December 31, 2008
Our scheduling server has crashed and so no work is being distributed. This may not get fixed for a day or two.


Effects reporting too.


i didn't found this page.. can you send me the link?

OK...here is the Server Status page......a page that tells the truth as it knows it....in other words, check the update time....if the servers have all stalled, and the update time is older that 30 minutes or so old...it may no longer be valid....

Then there is the Scarecrow Graphs.... wonderful for reference...but they are at the mercy of the servers too....if the servers are all down...Scarecrow gets no data to graph....

And the most reliable of all, the Cricket Graphs.......they show the bandwidth in and out of the Seti servers...and are not hosted on the Seti servers...so they are almost always accessible even when the Seti servers are on the fritz.........

Hope this helps you in the future...meow.


This last post from M-Sattler couldn't this be in a sticky on the forum.
Everytime there is a network/server downage i wonder what is happening and i seem to lost the links everytime i reformat my computer.
Could an op please make a sticky on that information so we all can keep track of what's happening.

That would calm everyone down i think.

Kind Regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 848138 · Report as offensive
DeMus
Avatar

Send message
Joined: 5 Jan 08
Posts: 238
Credit: 1,765,862
RAC: 0
Netherlands
Message 848144 - Posted: 2 Jan 2009, 12:14:45 UTC - in response to Message 848119.  

Still unable to upload or report any results from any of my machines.


The server status page says all servers are up and running, still I get the message:

Fri Jan 2 13:13:40 2009||Project communication failed: attempting access to reference site

Does anyone know what is wrong, is it here with me or is it at the other side?



______
DeMus


ID: 848144 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 848147 - Posted: 2 Jan 2009, 12:22:32 UTC - in response to Message 848144.  

Still unable to upload or report any results from any of my machines.


The server status page says all servers are up and running, still I get the message:

Fri Jan 2 13:13:40 2009||Project communication failed: attempting access to reference site

Does anyone know what is wrong, is it here with me or is it at the other side?



Just too many machines trying to up- and download work at the same time. The Cricket Graph tells the story. The pipes are clogged and patience is the only answer ATM.

F.
ID: 848147 · Report as offensive
-Bert-

Send message
Joined: 23 Mar 02
Posts: 152
Credit: 412,754
RAC: 0
Netherlands
Message 848152 - Posted: 2 Jan 2009, 12:35:45 UTC - in response to Message 848144.  

Still unable to upload or report any results from any of my machines.


The server status page says all servers are up and running, still I get the message:

Fri Jan 2 13:13:40 2009||Project communication failed: attempting access to reference site

Does anyone know what is wrong, is it here with me or is it at the other side?




Don't worry, it's definitely at the other side. The status page doesn't seem to be very reliable. There is only one thing we can do: practice patience.
ID: 848152 · Report as offensive
DeMus
Avatar

Send message
Joined: 5 Jan 08
Posts: 238
Credit: 1,765,862
RAC: 0
Netherlands
Message 848156 - Posted: 2 Jan 2009, 12:55:20 UTC - in response to Message 848152.  

Still unable to upload or report any results from any of my machines.


The server status page says all servers are up and running, still I get the message:

Fri Jan 2 13:13:40 2009||Project communication failed: attempting access to reference site

Does anyone know what is wrong, is it here with me or is it at the other side?




Don't worry, it's definitely at the other side. The status page doesn't seem to be very reliable. There is only one thing we can do: practice patience.


OK, but it's not easy, I tell you that. :-)

______
DeMus


ID: 848156 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 848162 - Posted: 2 Jan 2009, 13:04:14 UTC - in response to Message 848156.  

Still unable to upload or report any results from any of my machines.


The server status page says all servers are up and running, still I get the message:

Fri Jan 2 13:13:40 2009||Project communication failed: attempting access to reference site

Does anyone know what is wrong, is it here with me or is it at the other side?




Don't worry, it's definitely at the other side. The status page doesn't seem to be very reliable. There is only one thing we can do: practice patience.


OK, but it's not easy, I tell you that. :-)

OK just think how long it takes to recover from the normal 3-4 hour outage. This last outage was around 36 hours, around 8 times longer, so don't panic.

Just keep watch the Cricket Graph. When you see the traffic start to fall then work will start to flow.

Bernie
ID: 848162 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 848169 - Posted: 2 Jan 2009, 13:24:22 UTC - in response to Message 848135.  

It looks like problems again. After having received some 40WU's this morning, it now stops again.
I can't upload and also not download anything. The strange thing is when I press the upload button I read a message saying I am requesting 0 seconds of work. I set my cache to 4 days and still I request 0 seconds of work. What could be the reason for this?


If a project has more than 2 * number of cpu's uploads, either waiting or in progress, work-requests to project is blocked. This is primarily to make sure someone doesn't land in a situation there they're crunching all work faster than they manages to upload the results, something that would eventually lead to all work being reported after deadline. It's also an advantage in cases it's due to temporary server-problems or temporary maxed-out server-bandwidth like currently for SETI@home, since newly-downloaded work can often quickly lead to more uploads and therefore higher load, and higher load is something a project definitely don't want in cases they can't handle the current load...

"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 848169 · Report as offensive
Profile KWSN imcrazynow
Avatar

Send message
Joined: 15 Jan 00
Posts: 63
Credit: 1,163,256
RAC: 0
United States
Message 848198 - Posted: 2 Jan 2009, 14:29:28 UTC - in response to Message 848109.  

Message on the Home page:
December 31, 2008
Our scheduling server has crashed and so no work is being distributed. This may not get fixed for a day or two.


Effects reporting too.


i didn't found this page.. can you send me the link?


He stated it was on the home page.

Proof that those menu bars at the top and bottom are not working. Now remove them. Useless attribute on these forums anyway. Who ever uses them? ;-)


It was indeed on the Home Page. I saw it as well. I cant remember exactly what it said but it was to the effect that servers were down and may remain that way for a day or so.

ID: 848198 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 848223 - Posted: 2 Jan 2009, 15:59:51 UTC - in response to Message 848162.  

...
Just keep watch the Cricket Graph. When you see the traffic start to fall then work will start to flow.
...

The ~94 Mbits/sec of flow for over 11 hours has delivered something like 40 thousand AP WUs and 1.1 million MB WUs. I've often wondered if there's some specific timing relationship which makes it easy for some hosts to get work during those peak times, or if it is evenly distributed luck.

I agree that after the rate starts to drop it will become a lot easier to get work, not having to deal with dropped connections and such is much smoother.
                                                                  Joe
ID: 848223 · Report as offensive
Iztok s52d (and friends)

Send message
Joined: 12 Jan 01
Posts: 136
Credit: 393,469,375
RAC: 116
Slovenia
Message 848239 - Posted: 2 Jan 2009, 16:32:12 UTC - in response to Message 848223.  


The ~94 Mbits/sec of flow for over 11 hours has delivered something like 40 thousand AP WUs and 1.1 million MB WUs. I've often wondered if there's some specific timing relationship which makes it easy for some hosts to get work during those peak times, or if it is evenly distributed luck.



Hi!
Yes, inteed. Timers can help.

WHile seeking connection (SYN), where you do not get TCP connection,
timeout can be shorter.

After you got it: a lo tof uploads/reports fail because our default timeout
is too short. Make it longer and maybe you can finish session successfully.

Anyhow, those timers are usually system defaults for computer, so changing them
is changing behaviour for other TCP/IP activities.

So, yes, it is possible. And no, it does not make sense: just wait a while ....

On the other hand, servers can do something. For example, blocking SYN on firevalls in a way, so if you get to the server, you can easily upload/download/report.

73 HNY Iztok





ID: 848239 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 848377 - Posted: 2 Jan 2009, 21:14:36 UTC

Well I made it up to requesting 800,000 seconds of work and got 50 tasks..11 of them were APs. Still trying to upload 7 results, but things are moving a little.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 848377 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 848386 - Posted: 2 Jan 2009, 21:29:33 UTC - in response to Message 848239.  


On the other hand, servers can do something. For example, blocking SYN on firevalls in a way, so if you get to the server, you can easily upload/download/report.

73 HNY Iztok

... and there is one more possibility, that BOINC does not currently exploit.

There should be a way for a project's servers to publish (and "publish" means through some means other than from those servers) that they're busy, and tell the BOINC clients' "hey, everybody slow down."

Sure, blocking SYNs in a firewall would work, but it's even better to keep the SYNs from happening in the first place.

ID: 848386 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 848394 - Posted: 2 Jan 2009, 21:48:42 UTC - in response to Message 848386.  


On the other hand, servers can do something. For example, blocking SYN on firevalls in a way, so if you get to the server, you can easily upload/download/report.

73 HNY Iztok

... and there is one more possibility, that BOINC does not currently exploit.

There should be a way for a project's servers to publish (and "publish" means through some means other than from those servers) that they're busy, and tell the BOINC clients' "hey, everybody slow down."

Sure, blocking SYNs in a firewall would work, but it's even better to keep the SYNs from happening in the first place.

Yeah, there should be a mechanism that tells the clients during high-load times to double their retry wait clocks.

The first couple of retries are a 60-second wait, and then it goes to less than 10 minutes, and then about 20, and after that, it's usually 1+ hours. High load times should make it start with 20 minutes and go upwards of 4-6 hours for a retry.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 848394 · Report as offensive
Previous · 1 · 2

Message boards : Number crunching : Problem connecting to server - status page show no apparent problem


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.