Panic Mode On (42) Server problems

Message boards : Number crunching : Panic Mode On (42) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next

AuthorMessage
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1056695 - Posted: 16 Dec 2010, 18:05:30 UTC

Wow.. After days of trying to get APs and only getting one or two a day.. I go to sleep and wake up to find a task list full of them.

2010-12-16 07:50:54|SETI@home|Scheduler request succeeded: got 1 new tasks
2010-12-16 11:11:15|SETI@home|Scheduler request succeeded: got 15 new tasks
2010-12-16 11:33:12|SETI@home|Scheduler request succeeded: got 2 new tasks
2010-12-16 12:04:46|SETI@home|Scheduler request succeeded: got 2 new tasks
2010-12-16 12:25:53|SETI@home|Scheduler request succeeded: got 6 new tasks


I really need about 15 more, but due to some issue that can't be found/resolved at the moment, occasionally an AP will run 8-16 hours longer than normal. If I save that WU and run it in stand-alone mode, 23-25 hours like normal. So right now my DCF is off by about 10 hours.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1056695 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13832
Credit: 208,696,464
RAC: 304
Australia
Message 1056705 - Posted: 16 Dec 2010, 18:34:11 UTC - in response to Message 1056591.  
Last modified: 16 Dec 2010, 18:36:06 UTC

All of which goes to prove that Ned Ludd knew what he was talking about when he said that the download bandwidth needed to be more intelligently managed.

Having it maxxed out is not an intelligent solution.

Sure, the error correction and retry protocols will get there in the end - but all those retry packets merely add to the congestion, and reduce the space available for real data.

We probably got more actual work through the pipe last week at 60% - 80%, than we are doing right now at 93%.

I agree that 85% of available bandwidth would be better than 95%+, but looking at Scarecrow's Graphs the slope for Work in Progress in steeper this week than last. So as bad as thing are when maxed out, it's better for feeding the demand than only using 60% or so of the available bandwidth.
Grant
Darwin NT
ID: 1056705 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1056712 - Posted: 16 Dec 2010, 19:39:00 UTC

I think the point may be proved that its maxxed out bandwidth that causes ghosts. All my machines have had at least 40 resends in the last few hours. First time I've had (potential) ghosts since the project came back up. As I type this all are in "Project Backoff" with some units showing as much as 20 minutes elapsed.

T.A.
ID: 1056712 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1056714 - Posted: 16 Dec 2010, 19:47:38 UTC - in response to Message 1056705.  

All of which goes to prove that Ned Ludd knew what he was talking about when he said that the download bandwidth needed to be more intelligently managed.

Having it maxxed out is not an intelligent solution.

Sure, the error correction and retry protocols will get there in the end - but all those retry packets merely add to the congestion, and reduce the space available for real data.

We probably got more actual work through the pipe last week at 60% - 80%, than we are doing right now at 93%.

I agree that 85% of available bandwidth would be better than 95%+, but looking at Scarecrow's Graphs the slope for Work in Progress in steeper this week than last. So as bad as thing are when maxed out, it's better for feeding the demand than only using 60% or so of the available bandwidth.

"Work in progress" is not the same thing as feeding the demand.

One of my machines currently has 9 tasks ready to run, but 248 "in progress" but still trying to download.

Ghosts are also counted in "work in progress", remember.
ID: 1056714 · Report as offensive
Profile Frizz
Volunteer tester
Avatar

Send message
Joined: 17 May 99
Posts: 271
Credit: 5,852,934
RAC: 0
New Zealand
Message 1056790 - Posted: 16 Dec 2010, 23:42:07 UTC - in response to Message 1056705.  
Last modified: 16 Dec 2010, 23:42:43 UTC

I agree that 85% of available bandwidth would be better than 95%+ ...


+1


I have stop-and-go traffic at the moment. This is how my log looks like (since hours):

...
17/12/2010 00:35:03 SETI@home Started download of ap_24se10ab_B4_P0_00189_20101216_10901.wu
17/12/2010 00:35:10 Project communication failed: attempting access to reference site
17/12/2010 00:35:10 SETI@home Temporarily failed download of ap_25se10aa_B5_P0_00352_20101216_14500.wu: HTTP error
17/12/2010 00:35:10 SETI@home Backing off 1 min 0 sec on download of ap_25se10aa_B5_P0_00352_20101216_14500.wu
17/12/2010 00:35:11 Internet access OK - project servers may be temporarily down.
17/12/2010 00:35:32 SETI@home Started download of ap_24se10ab_B4_P0_00198_20101216_10901.wu
17/12/2010 00:36:01 Project communication failed: attempting access to reference site
17/12/2010 00:36:01 SETI@home Temporarily failed download of ap_24se10ab_B4_P0_00189_20101216_10901.wu: HTTP error
17/12/2010 00:36:01 SETI@home Backing off 3 hr 8 min 0 sec on download of ap_24se10ab_B4_P0_00189_20101216_10901.wu
17/12/2010 00:36:01 SETI@home Started download of ap_10se10aa_B3_P0_00025_20101216_06398.wu
17/12/2010 00:36:02 Internet access OK - project servers may be temporarily down.
...
ID: 1056790 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14672
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1056799 - Posted: 16 Dec 2010, 23:58:24 UTC - in response to Message 1056714.  

One of my machines currently has 9 tasks ready to run, but 248 "in progress" but still trying to download.

In the four hours since I posted that, the score for that machine is: Crunched 9, downloaded 0. Still 248 waiting to download.

GPUGrid gets another timeslice ;-)
ID: 1056799 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1056808 - Posted: 17 Dec 2010, 0:16:04 UTC - in response to Message 1056799.  


just did a quick check on 1 box that is not downloading wus at the moment ...
571 wus waiting to be crunched
618 wus alledgedly in progress according to seti
ghosts 47


ID: 1056808 · Report as offensive
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 371
Credit: 20,533,537
RAC: 0
United States
Message 1056831 - Posted: 17 Dec 2010, 2:30:15 UTC - in response to Message 1056808.  

When I look at mine, I have 80 wu on my machine with SETI reporting 7 valid, 57 pending, and 2 in error. That totals 146 which is what SETI says I have. I don't think I have any ghosts at the moment unless I am counting something in the total I shouldn't, but it sure would be a coincidence if that was the case.
ID: 1056831 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1056832 - Posted: 17 Dec 2010, 2:35:23 UTC - in response to Message 1056514.  

I am not terribly worried: I've got 4 AP units that BOINC thinks will take 200 hours (experience tells me 18 - 20 hours is more accurate).

As a result, my computer isn't even asking for work. :P


If you let run opt. project apps, you have added <flops> to your app_info.xml file?

ID: 1056832 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1056842 - Posted: 17 Dec 2010, 3:22:12 UTC - in response to Message 1056832.  


It appears the limits are still in place...
ID: 1056842 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1056851 - Posted: 17 Dec 2010, 4:11:24 UTC

Right now I have 167 cuda's and 2 MB's downloading to one machine. It is running at about 22kpbs x 4 WU's at a time. Seems to be doing pretty good on my end.
Traveling through space at ~67,000mph!
ID: 1056851 · Report as offensive
Profile Gundolf Jahn

Send message
Joined: 19 Sep 00
Posts: 3184
Credit: 446,358
RAC: 0
Germany
Message 1056885 - Posted: 17 Dec 2010, 7:18:12 UTC - in response to Message 1056851.  

Right now I have 167 cuda's and 2 MB's downloading to one machine...

CUDAs are MBs too. ;-)
(Sorry, couldn't stop myself:-)

Gruß,
Gundolf
ID: 1056885 · Report as offensive
KB7RZF
Volunteer tester
Avatar

Send message
Joined: 15 Aug 99
Posts: 9549
Credit: 3,308,926
RAC: 2
United States
Message 1056891 - Posted: 17 Dec 2010, 8:02:25 UTC - in response to Message 1056885.  

Right now I have 167 cuda's and 2 MB's downloading to one machine...

CUDAs are MBs too. ;-)
(Sorry, couldn't stop myself:-)

Gruß,
Gundolf

But its 2 different apps. :-P
LOL
ID: 1056891 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1056900 - Posted: 17 Dec 2010, 8:31:59 UTC - in response to Message 1056885.  

Right now I have 167 cuda's and 2 MB's downloading to one machine...

CUDAs are MBs too. ;-)
(Sorry, couldn't stop myself:-)

Gruß,
Gundolf


I'm well aware of that, just figured you could decipher what I meant. Slight mishap on wording but the point got across.
Traveling through space at ~67,000mph!
ID: 1056900 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34345
Credit: 79,922,639
RAC: 80
Germany
Message 1056930 - Posted: 17 Dec 2010, 10:37:37 UTC


MB download is very fast but astropulses takes forever.



With each crime and every kindness we birth our future.
ID: 1056930 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1056931 - Posted: 17 Dec 2010, 10:40:35 UTC - in response to Message 1056930.  


MB download is very fast but astropulses takes forever.

That is because of the size of the download for AP tasks.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1056931 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34345
Credit: 79,922,639
RAC: 80
Germany
Message 1057158 - Posted: 17 Dec 2010, 19:50:17 UTC - in response to Message 1056931.  


MB download is very fast but astropulses takes forever.

That is because of the size of the download for AP tasks.


No its because of the badwith.

On MB task i have 30 K/s
On AP only 1 - 3 K/s.



With each crime and every kindness we birth our future.
ID: 1057158 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1057160 - Posted: 17 Dec 2010, 19:53:14 UTC

From what I have seen on my machine is I'm not limited per download but per computer how fast I can download. Last night I was getting work units at 22kbps on one WU but 10 on another etc. I seem to stay, combined, around 35-40kbps while downloading. Is there a limit placed on the speed from Seti or is this a bottlenecking issue? I've never paid attention to my speeds on download until here lately.
Traveling through space at ~67,000mph!
ID: 1057160 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1057227 - Posted: 17 Dec 2010, 21:37:29 UTC

When the pipe isn't saturated, I would be able to download MBs in ~1 second, so 300-400k/sec. I would have expected that when there is still 50mbit of unused throughput on the pipe, I could use all 18mbit of my download speed for APs. Unfortunately, my observation on this is that it will only run between 50-200k/sec for AP downloads.

I'm not complaining at all, but there does seem to be some kind of throttling somewhere. It takes between 60-180 seconds for each AP to download for me, but it isn't my hardware, because at any time, I can go download java from oracle's website and get 2.2MB/sec.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1057227 · Report as offensive
-BeNt-
Avatar

Send message
Joined: 17 Oct 99
Posts: 1234
Credit: 10,116,112
RAC: 0
United States
Message 1057392 - Posted: 18 Dec 2010, 6:28:40 UTC - in response to Message 1057227.  

When the pipe isn't saturated, I would be able to download MBs in ~1 second, so 300-400k/sec. I would have expected that when there is still 50mbit of unused throughput on the pipe, I could use all 18mbit of my download speed for APs. Unfortunately, my observation on this is that it will only run between 50-200k/sec for AP downloads.

I'm not complaining at all, but there does seem to be some kind of throttling somewhere. It takes between 60-180 seconds for each AP to download for me, but it isn't my hardware, because at any time, I can go download java from oracle's website and get 2.2MB/sec.


Yeah there is throttling indeed. Tonight I'm getting about 60kbps on my 30Mbit line. Should be getting much higher speeds, but your download speed will also be affected by your routing to the server etc. If there is congestion other places along the way it slows things down. However they are throttled on their end, and if they weren't you would possibly be getting worse speeds as everyone ate up everything! Not sure how accurate their cricket graphs are but the highest I've ever seen it was about 97mbps, so kind of makes you wonder what QoS they do have running. Right now it's only at 79Mbps! The wave slowing down?
Traveling through space at ~67,000mph!
ID: 1057392 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (42) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.