Out of the fire and into the pit of sulfuric acid. (Feb 19, 2010)


log in

Advanced search

Message boards : Technical News : Out of the fire and into the pit of sulfuric acid. (Feb 19, 2010)

Previous · 1 · 2 · 3 · 4 · 5 . . . 15 · Next
Author Message
Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,635,618
RAC: 14,534
United States
Message 971925 - Posted: 20 Feb 2010, 0:22:36 UTC - in response to Message 971905.

maybe look at the configuration of your switches. no work is uploading, nor are scheduler requests.
____________

Swibby Bear
Send message
Joined: 1 Aug 01
Posts: 236
Credit: 7,073,266
RAC: 975
United States
Message 971928 - Posted: 20 Feb 2010, 0:40:24 UTC
Last modified: 20 Feb 2010, 0:44:18 UTC

I'm not sure you understand -- All of my WUs are finally uploaded -- But I C A N - N O T CONNECT TO R E P O R T Them For More Than 48 Hours!!! I don't think this has anything to do with splitters being down.

Edge
Send message
Joined: 16 May 99
Posts: 19
Credit: 241,690,050
RAC: 6,599
United States
Message 971930 - Posted: 20 Feb 2010, 0:41:58 UTC - in response to Message 971899.

It's aggravating that some people simply aren't seeing this problem while others have been seeing this even before the AC crash and the first group is saying "all is well".


Fwiw, I've got more than a couple of computers trying to send results over the last 72 to 96 hours. They managed to get about 75% through since the recovery from the A/C failure (but I was certainly having problems before then) - the rest are sitting here trying to upload but getting HTTP errors and backing off, and when they can connect to download they're being told the project has no jobs available (that may be true given the onslaught of people trying to catch up, i don't know).

I'm in the 'something's still not quite right' camp, but just holding out, since there's obviously not much we can do.

If nothing else it's a good reminder to take a breath. Maybe some of us get tunnelvision for numbercrunching.... must get more!mustgetmore! or maybe that's just me. dunno yet.

Either way, hat's off to the S@H team for giving it all they've got, and hope all have a great weekend.

[/quote]

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,635,618
RAC: 14,534
United States
Message 971933 - Posted: 20 Feb 2010, 0:48:57 UTC - in response to Message 971816.

... One of the raid arrays on thumper lost a drive...


you guys seem to loose way to many drives! how many do you have?



anybody got work?
____________

zoom314
Avatar
Send message
Joined: 30 Nov 03
Posts: 44511
Credit: 35,392,088
RAC: 9,273
Message 971935 - Posted: 20 Feb 2010, 0:57:58 UTC - in response to Message 971933.

... One of the raid arrays on thumper lost a drive...


you guys seem to loose way to many drives! how many do you have?



anybody got work?

Until about 2am PST, Sure, I've got work, After that, I doubt It highly. It's going to be colder than usual in the morning here(I leave the heater off at night since I'm the only living person around here anymore). Mom doesn't need anything as She's happy on Her shelf in My closet for eternity. :D
____________

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 23702
Credit: 493,022
RAC: 134
United States
Message 971945 - Posted: 20 Feb 2010, 1:11:23 UTC - in response to Message 971881.

Superjoker: as explained elsewhere the backoffs are our friend as without them the servers would be flooded with requests & no-one would get anywhere. The longer the backoffs the better as it spreads the load more.

Backoffs are a perfect technique for spreading the load when when the complete system is capable of handling (with an adequate satety margin) the aggregate anticipated demand averaged over an extended period of time. That's how SETI normally runs, and a few backoffs to shave the peaks and fill the troughs are exactly what's needed.

Backoffs do not help if the aggregate load exceeds - over an extended period - the system's capacity to absorb work. Then you have to take more drastic action, to reduce demand or increase supply.

For the last 4.5 days (only), SETI's capacity to absorb work has been below demand. I see no sign that demand has increased: instead, it seems to me that capacity has decreased (hopefully, temporarily).

No amount of smoothing (backoffs) will solve this. What is needed is to restore the status quo ante on the capacity side.

BOINC has a built in mechanism for this as well. If there are more than 2 * ncpus results waiting for uploads, no work will be requested. This was originally put in place to limit the growth on the client for a couple of projects where the upload time exceeded the crunch time. It also has the effect of limiting work being done if upload is slowed for an extended period.
____________


BOINC WIKI

John McLeod VII
Volunteer developer
Volunteer tester
Avatar
Send message
Joined: 15 Jul 99
Posts: 23702
Credit: 493,022
RAC: 134
United States
Message 971947 - Posted: 20 Feb 2010, 1:13:24 UTC - in response to Message 971933.

... One of the raid arrays on thumper lost a drive...


you guys seem to loose way to many drives! how many do you have?



anybody got work?

Lots and lots of work from many projects (just not much from SETI).
____________


BOINC WIKI

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 6966
Credit: 57,029,231
RAC: 22,515
Germany
Message 971951 - Posted: 20 Feb 2010, 1:19:08 UTC


I can not UL thousands of results and not report/request new work.

Since days only idle PCs.

It's not at my side, DSL router switched off/on and PCs rebooted.


Yes, sure.. I'm patiently..
Only for info.


____________
[Optimized project applications, for to increase your PC performance (double RAC)!][Overview of abbreviations, which are used often in forum and their meaning.]
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile hiamps
Volunteer tester
Avatar
Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 971952 - Posted: 20 Feb 2010, 1:19:25 UTC

Guess this weekend sucks for seti....

2/19/2010 5:17:07 PM SETI@home Reporting 384 completed tasks, not requesting new tasks
2/19/2010 5:17:29 PM Project communication failed: attempting access to reference site
2/19/2010 5:17:30 PM Internet access OK - project servers may be temporarily down.
2/19/2010 5:17:32 PM SETI@home Scheduler request failed: Couldn't connect to server

____________
Official Abuser of Boinc Buttons...
And no good credit hound!

jravin
Send message
Joined: 25 Mar 02
Posts: 904
Credit: 85,857,448
RAC: 85,097
United States
Message 971962 - Posted: 20 Feb 2010, 1:30:49 UTC - in response to Message 971952.

Guess this weekend sucks for seti....

2/19/2010 5:17:07 PM SETI@home Reporting 384 completed tasks, not requesting new tasks
2/19/2010 5:17:29 PM Project communication failed: attempting access to reference site
2/19/2010 5:17:30 PM Internet access OK - project servers may be temporarily down.
2/19/2010 5:17:32 PM SETI@home Scheduler request failed: Couldn't connect to server


Dittos here - has anybody a clue as to why this is happening? It's been several days, and not a hint as to what is going on.....
____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,635,618
RAC: 14,534
United States
Message 971979 - Posted: 20 Feb 2010, 1:50:09 UTC - in response to Message 971935.
Last modified: 20 Feb 2010, 1:51:15 UTC

Until about 2am PST, Sure, I've got work, After that, I doubt It highly. It's going to be colder than usual in the morning here(I leave the heater off at night since I'm the only living person around here anymore). Mom doesn't need anything as She's happy on Her shelf in My closet for eternity. :D



we got snow in the forcast for early next week, so it looks like i will be in the cold with you. it's just me and my doberman, i pitty the person who tries and rip me off. i'm sure the staff went home, so this is it for the week end, with just noise on the inlet pipe of clients trying to upload, but no one is home.
____________

zoom314
Avatar
Send message
Joined: 30 Nov 03
Posts: 44511
Credit: 35,392,088
RAC: 9,273
Message 971982 - Posted: 20 Feb 2010, 1:53:51 UTC - in response to Message 971962.

Guess this weekend sucks for seti....

2/19/2010 5:17:07 PM SETI@home Reporting 384 completed tasks, not requesting new tasks
2/19/2010 5:17:29 PM Project communication failed: attempting access to reference site
2/19/2010 5:17:30 PM Internet access OK - project servers may be temporarily down.
2/19/2010 5:17:32 PM SETI@home Scheduler request failed: Couldn't connect to server


Dittos here - has anybody a clue as to why this is happening? It's been several days, and not a hint as to what is going on.....

Nope and It was happening before the last outage and when I said something others said I was wrong, turns out I was right(told ya so), But still there is no idea where the blockage comes from as Seti doesn't seem to be getting anything and We don't get acks for uploads and can't report.
____________

Profile Lint trap
Send message
Joined: 30 May 03
Posts: 856
Credit: 24,545,883
RAC: 13,319
United States
Message 971988 - Posted: 20 Feb 2010, 2:09:45 UTC
Last modified: 20 Feb 2010, 2:10:25 UTC

Pathping is reporting some packet losses in the San Jose area.

The command in XP is

C:\pathping 208.68.240.16 boinc2.ssl.berkeley.edu

The ip address is the upload server (from my hosts file). You need both parts - the ip and the name. Pathping /? for options.

The losses aren't huge pct-wise, but they are consistent.

Martin

Bob Giel
Volunteer tester
Send message
Joined: 11 Jan 04
Posts: 54
Credit: 4,976,676
RAC: 336
United States
Message 971993 - Posted: 20 Feb 2010, 2:15:44 UTC

Had over 100 uploads at 15:30 Central time. It is now 20:14 Central time and all uploads have completed. The "ready to upload" feature is still having problems connecting. I'm just letting the system do what it does.
____________

Profile Gary Charpentier
Volunteer tester
Avatar
Send message
Joined: 25 Dec 00
Posts: 11732
Credit: 5,969,877
RAC: 0
United States
Message 971998 - Posted: 20 Feb 2010, 2:24:01 UTC

Eric:

Thanks for the update.

FYI: While the recovery from the A/C outage was happening my Mac 10.5.8 was not having problems reporting work units. My PC XP Pro was timing out. I suspect that the Mac has a better TCP/IP stack. My pointing this out is I don't know if there is anything you can tune on your side to assist in a recovery for the PC crowd. If both sides don't time out at about the same time, that will clog the pipes and make it more painful for everyone.

Hope the rain isn't too fast and furious up there. About to start here in LA LA land.

____________

zoom314
Avatar
Send message
Joined: 30 Nov 03
Posts: 44511
Credit: 35,392,088
RAC: 9,273
Message 972024 - Posted: 20 Feb 2010, 3:57:14 UTC - in response to Message 971988.

Pathping is reporting some packet losses in the San Jose area.

The command in XP is

C:\pathping 208.68.240.16 boinc2.ssl.berkeley.edu

The ip address is the upload server (from my hosts file). You need both parts - the ip and the name. Pathping /? for options.

The losses aren't huge pct-wise, but they are consistent.

Martin

Thanks Martin, I never knew about pathping, Here's My results using XP64:
Microsoft Windows [Version 5.2.3790] (C) Copyright 1985-2003 Microsoft Corp. C:\Documents and Settings\Administrator.PC1>pathping 208.68.240.16 boinc2.ssl.berkeley.edu Tracing route to boinc2.ssl.berkeley.edu [208.68.240.18] over a maximum of 30 hops: 0 pc1.westell.com [192.168.1.45] 1 dslrouter.westell.com [192.168.1.1] 2 L100.LSANCA-DSL-35.verizon-gni.net [71.105.32.1] 3 9-0-2935.LSANCA-LCR-09.verizon-gni.net [130.81.136.14] 4 so-4-0-0-0.LAX01-BB-RTR1.verizon-gni.net [130.81.28.72] 5 0.so-6-3-0.XT1.LAX9.ALTER.NET [152.63.10.153] 6 0.ge-7-1-0.XL3.SJC7.ALTER.NET [152.63.48.254] 7 POS6-0-0.GW4.SJC7.ALTER.NET [152.63.48.241] 8 teliasonera-test-gw.customer.alter.net [157.130.215.70] 9 hurricane-113209-sjo-bb1.c.telia.net [213.248.86.54] 10 64.71.140.42 11 208.68.243.254 12 boinc2.ssl.berkeley.edu [208.68.240.18] Computing statistics for 300 seconds... Source to Here This Node/Link Hop RTT Lost/Sent = Pct Lost/Sent = Pct Address 0 pc1.westell.com [192.168.1.45] 0/ 100 = 0% | 1 1ms 0/ 100 = 0% 0/ 100 = 0% dslrouter.westell.com [192.168.1.1] 0/ 100 = 0% | 2 37ms 0/ 100 = 0% 0/ 100 = 0% L100.LSANCA-DSL-35.verizon-gni.net [71.105.32.1] 0/ 100 = 0% | 3 38ms 0/ 100 = 0% 0/ 100 = 0% 9-0-2935.LSANCA-LCR-09.verizon-gni.net [130.81.136.14] 0/ 100 = 0% | 4 40ms 2/ 100 = 2% 2/ 100 = 2% so-4-0-0-0.LAX01-BB-RTR1.verizon-gni.net [130.81.28.72] 0/ 100 = 0% | 5 42ms 1/ 100 = 1% 1/ 100 = 1% 0.so-6-3-0.XT1.LAX9.ALTER.NET [152.63.10.153] 0/ 100 = 0% | 6 50ms 1/ 100 = 1% 1/ 100 = 1% 0.ge-7-1-0.XL3.SJC7.ALTER.NET [152.63.48.254] 0/ 100 = 0% | 7 48ms 0/ 100 = 0% 0/ 100 = 0% POS6-0-0.GW4.SJC7.ALTER.NET [152.63.48.241] 0/ 100 = 0% | 8 59ms 1/ 100 = 1% 1/ 100 = 1% teliasonera-test-gw.customer.alter.net [157.130.215.70] 0/ 100 = 0% | 9 54ms 1/ 100 = 1% 1/ 100 = 1% hurricane-113209-sjo-bb1.c.telia.net [213.248.86.54] 0/ 100 = 0% | 10 57ms 0/ 100 = 0% 0/ 100 = 0% 64.71.140.42 5/ 100 = 5% | 11 57ms 5/ 100 = 5% 0/ 100 = 0% 208.68.243.254 8/ 100 = 8% | 12 55ms 13/ 100 = 13% 0/ 100 = 0% boinc2.ssl.berkeley.edu [208.68.240.18] Trace complete. C:\Documents and Settings\Administrator.PC1>

____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,635,618
RAC: 14,534
United States
Message 972030 - Posted: 20 Feb 2010, 4:36:04 UTC
Last modified: 20 Feb 2010, 4:46:36 UTC


C:\>pathping 208.68.240.16 boinc2.ssl.berkeley.edu Tracing route to boinc2.ssl.berkeley.edu [208.68.240.18] over a maximum of 30 hops: 0 SkullStation [192.168.1.113] 1 ORION [192.168.1.1] 2 cab1-1.1scom.net [66.182.x.x] 3 DSL2-5.1scom.net [66.182.x.x] 4 Texas-Independent-Energy6328-custidna.cust-rtr.swbell.net [151.164.76.189] 5 * bb2-p7-1.rcsntx.sbcglobal.net [151.164.190.248] 6 ppp-151-164-52-78.rcsntx.swbell.net [151.164.52.78] 7 * asn6939-he.eqdltx.sbcgobal.net [151.164.248.150] 8 10gigabitethernet1-2.core1.lax1.he.net [72.52.92.57] 9 10gigabitethernet1-3.core1.pao1.he.net [72.52.92.21] 10 * 64.71.140.42 11 208.68.243.254 12 boinc2.ssl.berkeley.edu [208.68.240.18] Computing statistics for 300 seconds... Source to Here This Node/Link Hop RTT Lost/Sent = Pct Lost/Sent = Pct Address 0 SkullStation [192.168.1.113] 0/ 100 = 0% | 1 0ms 0/ 100 = 0% 0/ 100 = 0% ORION [192.168.1.1] 0/ 100 = 0% | 2 14ms 0/ 100 = 0% 0/ 100 = 0% cab1-1.1scom.net [66.182.x.x] 0/ 100 = 0% | 3 15ms 0/ 100 = 0% 0/ 100 = 0% DSL2-5.1scom.net [66.182.x.x] 0/ 100 = 0% | 4 12ms 0/ 100 = 0% 0/ 100 = 0% Texas-Independent-Energy6328-custidna.cust-rtr.swbell.net [151.164.76.189] 0/ 100 = 0% | 5 --- 100/ 100 =100% 100/ 100 =100% bb2-p7-1.rcsntx.sbcglobal.net [151.164.190.248] 0/ 100 = 0% | 6 --- 100/ 100 =100% 100/ 100 =100% ppp-151-164-52-78.rcsntx.swbell.net [151.164.52.78] 0/ 100 = 0% | 7 13ms 0/ 100 = 0% 0/ 100 = 0% asn6939-he.eqdltx.sbcgobal.net [151.164.248.150] 0/ 100 = 0% | 8 51ms 0/ 100 = 0% 0/ 100 = 0% 10gigabitethernet1-2.core1.lax1.he.net [72.52.92.57] 0/ 100 = 0% | 9 59ms 0/ 100 = 0% 0/ 100 = 0% 10gigabitethernet1-3.core1.pao1.he.net [72.52.92.21] 0/ 100 = 0% | 10 59ms 0/ 100 = 0% 0/ 100 = 0% 64.71.140.42 6/ 100 = 6% | 11 64ms 6/ 100 = 6% 0/ 100 = 0% 208.68.243.254 1/ 100 = 1% | 12 62ms 7/ 100 = 7% 0/ 100 = 0% boinc2.ssl.berkeley.edu [208.68.240.18] Trace complete.

line 5 and 6 don't look good, looks like some loss at the upload server and at the last hop.
____________

zoom314
Avatar
Send message
Joined: 30 Nov 03
Posts: 44511
Credit: 35,392,088
RAC: 9,273
Message 972031 - Posted: 20 Feb 2010, 4:38:41 UTC - in response to Message 972030.
Last modified: 20 Feb 2010, 4:54:29 UTC

C:\>pathping 208.68.240.16 boinc2.ssl.berkeley.edu Tracing route to boinc2.ssl.berkeley.edu [208.68.240.18] over a maximum of 30 hops: 0 SkullStation [192.168.1.113] 1 ORION [192.168.1.1] 2 cab1-1.1scom.net [66.182.x.x] 3 DSL2-5.1scom.net [66.182.x.x] 4 Texas-Independent-Energy6328-custidna.cust-rtr.swbell.net [151.164.76.189] 5 * bb2-p7-1.rcsntx.sbcglobal.net [151.164.190.248] 6 ppp-151-164-52-78.rcsntx.swbell.net [151.164.52.78] 7 * asn6939-he.eqdltx.sbcgobal.net [151.164.248.150] 8 10gigabitethernet1-2.core1.lax1.he.net [72.52.92.57] 9 10gigabitethernet1-3.core1.pao1.he.net [72.52.92.21] 10 * 64.71.140.42 11 208.68.243.254 12 boinc2.ssl.berkeley.edu [208.68.240.18] Computing statistics for 300 seconds... Source to Here This Node/Link Hop RTT Lost/Sent = Pct Lost/Sent = Pct Address 0 SkullStation [192.168.1.113] 0/ 100 = 0% | 1 0ms 0/ 100 = 0% 0/ 100 = 0% ORION [192.168.1.1] 0/ 100 = 0% | 2 14ms 0/ 100 = 0% 0/ 100 = 0% cab1-1.1scom.net [66.182.x.x] 0/ 100 = 0% | 3 15ms 0/ 100 = 0% 0/ 100 = 0% DSL2-5.1scom.net [66.182.x.x] 0/ 100 = 0% | 4 12ms 0/ 100 = 0% 0/ 100 = 0% Texas-Independent-Energy6328-custidna.cust-rtr.swbell.net [151.164.76.189] 0/ 100 = 0% | 5 --- 100/ 100 =100% 100/ 100 =100% bb2-p7-1.rcsntx.sbcglobal.net [151.164.190.248] 0/ 100 = 0% | 6 --- 100/ 100 =100% 100/ 100 =100% ppp-151-164-52-78.rcsntx.swbell.net [151.164.52.78] 0/ 100 = 0% | 7 13ms 0/ 100 = 0% 0/ 100 = 0% asn6939-he.eqdltx.sbcgobal.net [151.164.248.150] 0/ 100 = 0% | 8 51ms 0/ 100 = 0% 0/ 100 = 0% 10gigabitethernet1-2.core1.lax1.he.net [72.52.92.57] 0/ 100 = 0% | 9 59ms 0/ 100 = 0% 0/ 100 = 0% 10gigabitethernet1-3.core1.pao1.he.net [72.52.92.21] 0/ 100 = 0% | 10 59ms 0/ 100 = 0% 0/ 100 = 0% 64.71.140.42 <<Hurricane Electric 6/ 100 = 6% | 11 64ms 6/ 100 = 6% 0/ 100 = 0% 208.68.243.254 <<Seti@Home 1/ 100 = 1% | 12 62ms 7/ 100 = 7% 0/ 100 = 0% boinc2.ssl.berkeley.edu [208.68.240.18] Trace complete.


I also had to remove the [quote] tags too.

I thought I'd make It more readable, So I added a couple of [pre] and [size] tags to Your output.
____________

zoom314
Avatar
Send message
Joined: 30 Nov 03
Posts: 44511
Credit: 35,392,088
RAC: 9,273
Message 972033 - Posted: 20 Feb 2010, 4:48:45 UTC

I just had a thought and I could be wrong, But could somebody be throttling the project like We're a bunch of P2P file transfers?
____________

Profile RottenMutt
Avatar
Send message
Joined: 15 Mar 01
Posts: 992
Credit: 207,635,618
RAC: 14,534
United States
Message 972045 - Posted: 20 Feb 2010, 5:09:19 UTC - in response to Message 972033.

is there anyway to encript the transactions so they aren't blocked???
____________

Previous · 1 · 2 · 3 · 4 · 5 . . . 15 · Next

Message boards : Technical News : Out of the fire and into the pit of sulfuric acid. (Feb 19, 2010)

Copyright © 2014 University of California