Panic Mode On (82) Server Problems?

Message boards : Number crunching : Panic Mode On (82) Server Problems?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 24 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1340632 - Posted: 25 Feb 2013, 4:45:44 UTC

Now that we are mostly recovered from the weekend outage, lets start a new thread.

ID: 1340632 · Report as offensive
Profile Floyd
Avatar

Send message
Joined: 19 May 11
Posts: 524
Credit: 1,870,625
RAC: 0
United States
Message 1340640 - Posted: 25 Feb 2013, 5:51:08 UTC

Sounds good to me .
are the validators having problems , or just struggling with the mass of sudden Wu's being thrown at it after the 2 day outage ?
ID: 1340640 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1340665 - Posted: 25 Feb 2013, 9:52:44 UTC

are the validators having problems , or just struggling with the mass of sudden Wu's being thrown at it after the 2 day outage ?

Maybe. I checked my first 20 pendings and found 4 where both results are in but validation has not happened. Here's one:

http://setiathome.berkeley.edu/workunit.php?wuid=1175653331
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1340665 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1340669 - Posted: 25 Feb 2013, 10:43:50 UTC

are the validators having problems , or just struggling with the mass of sudden Wu's being thrown at it after the 2 day outage ?

Maybe. I checked my first 20 pendings and found 4 where both results are in but validation has not happened. Here's one:

Disregard, those workunits have cleared now. Probably just another nap.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1340669 · Report as offensive
Profile TRuEQ & TuVaLu
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 505
Credit: 69,523,653
RAC: 10
Sweden
Message 1340729 - Posted: 25 Feb 2013, 16:00:27 UTC

I have 5 WU's in transfer since yesterday.
ID: 1340729 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1340742 - Posted: 25 Feb 2013, 16:44:40 UTC
Last modified: 25 Feb 2013, 16:45:18 UTC

I just looked at my computers. O MY. one has 138 time outs. The other one wont download anything. I see em but they just sit there blinking at me stuck at various % of downloadedness:)
[/quote]

Old James
ID: 1340742 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22149
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1340766 - Posted: 25 Feb 2013, 17:21:23 UTC

The validators will be struggling with the number of reported tasks which will be far worse than a normal weekly outage.
Downloads have really taken a hit, its taking hours to get an MB down, and as for APs....
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1340766 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1340770 - Posted: 25 Feb 2013, 17:24:46 UTC - in response to Message 1340742.  

I just looked at my computers. O MY. one has 138 time outs. The other one wont download anything. I see em but they just sit there blinking at me stuck at various % of downloadedness:)


When you run above boinc 6.10 you need something to perodiaclly reset the backoffs...
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1340770 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1340771 - Posted: 25 Feb 2013, 17:26:31 UTC - in response to Message 1340766.  
Last modified: 25 Feb 2013, 17:29:57 UTC

The validators will be struggling with the number of reported tasks which will be far worse than a normal weekly outage.
Downloads have really taken a hit, its taking hours to get an MB down, and as for APs....

And I don't see things easing up much until at least the end of the week.
Even IF they decide to forgo the usual maintenance outage tomorrow, seeing as how everything was just rebooted. They may still need it if they did not do their usual db cleanups and compression, however.

And unfortunately, the shorties being sent out are not enhancing the recovery.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1340771 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1340781 - Posted: 25 Feb 2013, 18:04:19 UTC
Last modified: 25 Feb 2013, 18:37:50 UTC

I gave up on connecting my XP machine to Ubuntu via Firewire. I finally broke down and installed Squid in Ubuntu. I actually managed to get Squid to handle the SETI downloads on my XP machine. No More Download Stalls with XP. I do need to look into configuring Squid to handle the SETI uploads and Scheduler requests though. Just like with many of the HTTP proxies, Squid doesn't do Uploads & Scheduler requests. Kinda makes you think many of those HTTP proxies are running Squid. My new HTTP Proxy is 192.168.1.4:5555.

Oh, no more of those 'Permanent HTTP Download Errors' either...

2/25/2013 1:21:52 PM |  | Suspending network activity - user request
2/25/2013 1:22:00 PM |  | Using proxy info from GUI
2/25/2013 1:22:00 PM |  | Using HTTP proxy 192.168.1.4:5555
2/25/2013 1:22:05 PM |  | Resuming network activity
2/25/2013 1:22:05 PM | SETI@home | Started download of ap_16my12aa_B2_P1_00172_20130225_23241.wu
2/25/2013 1:22:05 PM | SETI@home | Started download of ap_16my12aa_B2_P0_00181_20130225_21583.wu
2/25/2013 1:22:05 PM | SETI@home | Started download of ap_16my12aa_B1_P1_00214_20130225_21107.wu
2/25/2013 1:22:05 PM | SETI@home | Started download of ap_16my12aa_B2_P0_00206_20130225_21583.wu
2/25/2013 1:22:05 PM | SETI@home | Started download of ap_16my12aa_B2_P1_00198_20130225_23241.wu
2/25/2013 1:26:40 PM | SETI@home | Sending scheduler request: To fetch work.
2/25/2013 1:26:40 PM | SETI@home | Requesting new tasks for CPU and NVIDIA and ATI
2/25/2013 1:26:41 PM | SETI@home | Scheduler request failed: Error 417
2/25/2013 1:26:49 PM | SETI@home | Computation for task 18se12ab.27382.24607.8.10.49_2 finished
2/25/2013 1:26:49 PM | SETI@home | Starting task 18se12ab.27382.24607.8.10.108_2 using setiathome_enhanced version 609 (cuda23) in slot 3
2/25/2013 1:26:51 PM | SETI@home | Started upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:26:53 PM | SETI@home | Project file upload handler is missing
2/25/2013 1:26:53 PM | SETI@home | Backing off 3 min 39 sec on upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:28:01 PM | SETI@home | Started upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:28:04 PM | SETI@home | Project file upload handler is missing
2/25/2013 1:28:04 PM | SETI@home | Backing off 5 min 19 sec on upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:30:31 PM | SETI@home | Started upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:30:33 PM | SETI@home | Project file upload handler is missing
2/25/2013 1:30:33 PM | SETI@home | Backing off 14 min 58 sec on upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:32:53 PM | SETI@home | Finished download of ap_16my12aa_B2_P1_00172_20130225_23241.wu
2/25/2013 1:32:53 PM | SETI@home | Finished download of ap_16my12aa_B2_P0_00181_20130225_21583.wu
2/25/2013 1:33:01 PM | SETI@home | Started upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:33:04 PM | SETI@home | Finished upload of 18se12ab.27382.24607.8.10.49_2_0
2/25/2013 1:33:04 PM | SETI@home | Sending scheduler request: To fetch work.
2/25/2013 1:33:04 PM | SETI@home | Reporting 1 completed tasks
2/25/2013 1:33:04 PM | SETI@home | Requesting new tasks for CPU and NVIDIA and ATI
2/25/2013 1:33:05 PM | SETI@home | Scheduler request failed: Error 417
2/25/2013 1:34:02 PM | SETI@home | Finished download of ap_16my12aa_B2_P1_00198_20130225_23241.wu
2/25/2013 1:34:04 PM | SETI@home | Finished download of ap_16my12aa_B1_P1_00214_20130225_21107.wu
2/25/2013 1:34:46 PM | SETI@home | Finished download of ap_16my12aa_B2_P0_00206_20130225_21583.wu
2/25/2013 1:35:01 PM |  | Using proxy info from GUI
2/25/2013 1:35:01 PM |  | Not using a proxy
...
ID: 1340781 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1340785 - Posted: 25 Feb 2013, 18:20:57 UTC - in response to Message 1340770.  

I just looked at my computers. O MY. one has 138 time outs. The other one wont download anything. I see em but they just sit there blinking at me stuck at various % of downloadedness:)


When you run above boinc 6.10 you need something to perodiaclly reset the backoffs...

I have one, Its called moving my mouse over the retry button and clicking it till my finger cramps:)
[/quote]

Old James
ID: 1340785 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1340787 - Posted: 25 Feb 2013, 18:34:06 UTC
Last modified: 25 Feb 2013, 18:57:14 UTC

Somethin' funny going on....
The server status page has not updated for an hour.
That's usually not a good sign.
Mebbe just generating the daily stats dump?

EDIT...
Finally updated...hopefully that's the end of that particular panic.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1340787 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22149
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1340792 - Posted: 25 Feb 2013, 19:05:22 UTC

Blame the yellow fluffy thing, that wot I say, blame the yellow fluffy thing....
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1340792 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1340799 - Posted: 25 Feb 2013, 19:38:16 UTC

This morning's panic wasn't a panic - it was me upgrading the OS on the master science database server. The whole successful operation took about 90 minutes, during which some machines/processes hung for a bit. Expect a few more of those, which I can't always schedule during the normal Tuesday outages if I want to finish these upgrades already.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1340799 · Report as offensive
Profile Floyd
Avatar

Send message
Joined: 19 May 11
Posts: 524
Credit: 1,870,625
RAC: 0
United States
Message 1340804 - Posted: 25 Feb 2013, 20:01:49 UTC - in response to Message 1340799.  

Upgrades are GOOD , Keep on keeping on Matt !
Does your statement mean that there is still going to be a tuesday outage after the last weekend downtime ?
ID: 1340804 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1340826 - Posted: 25 Feb 2013, 20:50:33 UTC
Last modified: 25 Feb 2013, 20:50:58 UTC

Question. Why did the server problem 81 get locked. The last post showed an awesome machine. So I will comment here. WOW, WOW,WOW, I want one. I was wondering how he had 32 threads going out a 8 core. Did not expect two cpus. It just looks so clean.
Now back to server problems.
All I have been doing for 2 days is baby sitting this machine. Here is how it goes. Download 2 k and wait five minutes, stall, Shut down connection, restart connection. Download 2 k , stall. Repeat.
I have tried to flush the dns and no joy there.
What probably will fix this is a proxy. I have tried several but now I can seem to get one going. Does anyone have a good proxy address that I can use?
Siv has given me some help but its the stalls that are beginning to give me more gray hair

Thanks in advance.
Michael Miles
ID: 1340826 · Report as offensive
Mike Davis
Volunteer tester

Send message
Joined: 17 May 99
Posts: 240
Credit: 5,402,361
RAC: 0
Isle of Man
Message 1340830 - Posted: 25 Feb 2013, 21:06:29 UTC

http://www.hidemyass.com/proxy-list/

Select USA and search. Currently 147.31.182.137 Port 80 is working quite well, but is not a secure proxy


ID: 1340830 · Report as offensive
Kevin Benfield

Send message
Joined: 29 Dec 03
Posts: 39
Credit: 30,085,439
RAC: 0
United Kingdom
Message 1340867 - Posted: 25 Feb 2013, 22:26:52 UTC

have a bunch of GPU units trying to download , so far has taken a couple of hours to download about 6, and they are only short ones.

looks like it's going to be a long time before things are back to the so called normal level
ID: 1340867 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1340880 - Posted: 25 Feb 2013, 23:05:35 UTC - in response to Message 1340867.  
Last modified: 25 Feb 2013, 23:06:36 UTC

have a bunch of GPU units trying to download , so far has taken a couple of hours to download about 6, and they are only short ones.

looks like it's going to be a long time before things are back to the so called normal level

Strange, I have 3 machines crunching S@H and when I got up this morning 08:00 UTC all three had dowloaded their maximum WU's.

I have been out most of the day and all 3 have maintained maximum WU's!!

So as far as I am concerned there are no problems. In fact I am surprised at how fast it has recovered form the weekend.
ID: 1340880 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1340881 - Posted: 25 Feb 2013, 23:06:26 UTC

@ pending vs. validated:

I've noticed that the tasks are getting validated fairly quickly, but it can be a few minutes, at least for APs anyway. I reported 3 of them this morning and went and checked the tasks page within a minute and they were pending, and I waited 30 seconds and hit refresh and they were validated.

Same thing just happened again for reporting two more APs a few minutes ago, but it took about 4 minutes instead.

Before the weekend's downtime, it was nearly instant. It may work itself out soon.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1340881 · Report as offensive
1 · 2 · 3 · 4 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (82) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.