Panic Mode On (115) Server Problems?

Message boards : Number crunching : Panic Mode On (115) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 31 · Next

AuthorMessage
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1982765 - Posted: 1 Mar 2019, 7:40:12 UTC - in response to Message 1982762.  

I found a little trick that seems to be working once you can get downloads started.
In cc_config, set max files transfer to 1.
<max_file_xfers>1</max_file_xfers>
Have Boinc read the config file.
Once it gets going, it will only download one task at a time.
Seems to be working at the moment.

Meow!

Meow yourself. :)
No luck here with that ...
Strange that some folks are actually getting anything and for others it seems to be down hard ...
ID: 1982765 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1982766 - Posted: 1 Mar 2019, 7:44:10 UTC - in response to Message 1982765.  
Last modified: 1 Mar 2019, 7:45:23 UTC

No luck here with that ...

Nope.
Same for playing with the hosts file.
Things are very borked.


Berkeley campus System Status shows all green, so it's (most likely) just a Set issue again.
Grant
Darwin NT
ID: 1982766 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1982767 - Posted: 1 Mar 2019, 7:44:31 UTC - in response to Message 1982765.  

It's working purrrrrrrrrrrfectly for me.
You might have to hit the retry button for a while to get things rolling.
But I am now downloading tasks 1 at a time with good results.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1982767 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1982770 - Posted: 1 Mar 2019, 8:11:46 UTC

I notice that the Master Database is copping a hiding at the moment as well.
Grant
Darwin NT
ID: 1982770 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9958
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1982771 - Posted: 1 Mar 2019, 8:23:41 UTC

A little bit of baby sitting here(spamming the retry on the last waiting task) and managed to get both machines to full caches.

To be fair it was only about 50 tasks across 2 machines.
ID: 1982771 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1982772 - Posted: 1 Mar 2019, 8:24:57 UTC

I can get away with <max_file_xfers>2</max_file_xfers>

But you have to get one download started to begin with, then they take off on their own.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1982772 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1982773 - Posted: 1 Mar 2019, 8:31:00 UTC - in response to Message 1982772.  
Last modified: 1 Mar 2019, 8:32:39 UTC

I can get away with <max_file_xfers>2</max_file_xfers>

But you have to get one download started to begin with, then they take off on their own.

Yeah, I finally got 1 box "in the door" and got ~100 tasks.
Looks like it's just everybody contending for open ports on the 1 d/l sever that is up and running.
Max xfers probably doesn't matter; just a question of getting a socket.

Max_xfers at 8 is probably pretty reasonable, given that max_xfers_per_project at 2 will throttle it lower anyway.
ID: 1982773 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1982774 - Posted: 1 Mar 2019, 8:31:03 UTC - in response to Message 1982772.  

I can get away with <max_file_xfers>2</max_file_xfers>

But you have to get one download started to begin with, then they take off on their own.

In the land of work arounds...............LOL.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1982774 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1982776 - Posted: 1 Mar 2019, 8:53:38 UTC - in response to Message 1982775.  

Morning Mark,

In the land of work arounds...............LOL.

Or J.M. Barrie's famous place :-)

Hiya, Chris.
I don't think I know that place.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1982776 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1982777 - Posted: 1 Mar 2019, 8:57:34 UTC - in response to Message 1982775.  

Morning Mark,

In the land of work arounds...............LOL.

Or J.M. Barrie's famous place :-)

Lol!
ID: 1982777 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51478
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1982780 - Posted: 1 Mar 2019, 9:02:36 UTC - in response to Message 1982779.  
Last modified: 1 Mar 2019, 9:04:25 UTC

Peter Pan?

p.s. mornin' Jim

Ahh,,,,,,,,,,,,,,,,
When one believes, it can be so.
I might be due for a pixie haircut soon.

Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1982780 · Report as offensive
Ghia
Avatar

Send message
Joined: 7 Feb 17
Posts: 238
Credit: 28,911,438
RAC: 50
Norway
Message 1982781 - Posted: 1 Mar 2019, 9:03:49 UTC

Arrrghh...woke up to an eerily quiet living room...no cozy fan hum means S@H trouble.
And trouble there was....I'm down to a few CPU tasks left in cache.
Can't get the downloads moving, "Project communication failed: attempting access to reference site".
Oh well....good time to get some backup work done.

...Ghia...
Humans may rule the world...but bacteria run it...
ID: 1982781 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1982784 - Posted: 1 Mar 2019, 9:29:40 UTC - in response to Message 1982780.  

Peter Pan?

p.s. mornin' Jim

Ahh,,,,,,,,,,,,,,,,
When one believes, it can be so.
I might be due for a pixie haircut soon.

Meow.

Or is it just that the servers are out in Never Never Land? :)
Morning, Chris.
ID: 1982784 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 36761
Credit: 261,360,520
RAC: 489
Australia
Message 1982786 - Posted: 1 Mar 2019, 9:44:15 UTC

Everything is going just dandy here and that's not to mention over 160 AP's picked up today.

Cheers.
ID: 1982786 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13854
Credit: 208,696,464
RAC: 304
Australia
Message 1982787 - Posted: 1 Mar 2019, 9:52:14 UTC - in response to Message 1982786.  
Last modified: 1 Mar 2019, 9:52:41 UTC

Everything is going just dandy here and that's not to mention over 160 AP's picked up today.

Cheers.

Making me think this could be some sort of campus network issue.
I've tried your fix, I've tried other suggested fixes, i've tried both, & various combinations.

Still no joy.
Grant
Darwin NT
ID: 1982787 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1982789 - Posted: 1 Mar 2019, 9:59:04 UTC

Well, like everybody else, I woke up to stalled downloads across all machines. All now fully reloaded and downloaded, with no downtime. Conclusions:

Smaller, less powerful machines can hold their caches for longer.
For this current (28 February/1 March 2019) problem, messing with the hosts file won't help. If you didn't 'reset to auto' after the last DNS failure, you're probably shooting yourself in the foot.
Setting 'max_file_xfers_per_project' to 1 does help. It won't help you get your first connection, but failures on the second channel won't send you into "project backoff". The single channel you've got seems to keep plodding away until finished.
ID: 1982789 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1982791 - Posted: 1 Mar 2019, 10:13:32 UTC - in response to Message 1982789.  


Setting 'max_file_xfers_per_project' to 1 does help. It won't help you get your first connection, but failures on the second channel won't send you into "project backoff". The single channel you've got seems to keep plodding away until finished.


Do u Think they have modified their firewall to allow only one concurrent Connection perhaps. If multiple Connections occur from the same IP then the host is sent to a block firewall rule?!

That's the only explanation that comes to mind when max_file is set to 1 instead as of 2 per default.

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1982791 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1982792 - Posted: 1 Mar 2019, 10:17:18 UTC - in response to Message 1982789.  

Smaller, less powerful machines can hold their caches for longer.

Interestingly, my least capable machine is the only one that seems able to get DLs, and now does so consistently. The others, no soap after several hours of manual trying.
All are 'max_file_xfers_per_project' set to 1.
Like a lot of other things, all comes down to the luck of the draw.
ID: 1982792 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14679
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1982793 - Posted: 1 Mar 2019, 10:28:16 UTC - in response to Message 1982791.  

No. There's no point, they haven't got time to mess about with things like that.

My reasoning for my advice was twofold:

1) At their end. We seem to have a download server which can't handle as many connections as we would all like it to service at the same time. Usually, when you hit it, it's too busy, and tells you to go away. But if you persist, every so often you'll hit it at just the right microsecond to get its attention.

2) At your end. Once you get a connection, you want to go on using it until your backlog is cleared. If you have a second set of connection attempts running at the same time, there's a chance that the second connection will hit at the wrong microsecond. Three of those in a row, and BOINC will stop trying. Then you go to the back of the scrum again.
ID: 1982793 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1982797 - Posted: 1 Mar 2019, 10:38:57 UTC - in response to Message 1982793.  

Seems like it, once i updated the cc_config to accept only one at a time again it seems to go through now.
Needed to nag the retry (in boinccmd it's a mess) but once it got going it seems to let through the others too at a slow pace.

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1982797 · Report as offensive
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 31 · Next

Message boards : Number crunching : Panic Mode On (115) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.