Panic Mode On (78) Server Problems?

Message boards : Number crunching : Panic Mode On (78) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 22 · Next

AuthorMessage
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1301867 - Posted: 3 Nov 2012, 23:42:58 UTC - in response to Message 1301815.  

Thank you, Juan!

Now we know that they know, and we all know that means they will fix the problem as soon as they can, as they have always done before. :)
ID: 1301867 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1301871 - Posted: 4 Nov 2012, 0:00:59 UTC - in response to Message 1301828.  

They will be acknowledged as report right now only if you select "No New Tasks" on the projects tab.

I wish that were true.
I'll mention it again- even with only a couple of tasks to report & No New Tasks set the Scheduler still usually times out. Theres 1 time in about 20 where it doesn't.
Uploads have been slow for much of this time as well.

Well it does on my system. Task gets done, it uploads and then the scheduler reports it, it goes through and the task vanishes from my client's task list.

Don't know what to say.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1301871 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301872 - Posted: 4 Nov 2012, 0:02:10 UTC - in response to Message 1301858.  

Something other than just bandwidth saturation is at play here.

That's my feeling.
There have been many times in the past where network traffic has been maxed out, and downloads are pretty much impossible, but you are still able to contact the Scheduler to report work & get more work allocated.

The fact is that even now with the network traffic maxed out, if you do (some how) manage to get some work, it's downloading fairly quickly. Certainly much, much faster than in the past, and when you were still able to get a response from the Scheduler.

Over the last few months we've had issues with Scheduler timeouts, but not for nearly as long as this time, nor nearly as severe- from memory i would get a response about 1 in 5 to 7 attemps. Now i'm lucky if it's 1 in 20, No New Tasks set or not.
Hence i suspect it's a system configuration/load problem, not a network load one.
Grant
Darwin NT
ID: 1301872 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301875 - Posted: 4 Nov 2012, 0:05:24 UTC - in response to Message 1301871.  

They will be acknowledged as report right now only if you select "No New Tasks" on the projects tab.

I wish that were true.
I'll mention it again- even with only a couple of tasks to report & No New Tasks set the Scheduler still usually times out. Theres 1 time in about 20 where it doesn't.
Uploads have been slow for much of this time as well.

Well it does on my system. Task gets done, it uploads and then the scheduler reports it, it goes through and the task vanishes from my client's task list.

Don't know what to say.

It gets really hard to quantify it when I have 9 rigs trying to report 1000s of WUs. Hits and misses go by unnoticed by me. Until I check the stats page and I see some rigs have not reported for hours.
That page is usually my barometer for the rigs, if I see one has not reported for a while, I suspect a crash and check it out.

Not a reliable barometer at the moment.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301875 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301878 - Posted: 4 Nov 2012, 0:07:17 UTC - in response to Message 1301871.  

They will be acknowledged as report right now only if you select "No New Tasks" on the projects tab.

I wish that were true.
I'll mention it again- even with only a couple of tasks to report & No New Tasks set the Scheduler still usually times out. Theres 1 time in about 20 where it doesn't.
Uploads have been slow for much of this time as well.

Well it does on my system. Task gets done, it uploads and then the scheduler reports it, it goes through and the task vanishes from my client's task list.

Don't know what to say.

My client_state is 2.4MB in size, my sched_request_setiathome.berkeley.edu is 450kB in size. I suspect yours are a lot smaller. You're in the US, i'm a few thousand kms away.
End result- you may be able to get work, i'm lucky if i can even report work- even after 30min of endless Update clicking with No New Tasks set.
Grant
Darwin NT
ID: 1301878 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301880 - Posted: 4 Nov 2012, 0:14:38 UTC - in response to Message 1301878.  

They will be acknowledged as report right now only if you select "No New Tasks" on the projects tab.

I wish that were true.
I'll mention it again- even with only a couple of tasks to report & No New Tasks set the Scheduler still usually times out. Theres 1 time in about 20 where it doesn't.
Uploads have been slow for much of this time as well.

Well it does on my system. Task gets done, it uploads and then the scheduler reports it, it goes through and the task vanishes from my client's task list.

Don't know what to say.

My client_state is 2.4MB in size, my sched_request_setiathome.berkeley.edu is 450kB in size. I suspect yours are a lot smaller. You're in the US, i'm a few thousand kms away.
End result- you may be able to get work, i'm lucky if i can even report work- even after 30min of endless Update clicking with No New Tasks set.

Grant, don't feel special.
The kitties are equally fukayed from the midwest USA.....
I could be on the Berk campus and still be screwed as bad as you right now.

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301880 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301887 - Posted: 4 Nov 2012, 0:26:29 UTC - in response to Message 1301880.  


Just to add to the present fun, i'm now getting some "Couldn't connect to server" messages in response to a Scheduler request.
Grant
Darwin NT
ID: 1301887 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301888 - Posted: 4 Nov 2012, 0:28:22 UTC - in response to Message 1301887.  


Just to add to the present fun, i'm now getting some "Couldn't connect to server" messages in response to a Scheduler request.

Been getting those for the last 24 hours or more.
#2 rig has not been able to get through for an hour and a half.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301888 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1301899 - Posted: 4 Nov 2012, 1:10:33 UTC

I'm having no trouble reporting 1-5 tasks every couple of hours. My cache got a little over-filled when I started hoarding APs, so I'm not asking for more work presently, which I suspect is nearly the equivalent of NNT, since both are effectively "not asking for more work."

When I was downloading the APs, they were coming in at around 25KB/sec for each one, sometimes I had 5-8 of them going at a time.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1301899 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1301901 - Posted: 4 Nov 2012, 1:12:11 UTC

my linux boxes are reporting and obtaining normal work. but my one windoz box shows nothing but scheduler timeout. it seems to upload ok .. slowly but ok.

ID: 1301901 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301903 - Posted: 4 Nov 2012, 1:18:22 UTC

You folks don't have a clue about hosts processing and trying to return 100s of results an hour.

Not the same as saying....oh, my rig did 2 tasks last hour, and both got reported just fine.

Yah, right. The kitties got YOUR back, bud.

Not dissing anybody, but it's a different class of problems here.

The kitties need access to the servers 24/7.


"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301903 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301905 - Posted: 4 Nov 2012, 1:29:07 UTC - in response to Message 1301899.  

When I was downloading the APs, they were coming in at around 25KB/sec for each one, sometimes I had 5-8 of them going at a time.

Which is more support for the Scheduler issues being server related, not network traffic.
Grant
Darwin NT
ID: 1301905 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1301912 - Posted: 4 Nov 2012, 1:42:33 UTC

msattler wrote:
The kitties need access to the servers 24/7.


They ought to just send you 500gb unsplit raw drives to crunch :-)
Then we'd all have some network bandwidth to spare.
ID: 1301912 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11354
Credit: 29,581,041
RAC: 66
United States
Message 1301914 - Posted: 4 Nov 2012, 1:51:19 UTC - in response to Message 1301903.  

I can't believe it is Karma. It is just that the big guys are constipating the system. Everybody knows that. Until a politically acceptable and economic doable solution is proposed this is just venting.
ID: 1301914 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11354
Credit: 29,581,041
RAC: 66
United States
Message 1301916 - Posted: 4 Nov 2012, 1:54:01 UTC - in response to Message 1301912.  

Tron, that gets away from the concept of distributed computing.
ID: 1301916 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1301917 - Posted: 4 Nov 2012, 1:55:18 UTC - in response to Message 1301912.  

msattler wrote:
The kitties need access to the servers 24/7.


They ought to just send you 500gb unsplit raw drives to crunch :-)
Then we'd all have some network bandwidth to spare.

Hey, that might work....LOL.

I'll broach the subject with Eric in our next chat.

I'd have to install the splitting software. I think the GPUUG has sent enough HDs for shuttle service. That would be a grand solution, I think.

They would have to trust the kitties' science trail....err, tails.

Something tells me that letting raw data out of the house would not work scientifically.

The kitties are simply scintellated by the thought, though.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1301917 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1301919 - Posted: 4 Nov 2012, 1:59:16 UTC - in response to Message 1301916.  

Tron, that gets away from the concept of distributed computing.



they still distribute the work , and only say... the top 25 machines would participate in a HD exchange program
I don't think the work would need to be split either.. just one long stream with nominal checkpointing.
ID: 1301919 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1301920 - Posted: 4 Nov 2012, 1:59:25 UTC - in response to Message 1301903.  

You folks don't have a clue about hosts processing and trying to return 100s of results an hour.

Not the same as saying....oh, my rig did 2 tasks last hour, and both got reported just fine.

Yah, right. The kitties got YOUR back, bud.

Not dissing anybody, but it's a different class of problems here.

The kitties need access to the servers 24/7.


I wasn't trying to say that it was a situation of "it must just be a problem on your end," I was merely pointing out that I don't have any/many scheduler contact issues, even when only reporting a very small number of tasks. Others are having connection issues when reporting a small number of tasks, and so are those who are reporting a large number.

Consider my message as a data point on a graph, or a breadcrumb for trying to pin-point the actual problem.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1301920 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1301958 - Posted: 4 Nov 2012, 4:54:15 UTC


...and to rub salt into the wounds, on the one occasion where the Scheduler responded to a request for work, i got 1 WU.
1,000 would be nice, a couple of thosand would be better. Probably at least half as many again to fill my caches.
1 WU!
Grant
Darwin NT
ID: 1301958 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1301975 - Posted: 4 Nov 2012, 5:34:07 UTC

Perhaps this might be a situation where there is more than one problem occurring at the same time. That, as most of us know from experience, makes diagnosis exceedingly difficult, which seems to fit the current crisis.

Could this be a possibility? What do you think?
ID: 1301975 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.