Panic Mode On (81) Server Problems?

Message boards : Number crunching : Panic Mode On (81) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 21 · Next

AuthorMessage
Kathy
Avatar

Send message
Joined: 5 Jan 03
Posts: 338
Credit: 27,877,436
RAC: 0
United States
Message 1335459 - Posted: 7 Feb 2013, 15:28:36 UTC

Haven't been able to connect for 3 days now, and get the same messages every time:

2/7/2013 10:25:47 AM | SETI@home | Reporting 81 completed tasks, requesting new tasks for CPU and ATI
2/7/2013 10:26:11 AM | SETI@home | Scheduler request failed: Couldn't connect to server
2/7/2013 10:26:15 AM | | Project communication failed: attempting access to reference site
2/7/2013 10:26:16 AM | | Internet access OK - project servers may be temporarily down.

ID: 1335459 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1335460 - Posted: 7 Feb 2013, 15:30:55 UTC
Last modified: 7 Feb 2013, 15:32:00 UTC

Yes it seems lately we have had a lot of problems. No work,ghost work units, Cant report, Cant download, Cant upload. Or download speeds so slow you could go to the lab and get them faster inperson.

I can assure you that no one likes it. Not even the lab crew. But when you are the Number 1 Boinc project in terms of active members and underfunded its understandable.

I do other projects when I have no work. Id rather not, but I do just to keep my computers busy. And they are worthy projects in their own right.
[/quote]

Old James
ID: 1335460 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1335483 - Posted: 7 Feb 2013, 16:34:53 UTC

Don't know what happened, just woke up from dozing off for about an hour and noticed that the scheduler has dumped almost a full fuel load into my fastest rig, finishing as I opened my eyes. Now if it would only bless my other rig, both machines will quit bitc(&^$$^%%$inhg.


I don't buy computers, I build them!!
ID: 1335483 · Report as offensive
Gone

Send message
Joined: 31 May 99
Posts: 150
Credit: 125,779,206
RAC: 0
United Kingdom
Message 1335485 - Posted: 7 Feb 2013, 16:43:45 UTC - in response to Message 1335483.  

Don't know what happened, just woke up from dozing off for about an hour and noticed that the scheduler has dumped almost a full fuel load into my fastest rig, finishing as I opened my eyes. Now if it would only bless my other rig, both machines will quit bitc(&^$$^%%$inhg.





Similar thing just happened to me, except I was only dreaming.

When I awoke all was still broken ...

:)
ID: 1335485 · Report as offensive
Profile Cliff Harding
Volunteer tester
Avatar

Send message
Joined: 18 Aug 99
Posts: 1432
Credit: 110,967,840
RAC: 67
United States
Message 1335490 - Posted: 7 Feb 2013, 16:55:38 UTC - in response to Message 1335485.  

Don't know what happened, just woke up from dozing off for about an hour and noticed that the scheduler has dumped almost a full fuel load into my fastest rig, finishing as I opened my eyes. Now if it would only bless my other rig, both machines will quit bitc(&^$$^%%$inhg.





Similar thing just happened to me, except I was only dreaming.

When I awoke all was still broken ...

:)


My luck is still holding, just got 2 more Opencl units since my initial post.


I don't buy computers, I build them!!
ID: 1335490 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1335495 - Posted: 7 Feb 2013, 17:09:51 UTC

Sure would be nice to have a place to get some information as to what is being done to correct this issue. Or is there and I just don't know where to look?
ID: 1335495 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 1335496 - Posted: 7 Feb 2013, 17:19:32 UTC - in response to Message 1335495.  

Sure would be nice to have a place to get some information as to what is being done to correct this issue. Or is there and I just don't know where to look?


This is the place.
Welcome to Room 101.

ID: 1335496 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1335500 - Posted: 7 Feb 2013, 17:27:22 UTC - in response to Message 1335496.  

Sure would be nice to have a place to get some information as to what is being done to correct this issue. Or is there and I just don't know where to look?


This is the place.
Welcome to Room 101.


Thanks. Guess I need a crash course in reading comprehension. :)
ID: 1335500 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1335501 - Posted: 7 Feb 2013, 17:28:16 UTC - in response to Message 1335495.  
Last modified: 7 Feb 2013, 17:47:59 UTC

I had been running NNT to trim out the existing work units. I don't run GPU work units from SETI for a number of reasons so I suppose my frustration level is lower than many. These days, as the project repeatedly bounces its collective head against the AP mass traffic tie up - something which has happened many times over the past few months without a resolution -- when I encounter the 'Dead SETI scenaria, I simply suspend the project on the handful of workstations still running SETI and feed projects that don't suffer from this sort of persistent reliability problem.

Perhaps at some point the folks back at the project will figure out that running the high traffic volume AP work units along with the worthless less than one minute CPU work units only serves to reduce the projects effective work and thus might be best avoided.

Until then, I figure to watch closely, run and report in remaining work during the increasingly rare full functionality periods and once that clears, simply watch to see if a project learning process is demonstrated.

Alternatively, it might be part of a grand plan at the project to reduce problems by pushing away users until the traffic levels are low enough to be sustained.
ID: 1335501 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1335513 - Posted: 7 Feb 2013, 17:59:36 UTC - in response to Message 1335501.  


Woke up this morning to find both systems rapidly running out of work. Large backoffs due to not being able to connect to the Scheduler when it tries to do so.
Grant
Darwin NT
ID: 1335513 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 944
Credit: 52,956,491
RAC: 67
United Kingdom
Message 1335515 - Posted: 7 Feb 2013, 18:07:27 UTC - in response to Message 1335513.  


Woke up this morning to find both systems rapidly running out of work. Large backoffs due to not being able to connect to the Scheduler when it tries to do so.

This is obviously the start of a 1920s Blues song.
"Woke up this morning,
Found I was nearly out of work.
Woke up this morning,
etc...

ID: 1335515 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1335519 - Posted: 7 Feb 2013, 18:13:19 UTC
Last modified: 7 Feb 2013, 18:13:40 UTC

Can't report tasks, scheduler not responding. Anyone sees the same now ?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1335519 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1335523 - Posted: 7 Feb 2013, 18:18:48 UTC - in response to Message 1335519.  

Can't report tasks, scheduler not responding. Anyone sees the same now ?

Been extremely hard to contact the scheduler for days now.....
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1335523 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1335531 - Posted: 7 Feb 2013, 18:36:53 UTC - in response to Message 1335519.  
Last modified: 7 Feb 2013, 18:44:51 UTC

Can't report tasks, scheduler not responding. Anyone sees the same now ?

For about 3-4 days now.


EDIT- and if you are able to contact the Scheduler & don't get a "Failure when receiving data from the peer message" it takes 2-5min to get a response.
Grant
Darwin NT
ID: 1335531 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1335544 - Posted: 7 Feb 2013, 19:05:48 UTC
Last modified: 7 Feb 2013, 19:44:55 UTC

Now here is a possible cause that super crunchers won't likely realize. Because of the 100 unit max limit is less than what I would normally have based on queue size, the client is requesting more units every five minutes only to be (when I get through) rebuffed due to the 100 unit limit. Now super crunchers are likely to actually have completed units to report every five minutes but for me, maybe only 2-3 units per hour on average. So 9 or 10 pointless requests than are required are sent to the server. Now multiply that by all the hosts that are crunching as "slow" or slower than me (daily RAC ranks my host in the top 4,500-5,000 range) and that adds up to a lot of meaningless requests to the server.

Could that be causing a problem?
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1335544 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13715
Credit: 208,696,464
RAC: 304
Australia
Message 1335548 - Posted: 7 Feb 2013, 19:15:57 UTC - in response to Message 1335544.  

Could that be causing a problem?

It wouldn't be helping with the bandwidth issues, but it wouldn't be causing a problem in it's own right.
We've had Scheduler issues on & off for months now, but the current one began about 4 days ago.
Prior to that, for a while at least, Scheduler repsonses were nice & quick. Now they take forever, if you can connect & if you don't get an error once you've done so.

Grant
Darwin NT
ID: 1335548 · Report as offensive
Profile S@NL Etienne Dokkum
Volunteer tester
Avatar

Send message
Joined: 11 Jun 99
Posts: 212
Credit: 43,822,095
RAC: 0
Netherlands
Message 1335551 - Posted: 7 Feb 2013, 19:18:34 UTC

sometime this afternoon the server "granted" my main rig with 200 WU's ... Soon to discover that all GPU were shorties... increasing the server load even more :S
ID: 1335551 · Report as offensive
ExchangeMan
Volunteer tester

Send message
Joined: 9 Jan 00
Posts: 115
Credit: 157,719,104
RAC: 0
United States
Message 1335558 - Posted: 7 Feb 2013, 19:41:36 UTC - in response to Message 1335551.  

sometime this afternoon the server "granted" my main rig with 200 WU's ... Soon to discover that all GPU were shorties... increasing the server load even more :S

I hate it when I get tons of shorties. Too much bandwidth for too little credit.

ID: 1335558 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1335560 - Posted: 7 Feb 2013, 19:50:47 UTC - in response to Message 1335558.  

Given the evidence of the past many months, it seems that there is no solution envisioned or affordable for the clear bandwidth choke point, so perhaps instead some software filtering solution might be a reasonable approach to mitigating the obvious problem.

Of course, there is another approach, which takes no effort at all, as people realize that the project reliability has in fact been seriously compromised due to traffic levels, they may do as Greeley suggested -- 'go elsewhere dear user' -- there are in fact many of other more reliable and interesting projects out there.



sometime this afternoon the server "granted" my main rig with 200 WU's ... Soon to discover that all GPU were shorties... increasing the server load even more :S

I hate it when I get tons of shorties. Too much bandwidth for too little credit.

ID: 1335560 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1335561 - Posted: 7 Feb 2013, 19:51:14 UTC - in response to Message 1335548.  
Last modified: 7 Feb 2013, 19:51:54 UTC

Could that be causing a problem?

It wouldn't be helping with the bandwidth issues, but it wouldn't be causing a problem in it's own right.
We've had Scheduler issues on & off for months now, but the current one began about 4 days ago.
Prior to that, for a while at least, Scheduler repsonses were nice & quick. Now they take forever, if you can connect & if you don't get an error once you've done so.


I know but it's acting a lot like a memory leak. Everything stays responsive right up to the point the application starts hitting the swap space hard and it becomes slow and unresponsive.

Not saying it's a memory leak, just it's acting as if some buffer that isn't being emptied quite as fast as it is being filled finally overflows and it becomes a crapshot whether or not the next item actually gets into the buffer or lost.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1335561 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (81) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.