Panic Mode On (57) Server problems?

Message boards : Number crunching : Panic Mode On (57) Server problems?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 10 · Next

AuthorMessage
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1158065 - Posted: 2 Oct 2011, 0:25:49 UTC

Server status.....


ID: 1158065 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 1158090 - Posted: 2 Oct 2011, 1:34:15 UTC - in response to Message 1158065.  


Still no response from the Scheduler.
Grant
Darwin NT
ID: 1158090 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1158095 - Posted: 2 Oct 2011, 1:43:20 UTC

Scheduler responds for me. Getting nothing but "no tasks available." What I find sketchy is the ready-to-send queue is building, splitters are steaming right along, but cricket is pretty calm and lots of "no tasks available." Did they drop the feeder cache to something like 10 every 2 seconds?
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1158095 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1158097 - Posted: 2 Oct 2011, 1:51:52 UTC - in response to Message 1158095.  

Well I'm connecting again and receiving work again so I'm happy.

Cheers.
ID: 1158097 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 1158100 - Posted: 2 Oct 2011, 1:55:42 UTC - in response to Message 1158097.  

Well I'm connecting again and receiving work again so I'm happy.

Cheers.

Tried it again just then. This time it conected.
Project still has not tasks available though.
Grant
Darwin NT
ID: 1158100 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 1158101 - Posted: 2 Oct 2011, 1:57:43 UTC - in response to Message 1158100.  
Last modified: 2 Oct 2011, 2:04:00 UTC

Well I'm connecting again and receiving work again so I'm happy.

Cheers.

Tried it again just then. This time it conected.
Project still has not tasks available though.



EDIT- just tried it on the other machine & no response fom the Scheduler.
Grant
Darwin NT
ID: 1158101 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 1158103 - Posted: 2 Oct 2011, 1:59:25 UTC - in response to Message 1158101.  
Last modified: 2 Oct 2011, 2:03:46 UTC

Well I'm connecting again and receiving work again so I'm happy.

Cheers.

Tried it again just then. This time it conected.
Project still has not tasks available though.



EDIT- just tried it on the other machine & no response fom the Scheduler.



EDIT again- just tried it again & not only got a response but got 30 WUs.

Things really are seriously screwed. Not just the flakey router but the servers as well.

Yet another EDIT- 1st machine got a "server returned nothing(no headers, no data)" response from the Scheduler.
The next time it tried it got a couple of WUs.
Grant
Darwin NT
ID: 1158103 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1158134 - Posted: 2 Oct 2011, 3:44:32 UTC

If I was Admin in this situation, I would be completely incoherent. These are times that Sadms live on Cheeto's and Mountain Dew.

Or Kit Kats and Coca Cola (sugar and caffeine mmmmmmmm)
ID: 1158134 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1158162 - Posted: 2 Oct 2011, 7:43:18 UTC

  Team,

    This following messages are typical today...

    10/1/2011 7:40:19 PM | SETI@home | update requested by user
    10/1/2011 7:40:21 PM | SETI@home | Sending scheduler request: Requested by user.
    10/1/2011 7:40:21 PM | SETI@home | Requesting new tasks for NVIDIA GPU
    10/1/2011 7:40:25 PM | SETI@home | Scheduler request completed: got 0 new tasks
    10/1/2011 7:40:25 PM | SETI@home | No tasks sent
    10/1/2011 7:40:25 PM | SETI@home | No tasks are available for SETI@home Enhanced
    10/1/2011 7:40:25 PM | SETI@home | Tasks for CPU are available, but your preferences are set to not accept them


    Jeff



I wonder if that means that there's a lot of VLAR's around, (VLAR's aren't sent to Nvidia GPU's)


Out from 88 units i´ve downloaded over night there was only 6 VLARs.
Most VHARs and a few mid range units.



With each crime and every kindness we birth our future.
ID: 1158162 · Report as offensive
Profile soft^spirit
Avatar

Send message
Joined: 18 May 99
Posts: 6497
Credit: 34,134,168
RAC: 0
United States
Message 1158175 - Posted: 2 Oct 2011, 9:05:17 UTC

scheduler attempts are up to flakey. Some go through, others different types of errors/no connection and some go through, sometimes yielding units sometimes not. All in all cache is building slowly, and uploads and downloads are flying through for me.


Janice
ID: 1158175 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1158177 - Posted: 2 Oct 2011, 9:21:00 UTC - in response to Message 1158162.  

  Team,

    This following messages are typical today...

    10/1/2011 7:40:19 PM | SETI@home | update requested by user
    10/1/2011 7:40:21 PM | SETI@home | Sending scheduler request: Requested by user.
    10/1/2011 7:40:21 PM | SETI@home | Requesting new tasks for NVIDIA GPU
    10/1/2011 7:40:25 PM | SETI@home | Scheduler request completed: got 0 new tasks
    10/1/2011 7:40:25 PM | SETI@home | No tasks sent
    10/1/2011 7:40:25 PM | SETI@home | No tasks are available for SETI@home Enhanced
    10/1/2011 7:40:25 PM | SETI@home | Tasks for CPU are available, but your preferences are set to not accept them


    Jeff



I wonder if that means that there's a lot of VLAR's around, (VLAR's aren't sent to Nvidia GPU's)


Out from 88 units i´ve downloaded over night there was only 6 VLARs.
Most VHARs and a few mid range units.

I think Jeff should check his project preferences then, and check that SETI@home Enhanced is still selected, then report back,

Claggy
ID: 1158177 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 1158201 - Posted: 2 Oct 2011, 11:13:18 UTC

ooh, ohhh - i think their is trouble ahead. I never have seen this values at this level: Results waiting for db purging 10,049,030 MB# 113,856 AP#.
And db_purge.x86_64 on vader is at "Disabled".
How many WUs could be handled by this database?

__W__
_______________________________________________________________________________
ID: 1158201 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1158209 - Posted: 2 Oct 2011, 12:04:16 UTC - in response to Message 1158201.  

ooh, ohhh - i think their is trouble ahead. I never have seen this values at this level: Results waiting for db purging 10,049,030 MB# 113,856 AP#.
And db_purge.x86_64 on vader is at "Disabled".
How many WUs could be handled by this database?

__W__

That's a good question. From what I know of databases in general, I doubt we'll hit any problems with the absolute numbers - databases are designed to grow until they fill all the available disk space.

Bur what happens is that they get slower and slower as they get bigger. In particular for SETI, Matt has mentioned in the past then when the database gets too big for key tables (and indexes) to be held in RAM, then the performance is really hammered by having to swap pages at disk I/O speed. I think I'm seeing that already, if I ever look at the full task list for one of my hosts.
ID: 1158209 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1158211 - Posted: 2 Oct 2011, 12:07:22 UTC - in response to Message 1158209.  

ooh, ohhh - i think their is trouble ahead. I never have seen this values at this level: Results waiting for db purging 10,049,030 MB# 113,856 AP#.
And db_purge.x86_64 on vader is at "Disabled".
How many WUs could be handled by this database?

__W__

That's a good question. From what I know of databases in general, I doubt we'll hit any problems with the absolute numbers - databases are designed to grow until they fill all the available disk space.

Bur what happens is that they get slower and slower as they get bigger. In particular for SETI, Matt has mentioned in the past then when the database gets too big for key tables (and indexes) to be held in RAM, then the performance is really hammered by having to swap pages at disk I/O speed. I think I'm seeing that already, if I ever look at the full task list for one of my hosts.

I have noticed the slow response when viewing task lists as well.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1158211 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 1158212 - Posted: 2 Oct 2011, 12:13:32 UTC - in response to Message 1158209.  

This maybe the reason why some crunchers could not get WUs up to the limits dispite cricket is not at a high level - at least connection to Berkely is now good and fast from my point of view.

__W__
_______________________________________________________________________________
ID: 1158212 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1158214 - Posted: 2 Oct 2011, 12:32:20 UTC

So.. I keep hearing about this limit of 50/cpu... my single core machine has 99 in progress and the messages tab says nothing about a limit. Hm. Now it's 100, still no message. Either it's a glitch, or it's because a 10-day cache on this machine is somewhere between 90-110 MBs on average, at least with the weird estimates and most of them being shorties.

I'm keep hitting the limit on 2 PCs. What I don't understand is why BOINC still keep asking for CPU work, just now it has asked for new CPU work 5 times in a row. Wouldn't it be better if it waits until a CPU task is finished before asking for more?
ID: 1158214 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 1158220 - Posted: 2 Oct 2011, 12:40:23 UTC - in response to Message 1158214.  

What I don't understand is why BOINC still keep asking for CPU work, just now it has asked for new CPU work 5 times in a row. Wouldn't it be better if it waits until a CPU task is finished before asking for more?

You cache is set to a higher level so Boinc has to ask for more WUs. The 50/400 WUs limit are set at serverside and so Boinc don't "know" about them and can't handle them.

__W__
_______________________________________________________________________________
ID: 1158220 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1158221 - Posted: 2 Oct 2011, 12:46:49 UTC - in response to Message 1158214.  

So.. I keep hearing about this limit of 50/cpu... my single core machine has 99 in progress and the messages tab says nothing about a limit. Hm. Now it's 100, still no message. Either it's a glitch, or it's because a 10-day cache on this machine is somewhere between 90-110 MBs on average, at least with the weird estimates and most of them being shorties.

I'm keep hitting the limit on 2 PCs. What I don't understand is why BOINC still keep asking for CPU work, just now it has asked for new CPU work 5 times in a row. Wouldn't it be better if it waits until a CPU task is finished before asking for more?

BOINC hasn't been programmed to take any notice of the reason why no work is issued - even when the reason is stated (it isn't always).

The newer BOINC v6.12.xx are programmed to back off and ask less frequently when no work is forthcoming - but the backoff is reset to zero when a task completes, and BOINC allows time for the just-completed task to upload and be available for reporting before requesting new work. So, its behaviour is quite close to what you're suggesting: if you have reached the quota limit, there's a fair likelyhood that your next scheduler request will be 'report one, get one replacement'.

You, on the other hand, are running BOINC v6.10.58/60 - that version of BOINC simply knows how much work you've said that you'd like to have, and keeps asking 'more, more, more', even when the server is replying 'no, no, no'. If you are concerned about the strain that your repeated fruitless requests are placing on the servers, you could consider upgrading to BOINC v6.12.34, or temporarily reduce you cache size to something which matches the current (temporary) quota limit more closely.
ID: 1158221 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1158226 - Posted: 2 Oct 2011, 12:55:49 UTC

I don't like v6.12.* so I will reduce my cache setting for now :)
ID: 1158226 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 1158227 - Posted: 2 Oct 2011, 13:03:08 UTC - in response to Message 1158221.  

BOINC hasn't been programmed to take any notice of the reason why no work is issued - even when the reason is stated (it isn't always).

The newer BOINC v6.12.xx are programmed to back off and ask less frequently when no work is forthcoming - but the backoff is reset to zero when a task completes, and BOINC allows time for the just-completed task to upload and be available for reporting before requesting new work. So, its behaviour is quite close to what you're suggesting: if you have reached the quota limit, there's a fair likelyhood that your next scheduler request will be 'report one, get one replacement'.

You, on the other hand, are running BOINC v6.10.58/60 - that version of BOINC simply knows how much work you've said that you'd like to have, and keeps asking 'more, more, more', even when the server is replying 'no, no, no'. If you are concerned about the strain that your repeated fruitless requests are placing on the servers, you could consider upgrading to BOINC v6.12.34, or temporarily reduce you cache size to something which matches the current (temporary) quota limit more closely.

That's the much more detailed explanation, thanks ;-)
__W__

_______________________________________________________________________________
ID: 1158227 · Report as offensive
1 · 2 · 3 · 4 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (57) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.