Panic Mode On (78) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (78) Server Problems?

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 22 · Next
Author Message
Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302768 - Posted: 6 Nov 2012, 8:49:06 UTC
Last modified: 6 Nov 2012, 9:26:06 UTC

If I remember correct - the last time it was max. 50 WUs/CPU-thread and 400 WUs/GPU in BOINC.

So my Intel Core2 Duo E7600 with NVIDIA GeForce GTX260 should get (50 x 2) + 400 = 500 WUs - maybe also this time.


Not even sure it is 50 per core. I'm down to 230 CPU tasks for my hexcore, and still get the limit message. That's with a CPU-only request. I'm only using 5 for crunching, but if that value is used I'd still get 250. I'm way over 400 on GPU tasks, and hate to see a limit there. I won't be able to handle much more than a one day outage, especially if there are a lot of shorties. New card is faster than the one I was running the last time we had these limits.

Haven't seen more timeouts like I reported earlier in the thread.
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302774 - Posted: 6 Nov 2012, 9:02:31 UTC - in response to Message 1302761.
Last modified: 6 Nov 2012, 9:27:50 UTC

I don't know how it's this time .. (no admin announced it) ..

All I know is that Richard Haselgrove posted this earlier in this thread:
(Message 1302257)

I've just had a note back from Eric:

I've stopped the splitters and doubled the httpd timeout...

I think we're going to need to at least temporarily go back
to restricting workunits in progress on a per host basis and per RPC
basis, regardless of what complaints we get about people being unable
to keep their hosts busy.

Of course more work was done Monday, but I don't know what was done.
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8275
Credit: 44,923,904
RAC: 13,577
United Kingdom
Message 1302777 - Posted: 6 Nov 2012, 9:24:02 UTC

Oh dear. I think they've turned on too much, too quickly. I've just created 18 new ghosts.

Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302780 - Posted: 6 Nov 2012, 9:50:02 UTC

Oh dear. I think they've turned on too much, too quickly. I've just created 18 new ghosts.
Still doing the ghost thing? Getting timeouts? +1 on the too much too quick comment.

I've said this before, but the runaway hosts and large number of "Results" in the field last weekend were because Scheduler was assigning new work to hosts that had ghosts that needed to be resent. I don't recall it doing that in the past. Must have been a recent change, so maybe they can find it (but scheduler code is probably a tangled web by now). Go back to "ghosts first" and you don't need steps like these limits.
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Profile Bernie Vine
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 6601
Credit: 22,380,512
RAC: 14,808
United Kingdom
Message 1302783 - Posted: 6 Nov 2012, 10:03:20 UTC - in response to Message 1302777.
Last modified: 6 Nov 2012, 10:03:47 UTC

Oh dear. I think they've turned on too much, too quickly. I've just created 18 new ghosts.

Yes, It also might have been nice for just a few words from someone at the lab as to what they have done and why.

I realise that only the people who come here would see, but there is a hardcore who would like to know if their efforts are worth it. I feel ignored here.

Unfortunately I think the project is trying to do too much with not enough staff.

Sadly I feel my time and electricity is better used elsewhere.
____________


Today is life, the only life we're sure of. Make the most of today.

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8275
Credit: 44,923,904
RAC: 13,577
United Kingdom
Message 1302791 - Posted: 6 Nov 2012, 10:38:04 UTC - in response to Message 1302780.

Oh dear. I think they've turned on too much, too quickly. I've just created 18 new ghosts.

Still doing the ghost thing? Getting timeouts? +1 on the too much too quick comment.

I've said this before, but the runaway hosts and large number of "Results" in the field last weekend were because Scheduler was assigning new work to hosts that had ghosts that needed to be resent. I don't recall it doing that in the past. Must have been a recent change, so maybe they can find it (but scheduler code is probably a tangled web by now). Go back to "ghosts first" and you don't need steps like these limits.

That particular host didn't have any ghosts before the experiment, which is why I tried it first. I did get all 18 of them resent at the next attempt.

I have the beginnings of another theory. Matt has commented in the past that database performance drops off dramatically when the table size grows beyond what can fit in memory (so data has to be fetched from the phyical disks when called for). If a particular host asks for work for the first time in a while, the request takes a long time because of all the disk thrashing. But if you ask again a few minutes later, the host records are still in memory and haven't been overwritten by other hosts. So the second attempt has a better chance of succeeding.

After compaction of the database during maintenance tonight, and a few more days of quota, we might see an improvement. If not, back to the drawing board...

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302795 - Posted: 6 Nov 2012, 10:46:20 UTC

Another newbie question.
Is there a way to easily see how many ghost tasks I have?

Frank

Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302798 - Posted: 6 Nov 2012, 11:02:32 UTC

Another newbie question.
Is there a way to easily see how many ghost tasks I have?


Easiest way is BOINC Tasks. It will give you task counts and the sum of estimated completion times for each device on the Projects tab, and a split by application /status on the Tasks tab. Compare the cpu + gpu counts to the website counts to determine how many ghosts you have. This is how I can tell at a glance that I'm below the CPU limit. It enables you to monitor and control all of your computers from one host using the IP addresses. Test drive it on one of your hosts - think you'll like the improved user interface.

There are also some commands to count the tasks in BOINC Mgr, but I've forgotten them. Anyone?
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302803 - Posted: 6 Nov 2012, 11:11:44 UTC - in response to Message 1302798.

Another newbie question.
Is there a way to easily see how many ghost tasks I have?


Easiest way is BOINC Tasks. It will give you task counts and the sum of estimated completion times for each device on the Projects tab, and a split by application /status on the Tasks tab. Compare the cpu + gpu counts to the website counts to determine how many ghosts you have. This is how I can tell at a glance that I'm below the CPU limit. It enables you to monitor and control all of your computers from one host using the IP addresses. Test drive it on one of your hosts - think you'll like the improved user interface.

There are also some commands to count the tasks in BOINC Mgr, but I've forgotten them. Anyone?


I am using Boinc Tasks and really like it. Unless I am missing something, I still have to go to the web site to get the total in progress count..

Thanks...Frank

Profile [B^S] madmac
Volunteer tester
Avatar
Send message
Joined: 9 Feb 04
Posts: 1133
Credit: 3,085,419
RAC: 3,452
United Kingdom
Message 1302805 - Posted: 6 Nov 2012, 11:19:20 UTC
Last modified: 6 Nov 2012, 12:08:58 UTC

Myself it is mainly downloads that I am having trouble with now get 10 waiting to download mainly http errors so hopefully it will sort itself out IN the end had to hit the retry button and they just downloaded very quickly
____________

Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302807 - Posted: 6 Nov 2012, 11:19:55 UTC - in response to Message 1302803.

.

I am using Boinc Tasks and really like it. Unless I am missing something, I still have to go to the web site to get the total in progress count..

No, you have to check the website and compare to what BOINC Tasks shows. BOINC doesn't know anything about ghosts - it never got word of the assignment and instructions to download them. So, you have to compare BT's numbers with the project's numbers on the website.
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302821 - Posted: 6 Nov 2012, 13:03:14 UTC

I'm now down to 204 CPU tasks now vs. the 300 that we think is the limit for 6 cores, but I'm still getting the limits message. That's true whether the work request is CPU only or CPU+GPU. Is anyone else experiencing this?
____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1302824 - Posted: 6 Nov 2012, 13:14:20 UTC - in response to Message 1302821.

I'm now down to 204 CPU tasks now vs. the 300 that we think is the limit for 6 cores, but I'm still getting the limits message. That's true whether the work request is CPU only or CPU+GPU. Is anyone else experiencing this?


More or less. Sitting here trying to figure it out and not having much luck at it.

S@NL - John van Gorsel
Volunteer tester
Avatar
Send message
Joined: 5 Jul 99
Posts: 188
Credit: 136,399,573
RAC: 6,569
Netherlands
Message 1302828 - Posted: 6 Nov 2012, 13:35:25 UTC - in response to Message 1302821.

I'm now down to 204 CPU tasks now vs. the 300 that we think is the limit for 6 cores, but I'm still getting the limits message.


I assume that the limit is based on the number of tasks as reported on the account page, so including "ghosts".
Chances are that you still get the "limit reached" message when you are completely out of work...
____________


Seti@Netherlands website

Profile Fred E.
Volunteer tester
Send message
Joined: 22 Jul 99
Posts: 731
Credit: 22,048,905
RAC: 23,705
United States
Message 1302831 - Posted: 6 Nov 2012, 13:43:16 UTC

I'm now down to 204 CPU tasks now vs. the 300 that we think is the limit for 6 cores, but I'm still getting the limits message.


I assume that the limit is based on the number of tasks as reported on the account page, so including "ghosts".
Chances are that you still get the "limit reached" message when you are completely out of work...

I don't have any ghosts, results pending report, or downloads in progress. Both BoincTasks and the website's task page show 1661 tasks in progress. But I agree if someone has ghosts, the enforcement of the limit would be based on what Scheduler thinks you should have, whether downloaded or not.

____________
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 3567
Credit: 97,865,566
RAC: 78,767
United States
Message 1302835 - Posted: 6 Nov 2012, 14:03:48 UTC - in response to Message 1302798.
Last modified: 6 Nov 2012, 14:06:12 UTC

Another newbie question.
Is there a way to easily see how many ghost tasks I have?


Easiest way is BOINC Tasks. It will give you task counts and the sum of estimated completion times for each device on the Projects tab, and a split by application /status on the Tasks tab. Compare the cpu + gpu counts to the website counts to determine how many ghosts you have. This is how I can tell at a glance that I'm below the CPU limit. It enables you to monitor and control all of your computers from one host using the IP addresses. Test drive it on one of your hosts - think you'll like the improved user interface.

There are also some commands to count the tasks in BOINC Mgr, but I've forgotten them. Anyone?

I don't know about the manager but you can get your task count from the command line.
1) From the command line type: boinccmd --get_tasks
2) Then compare it to the In progress count the host task page.
http://www.hal6000.com/seti/images/check_ghost.png
If your local count is lower than the server count you should call the Ghostbusters.

EDIT: I guess I have been lucky. Power didn't go out even though we are right in the middle of path that hurricane Sandy took. No real problems uploading, downloading, reporting, or requesting work.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8275
Credit: 44,923,904
RAC: 13,577
United Kingdom
Message 1302837 - Posted: 6 Nov 2012, 14:11:15 UTC - in response to Message 1302831.

I'm now down to 204 CPU tasks now vs. the 300 that we think is the limit for 6 cores, but I'm still getting the limits message.

I assume that the limit is based on the number of tasks as reported on the account page, so including "ghosts".
Chances are that you still get the "limit reached" message when you are completely out of work...

I don't have any ghosts, results pending report, or downloads in progress. Both BoincTasks and the website's task page show 1661 tasks in progress. But I agree if someone has ghosts, the enforcement of the limit would be based on what Scheduler thinks you should have, whether downloaded or not.

Actually, that's not the way it worked last time we were running a quota. My message 1161932 was terse:

One blessing is that ghosts don't count towards the 'tasks in progress' quota limit.

but I remember checking carefully before I posted. At that time - not saying it's necessarily the same now - the quota was calculated on the basis of the tasks that the host reported that it had.

Profile Tron
Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,236,055
RAC: 0
United States
Message 1302900 - Posted: 6 Nov 2012, 23:36:02 UTC

Tue 06 Nov 2012 06:33:18 PM EST | SETI@home | update requested by user
Tue 06 Nov 2012 06:33:21 PM EST | SETI@home | Sending scheduler request: Requested by user.
Tue 06 Nov 2012 06:33:21 PM EST | SETI@home | Reporting 72 completed tasks
Tue 06 Nov 2012 06:33:21 PM EST | SETI@home | Requesting new tasks for CPU and NVIDIA
Tue 06 Nov 2012 06:34:05 PM EST | SETI@home | Scheduler request failed: HTTP internal server error


ut ohh

BarryAZ
Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 11,442,882
RAC: 5,651
United States
Message 1302906 - Posted: 6 Nov 2012, 23:54:46 UTC - in response to Message 1302900.

You might consider not requesting new work until the folks back at the farm figure out what has been in 'mangled condition' regarding the scheduler for the past week or so.

That's what other BOINC projects are for <smile>


Tue 06 Nov 2012 06:33:18 PM EST | SETI@home | update requested by user
Tue 06 Nov 2012 06:33:21 PM EST | SETI@home | Sending scheduler request: Requested by user.
Tue 06 Nov 2012 06:33:21 PM EST | SETI@home | Reporting 72 completed tasks
Tue 06 Nov 2012 06:33:21 PM EST | SETI@home | Requesting new tasks for CPU and NVIDIA
Tue 06 Nov 2012 06:34:05 PM EST | SETI@home | Scheduler request failed: HTTP internal server error


ut ohh

Profile Tron
Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,236,055
RAC: 0
United States
Message 1302915 - Posted: 7 Nov 2012, 0:11:52 UTC

You might consider not requesting new work until the folks back at the farm figure out what has been in 'mangled condition' regarding the scheduler for the past week or so.


They had all day to figure that out. looks like they haven’t ...
Internal server error is an important message that justifies posting.
Your suggestion has been considered <roll>

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · 11 · 12 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?

Copyright © 2014 University of California