School (Feb 22 2011)


log in

Advanced search

Message boards : Technical News : School (Feb 22 2011)

Previous · 1 · 2
Author Message
Profile Jeff Mercer
Send message
Joined: 14 Aug 08
Posts: 90
Credit: 162,139
RAC: 0
United States
Message 1081667 - Posted: 26 Feb 2011, 16:42:40 UTC

Uh Oh.... I think that I started something here ! :>

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1991
Credit: 10,993,484
RAC: 9,361
United States
Message 1081708 - Posted: 26 Feb 2011, 18:24:19 UTC

Umm, isn't it time to consider cutting down the mandatory wait time after a scheduler update? (to something less than 5 minutes...)
____________
.

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8755
Credit: 61,654,715
RAC: 33,365
United Kingdom
Message 1081962 - Posted: 27 Feb 2011, 9:11:53 UTC

Hmm, someone must be calling into the lab over the weekend and loading new tapes as there are both MB and AP units available, and a few more of each to be split from loaded tapes. May I thank that person for going "above and beyond".
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Kibble (KB7TIB)
Avatar
Send message
Joined: 6 Dec 99
Posts: 21
Credit: 1,844,384
RAC: 2,222
United States
Message 1081980 - Posted: 27 Feb 2011, 9:38:32 UTC - in response to Message 1081708.

Umm, isn't it time to consider cutting down the mandatory wait time after a scheduler update? (to something less than 5 minutes...)


I'm not sure, but I suspect the mandatory wait time is there for limiting bandwidth issues. Regardless, it's something I can live with. (Don't want to hammer the servers too much while things are running as smoothly as they are now, after all.)

____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,797
RAC: 257
Netherlands
Message 1082332 - Posted: 28 Feb 2011, 11:09:17 UTC - in response to Message 1081980.

Umm, isn't it time to consider cutting down the mandatory wait time after a scheduler update? (to something less than 5 minutes...)


I'm not sure, but I suspect the mandatory wait time is there for limiting bandwidth issues. Regardless, it's something I can live with. (Don't want to hammer the servers too much while things are running as smoothly as they are now, after all.)


Well you just hit the nail on the head, changing the 11 seconds(?), in 5 minutes,I think, was one reason to decrease the IN and OUTput of the Up- & DownLoad SERVERS!

A 100,000 or more hosts hammering every few seconds on these servers, was of the
reasons to change this, these short 11 seconds, also changed to report direct in some older BOINC versions, is somekind of DDOS-Attack, IMHO!

And with the, still growing CUDA/CAL/OpenCL GPU processing, this certainly will be reviewed, sometime in the (near) future....

I'm not a ICT specialist, but with the ever increasing (Moore's Law), demand for faster and more efficient hard and software, it's matter of time
when the new SERVERS can't keep up, with the demand for new work, anymore
and another expansion, is needed!

____________

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1991
Credit: 10,993,484
RAC: 9,361
United States
Message 1082387 - Posted: 28 Feb 2011, 17:09:08 UTC
Last modified: 28 Feb 2011, 17:14:23 UTC

I was not suggesting a return to the 11 second delay - just somethin' shorter than 5 minutes! (like 2-3 minutes...)

IIRC, the 5 minute delay was implemented when the project was taking 3 day outages, to (yes) decrease the number of simultaneous connections after the restoration of downloading... but now that the three day outage weeks are over, (we hope!) it should be time to cut back that delay.
____________
.

Profile Kibble (KB7TIB)
Avatar
Send message
Joined: 6 Dec 99
Posts: 21
Credit: 1,844,384
RAC: 2,222
United States
Message 1082412 - Posted: 28 Feb 2011, 18:33:02 UTC - in response to Message 1082387.
Last modified: 28 Feb 2011, 18:34:43 UTC

The procedure to increase the delay was kind of successful in that it decreased the entropy on the machinery as well as allowed for bandwidth to be opened up. Newer machines have been added to the mix in the lab, but the bandwidth is, if I'm not mistaken, the same. The planned three day outages may be a thing of the past, but occasional outages will still occur for various reasons. Don't forget that not all the servers were changed out, and plenty of stuff is happening in the background. It's not a good idea to fix something that isn't broken anyway. I consider the delay a real non-issue. If you really want to connect sooner you can always force the issue manually with the update button.
____________

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2327
Credit: 8,869,285
RAC: 683
United States
Message 1082427 - Posted: 28 Feb 2011, 19:17:51 UTC

The 5-minute delay was to reduce the load on jocelyn when we were waiting for the new servers to be spec'ed, ordered, and installed.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8767
Credit: 52,716,920
RAC: 16,500
United Kingdom
Message 1082458 - Posted: 28 Feb 2011, 20:56:44 UTC - in response to Message 1082427.

The 5-minute delay was to reduce the load on jocelyn when we were waiting for the new servers to be spec'ed, ordered, and installed.

But given the load we're putting in the system at the moment - with the cricket graph maxxed out for seven of the last eight hours - I'd suggest that it would be wise to keep things tamped down for the time being.

Remember that the problem was scheduler request files fighting their way through the upload and download traffic, and the result - at that time - was ghost WUs, which certainly caused more problems than they were worth. Now, with the more powerful servers proving themselves capable of handling the limited number of 'resend lost results', things are running a lot smoother - but I see no benefit in increasing the number of lost results needing to be resent.

And I haven't seen any sign myself, or heard any complaints from the boards, suggesting that a five-minute delay is too long. Most regular posters will be running caches measured in days, not even hours - having to re-request two or three times isn't going to make any noticable difference to them.

Profile Wiggo
Avatar
Send message
Joined: 24 Jan 00
Posts: 7990
Credit: 98,335,002
RAC: 23,348
Australia
Message 1082460 - Posted: 28 Feb 2011, 21:02:49 UTC - in response to Message 1082458.

No complaints from me about a lousy 5mins.

Cheers.
____________

B-Man
Volunteer tester
Send message
Joined: 11 Feb 01
Posts: 253
Credit: 147,366
RAC: 0
United States
Message 1082474 - Posted: 28 Feb 2011, 21:57:46 UTC

I say leave be on the 5 Min delay for another 3-4 weeks to get a better feel on how things work when we have no problems. We have no need to rush into things. I say we take it slow before making changes.
____________

Blake Bonkofsky
Volunteer tester
Avatar
Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,332,781
RAC: 0
United States
Message 1082494 - Posted: 28 Feb 2011, 22:54:22 UTC - in response to Message 1082458.

The 5-minute delay was to reduce the load on jocelyn when we were waiting for the new servers to be spec'ed, ordered, and installed.

But given the load we're putting in the system at the moment - with the cricket graph maxxed out for seven of the last eight hours - I'd suggest that it would be wise to keep things tamped down for the time being.

Remember that the problem was scheduler request files fighting their way through the upload and download traffic, and the result - at that time - was ghost WUs, which certainly caused more problems than they were worth. Now, with the more powerful servers proving themselves capable of handling the limited number of 'resend lost results', things are running a lot smoother - but I see no benefit in increasing the number of lost results needing to be resent.

And I haven't seen any sign myself, or heard any complaints from the boards, suggesting that a five-minute delay is too long. Most regular posters will be running caches measured in days, not even hours - having to re-request two or three times isn't going to make any noticable difference to them.


This is my feeling as well. The 5 minute delays certainly haven't hurt me. My 5 machines have a combined RAC of over 80,000, putting me at over 1000 WU's a day on average. I'm having no problem at all keeping their caches full. Even Todd with his 600,000+ RAC (Easily over 7500 WU's a day) never has complained about keeping the cache topped off, even with the 5 min delays. The only thing limited by that 5 minutes is requesting new work.
____________

Profile Jeff Mercer
Send message
Joined: 14 Aug 08
Posts: 90
Credit: 162,139
RAC: 0
United States
Message 1082495 - Posted: 28 Feb 2011, 22:57:42 UTC - in response to Message 1082474.

I say leave be on the 5 Min delay for another 3-4 weeks to get a better feel on how things work when we have no problems. We have no need to rush into things. I say we take it slow before making changes.



I agree. I don't have any problems, however, I'm only running one little computer. I don't know about the people that are running several machines. I get enough work to keep me going for about 5 days, and so far, it's running nice and smooth.

Profile Zapped SparkyProject donor
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 30 Aug 08
Posts: 8976
Credit: 1,321,433
RAC: 552
United Kingdom
Message 1082525 - Posted: 1 Mar 2011, 0:29:03 UTC

Another here running only a little computer and things couldn't be better. Before boinc would try to connect to the server what seemed like once a minute for HOURS at a time with no success, now if it can't connect to the server it will try again five minutes later and I get another task. Woo and yay.

Josef W. SegurProject donor
Volunteer developer
Volunteer tester
Send message
Joined: 30 Oct 99
Posts: 4336
Credit: 1,113,795
RAC: 779
United States
Message 1082550 - Posted: 1 Mar 2011, 2:11:54 UTC - in response to Message 1082412.

Kibble (KB7TIB) wrote:
...
If you really want to connect sooner you can always force the issue manually with the update button.

Yes, but a request for work forced that way will get a "Not sending work - last request too recent: xxx sec" message if there is work available.

A few of the top computers are able to do several tasks in 5 minutes, but not more than they are likely to get from successful requests at that interval. As hardware improves there will indeed come a time when top computers won't be able to be fully productive without a reduction of that setting. But IMO the time to consider reducing it will be after the available download bandwidth is increased, even with the 5 minute interval the 100 Mbps download link we have now is often saturated.
Joe

tbretProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2897
Credit: 218,381,660
RAC: 21,868
United States
Message 1082904 - Posted: 2 Mar 2011, 10:19:39 UTC - in response to Message 1082494.

My 5 machines have a combined RAC of over 80,000, putting me at over 1000 WU's a day on average. I'm having no problem at all keeping their caches full. Even Todd with his 600,000+ RAC (Easily over 7500 WU's a day) never has complained about keeping the cache topped off, even with the 5 min delays. The only thing limited by that 5 minutes is requesting new work.


I notice that you are running a number of GTX 460s.

Are you happy with them, generally? Looks like you are clocking-in at about 20k RAC, which is higher than I thought that card's production would be.

____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,902,797
RAC: 257
Netherlands
Message 1082914 - Posted: 2 Mar 2011, 11:30:04 UTC - in response to Message 1082904.
Last modified: 2 Mar 2011, 11:35:58 UTC

I'm very happy with the 470 & 480 FERMIs, you can run more then 1 WU at a time,
depending on # of usable 'cores', I run 3 on the 480, whithout it getting too hot or producing errors.
And 2 on the 470, even 3 are possible, but it's getting quite hot, doing so.

Some 'other Projects', like GPUgrid, can (only?) be ran on FERMIs.
(When looking at some results, unfortunatly, most of the 200 series cards, are
producing a lot errors!) Took my GTS250 off GPUgrid, as I did not get any work.
Even when running SETI MB, using the LUNATICs V0.37 Installer, is gets very hot
and needs an additional fan and one side off it's case.

And about the extended, or 5 minutes delay, I haven't noticed, getting less WUs
or other problems and there is always the UPDate Button, but BOINC 6.10.56 & 58 (XP64), is perfectly capable of doing it's job.
(If you just let it run, which is just 1 of it's many functions)

Also haven't noticed any problems, reporting, or UP- and Down-Loading.
And haven't been out of work, still using a 4 day cache and only displaying
active tasks since this can take some extra load on the CPU, sometimes
5% or more, depending on the amount of WU's.
(A CPDN WU, FAMOUS or otherwise, takes, after zipping 4 minutes to UPload,
@ 5 or 6Kbit/sec! and therefore also suited for Dial-Up Connections, my LT, still has a 'Old style' MODEM, which can be very usefull, ofc. also Ethernet and WLAN)
____________

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1991
Credit: 10,993,484
RAC: 9,361
United States
Message 1082987 - Posted: 2 Mar 2011, 18:41:37 UTC
Last modified: 2 Mar 2011, 19:07:00 UTC

Here's my situation, and why I'd like that 5 min delay reduced to 2 (or less ;-) )

I only connect to SETI once a day, with three (soon to be 4...) computers in my morning. (which happens to be the same as morning at the SETI labs...) If I get the dreaded "no work sent" when I request work that mean that I'm twiddling my thumbs waiting out the 5 minutes when I could be doing somethin' productive, in a BOINC sense...

This occasionally has made my computer(s) miss getting any WU's on Tuesdays. (yes, I run that close to 9 AM Pacific time... and sometimes the lab starts the outage early, like this week!)

This whole Method of Operations started 5 years ago, when I was connecting to the web (and SETI...) with a dial-up.
____________
.

Blake Bonkofsky
Volunteer tester
Avatar
Send message
Joined: 29 Dec 99
Posts: 617
Credit: 46,332,781
RAC: 0
United States
Message 1083027 - Posted: 2 Mar 2011, 20:47:30 UTC - in response to Message 1082904.

My 5 machines have a combined RAC of over 80,000, putting me at over 1000 WU's a day on average. I'm having no problem at all keeping their caches full. Even Todd with his 600,000+ RAC (Easily over 7500 WU's a day) never has complained about keeping the cache topped off, even with the 5 min delays. The only thing limited by that 5 minutes is requesting new work.


I notice that you are running a number of GTX 460s.

Are you happy with them, generally? Looks like you are clocking-in at about 20k RAC, which is higher than I thought that card's production would be.


For the money, I don't think they can be beat. Maybe once we see FERMI specific opt apps come out it'll will spread the field, but for now I'm very please. The cards seem to be good for about 16-20k each. My triple machine was up over 50k and climbing before the outage a few weeks ago, still trying to get back to that point. My single machine (Q8300) has been stable around 22k for awhile, I believe about 4k of that coming from the CPU. I'd expect my triple machine to peak around the upper 50k's.
____________

N9JFE David SProject donor
Volunteer tester
Avatar
Send message
Joined: 4 Oct 99
Posts: 12522
Credit: 14,826,344
RAC: 2,956
United States
Message 1083216 - Posted: 3 Mar 2011, 16:05:57 UTC - in response to Message 1082987.

Here's my situation, and why I'd like that 5 min delay reduced to 2 (or less ;-) )

I only connect to SETI once a day, with three (soon to be 4...) computers in my morning. (which happens to be the same as morning at the SETI labs...) If I get the dreaded "no work sent" when I request work that mean that I'm twiddling my thumbs waiting out the 5 minutes when I could be doing somethin' productive, in a BOINC sense...

This occasionally has made my computer(s) miss getting any WU's on Tuesdays. (yes, I run that close to 9 AM Pacific time... and sometimes the lab starts the outage early, like this week!)

This whole Method of Operations started 5 years ago, when I was connecting to the web (and SETI...) with a dial-up.

Surely you're not still on dial-up, are you??? Do you have some reason, other than habit, for not allowing your computers to connect to SETI whenever they feel the need? (I know some people don't like to leave their internet connection on when they're not actively using it. My modem and router are in the basement, stuck up in the floor joists, because it's the most convenient place for connections to power, the outside feed, and internal network cables, but it means they're on 24/7... which is fine for the machine running BOINC and my Radio Reference feed.)

My only suggestion, and I'm sure I don't need to make it to you, would be to increase your cache size so that if you miss a day, you'll still have enough work to carry you through for another day.

David
____________
David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.


Previous · 1 · 2

Message boards : Technical News : School (Feb 22 2011)

Copyright © 2014 University of California