Panic Mode On (75) Server problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (75) Server problems?

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next
Author Message
Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 311
Credit: 130,846,438
RAC: 244,275
Australia
Message 1239223 - Posted: 1 Jun 2012, 1:34:59 UTC

I suppose I was just considering Folding@home's philosophy of timeliness of results vs the bulk of work that can be done. I appreciate the improvements in the hardware with regards to WU generation/transfer/validation/etc, but I was thinking that with smaller caches, WUs (from a global project perspective) could be processed and validated faster, freeing resources in the databases and other server processes and generally allowing S@h to run more efficiently.

I admit I don't follow the forums here that thoroughly (some users have been less than friendly), so I don't know if this is an idea that's been raised previously.
____________
Soli Deo Gloria

tbret
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2631
Credit: 192,423,405
RAC: 537,534
United States
Message 1239269 - Posted: 1 Jun 2012, 3:15:45 UTC - in response to Message 1239223.

I suppose I was just considering Folding@home's philosophy of timeliness of results vs the bulk of work that can be done. I appreciate the improvements in the hardware with regards to WU generation/transfer/validation/etc, but I was thinking that with smaller caches, WUs (from a global project perspective) could be processed and validated faster, freeing resources in the databases and other server processes and generally allowing S@h to run more efficiently.

I admit I don't follow the forums here that thoroughly (some users have been less than friendly), so I don't know if this is an idea that's been raised previously.


You ought to hang-out more!

Yeah, it's talked-about from time-to-time. I'd rather have the work I turn-in error-out or validate quickly, but...I kind-of like to have work to do when there is a power-outage or a RAID corruption or something, too.

Just over this little outage I had a computer run-out of CPU work (but for some reason had plenty of GPU work in reserve).

And for some reason we don't seem to always go FIFO, so the last thing you download may be the next thing you crunch while older tasks sit.

So... what do you do?

I'm trying a happy 5-day medium, but apparently this whole misreporting issue is messing with the estimated run times so what I set isn't necessarily what I get.

Profile Misfit
Volunteer tester
Avatar
Send message
Joined: 21 Jun 01
Posts: 21790
Credit: 2,510,901
RAC: 0
United States
Message 1239290 - Posted: 1 Jun 2012, 5:07:20 UTC - in response to Message 1239220.

The difference is you aren't bragging about having several days worth available to crunch while complaining your uploads won't go thru. For the people who do those are not words of encouragement to those of us who are running on empty.
____________

bill
Send message
Joined: 16 Jun 99
Posts: 859
Credit: 22,390,253
RAC: 16,297
United States
Message 1239327 - Posted: 1 Jun 2012, 7:17:31 UTC - in response to Message 1239290.

You choose to run on empty. Why should that be a concern to
any one else?

msattler
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38336
Credit: 561,534,262
RAC: 647,473
United States
Message 1239329 - Posted: 1 Jun 2012, 7:22:40 UTC

Come now, kitties......

The subject of cache sizes has long been debated.
Fact remains, it is up to the individual users to choose what cache size they wish to try to maintain within the choices made available by this project.

Let's not stir the pot please.


____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Lionel
Send message
Joined: 25 Mar 00
Posts: 544
Credit: 217,707,709
RAC: 209,411
Australia
Message 1239346 - Posted: 1 Jun 2012, 8:05:05 UTC - in response to Message 1239329.


Getting Scheduler request failed: HTTP Internal Server Error

I don't believe that this problem is fixed ...

cheers
____________

Profile Link
Avatar
Send message
Joined: 18 Sep 03
Posts: 824
Credit: 1,545,480
RAC: 290
Germany
Message 1239367 - Posted: 1 Jun 2012, 9:07:00 UTC - in response to Message 1239216.
Last modified: 1 Jun 2012, 9:17:40 UTC

If everyone who holds a ten day cache dropped to something more reasonable, there'd be more WUs to share around...

No, the splitters stop, when about 250,000 WUs are ready to send, so once they run out of tapes (for whatever reason), there are usually no more than 250,000 WUs to send out regardless of what the people have in their caches.

Larger caches actually force the servers to generate larger work buffer, which is than stored in the cache of each client, so if the servers are down, the clients can still do a lot of work for the project and return it after the outage. If we all had just one day cache, the processing for S@H would have been stopped after about 24 hours completely, with larger caches we process the WUs like nothing happend, the current servers are powerful enough to catch up and restore our caches a while after the outage anyway.
____________
.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2240
Credit: 8,458,724
RAC: 4,095
United States
Message 1239506 - Posted: 1 Jun 2012, 16:50:49 UTC

My single core machine was down to one MB left of its 2.5-day cache, but on the first scheduler contact after it all came back up, it reported all the completed tasks and got 8 new ones to fill the cache back up.

Main cruncher is AP-only and reported what it completed during the outage, and hasn't gotten any new APs yet. Just a little less than a day before I run out. Not worried, nor complaining. I'll get more work eventually.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5698
Credit: 56,493,166
RAC: 48,976
Australia
Message 1239961 - Posted: 2 Jun 2012, 4:42:41 UTC


Noticed the network traffic has dropped off, so i had a look in my messages tab & it's getting a lot of "Project has no tasks available" and "No tasks sent" messages. It's trying to get work, it's just not receiving it.
____________
Grant
Darwin NT.

msattler
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 38336
Credit: 561,534,262
RAC: 647,473
United States
Message 1239968 - Posted: 2 Jun 2012, 4:50:14 UTC - in response to Message 1239961.


Noticed the network traffic has dropped off, so i had a look in my messages tab & it's getting a lot of "Project has no tasks available" and "No tasks sent" messages. It's trying to get work, it's just not receiving it.

The main reason the bandwidth dropped off is the AP splitters are not running at the moment.
____________
*********************************************
Embrace your inner kitty...ya know ya wanna!

I have met a few friends in my life.
Most were cats.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3336
Credit: 19,084,648
RAC: 19,729
Sweden
Message 1241263 - Posted: 4 Jun 2012, 19:35:59 UTC

Oops, someone turned on AP splitting, and as usual the message board became sluggish like surfing in molasses. Cricket graph confirms that we have a slow evening ahead of us too.

____________

clive G1FYE
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,054,144
RAC: 0
United Kingdom
Message 1241305 - Posted: 4 Jun 2012, 20:42:19 UTC

As of now, that is only one AP splitter and not a lot of data for it to chew on,
Looks like it will be `that` kind of evening on this side of the pond.

`More coal in the boiler the ship is slowing down`

rob smith
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 8149
Credit: 52,888,924
RAC: 75,661
United Kingdom
Message 1241312 - Posted: 4 Jun 2012, 20:56:23 UTC

...and by the time I read your post the coal in the AP splitting boiler had run out...
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Zapped Sparky
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 30 Aug 08
Posts: 6720
Credit: 1,200,844
RAC: 46
United Kingdom
Message 1241382 - Posted: 4 Jun 2012, 22:41:07 UTC

I missed some Astropulse's being split? Oh man, I'm currently crunching the last one in my cache. Time to panic me thinks.
____________
In an alternate universe, it was a ZX81 that asked for clothes, boots and motorcycle.

Client error 418: I'm a teapot

Tropical Goldfish Fish 13: You're not crazy if you crunch for Seti :)

Profile SciManStev
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 4792
Credit: 79,827,771
RAC: 36,224
United States
Message 1241410 - Posted: 4 Jun 2012, 23:23:13 UTC

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3232
Credit: 31,585,541
RAC: 0
Netherlands
Message 1241422 - Posted: 4 Jun 2012, 23:39:23 UTC - in response to Message 1241382.
Last modified: 4 Jun 2012, 23:57:10 UTC

I missed some Astropulse's being split? Oh man, I'm currently crunching the last one in my cache. Time to panic me thinks.


Not really a panic, but I am doing a Spock raised eyebrow.....
Steve.


Well, you'll survive, also have 4 times the throughput of my hosts...
Since I'm still running BOINC 7.00.25, almost all SETI, MB also Astropulse
are run High Priority....MW, too. Merely cosmetic?

I've just crunched a few on ATI 5870 GPUs, 3 to 4 hours runtime,
9,779.41 6,056.19 CPU in behandeling* AstroPulse v6
Anoniem platform (ATI GPU)
High CPU time is due to the high % of blanking.
*Checked no 2nd result, yet.(I'm waiting for my {wing-}man).

I could do with some more AstroPulse work, though.
Panic, no way, besides thats where Back-Up Projects can be used for ;-).
____________


Knight Who Says Ni N!, OUT numbered.................

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8377
Credit: 46,802,568
RAC: 23,892
United Kingdom
Message 1241427 - Posted: 4 Jun 2012, 23:50:01 UTC - in response to Message 1241410.

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve

Oh good, they must be splitting shorties again - I could do with a few of them.

Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3597
Credit: 47,401,500
RAC: 5,857
United States
Message 1241430 - Posted: 4 Jun 2012, 23:53:07 UTC - in response to Message 1241410.

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve


I am on 2600 wu's with my single GTX560 and a 5 day cache.
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3232
Credit: 31,585,541
RAC: 0
Netherlands
Message 1241439 - Posted: 5 Jun 2012, 0:08:09 UTC - in response to Message 1241430.
Last modified: 5 Jun 2012, 0:24:48 UTC

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve


I am on 2600 wu's with my single GTX560 and a 5 day cache.


I've SETI@home Enhanced (1784), (In behandeling)* (831)MB crunched, reported
and waiting for a cannonical result. 3 Astropulse WUs reported.
Less then 1000 a.t.m. on my i7-2600+ 2x ATI 5870 GPUs, 12 a time, have to
set a larger cache, but I'm never out of work, so I let it be.

Most of the regular posters, volunteer developpers, testers,
have an almost 10 fold throughput, since the(y) use of CUDA (FERMI/KEPPLER) and
OpenCL 1.2(ATI-AMD{SDK} 2.4)!
More or less. Look at these Results,
Atropulse
WU 99732719
.
____________


Knight Who Says Ni N!, OUT numbered.................

Kevin Olley
Send message
Joined: 3 Aug 99
Posts: 368
Credit: 35,181,467
RAC: 2,439
United Kingdom
Message 1241450 - Posted: 5 Jun 2012, 0:28:02 UTC - in response to Message 1241410.

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve


I've had the same, a run of "fast" regular WU's then a few crunchy ones, it upsets (increases) the estimated completion time and with a larger cache its enough to cause Boinc to panic.

Sometimes it will jump in and out of high priority mode, if you have got a bunch of longer running WU's when one finnishes it will kick it into high priority mode and start doing a bunch of shorties, then as the esimated completion time drops it will start back on the longer running ones untill it completes another one and then kicks back into high priority mode again.

There does not seem to be a lot of VLAR's or VHAR's around, but there seems to be a lot of variation (runtime wise) on the regular WU's.



____________
Kevin


Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (75) Server problems?

Copyright © 2014 University of California