Panic Mode On (75) Server problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (75) Server problems?

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next
Author Message
Wedge009
Volunteer tester
Avatar
Send message
Joined: 3 Apr 99
Posts: 414
Credit: 174,414,350
RAC: 133,496
Australia
Message 1239223 - Posted: 1 Jun 2012, 1:34:59 UTC

I suppose I was just considering Folding@home's philosophy of timeliness of results vs the bulk of work that can be done. I appreciate the improvements in the hardware with regards to WU generation/transfer/validation/etc, but I was thinking that with smaller caches, WUs (from a global project perspective) could be processed and validated faster, freeing resources in the databases and other server processes and generally allowing S@h to run more efficiently.

I admit I don't follow the forums here that thoroughly (some users have been less than friendly), so I don't know if this is an idea that's been raised previously.
____________
Soli Deo Gloria

tbret
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2934
Credit: 221,106,286
RAC: 28,333
United States
Message 1239269 - Posted: 1 Jun 2012, 3:15:45 UTC - in response to Message 1239223.

I suppose I was just considering Folding@home's philosophy of timeliness of results vs the bulk of work that can be done. I appreciate the improvements in the hardware with regards to WU generation/transfer/validation/etc, but I was thinking that with smaller caches, WUs (from a global project perspective) could be processed and validated faster, freeing resources in the databases and other server processes and generally allowing S@h to run more efficiently.

I admit I don't follow the forums here that thoroughly (some users have been less than friendly), so I don't know if this is an idea that's been raised previously.


You ought to hang-out more!

Yeah, it's talked-about from time-to-time. I'd rather have the work I turn-in error-out or validate quickly, but...I kind-of like to have work to do when there is a power-outage or a RAID corruption or something, too.

Just over this little outage I had a computer run-out of CPU work (but for some reason had plenty of GPU work in reserve).

And for some reason we don't seem to always go FIFO, so the last thing you download may be the next thing you crunch while older tasks sit.

So... what do you do?

I'm trying a happy 5-day medium, but apparently this whole misreporting issue is messing with the estimated run times so what I set isn't necessarily what I get.

Profile Misfit
Volunteer tester
Avatar
Send message
Joined: 21 Jun 01
Posts: 21790
Credit: 2,510,901
RAC: 0
United States
Message 1239290 - Posted: 1 Jun 2012, 5:07:20 UTC - in response to Message 1239220.

The difference is you aren't bragging about having several days worth available to crunch while complaining your uploads won't go thru. For the people who do those are not words of encouragement to those of us who are running on empty.
____________

Join BOINC Synergy!

bill
Send message
Joined: 16 Jun 99
Posts: 861
Credit: 26,116,958
RAC: 10,012
United States
Message 1239327 - Posted: 1 Jun 2012, 7:17:31 UTC - in response to Message 1239290.

You choose to run on empty. Why should that be a concern to
any one else?

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 40848
Credit: 665,892,081
RAC: 361,766
United States
Message 1239329 - Posted: 1 Jun 2012, 7:22:40 UTC

Come now, kitties......

The subject of cache sizes has long been debated.
Fact remains, it is up to the individual users to choose what cache size they wish to try to maintain within the choices made available by this project.

Let's not stir the pot please.


____________
*********************************************
Behold the power of kitty!!

I have met a few friends in my life.
Most were cats.

Lionel
Send message
Joined: 25 Mar 00
Posts: 601
Credit: 269,979,535
RAC: 183,978
Australia
Message 1239346 - Posted: 1 Jun 2012, 8:05:05 UTC - in response to Message 1239329.


Getting Scheduler request failed: HTTP Internal Server Error

I don't believe that this problem is fixed ...

cheers
____________

Profile Link
Avatar
Send message
Joined: 18 Sep 03
Posts: 791
Credit: 1,585,926
RAC: 88
Germany
Message 1239367 - Posted: 1 Jun 2012, 9:07:00 UTC - in response to Message 1239216.
Last modified: 1 Jun 2012, 9:17:40 UTC

If everyone who holds a ten day cache dropped to something more reasonable, there'd be more WUs to share around...

No, the splitters stop, when about 250,000 WUs are ready to send, so once they run out of tapes (for whatever reason), there are usually no more than 250,000 WUs to send out regardless of what the people have in their caches.

Larger caches actually force the servers to generate larger work buffer, which is than stored in the cache of each client, so if the servers are down, the clients can still do a lot of work for the project and return it after the outage. If we all had just one day cache, the processing for S@H would have been stopped after about 24 hours completely, with larger caches we process the WUs like nothing happend, the current servers are powerful enough to catch up and restore our caches a while after the outage anyway.
____________
.

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2461
Credit: 9,389,367
RAC: 2,834
United States
Message 1239506 - Posted: 1 Jun 2012, 16:50:49 UTC

My single core machine was down to one MB left of its 2.5-day cache, but on the first scheduler contact after it all came back up, it reported all the completed tasks and got 8 new ones to fill the cache back up.

Main cruncher is AP-only and reported what it completed during the outage, and hasn't gotten any new APs yet. Just a little less than a day before I run out. Not worried, nor complaining. I'll get more work eventually.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 6228
Credit: 69,419,110
RAC: 48,652
Australia
Message 1239961 - Posted: 2 Jun 2012, 4:42:41 UTC


Noticed the network traffic has dropped off, so i had a look in my messages tab & it's getting a lot of "Project has no tasks available" and "No tasks sent" messages. It's trying to get work, it's just not receiving it.
____________
Grant
Darwin NT.

msattlerProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Jul 00
Posts: 40848
Credit: 665,892,081
RAC: 361,766
United States
Message 1239968 - Posted: 2 Jun 2012, 4:50:14 UTC - in response to Message 1239961.


Noticed the network traffic has dropped off, so i had a look in my messages tab & it's getting a lot of "Project has no tasks available" and "No tasks sent" messages. It's trying to get work, it's just not receiving it.

The main reason the bandwidth dropped off is the AP splitters are not running at the moment.
____________
*********************************************
Behold the power of kitty!!

I have met a few friends in my life.
Most were cats.

Card-Carrying Grumpyologist and Communistologist
Volunteer tester
Avatar
Send message
Joined: 1 Nov 08
Posts: 4847
Credit: 23,527,227
RAC: 4,497
Sweden
Message 1241263 - Posted: 4 Jun 2012, 19:35:59 UTC

Oops, someone turned on AP splitting, and as usual the message board became sluggish like surfing in molasses. Cricket graph confirms that we have a slow evening ahead of us too.

____________
I'm only running one computer. Using 2 cores of an old Q8200 CPU for CPU tasks, and 2 cores feeding a single Mid-range GPU, ATI HD7870.
Look at the RAC folks, and ask yourselves why it beats so many multi GPU monster computers :-)

.clair.
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 29,693,171
RAC: 11,845
United Kingdom
Message 1241305 - Posted: 4 Jun 2012, 20:42:19 UTC

As of now, that is only one AP splitter and not a lot of data for it to chew on,
Looks like it will be `that` kind of evening on this side of the pond.

`More coal in the boiler the ship is slowing down`

rob smithProject donor
Volunteer tester
Send message
Joined: 7 Mar 03
Posts: 9487
Credit: 76,898,513
RAC: 112,728
United Kingdom
Message 1241312 - Posted: 4 Jun 2012, 20:56:23 UTC

...and by the time I read your post the coal in the AP splitting boiler had run out...
____________
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?

Profile Zapped SparkyProject donor
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 30 Aug 08
Posts: 13226
Credit: 1,492,452
RAC: 914
United Kingdom
Message 1241382 - Posted: 4 Jun 2012, 22:41:07 UTC

I missed some Astropulse's being split? Oh man, I'm currently crunching the last one in my cache. Time to panic me thinks.
____________
In an alternate universe, it was a ZX81 that asked for clothes, boots and motorcycle.

Client error 418: I'm a teapot

Tropical Goldfish Fish 15: Squeaky bras 'R us

Illusions of normality sufferer

Profile SciManStev
Volunteer tester
Avatar
Send message
Joined: 20 Jun 99
Posts: 5035
Credit: 89,465,974
RAC: 39,851
United States
Message 1241410 - Posted: 4 Jun 2012, 23:23:13 UTC

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve
____________
Warning, addicted to SETI crunching!
Crunching as a member of GPU Users Group.
GPUUG Website

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1241422 - Posted: 4 Jun 2012, 23:39:23 UTC - in response to Message 1241382.
Last modified: 4 Jun 2012, 23:57:10 UTC

I missed some Astropulse's being split? Oh man, I'm currently crunching the last one in my cache. Time to panic me thinks.


Not really a panic, but I am doing a Spock raised eyebrow.....
Steve.


Well, you'll survive, also have 4 times the throughput of my hosts...
Since I'm still running BOINC 7.00.25, almost all SETI, MB also Astropulse
are run High Priority....MW, too. Merely cosmetic?

I've just crunched a few on ATI 5870 GPUs, 3 to 4 hours runtime,
9,779.41 6,056.19 CPU in behandeling* AstroPulse v6
Anoniem platform (ATI GPU)
High CPU time is due to the high % of blanking.
*Checked no 2nd result, yet.(I'm waiting for my {wing-}man).

I could do with some more AstroPulse work, though.
Panic, no way, besides thats where Back-Up Projects can be used for ;-).
____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 9207
Credit: 58,026,017
RAC: 35,958
United Kingdom
Message 1241427 - Posted: 4 Jun 2012, 23:50:01 UTC - in response to Message 1241410.

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve

Oh good, they must be splitting shorties again - I could do with a few of them.

Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3809
Credit: 48,779,443
RAC: 2
United States
Message 1241430 - Posted: 4 Jun 2012, 23:53:07 UTC - in response to Message 1241410.

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve


I am on 2600 wu's with my single GTX560 and a 5 day cache.
____________

Profile Fred J. Verster
Volunteer tester
Avatar
Send message
Joined: 21 Apr 04
Posts: 3252
Credit: 31,903,643
RAC: 0
Netherlands
Message 1241439 - Posted: 5 Jun 2012, 0:08:09 UTC - in response to Message 1241430.
Last modified: 5 Jun 2012, 0:24:48 UTC

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve


I am on 2600 wu's with my single GTX560 and a 5 day cache.


I've SETI@home Enhanced (1784), (In behandeling)* (831)MB crunched, reported
and waiting for a cannonical result. 3 Astropulse WUs reported.
Less then 1000 a.t.m. on my i7-2600+ 2x ATI 5870 GPUs, 12 a time, have to
set a larger cache, but I'm never out of work, so I let it be.

Most of the regular posters, volunteer developpers, testers,
have an almost 10 fold throughput, since the(y) use of CUDA (FERMI/KEPPLER) and
OpenCL 1.2(ATI-AMD{SDK} 2.4)!
More or less. Look at these Results,
Atropulse
WU 99732719
.
____________

Kevin Olley
Send message
Joined: 3 Aug 99
Posts: 375
Credit: 38,393,371
RAC: 12,337
United Kingdom
Message 1241450 - Posted: 5 Jun 2012, 0:28:02 UTC - in response to Message 1241410.

My rig just jumped into high priority mode and is leaving work unfinished all over the place. I can't even remember the last time that happened. I do have about 6900 wu's on board, which seems a wee high for a 5 day cache. Even with that amount, I should have no trouble crunching them.

Not really a panic, but I am doing a Spock raised eyebrow.....

Steve


I've had the same, a run of "fast" regular WU's then a few crunchy ones, it upsets (increases) the estimated completion time and with a larger cache its enough to cause Boinc to panic.

Sometimes it will jump in and out of high priority mode, if you have got a bunch of longer running WU's when one finnishes it will kick it into high priority mode and start doing a bunch of shorties, then as the esimated completion time drops it will start back on the longer running ones untill it completes another one and then kicks back into high priority mode again.

There does not seem to be a lot of VLAR's or VHAR's around, but there seems to be a lot of variation (runtime wise) on the regular WU's.



____________
Kevin


Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : Number crunching : Panic Mode On (75) Server problems?

Copyright © 2015 University of California