Panic Mode On (70) Server problems?

Message boards : Number crunching : Panic Mode On (70) Server problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203623 - Posted: 8 Mar 2012, 8:30:29 UTC

Well, I see nothing really got fixed today with dear ol' Synergy in it's new duties. The crickets are still slouching, even though kibble bowls are not full.

I have sloooooooooly gained a bit on cache, but still am about 3000 WUs shy of the limits, and have several rigs doing GPU only, because of the silly server bug that seems to avoid sending any CPU work unless the GPU is up to the limits.
And so, the CPUs sit idle.

Vexing bit of a perfect storm.

But the kitties pad along.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203623 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1203625 - Posted: 8 Mar 2012, 8:34:33 UTC - in response to Message 1203495.  

Well I think I found a way around the problem of misscommunication between boinc and its peer.

Hit the update button until it succeeds.

I had 19 tasks to report, didnt want more WU and the update failed.So I hit update again [ignoring the timeout] and it failed again, so I hit update again and it made contact ok, reported ok.

It looks to my admittedly untrained eye as if there is a problem with the router at berkeley, perhaps memory or a software glitch.Or of course the pipe feeding the site.

Anyway my tuppence worth.

Cheers
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1203625 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203627 - Posted: 8 Mar 2012, 8:37:34 UTC - in response to Message 1203625.  

Well I think I found a way around the problem of misscommunication between boinc and its peer.

Hit the update button until it succeeds.

I had 19 tasks to report, didnt want more WU and the update failed.So I hit update again [ignoring the timeout] and it failed again, so I hit update again and it made contact ok, reported ok.

It looks to my admittedly untrained eye as if there is a problem with the router at berkeley, perhaps memory or a software glitch.Or of course the pipe feeding the site.

Anyway my tuppence worth.

Cheers

I think that Synergy is just overloaded, or the Apache settings are not quite right yet.....dunno, just a guess. Synergy is trying to do a yeoman's duty here....
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203627 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1203628 - Posted: 8 Mar 2012, 8:42:57 UTC - in response to Message 1203627.  

Hi Mark,
Well I had another 2 results to report and it went into its usual routine, so I did what I posted I'd done last time and once again 3rd time lucky.

If its synergy being overloaded then I guess this wont be sorted out until bane is reconfigured:-/

And methinks that may take Matt a fair bit of time since he's also going to be dealing with the new servers too.

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1203628 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203629 - Posted: 8 Mar 2012, 8:46:16 UTC - in response to Message 1203628.  

Hi Mark,
Well I had another 2 results to report and it went into its usual routine, so I did what I posted I'd done last time and once again 3rd time lucky.

If its synergy being overloaded then I guess this wont be sorted out until bane is reconfigured:-/

And methinks that may take Matt a fair bit of time since he's also going to be dealing with the new servers too.

Regards,

Hopefully the new download server shall take some load off of Synergy.....or however they choose to distribute the work load. Certainly I would think that the new server would handle more than just downloads, just as Synergy is now handling much more than it was originally commissioned for.

I suspect good things are indeed on the horizon....and not just the headlight of an approaching train. LOL.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203629 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1203636 - Posted: 8 Mar 2012, 9:01:08 UTC - in response to Message 1203629.  

Dunno so much about those good things, it seems we have a massive solar storm enroute to us that is due to hit later on this evening.

Apparently the northern lights will be seen at lower latitudes than usual and power grids may be affected as will some flight paths.

Other than the potential power outage I guess when the new kit comes online things will be better, once its fully configured of course..

So about 2 weeks? or so...

Regards,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1203636 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203637 - Posted: 8 Mar 2012, 9:02:47 UTC - in response to Message 1203636.  

Dunno so much about those good things, it seems we have a massive solar storm enroute to us that is due to hit later on this evening.

Apparently the northern lights will be seen at lower latitudes than usual and power grids may be affected as will some flight paths.

Other than the potential power outage I guess when the new kit comes online things will be better, once its fully configured of course..

So about 2 weeks? or so...

Regards,

I just posted an article about the solar flares in the Cafe.
Hope it is not as disruptive as they are saying it could be.

"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203637 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34747
Credit: 261,360,520
RAC: 489
Australia
Message 1203661 - Posted: 8 Mar 2012, 9:45:54 UTC - in response to Message 1203638.  

I'm sorry but I still can't complain.

Cheers.
ID: 1203661 · Report as offensive
Profile shizaru
Volunteer tester
Avatar

Send message
Joined: 14 Jun 04
Posts: 1130
Credit: 1,967,904
RAC: 0
Greece
Message 1203673 - Posted: 8 Mar 2012, 10:56:12 UTC - in response to Message 1203623.  

Well, I see nothing really got fixed today with dear ol' Synergy in it's new duties. The crickets are still slouching, even though kibble bowls are not full.

I have sloooooooooly gained a bit on cache, but still am about 3000 WUs shy of the limits, and have several rigs doing GPU only, because of the silly server bug that seems to avoid sending any CPU work unless the GPU is up to the limits.
And so, the CPUs sit idle.
Vexing bit of a perfect storm.

But the kitties pad along.


Thank you for that.

From an old post I made, I can see I've had this problem for at least 12 days now... Suprised I missed it being mentioned. It was driving me crazy 'cause I was sure I messed something up when I signed up for Beta.
ID: 1203673 · Report as offensive
LadyL
Volunteer tester
Avatar

Send message
Joined: 14 Sep 11
Posts: 1679
Credit: 5,230,097
RAC: 0
Message 1203674 - Posted: 8 Mar 2012, 11:03:14 UTC - in response to Message 1203673.  

Well, I see nothing really got fixed today with dear ol' Synergy in it's new duties. The crickets are still slouching, even though kibble bowls are not full.

I have sloooooooooly gained a bit on cache, but still am about 3000 WUs shy of the limits, and have several rigs doing GPU only, because of the silly server bug that seems to avoid sending any CPU work unless the GPU is up to the limits.
And so, the CPUs sit idle.
Vexing bit of a perfect storm.

But the kitties pad along.


Thank you for that.

From an old post I made, I can see I've had this problem for at least 12 days now... Suprised I missed it being mentioned. It was driving me crazy 'cause I was sure I messed something up when I signed up for Beta.


You could try using the website setting to get no GPU work for some time, so your CPU cache fills and then reenable GPU workfetch.
I'm not the Pope. I don't speak Ex Cathedra!
ID: 1203674 · Report as offensive
David S
Volunteer tester
Avatar

Send message
Joined: 4 Oct 99
Posts: 18352
Credit: 27,761,924
RAC: 12
United States
Message 1203744 - Posted: 8 Mar 2012, 16:10:49 UTC - in response to Message 1203627.  

I think that Synergy is just overloaded, or the Apache settings are not quite right yet.....dunno, just a guess. Synergy is trying to do a yeoman's duty here....

As stated on the SSP, the purpose of the feeder is "Fills up the scheduler work queue with workunits ready to be sent. The scheduler is usually too busy handling client transactions to maintain such a queue itself."

So is that effect being negated by the scheduling server, scheduler process, and feeder all being on the same machine?

Whatever, my i7 has gained some on its quota. It's now over 700 out of its limit of 800, up from 6xx yesterday and around 500 last week. It's also devoid of Einsteins (probably paying off debt to Seti accrued when work wasn't available).

David
Sitting on my butt while others boldly go,
Waiting for a message from a small furry creature from Alpha Centauri.

ID: 1203744 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203768 - Posted: 8 Mar 2012, 17:06:30 UTC - in response to Message 1203674.  
Last modified: 8 Mar 2012, 17:08:29 UTC

Well, I see nothing really got fixed today with dear ol' Synergy in it's new duties. The crickets are still slouching, even though kibble bowls are not full.

I have sloooooooooly gained a bit on cache, but still am about 3000 WUs shy of the limits, and have several rigs doing GPU only, because of the silly server bug that seems to avoid sending any CPU work unless the GPU is up to the limits.
And so, the CPUs sit idle.
Vexing bit of a perfect storm.

But the kitties pad along.


Thank you for that.

From an old post I made, I can see I've had this problem for at least 12 days now... Suprised I missed it being mentioned. It was driving me crazy 'cause I was sure I messed something up when I signed up for Beta.


You could try using the website setting to get no GPU work for some time, so your CPU cache fills and then reenable GPU workfetch.

Thank you so much for the suggestion, worked like a charm!
The kitties now have their CPUs crunching away again.

Now....you might persuade D.A. to amend the scheduler programming so this work around is not required. Instead of totally ignoring the slower resource on a computer, perhaps it could try to give just a few percent to the slower resource while still heavily favoring the faster GPUs. Instead of starving the slower CPUs altogether. Just a thought...LOL.

(Now I just have to remember to set it back before I go to work...or I'll really have to kick myself...)
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203768 · Report as offensive
Profile cliff
Avatar

Send message
Joined: 16 Dec 07
Posts: 625
Credit: 3,590,440
RAC: 0
United Kingdom
Message 1203779 - Posted: 8 Mar 2012, 17:41:42 UTC - in response to Message 1203758.  

Dont tell me the sun has hidden his head again? Anyway feel free to whinge as much as you please, nearly everyone else is:-)

How long do you think it will take to start putting out AP's again? Days,Weeks, Months. Thats as good as any place to start.

Cheers,
Cliff,
Been there, Done that, Still no damm T shirt!
ID: 1203779 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1203780 - Posted: 8 Mar 2012, 17:42:56 UTC - in response to Message 1203768.  


Now....you might persuade D.A. to amend the scheduler programming so this work around is not required. Instead of totally ignoring the slower resource on a computer, perhaps it could try to give just a few percent to the slower resource while still heavily favoring the faster GPUs. Instead of starving the slower CPUs altogether. Just a thought...LOL.


Now that sounds like a very good idea, if the download server cache is getting clogged with VLAR's and the server is only trying to hand out GPU WU's that could slow things down, we need the servers running as smoothly as possible.



Kevin


ID: 1203780 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203784 - Posted: 8 Mar 2012, 18:00:23 UTC - in response to Message 1203780.  


Now....you might persuade D.A. to amend the scheduler programming so this work around is not required. Instead of totally ignoring the slower resource on a computer, perhaps it could try to give just a few percent to the slower resource while still heavily favoring the faster GPUs. Instead of starving the slower CPUs altogether. Just a thought...LOL.


Now that sounds like a very good idea, if the download server cache is getting clogged with VLAR's and the server is only trying to hand out GPU WU's that could slow things down, we need the servers running as smoothly as possible.



Of course. Don't get me wrong, the idea of prioritizing workfetch for the best computer resource is a good one. It just should not be done with the TOTAL exclusion of repeated requests for CPU work.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203784 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22202
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1203795 - Posted: 8 Mar 2012, 18:26:43 UTC
Last modified: 8 Mar 2012, 18:27:20 UTC

As far as I understand it the cruncher decides what it is going to ask for and the server decides if it can satisfy that request.
Thus if you are requesting "GPU only", and there are no WU suitable you will get the dreaded "no tasks available" message, or one of its variants. That said, if you are asking for "any WU I can crunch" preference will be given to satisfying that particular crunchers "best" processor's needs.
At the server level there is no such thing as a "GPU WU", but there are types of WU that a particular GPU cannot handle, this is managed by a combination of the cruncher and the server.
In general crunchers running "more modern" versions of BOINC will be sending back their current cache states so the servers know if the limits are being infringed or not.
Assuming that a cruncher is running a reasonably recent version of BOINC the servers will only be reacting to requests for WU that can be crunched, and only enough to satisfy either the imposed limits, or the locally defined cache, and only WU that the cruncher is capable of processing will be delivered...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1203795 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1203796 - Posted: 8 Mar 2012, 18:30:59 UTC - in response to Message 1203795.  

As far as I understand it the cruncher decides what it is going to ask for and the server decides if it can satisfy that request.
Thus if you are requesting "GPU only", and there are no WU suitable you will get the dreaded "no tasks available" message, or one of its variants. That said, if you are asking for "any WU I can crunch" preference will be given to satisfying that particular crunchers "best" processor's needs.
At the server level there is no such thing as a "GPU WU", but there are types of WU that a particular GPU cannot handle, this is managed by a combination of the cruncher and the server.
In general crunchers running "more modern" versions of BOINC will be sending back their current cache states so the servers know if the limits are being infringed or not.
Assuming that a cruncher is running a reasonably recent version of BOINC the servers will only be reacting to requests for WU that can be crunched, and only enough to satisfy either the imposed limits, or the locally defined cache, and only WU that the cruncher is capable of processing will be delivered...

Not quite true....
My top host was making work requests for CPU AND GPU. And yet getting NOTHING for the CPU at all. The CPUs were idle. It was not until I unchecked GPUs in the preferences that Boinc made work requests for CPU only, and the work then started coming in. The mixed requests were not getting any work for the CPU whatsoever.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1203796 · Report as offensive
Kevin Olley

Send message
Joined: 3 Aug 99
Posts: 906
Credit: 261,085,289
RAC: 572
United Kingdom
Message 1203797 - Posted: 8 Mar 2012, 18:35:09 UTC - in response to Message 1203796.  


Not quite true....
My top host was making work requests for CPU AND GPU. And yet getting NOTHING for the CPU at all. The CPUs were idle. It was not until I unchecked GPUs in the preferences that Boinc made work requests for CPU only, and the work then started coming in. The mixed requests were not getting any work for the CPU whatsoever.


+1


Kevin


ID: 1203797 · Report as offensive
Profile Khangollo
Avatar

Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1203799 - Posted: 8 Mar 2012, 18:38:11 UTC - in response to Message 1203796.  
Last modified: 8 Mar 2012, 18:41:51 UTC

My top host was making work requests for CPU AND GPU. And yet getting NOTHING for the CPU at all. The CPUs were idle. It was not until I unchecked GPUs in the preferences that Boinc made work requests for CPU only, and the work then started coming in. The mixed requests were not getting any work for the CPU whatsoever.

I'm observing the same annoying behavior on my host. Combined CPU+GPU requests only start getting some CPU work when cache is near full, otherwise I'm getting GPU work only.
When astropulse was still operational, I sometimes got 1-2 AP CPU workunits during those combined requests, even when cache was nearly empty.

Also, to make matters much much worse, if you let your host to "bounce" off the limits, cache never gets full enough and you get no CPU work at all.
ID: 1203799 · Report as offensive
musicplayer

Send message
Joined: 17 May 10
Posts: 2430
Credit: 926,046
RAC: 0
Message 1203800 - Posted: 8 Mar 2012, 18:38:50 UTC
Last modified: 8 Mar 2012, 18:43:09 UTC

I have noticed that with the two most recent versions of BOINC Manager I am using, 6.10.58 and particularly 6.12.34, that when attaching to the server (be it Seti@home or PrimeGrid), the server gives me both CUDA-based as well as non-CUDA tasks (PrimeGrid has more options than Seti@home for setting preferences for the types of tasks being run than Seti@home).

Anyway, I get both types and especially 6.12.34 towards PrimeGrid gives me a whole bunch of new tasks.

It is when I first set the CUDA-option active after first finishing setting up things, making sure it works, these tasks eventually starts running.

My guess is that the server looks at the version of BOINC Manager being used first, then next what the capabilities of the PC being used are, including the processor.
ID: 1203800 · Report as offensive
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · Next

Message boards : Number crunching : Panic Mode On (70) Server problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.