Panic Mode On (83) Server Problems?

Message boards : Number crunching : Panic Mode On (83) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 21 · Next

AuthorMessage
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1364556 - Posted: 4 May 2013, 20:28:38 UTC

It looks like no new AP WU since 3 days?

Is there a reason why the AP splitters are:
Not Running
Program failed or ran out of work
(or the project is down)


Maybe the currently tapes have no useful AP data in it for crunching?


* Best regards! :-) * Philip J. Fry, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1364556 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1364557 - Posted: 4 May 2013, 20:29:28 UTC - in response to Message 1364550.  

if you can't even get CPU work units (and I am having the same problem here) where are all the work units going.


When Who or What makes the decision to assign a task to

1. CPU
2. ATI
3. Nvidia

Roundrobin or based on request or random or persnickity algorythm?

Your BOINC Manager makes the decision and it will fill your fastest resources first before your slowest ones.

Cheers.

I thought it was the Scheduler that assigned the tasks to the appropriate resource, depending on what was requested by your BOINC. If both CPU and GPU tasks are requested, usually the GPU is the more efficient resource so that request gets filled first, then the CPU request, to the extent there are tasks available in the feeder. The one exception is VLARs are not sent to NVidia GPUs, since many of the older models do not handle those tasks very well, and the Scheduler has not been updated to make that distinction.
Donald
Infernal Optimist / Submariner, retired
ID: 1364557 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364559 - Posted: 4 May 2013, 20:31:25 UTC - in response to Message 1364549.  

As I understand it, the splitters are running at about the same rate as they were before the move,

Nah, not even close.
Before the move (and for a week or 2 after it) the splitters were able to produce 50+/s. That would allow them to provide work as well as build up a ready-to-send buffer. Once the buffer was full (around 300,000 WUs) they'd shut down. For the last couple of weeks, they've only been able to produce 30/s (which is barely enough when the work is predominately VLARs, put just a few shorties in there & it's no where near enough. Shorty storm- forget it).
There was actually a spike there today of 60/s, but that was just a glitch. 30/s (or less) has been it for a couple of weeks now.
Grant
Darwin NT
ID: 1364559 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364561 - Posted: 4 May 2013, 20:33:23 UTC - in response to Message 1364549.  

Actually all kidding aside on this, if you can't even get CPU work units (and I am having the same problem here) where are all the work units going. The outbound pipe is running at twice the capacity of before yet many of us are starved for work. Anyone care to speculate or am I missing something obvious here?

there are other servers, non seti, connected to that switch so the outbound traffic is meaningless now...

In this message, Josef Segur says otherwise.

Depends what people mean by inbound & outbound. Is it from our perspective, or the routers?

Same as the last router- outbound on the Cricket graphs is inbound to the project (Scheduler requests, return of completed WUs). Inbound on the Cricket graphs is outbound from the project (our downloads).

My point was to counter rottenmutt's statement that there are non-seti servers on that router - Joe says not so.

Ah.
I'll go with Joe, it's just Seti traffic.

Grant
Darwin NT
ID: 1364561 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1364563 - Posted: 4 May 2013, 20:33:55 UTC - in response to Message 1364556.  
Last modified: 4 May 2013, 20:34:17 UTC

It looks like no new AP WU since 3 days?

Is there a reason why the AP splitters are:
Not Running
Program failed or ran out of work
(or the project is down)


Maybe the currently tapes have no useful AP data in it for crunching?

Yes, per the Server Status Page, all the tapes available to the splitters have already been processed for AP tasks. Until more tapes are loaded, only AP resends will be available.
Donald
Infernal Optimist / Submariner, retired
ID: 1364563 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1364565 - Posted: 4 May 2013, 20:35:58 UTC

Please Help! need some GPU WU to crunch, it´s cold here and the GPU´s are running empty...
ID: 1364565 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1364566 - Posted: 4 May 2013, 20:40:47 UTC - in response to Message 1364559.  

As I understand it, the splitters are running at about the same rate as they were before the move,

Nah, not even close.
Before the move (and for a week or 2 after it) the splitters were able to produce 50+/s. That would allow them to provide work as well as build up a ready-to-send buffer. Once the buffer was full (around 300,000 WUs) they'd shut down. For the last couple of weeks, they've only been able to produce 30/s (which is barely enough when the work is predominately VLARs, put just a few shorties in there & it's no where near enough. Shorty storm- forget it).
There was actually a spike there today of 60/s, but that was just a glitch. 30/s (or less) has been it for a couple of weeks now.

Okay, I haven't been paying that close attention to the production rate. Maybe that is one of the "throttles" Matt spoke of. Yes, I do wish he or Jeff or Eric would confirm that, so we can stop speculating and arguing about things we can't be certain about ....
Donald
Infernal Optimist / Submariner, retired
ID: 1364566 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1364569 - Posted: 4 May 2013, 20:45:27 UTC - in response to Message 1364563.  
Last modified: 4 May 2013, 20:46:57 UTC

Philip J. Fry wrote:
It looks like no new AP WU since 3 days?

Is there a reason why the AP splitters are:
Not Running
Program failed or ran out of work
(or the project is down)


Maybe the currently tapes have no useful AP data in it for crunching?

Donald L. Johnson wrote:
Yes, per the Server Status Page, all the tapes available to the splitters have already been processed for AP tasks. Until more tapes are loaded, only AP resends will be available.

But it looks strange.
Everytime I look to the server status page, the AP splitter are 'Not Running'.
My BOINC don't get new AP WUs since 3 days. Normally every day ~ 5 AP WUs.


* Best regards! :-) * Philip J. Fry, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1364569 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364570 - Posted: 4 May 2013, 20:46:38 UTC - in response to Message 1364559.  
Last modified: 4 May 2013, 20:47:37 UTC

As I understand it, the splitters are running at about the same rate as they were before the move,

Nah, not even close.
Before the move (and for a week or 2 after it) the splitters were able to produce 50+/s. That would allow them to provide work as well as build up a ready-to-send buffer. Once the buffer was full (around 300,000 WUs) they'd shut down. For the last couple of weeks, they've only been able to produce 30/s (which is barely enough when the work is predominately VLARs, put just a few shorties in there & it's no where near enough. Shorty storm- forget it).
There was actually a spike there today of 60/s, but that was just a glitch. 30/s (or less) has been it for a couple of weeks now.


Unfortunately the resolution of the long term graphs makes it impossible to see the spikes & troughs that used to be the splitter output (run till buffer full, then shut down).
But what can be seen is the average turn around time. Can anyone rememeber when the server side limits came in?
The average turn around time used to be 80-100 hours. Around Dec it dropped to around 50hrs. Lately it's been around 30hrs.

In the past, those sustained 50+/s outputs from the splitters were only for a couple of hours, then the buffer would be full & they'd shut down. Not knowing the underlying mechanism for splitting work, it's possible the present sustained load is resulting in the splitter chocking (as someone posted previously there are issues with disk I/O limitations). But even allowing for that, to me it still doen't explain why we don't at least get several hours of high output after the outages.
Grant
Darwin NT
ID: 1364570 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364573 - Posted: 4 May 2013, 20:49:48 UTC - in response to Message 1364569.  

But it looks strange.
Everytime I look to the server status page, the AP splitter are 'Not Running'.
My BOINC don't get new AP WUs since 3 days. Normally every day ~ 5 AP WUs.

You will only get work if work is being produced.
For work to be produced the splitters have to be running.
They can't run if there isn't any data for them to process.
There isn't any data for them to process, so they're not running.

Grant
Darwin NT
ID: 1364573 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1364578 - Posted: 4 May 2013, 20:58:43 UTC - in response to Message 1364573.  

Philip J. Fry wrote:
But it looks strange.
Everytime I look to the server status page, the AP splitter are 'Not Running'.
My BOINC don't get new AP WUs since 3 days. Normally every day ~ 5 AP WUs.

Grant (SSSF) wrote:
You will only get work if work is being produced.
For work to be produced the splitters have to be running.
They can't run if there isn't any data for them to process.
There isn't any data for them to process, so they're not running.

And why there isn't any data for them to process? ;-)

The tapes don't have useful AP data in it since 3 days?


* Best regards! :-) * Philip J. Fry, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1364578 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364583 - Posted: 4 May 2013, 21:12:54 UTC - in response to Message 1364578.  

The tapes don't have useful AP data in it since 3 days?

Nope.
New ones have yet to be loaded.

Grant
Darwin NT
ID: 1364583 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1364584 - Posted: 4 May 2013, 21:13:31 UTC
Last modified: 4 May 2013, 21:13:54 UTC

AP splitter 2 and 3 are running but all tapes are already marked as done.
That means they was sent as soon they were beeing split.


With each crime and every kindness we birth our future.
ID: 1364584 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1364586 - Posted: 4 May 2013, 21:17:16 UTC - in response to Message 1364578.  

Philip J. Fry wrote:
But it looks strange.
Everytime I look to the server status page, the AP splitter are 'Not Running'.
My BOINC don't get new AP WUs since 3 days. Normally every day ~ 5 AP WUs.

Grant (SSSF) wrote:
You will only get work if work is being produced.
For work to be produced the splitters have to be running.
They can't run if there isn't any data for them to process.
There isn't any data for them to process, so they're not running.

And why there isn't any data for them to process? ;-)

The tapes don't have useful AP data in it since 3 days?

Philip, for all the time that you have been here I thought that you would be able to read the Server Status page and there for understand the reason without having to ask.

AP's are split off the same files that MB's are but the AP's are split off that much faster than the MB's are being that there are a lot less AP's coming from 1 file then there are MB's.

If you had a look at the files to be split the last few days you would of noticed a lot of files there that had already finished splitting AP's but were still waiting to complete MB's splitting.

Now that those files are almost finished now I expect that new files will start to be loaded again and AP's will be split very quickly again and will have to wait for the MB's to catch up again.

Cheers.
ID: 1364586 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22190
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1364587 - Posted: 4 May 2013, 21:19:28 UTC

Each tape holds the same theoretical number of APs, but the AP splitters get through a tape much faster than the MB splitters for a number of reasons, including the split is simpler - there is substantially less overlap (non?), and the bandwidth per WU is much higher, (Which means there are fewer AP WU are split from each tape). The new MB splitters appear to much slower than the old ones - are they stress testing them?
The last tape load was "quite substantial", and there haven't been any new tapes loaded for a few days - much the same happened last week, where the splitters were down to the very tail of the last tape before any new tapes were loaded. At least with the 24/7 support in the coloc there should be someone around to load the new tapes when the time is due (provided the project has sent a few down the hill).
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1364587 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1364595 - Posted: 4 May 2013, 21:30:16 UTC - in response to Message 1364587.  

Personally I think that they could drop 2 AP splitters and move them over to MB splitting which would help balance the production of the 2 a bit better.

Cheers.
ID: 1364595 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364598 - Posted: 4 May 2013, 21:52:30 UTC - in response to Message 1364595.  

Personally I think that they could drop 2 AP splitters and move them over to MB splitting which would help balance the production of the 2 a bit better.

Good idea.

Grant
Darwin NT
ID: 1364598 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1364602 - Posted: 4 May 2013, 22:05:32 UTC
Last modified: 4 May 2013, 22:10:39 UTC

This last outage of mine, partially caused by me draining my queues to update to .64, and which finished after we the well went dry, when I did get units I got CPU topped off first and then GPU several hours later. Usually it's the other way around but I believe I know why it was in this case.

I believe this happeded because 1) some units can only be done on CPU since the GPU apps have problems; 2) GPU units are in a very high demand due to the current trend of everyone using GPUs to crunch and 3) the reserve of units are gone and 30/s doesn't cut it. This leads to a situation where there may be CPU only units available while GPU is always in extreme short supply.

As it stands now, I need an GPU unit only every 40-50 minutes on average based on the mix of long and short units, I am usually capped off at my 100. However if your needs are considerably higher, as in a few minutes, your GPU is likely to run low/out while CPU can still be filled by CPU only units.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1364602 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364607 - Posted: 4 May 2013, 22:21:59 UTC - in response to Message 1364606.  

Yeah well, I can always switch to Beta if I see ice forming on my CPU and GPU :-)

Isn't it meant to be almost summer there in the North?

Grant
Darwin NT
ID: 1364607 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1364614 - Posted: 4 May 2013, 22:53:37 UTC - in response to Message 1364608.  

Yeah well, I can always switch to Beta if I see ice forming on my CPU and GPU :-)

Isn't it meant to be almost summer there in the North?


Yes indeed, the ice, was just a metaphor for a very cold computer.

Anything below 25°c is very cold.
Grant
Darwin NT
ID: 1364614 · Report as offensive
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (83) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.