Panic Mode On (83) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (83) Server Problems?

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 22 · Next
Author Message
Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,643,575
RAC: 47,602
Australia
Message 1364598 - Posted: 4 May 2013, 21:52:30 UTC - in response to Message 1364595.

Personally I think that they could drop 2 AP splitters and move them over to MB splitting which would help balance the production of the 2 a bit better.

Good idea.

____________
Grant
Darwin NT.

Keith White
Avatar
Send message
Joined: 29 May 99
Posts: 370
Credit: 2,898,665
RAC: 2,449
United States
Message 1364602 - Posted: 4 May 2013, 22:05:32 UTC
Last modified: 4 May 2013, 22:10:39 UTC

This last outage of mine, partially caused by me draining my queues to update to .64, and which finished after we the well went dry, when I did get units I got CPU topped off first and then GPU several hours later. Usually it's the other way around but I believe I know why it was in this case.

I believe this happeded because 1) some units can only be done on CPU since the GPU apps have problems; 2) GPU units are in a very high demand due to the current trend of everyone using GPUs to crunch and 3) the reserve of units are gone and 30/s doesn't cut it. This leads to a situation where there may be CPU only units available while GPU is always in extreme short supply.

As it stands now, I need an GPU unit only every 40-50 minutes on average based on the mix of long and short units, I am usually capped off at my 100. However if your needs are considerably higher, as in a few minutes, your GPU is likely to run low/out while CPU can still be filled by CPU only units.
____________
"Life is just nature's way of keeping meat fresh." - The Doctor

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3590
Credit: 20,783,091
RAC: 23,953
Sweden
Message 1364606 - Posted: 4 May 2013, 22:19:33 UTC

Well, this ain't looking good. No more tapes are being added, no AP's available for days now. In a few hours my main cruncher will be bone dry, if no more tapes are added, and if they don't put a turbo on the sluggish splitters.

Yeah well, I can always switch to Beta if I see ice forming on my CPU and GPU :-)
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,643,575
RAC: 47,602
Australia
Message 1364607 - Posted: 4 May 2013, 22:21:59 UTC - in response to Message 1364606.

Yeah well, I can always switch to Beta if I see ice forming on my CPU and GPU :-)

Isn't it meant to be almost summer there in the North?

____________
Grant
Darwin NT.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3590
Credit: 20,783,091
RAC: 23,953
Sweden
Message 1364608 - Posted: 4 May 2013, 22:28:55 UTC - in response to Message 1364607.

Yeah well, I can always switch to Beta if I see ice forming on my CPU and GPU :-)

Isn't it meant to be almost summer there in the North?


Yes indeed, the ice, was just a metaphor for a very cold computer.
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,643,575
RAC: 47,602
Australia
Message 1364614 - Posted: 4 May 2013, 22:53:37 UTC - in response to Message 1364608.

Yeah well, I can always switch to Beta if I see ice forming on my CPU and GPU :-)

Isn't it meant to be almost summer there in the North?


Yes indeed, the ice, was just a metaphor for a very cold computer.

Anything below 25°c is very cold.
____________
Grant
Darwin NT.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3590
Credit: 20,783,091
RAC: 23,953
Sweden
Message 1364615 - Posted: 4 May 2013, 22:54:19 UTC
Last modified: 4 May 2013, 22:55:41 UTC

Oh my G, all of a sudden a whole truckload of new files to split came online. MB and AP, to no end.

Edit: Life is good again.

LOL
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,643,575
RAC: 47,602
Australia
Message 1364621 - Posted: 4 May 2013, 23:17:00 UTC - in response to Message 1364615.


Network traffic went from 186Mb/s up to over 300Mb/s.
____________
Grant
Darwin NT.

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3590
Credit: 20,783,091
RAC: 23,953
Sweden
Message 1364624 - Posted: 4 May 2013, 23:22:58 UTC - in response to Message 1364621.


Network traffic went from 186Mb/s up to over 300Mb/s.


Yeah, but so far it hasn't become any easier to get any new work, MB or AP. I'm afraid that with all the new super GPU's, we have reached the point where the demand is far far greater than what the project can ever supply. I fear that this is the new "normal".


____________

Lionel
Send message
Joined: 25 Mar 00
Posts: 576
Credit: 236,244,302
RAC: 231,922
Australia
Message 1364630 - Posted: 4 May 2013, 23:36:52 UTC - in response to Message 1364533.

Is anyone getting any work? Or am I alone in this respect?


no you're not ... getting zip here myself mate ...



____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,643,575
RAC: 47,602
Australia
Message 1364647 - Posted: 5 May 2013, 0:43:49 UTC - in response to Message 1364630.

Is anyone getting any work? Or am I alone in this respect?


no you're not ... getting zip here myself mate ...

As mentioned before- i'm getting work, but it's only on every 3rd to 10th request that i do get it.

____________
Grant
Darwin NT.

Starman
Avatar
Send message
Joined: 15 May 99
Posts: 134
Credit: 38,439,891
RAC: 35,455
Canada
Message 1364656 - Posted: 5 May 2013, 2:40:29 UTC

Not getting too many here, especially AP v6 Open CL, Those are the main diet for the main cruncher and that cupboard has been bare since Thursday and only received 3 since Apr. 30. One of my other crunchers has managed to get 11 in the past 2 days and has 38 of them. I guess it's all in the timing of the request.
____________

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5868
Credit: 60,643,575
RAC: 47,602
Australia
Message 1364658 - Posted: 5 May 2013, 3:05:23 UTC - in response to Message 1364656.

I guess it's all in the timing of the request.

Yep.
____________
Grant
Darwin NT.

ExchangeMan
Volunteer tester
Send message
Joined: 9 Jan 00
Posts: 113
Credit: 143,584,722
RAC: 206,609
United States
Message 1364659 - Posted: 5 May 2013, 3:06:53 UTC

In the last half hour my wish was finally granted. Got a nice load of CPU, GPU and AP! Now that AP splitters are running, maybe things will get back to normal.
____________

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6265
Credit: 740,672
RAC: 1,171
United States
Message 1364660 - Posted: 5 May 2013, 3:08:43 UTC - in response to Message 1364621.

Network traffic went from 186Mb/s up to over 300Mb/s.

Yeah, probably due to APs being split again, and sent out as soon as they hit the feeder.....
____________
Donald
Infernal Optimist / Submariner, retired

Profile Donald L. JohnsonProject donor
Avatar
Send message
Joined: 5 Aug 02
Posts: 6265
Credit: 740,672
RAC: 1,171
United States
Message 1364661 - Posted: 5 May 2013, 3:17:42 UTC - in response to Message 1364586.

Philip J. Fry wrote:
But it looks strange.
Everytime I look to the server status page, the AP splitter are 'Not Running'.
My BOINC don't get new AP WUs since 3 days. Normally every day ~ 5 AP WUs.

Grant (SSSF) wrote:
You will only get work if work is being produced.
For work to be produced the splitters have to be running.
They can't run if there isn't any data for them to process.
There isn't any data for them to process, so they're not running.

And why there isn't any data for them to process? ;-)

The tapes don't have useful AP data in it since 3 days?

Philip, for all the time that you have been here I thought that you would be able to read the Server Status page and there for understand the reason without having to ask.

AP's are split off the same files that MB's are but the AP's are split off that much faster than the MB's are being that there are a lot less AP's coming from 1 file then there are MB's.

If you had a look at the files to be split the last few days you would of noticed a lot of files there that had already finished splitting AP's but were still waiting to complete MB's splitting.

Now that those files are almost finished now I expect that new files will start to be loaded again and AP's will be split very quickly again and will have to wait for the MB's to catch up again.

Cheers.

And, just for the record, those data files are shown on the right-hand side of the Server Status Page, and shows for each file whether it is being split for MB, AP, waiting, or Done.
____________
Donald
Infernal Optimist / Submariner, retired

Cosmic_Ocean
Avatar
Send message
Joined: 23 Dec 00
Posts: 2292
Credit: 8,816,987
RAC: 3,997
United States
Message 1364687 - Posted: 5 May 2013, 6:58:35 UTC

I've gotten 19 APs today with probably about 100 requests. Was able to fill up my 10-day cache. Then I was updating my spreadsheet and noticed that I got assigned my 2500th AP for r557 on this CPU.
____________

Linux laptop uptime: 1484d 22h 42m
Ended due to UPS failure, found 14 hours after the fact

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 103,502,342
RAC: 78,777
Finland
Message 1364690 - Posted: 5 May 2013, 7:09:31 UTC

I do not seem to be getting any work for one of my computers.
Is BOINC able to detect and send work only to the fastest hosts if you have multiple hosts?

No GPU work. Recently my other computer got 97 new tasks...

05/05/2013 09:50:43 | SETI@home | [sched_op] Starting scheduler request
05/05/2013 09:50:43 | SETI@home | Sending scheduler request: To fetch work.
05/05/2013 09:50:43 | SETI@home | Reporting 1 completed tasks
05/05/2013 09:50:43 | SETI@home | Requesting new tasks for CPU
05/05/2013 09:50:43 | SETI@home | [sched_op] CPU work request: 84087.83 seconds; 0.00 devices
05/05/2013 09:50:43 | SETI@home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
05/05/2013 09:50:46 | SETI@home | Scheduler request completed: got 4 new tasks
05/05/2013 09:50:46 | SETI@home | [sched_op] Server version 701
05/05/2013 09:50:46 | SETI@home | Project requested delay of 303 seconds
05/05/2013 09:50:46 | SETI@home | [sched_op] estimated total CPU task duration: 33068 seconds
05/05/2013 09:50:46 | SETI@home | [sched_op] estimated total NVIDIA task duration: 0 seconds
05/05/2013 09:50:46 | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 03mr13ad.25526.11928.11.11.138_0
05/05/2013 09:50:46 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
05/05/2013 09:50:46 | SETI@home | [sched_op] Reason: requested by project
05/05/2013 09:50:48 | SETI@home | Started download of 02my11ae.26493.3909.10.11.84.vlar
05/05/2013 09:50:48 | SETI@home | Started download of 02my11ae.26493.3909.10.11.214.vlar
05/05/2013 09:50:48 | SETI@home | Started download of 02my11ae.26493.3909.10.11.120.vlar
05/05/2013 09:50:48 | SETI@home | Started download of 02my11ae.26493.3909.10.11.220.vlar
05/05/2013 09:50:52 | SETI@home | Finished download of 02my11ae.26493.3909.10.11.84.vlar
05/05/2013 09:50:52 | SETI@home | Finished download of 02my11ae.26493.3909.10.11.214.vlar
05/05/2013 09:50:52 | SETI@home | Finished download of 02my11ae.26493.3909.10.11.120.vlar
05/05/2013 09:50:52 | SETI@home | Finished download of 02my11ae.26493.3909.10.11.220.vlar
05/05/2013 09:55:51 | SETI@home | [sched_op] Starting scheduler request
05/05/2013 09:55:51 | SETI@home | Sending scheduler request: To fetch work.
05/05/2013 09:55:51 | SETI@home | Requesting new tasks for CPU
05/05/2013 09:55:51 | SETI@home | [sched_op] CPU work request: 52269.95 seconds; 0.00 devices
05/05/2013 09:55:51 | SETI@home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
05/05/2013 09:55:54 | SETI@home | Scheduler request completed: got 0 new tasks
05/05/2013 09:55:54 | SETI@home | [sched_op] Server version 701
05/05/2013 09:55:54 | SETI@home | No tasks sent
05/05/2013 09:55:54 | SETI@home | No tasks are available for SETI@home Enhanced
05/05/2013 09:55:54 | SETI@home | No tasks are available for Astropulse v505
05/05/2013 09:55:54 | SETI@home | No tasks are available for SETI@home v7
05/05/2013 09:55:54 | SETI@home | No tasks are available for AstroPulse v6
05/05/2013 09:55:54 | SETI@home | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
05/05/2013 09:55:54 | SETI@home | This computer has reached a limit on tasks in progress
05/05/2013 09:55:54 | SETI@home | Project has no tasks available
05/05/2013 09:55:54 | SETI@home | Project requested delay of 303 seconds
05/05/2013 09:55:54 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
05/05/2013 09:55:54 | SETI@home | [sched_op] Reason: requested by project
05/05/2013 10:02:42 | SETI@home | update requested by user
05/05/2013 10:02:46 | SETI@home | [sched_op] Starting scheduler request
05/05/2013 10:02:46 | SETI@home | Sending scheduler request: Requested by user.
05/05/2013 10:02:46 | SETI@home | Requesting new tasks for CPU and NVIDIA
05/05/2013 10:02:46 | SETI@home | [sched_op] CPU work request: 53948.89 seconds; 0.00 devices
05/05/2013 10:02:46 | SETI@home | [sched_op] NVIDIA work request: 233280.00 seconds; 2.00 devices
05/05/2013 10:02:49 | SETI@home | Scheduler request completed: got 0 new tasks
05/05/2013 10:02:49 | SETI@home | [sched_op] Server version 701
05/05/2013 10:02:49 | SETI@home | Project has no tasks available
05/05/2013 10:02:49 | SETI@home | Project requested delay of 303 seconds
05/05/2013 10:02:49 | SETI@home | [sched_op] Deferring communication for 5 min 3 sec
05/05/2013 10:02:49 | SETI@home | [sched_op] Reason: requested by project

____________

Mark Lybeck
Send message
Joined: 9 Aug 99
Posts: 209
Credit: 103,502,342
RAC: 78,777
Finland
Message 1364693 - Posted: 5 May 2013, 7:14:53 UTC

Hoorray got just 5 GPU WU. Too bad that will only last for about 3 minutes.
____________

Sten-Arne
Volunteer tester
Send message
Joined: 1 Nov 08
Posts: 3590
Credit: 20,783,091
RAC: 23,953
Sweden
Message 1364726 - Posted: 5 May 2013, 9:42:17 UTC

I have almost filled my cache to the max (100) with APs now, for my Q8200/ ATI HD7870 machine. At the moment I have 73 APs onboard. 100 APs won't last long though, since the HD7870 is doing 2 AP's in around 1 hour, depending on % of blanking of course.

The Q8200 CPU is doing BETA only, and on Beta there is no problems getting WUs.

____________

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (83) Server Problems?

Copyright © 2014 University of California