Panic Mode On (105) Server Problems?

Message boards : Number crunching : Panic Mode On (105) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 25 · 26 · 27 · 28 · 29 · 30 · 31 . . . 34 · Next

AuthorMessage
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1862669 - Posted: 21 Apr 2017, 8:40:25 UTC - in response to Message 1862667.  

You will get the 'Reached Limit' message if ANY one of your caches is full. It's a server bug.
ID: 1862669 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1862671 - Posted: 21 Apr 2017, 8:54:41 UTC - in response to Message 1862669.  

You will get the 'Reached Limit' message if ANY one of your caches is full. It's a server bug.

It's also the message you get when requesting GPU work and only Arecibo VLARs are available.
Grant
Darwin NT
ID: 1862671 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862681 - Posted: 21 Apr 2017, 11:40:42 UTC - in response to Message 1862596.  
Last modified: 21 Apr 2017, 11:51:23 UTC

Currently have been in a Arecibo VLAR storm for the past couple of hours. Been unable to maintain my allotment of GPU tasks with the restrictions on sending to Nvidia GPU's. So, my question is why are we restricting Arecibo VLAR being sent to Nvidia GPU's but it is entirely OK to send BLC VLAR to Nvidia GPU's. Same low angle ranges. What's the difference?


. . As a contrast guppis take about 50 to 100% longer than an Arecibo normal AR task on a GPU, but Arecibo VLARs take even longer. Running SoG 2 at a time on my 1050ti a NARA would take about 20 to 22 mins, BLC2/3/02/03/13/6/7 would take about 32 mins and Blc5/05 would take about 36 to 38 mins, but Arecibo VLARs would take about 40 to 44 mins roughly. But as I understand it, in the "old days" Arecibo VLARs would play havoc with the earlier versions of CUDA and cause massive disruptions of the rigs running that, so they were blocked from being sent to Nvidia GPUs. Now with SoG there seems to be no such problem, well none that I have observed, nor Zalster or indeed others I believe. But the old block is still in place. Since there are still many rigs out there running older GPUs and still running CUDA42 and CUDA50 I guess they will have to leave that block in place for a while longer yet. But I have in the past, with the help of reschedulers, moved such tasks onto the GPU during WU famines to keep them crunching without issues. So if you want to set up a CPU queue to get a cache of such tasks there is no reason you cannot move them onto your GPUs running SoG.

Has anybody tested Arecibo VLAR's on the SoG app at Beta for example? Maybe the SoG app doesn't have issues with the Arecibo VLAR's like it used to have with the CUDA apps. I really would like the restriction removed if possible.


. . The heck with Beta, I have run a few dozen of them on my GPUs with SoG in main. With complete success. Raistmer has tested them extensively with SoG I believe and I recall Zalster saying he has crunched many of them as well. In fact if I could get the rescheduler to work under Linux I would be emptying my CPU queue by moving them all into the GPU queue and ending CPU crunching on Mi-Burrito because they also work OK under CUDA80 I believe. But I think you are whistling in the wind if you want that restriction removed, not while there are still so many "passive" hosts out there still running older apps. The effects would be very counter productive from what I understand.

Stephen

:)
ID: 1862681 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34748
Credit: 261,360,520
RAC: 489
Australia
Message 1862682 - Posted: 21 Apr 2017, 11:44:01 UTC

No probs getting work here. ;-)

But it would be nicer if I could get paired with faster wingmen. :-D

Cheers.
ID: 1862682 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862686 - Posted: 21 Apr 2017, 11:55:32 UTC - in response to Message 1862610.  

Raistmer would know. You could try sending a PM and see what he says.


. . Hmmm I can recall you telling me that you had run many Arecibo VLARs on your GPUs using SoG, or was that all just hot air? :)

Stephen

?? :)
ID: 1862686 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862687 - Posted: 21 Apr 2017, 12:02:29 UTC - in response to Message 1862617.  
Last modified: 21 Apr 2017, 12:03:04 UTC

Last time I check on any Beta work, they were trying to tailor the SoG for low end GPUs. I didn't do much of those work units since I gave away my last 750 some time back. There was a thought to release uncompressed work units and we tested those. No Problem but they decided to continue with current sized work units.


. . I think the doubling of the transfer load on schedulers was the consideration that terminated that effort, while those tasks (4 bit WUs) crunched in about the same time as 2 bit units the d/l files were twice the size. I was disappointed because the theory behind them was they would allow a doubling of the precision/resolution of the crunching process. But network limitations will prevail.

Stephen

:(
ID: 1862687 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862688 - Posted: 21 Apr 2017, 12:22:57 UTC - in response to Message 1862647.  


I'm actually looking for GBT VLAR's to crunch. I want them for the Ryzen CPU since they crunch the fastest. I actively reschedule the BLC's that get assigned to the GPU's over to the CPU and take whatever normal Arecibo tasks that get assigned to the CPU and move them to the GPU's. That way each processor gets to work on the most efficient type of task.

. . With you all the way on that one. I have even considered changing my nick to Stephen the Heretic :)

I can do the BLC VLAR's on the Ryzen in under an hour compared to the Arecibo CPU task which takes 1-1/4 to 1-1/2 hours. The BLC task on the GPU take 12-15 minutes two up while the Arecibo normal AR tasks get done in 4-8 minutes two up.

. . On my i5 with the GTX950 I can do BLC tasks on the CPU in 50 to 58 mins (except for BLC5/05s that take about 65 mins), Arecibo VLARs in about 72 to 75 mins and normal AR Arecibo tasks in about 78 to 83 mins. So I am also lamenting the lack of guppis for that machine. On the 950 (2 up) NARAs take 22 to 25 mins, BLCs take 30 to 35 (Blc5 take about 40) and Arecibo VLAR take about 45 and up, so I reserve that for times of desperation.

But without any BLC tasks on my system, there is nothing to reschedule. I'm down to 80 GPU tasks on Numbskull now. Should be 200.


. . We can only work with what the schedulers send us :(

Stephen

<shrug>
ID: 1862688 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862689 - Posted: 21 Apr 2017, 12:31:23 UTC - in response to Message 1862659.  

Yes, still having issues getting work. Down to under 70 tasks now on Numbskull. I couldn't pinpoint the exact time that this problem became very noticeable. It may have been after the outage. I got tasks in small dribs and drabs, not the usual 20 tasks in one shot that I get after an outage once the request queue gets reduced as all the crunchers slowly get refilled. It is has become a case of attrition where there seems to be nothing but Arecibo VLARs coming out of the splitters and they don't get assigned to Nvidia GPU's. CPU's have been at full cache level since the outage since there is no restriction of type.


. . If it cheers you up Mr Kevvy has indicated he was contemplating modifying the rescheduler to allow the movement of Arecibo VLARs and Guppis from the CPU q to the GPU q. Of course, he did add the caveat about finding the time ... :(

Stephen

??
ID: 1862689 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862690 - Posted: 21 Apr 2017, 12:38:09 UTC - in response to Message 1862682.  

No probs getting work here. ;-)

But it would be nicer if I could get paired with faster wingmen. :-D

Cheers.


. . Curses on slow wingmen, those pending queues are growing out of control :)

Stephen

:)
ID: 1862690 · Report as offensive
Profile JaundicedEye
Avatar

Send message
Joined: 14 Mar 12
Posts: 5375
Credit: 30,870,693
RAC: 1
United States
Message 1862717 - Posted: 21 Apr 2017, 15:26:18 UTC - in response to Message 1862690.  

No probs getting work here. ;-)

But it would be nicer if I could get paired with faster wingmen. :-D

Cheers.


. . Curses on slow wingmen, those pending queues are growing out of control :)

Stephen

:)
Slow wing men are at least returning results, a larger problem I find are the 'ghost' wing men. They appear, are assigned work then disappear.........

Some I'm sure are new to BOINC and try SETI but don't continue for whatever reason(slows processing due to incorrect settings, etc.) innocent newbies. Another group, I suspect, are testing new builds or installing on public machines without permission and when discovered are removed.

These ghosts cause timeouts delaying consensus results and slowing overall processing efficiency. Nothing can be done, they're just another nuisance that exists in everyone's work Que.

"Sour Grapes make a bitter Whine." <(0)>
ID: 1862717 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1862723 - Posted: 21 Apr 2017, 15:43:17 UTC - in response to Message 1862686.  

Raistmer would know. You could try sending a PM and see what he says.


. . Hmmm I can recall you telling me that you had run many Arecibo VLARs on your GPUs using SoG, or was that all just hot air? :)

Stephen

?? :)


I probably did but that was more than a year ago by now, hard to remember what we were doing back then. I'm sure if I spent the time going thru the thread I could find them but I have more important things to do this weekend, lol. Whatever the reason was, they decided not to allow them on GPUs on Main. Maybe that will change but for now it is what it is....
ID: 1862723 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1862730 - Posted: 21 Apr 2017, 17:08:34 UTC - in response to Message 1862681.  


. . The heck with Beta, I have run a few dozen of them on my GPUs with SoG in main. With complete success. Raistmer has tested them extensively with SoG I believe and I recall Zalster saying he has crunched many of them as well. In fact if I could get the rescheduler to work under Linux I would be emptying my CPU queue by moving them all into the GPU queue and ending CPU crunching on Mi-Burrito because they also work OK under CUDA80 I believe. But I think you are whistling in the wind if you want that restriction removed, not while there are still so many "passive" hosts out there still running older apps. The effects would be very counter productive from what I understand.

Stephen

:)

Well that answers my question I guess. Still haven't heard anything back yet from my PM to Raistmer. I guess I can only hope that Mr. Kevvy does indeed find the time to modify the rescheduler to allow moving of tasks both ways.

My issue is compounded on Numbskull because even though I have been told the schedulers don't make atomic moves based on specific platforms or that it has been simple coincidence of what the splitters have been spewing out ...... but I still suspect that what Numbskull has been sent has in fact been tailored told the Ryzen processor. Maybe it is so new the schedulers don't know about it or how to handle it or something. I have been receiving an almost 100% mix of Arecibo non-VLAR or VLARs since it went online. I also seem to get a preponderance of XEON processor wingmen.

But I have had to run the rescheduler against that machine every couple of hours compared to twice a day normally on the FX systems. But without any BLC VLARs getting assigned to the GPU's, there is nothing to move.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1862730 · Report as offensive
Ghia
Avatar

Send message
Joined: 7 Feb 17
Posts: 238
Credit: 28,911,438
RAC: 50
Norway
Message 1862740 - Posted: 21 Apr 2017, 18:39:05 UTC

Have received a handful of BLC VLARs throughout today....som assigned to the CPU and some to the GPU. Atm, there are 21 assigned to SoG and 4 to v8.22 (stock apps) in progress and 10 more done. Maybe the 780Ti counts as a low end GPU, and therefore gets the BLC VLARs ?
Humans may rule the world...but bacteria run it...
ID: 1862740 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1862749 - Posted: 21 Apr 2017, 19:10:18 UTC

Down to 50 tasks on Numbskull's dual 1070's. Unless something changes with the servers, I will be out of GPU tasks by the end of the day.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1862749 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1862750 - Posted: 21 Apr 2017, 19:13:34 UTC

Still no sign of any BLC tasks in any download. All requests for the last hour have received only non-VLAR Arecibo tasks.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1862750 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1862764 - Posted: 21 Apr 2017, 21:00:20 UTC

Raistmer says yes the Arecibo VLARs work fine on the Nvidia GPU with some fine tuning to allow for PC usability.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1862764 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1862765 - Posted: 21 Apr 2017, 21:06:08 UTC

Hallelujah, the floodgates have finally opened. Received almost the full quota of GPU tasks. Almost entirely Arecibo non-VLAR. I wonder if my last toggle in Preferences to shut off AP did the trick or just a lucky hit to the splitter buffers over a half hour period.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1862765 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1862798 - Posted: 21 Apr 2017, 23:14:55 UTC - in response to Message 1862765.  
Last modified: 21 Apr 2017, 23:28:36 UTC

Hallelujah, the floodgates have finally opened. Received almost the full quota of GPU tasks. Almost entirely Arecibo non-VLAR. I wonder if my last toggle in Preferences to shut off AP did the trick or just a lucky hit to the splitter buffers over a half hour period.

Just checked my systems this morning. The faster one that after installing AP was keeping it's cache full (after always running low), is down to about 20 GPU WUs left in the cache. First time since installing the AP application.
Playing with the application settings now.


EDIT- and bang! just like that, 50 WUs on the next Scheduler request after getting the changed settings in the Manager.

I also notice there is no GBT work, I think all the GBT splitters have frozen. I didn't make a note last night of what files were how far through, but the page looks pretty much the same now as it did then- 3 files at the top just started, one at the bottom almost finished.
I think Centurion is in need of a re-boot.
Grant
Darwin NT
ID: 1862798 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862803 - Posted: 21 Apr 2017, 23:53:51 UTC - in response to Message 1862717.  

No probs getting work here. ;-)

But it would be nicer if I could get paired with faster wingmen. :-D

Cheers.


. . Curses on slow wingmen, those pending queues are growing out of control :)

Stephen

:)
Slow wing men are at least returning results, a larger problem I find are the 'ghost' wing men. They appear, are assigned work then disappear.........

Some I'm sure are new to BOINC and try SETI but don't continue for whatever reason(slows processing due to incorrect settings, etc.) innocent newbies. Another group, I suspect, are testing new builds or installing on public machines without permission and when discovered are removed.

These ghosts cause timeouts delaying consensus results and slowing overall processing efficiency. Nothing can be done, they're just another nuisance that exists in everyone's work Que.


. . The sad thing is the numbers of them, they pop up all over the place. But like a sore toe you do tend to notice them more than the other 9, or 90,000, so their numbers probably seem greater than they actually are :)

Stephen

:)
ID: 1862803 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1862804 - Posted: 22 Apr 2017, 0:00:34 UTC - in response to Message 1862723.  

Raistmer would know. You could try sending a PM and see what he says.


. . Hmmm I can recall you telling me that you had run many Arecibo VLARs on your GPUs using SoG, or was that all just hot air? :)

Stephen

?? :)


I probably did but that was more than a year ago by now, hard to remember what we were doing back then. I'm sure if I spent the time going thru the thread I could find them but I have more important things to do this weekend, lol. Whatever the reason was, they decided not to allow them on GPUs on Main. Maybe that will change but for now it is what it is....


. . Yes time does slip past us ...

. . But getting older by the day I tend to write things down now because I know I will probably not remember otherwise :)

. . I know what you mean about finding old messages out of the many in the threads, I can never find them myself, and it takes way too long to try.

. . But as long as their old GPUs, and participants who take little active role in their machines behaviour and so run older apps, that block will probably need to stay. :(

Stephen

<shrug>
ID: 1862804 · Report as offensive
Previous · 1 . . . 25 · 26 · 27 · 28 · 29 · 30 · 31 . . . 34 · Next

Message boards : Number crunching : Panic Mode On (105) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.