GBT ('guppi') .vlar tasks will be send to GPUs, what you think about this?

Author	Message
Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1787233 - Posted: 13 May 2016, 8:41:59 UTC - in response to Message 1787230. The are taking about 10min each for 1 instance on my Fury X. Current stock or r3430? ID: 1787233 ·

RueiKe Volunteer tester Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785	Message 1787234 - Posted: 13 May 2016, 8:56:40 UTC - in response to Message 1787233. The are taking about 10min each for 1 instance on my Fury X. Current stock or r3430? My Fury X system is using stock apps. My Nano system is running optimized apps, R3330 and is taking about 9.5min for 1 instance at a time. Is it time to upgrade to r3430? GitHub: Ricks-Lab Instagram: ricks_labs ID: 1787234 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1787240 - Posted: 13 May 2016, 9:30:23 UTC - in response to Message 1787234. The are taking about 10min each for 1 instance on my Fury X. Current stock or r3430? My Fury X system is using stock apps. My Nano system is running optimized apps, R3330 and is taking about 9.5min for 1 instance at a time. Is it time to upgrade to r3430? So some additional improvement possible. I expect better performance for latest rev in VLAR area. And yes, r3430 can be used w/o limitations. ID: 1787240 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13731 Credit: 208,696,464 RAC: 304	Message 1787242 - Posted: 13 May 2016, 9:47:46 UTC - in response to Message 1787232. Curiosity pure & simple, about why an application running on a GPU is so dependant on CPU resources. Cause GPU and CPU are different devices and GPU constitutes coprocessor, not Central Processing Unit. I'm aware of and understand that. However I remember the days of CPUs without FP units, and then came along the maths coprocessor. The CPU would offload the work to the coprocessor, the coprocessor would do the work, then return the result. Current architecture is somewhat more complicated than that of the 8086/88 days, but the principle is the same- Off load the work to the coprocessor, return the result when done. EDIT: and cause "aplication running on GPU" is misleading jargon. Application is running on CPU always. That's the way modern PC architecture works and that's why only one device in PC called CPU. That may be so, but it seemed odd to me that the GPU is unable to be setup to receive a WU, then return the result. What I didn't realise was just how limited current GPUs are when it comes to programme flow & control, and that the CPU application is initialising the GPU, sending work, getting it back, then sending more, etc, etc till the WU is done. It's not able to programme the GPU to process the WU to completion, then return the final result (I can see now why the Xeon Phi is so powerful in the applications that suit it). I see that Kepler made big advances over Fermi (being able to launch it's own threads based on the results of data processed), but it's actual programmability is still extremely limited so the CPU application can't just setup the GPU, provide the WU & get the finished result back in one go. Hence it's heavy need for CPU resources, and the more it needs those resources (the more powerful the GPU hardware) the bigger the negative impact of the latencies in GPU-CPU communications. Grant Darwin NT ID: 1787242 ·

Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489	Message 1787263 - Posted: 13 May 2016, 13:05:00 UTC It's good that I'm just running stock ATM as I can see that once I go Lunatic's (where is that MB V7 free version that we should've had to start with?) and start doing multiple tasks I can see these causing me the same problems that "Arecibo" VLAR's do. :-( Cheers. ID: 1787263 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1787285 - Posted: 13 May 2016, 14:29:51 UTC - in response to Message 1787225. Why is it that the GPUs ability to process work is so heavily dependant on the CPU? In very short: cause there are parts of control code that should be executed serially. And cause currently they executed on CPU. So while all the crunching work should be done on the GPU, but at present it isn't. And then , when communication CPU<=>GPU really decreased, "my BOINC doesn't tick " appears... So ideally that communication should be asynchronous. Every 100th, 10th (half, even whole second) the GPU outputs it's progress on the WU until it's complete at which time it sends out the result file and requests another WU. The GPU isn't waiting on any CPU responses (until it requests another WU), and the CPU can do with the progress updates whatever it needs to (update the progress bar or ignore it as too recent since last update or whatever). If while the WU is being processed the CPU doesn't receive any updates on the progress of the WU, then it initiates a call to the GPU to find out what's going on. Ah, you strongly know how it "should be", fine, we have new optimiztion team member it seems, with fresh and great ideas. Good, you can start here: https://setisvn.ssl.berkeley.edu/trac/browser/branches/sah_v7_opt All Aboard! "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1787285 ·

RueiKe Volunteer tester Send message Joined: 14 Feb 16 Posts: 492 Credit: 378,512,430 RAC: 785	Message 1787287 - Posted: 13 May 2016, 14:47:04 UTC - in response to Message 1787240. The are taking about 10min each for 1 instance on my Fury X. Current stock or r3430? My Fury X system is using stock apps. My Nano system is running optimized apps, R3330 and is taking about 9.5min for 1 instance at a time. Is it time to upgrade to r3430? So some additional improvement possible. I expect better performance for latest rev in VLAR area. And yes, r3430 can be used w/o limitations. Is there an installer available? Last time I tried to update an app manually, I messed things up... GitHub: Ricks-Lab Instagram: ricks_labs ID: 1787287 ·

Chris Adamek Volunteer tester Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236	Message 1787298 - Posted: 13 May 2016, 15:56:36 UTC - in response to Message 1787287. For anyone running 2013 Mac Pro's with D500 or D700 definitely run 3 Green Bank wu's at a time with the SoG app. There is almost no difference running in completion time running 2 at a time and only a small change running 3 at a time. D700 completes 3 in around 30 min (~10 min each) and the D500 does the same in around 35-40 min each (~13 min each). Conversely running 1 at a time on the D500's takes almost 30 minutes by its self and 20 min on the D700. I still need to try 4 at a time on the D700 to see if there is any benefit there. The probably applies to the normal AMD 7970 (32 cu's) down to whatever model has 24 cu's as well. Chris ID: 1787298 ·

zoom3+1=4 Volunteer tester Send message Joined: 30 Nov 03 Posts: 65738 Credit: 55,293,173 RAC: 49	Message 1787507 - Posted: 14 May 2016, 16:35:14 UTC On My gpu a PNY LC GTX 580 @ 857 MHz, a GBT takes about 58 minutes to do with Lunatics on cuda 42, that's close to 4 times as long as a normal gpu wu takes to do here. Temps with a GBT about 52C, with a normal gpu wu, about 62C. I wish the GBT wu's would either run faster, or just be on the cpu, at least until the times are somehow brought down by those more knowledgeable than I. The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's ID: 1787507 ·

Chris Oliver Send message Joined: 4 Jul 99 Posts: 72 Credit: 134,288,250 RAC: 15	Message 1787558 - Posted: 14 May 2016, 20:42:42 UTC I believe that releasing these workunits into the community knowing full well that software that we are issued with by the project does not play nice with these units is a spectacular cluster f*ck of a decision. When Seti's software or WU's starts impeding the ability of my PC to do other tasks I have to say enough is enough. For now I am inspecting WU's waiting to be processed and aborting any vlar's assigned to the GPU. Looking forward if this is going to be the way the project is going then I'm afraid we will be parting ways. There are plenty of other worthy projects that will run quietly in the background and not bring my PC to its knee's. ID: 1787558 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1787560 - Posted: 14 May 2016, 20:53:37 UTC As big of a pita these work units are, IMO they are the ones with the best chance so far of finding our goal. My goal is not to process the most but to find ETI. ID: 1787560 ·

Bernie Vine Volunteer moderator Volunteer tester Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328	Message 1787563 - Posted: 14 May 2016, 21:24:32 UTC Well I have several machines that I sometimes use as "crunchers only" So I don't really notice any slowdowns My main machine, which has a GTX 970, is also my daily use machine. Now, as in the past if I find a program that is slowed down by SETI@Home I make it an exclusive, so SETI@Home does not run when I am using that program. I have about 10 exclusive, programs. Simple, it is the way SETI@Home was designed to work. I leave the machine on 24/7 so it crunches when not in use, seems the best way to me. Even on my dedicated crunchers, if I am doing updates or anything different, first thing I do is deactivate BOINC, always have done. ID: 1787563 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34256 Credit: 79,922,639 RAC: 80	Message 1787575 - Posted: 14 May 2016, 22:05:13 UTC - in response to Message 1787558. I believe that releasing these workunits into the community knowing full well that software that we are issued with by the project does not play nice with these units is a spectacular cluster f*ck of a decision. When Seti's software or WU's starts impeding the ability of my PC to do other tasks I have to say enough is enough. For now I am inspecting WU's waiting to be processed and aborting any vlar's assigned to the GPU. Looking forward if this is going to be the way the project is going then I'm afraid we will be parting ways. There are plenty of other worthy projects that will run quietly in the background and not bring my PC to its knee's. You just need to check other threads to realize its not that hard to get good results on your card. Its always easier to blame others. What did you do to change that fact. With each crime and every kindness we birth our future. ID: 1787575 ·

Chris Oliver Send message Joined: 4 Jul 99 Posts: 72 Credit: 134,288,250 RAC: 15	Message 1787578 - Posted: 14 May 2016, 22:10:37 UTC - in response to Message 1787560. Hi betreger, You may be absolutely right and I have no objections to WU's running for hours but I do object to not being able to do simple tasks such as browsing a web page without major lag. The project had this issue previously and all vlar's were filtered to process on CPU's only. betreger wrote: As big of a pita these work units are, IMO they are the ones with the best chance so far of finding our goal. My goal is not to process the most but to find ETI. ____________ ID: 1787578 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1787584 - Posted: 14 May 2016, 22:29:53 UTC - in response to Message 1787578. Last modified: 14 May 2016, 22:44:57 UTC I would like to remind those who not too comfort with BOINC interface and found it's much easy to switch project than to change BOINC setup: There is "don't use GPU while PC in use" option that removes any GPU-related lags once at for all. Of course most preferably would be to specify conditions of observed lags and attempt to correct them, but I understand that not all can correctly communicate even on their own language. ID: 1787584 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1787585 - Posted: 14 May 2016, 22:33:04 UTC - in response to Message 1787578. Last modified: 14 May 2016, 22:33:18 UTC I would like to remind those who not too comfort with BOINC interface and found it's much easy to switch project than to change BOINC setup: There is "don't use GPU while PC in use" option that remover any GPU-related lags once at for all. As Raistmer stated that is the easiest way to solve your problem. ID: 1787585 ·

palmer59 Volunteer tester Send message Joined: 20 May 99 Posts: 4 Credit: 93,393,182 RAC: 66	Message 1787636 - Posted: 15 May 2016, 4:09:48 UTC For now Iâ€™m stopping my main system from getting any GPU WU. What frustrates me is that this is not the first time that something is released on this project that does not play well. You figure that it should be tried in BETA and not released until all the kinks have been worked out. Crunching four simultaneous WU on my system used to take 6-15 minutes each, now even running just two at a time takes over an hour each taxing my OC 780Ti to 99%. Mike ID: 1787636 ·

Brent Norman Volunteer tester Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835	Message 1787658 - Posted: 15 May 2016, 6:57:52 UTC I posted this in a different thread. I wonder if it would be possible to program the scheduler to keep Guppie's off of Nvidia cards UNLESS you are requesting more than (example) 2 hours of work. That way we wouldn't run dry, but be working on more tasks that are more compliant with our cards. But run them if we have to. ID: 1787658 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1787666 - Posted: 15 May 2016, 8:57:10 UTC - in response to Message 1787636. For now Iâ€™m stopping my main system from getting any GPU WU. What frustrates me is that this is not the first time that something is released on this project that does not play well. You figure that it should be tried in BETA and not released until all the kinks have been worked out. Crunching four simultaneous WU on my system used to take 6-15 minutes each, now even running just two at a time takes over an hour each taxing my OC 780Ti to 99%. Mike So what? You experience lags? You experience driver restarts? In that case describe conditions and app you use. Or you experience failure of own expectations instead? Is it so hard to understand that there are different kinds of work that should be done on this project? To allow VLAR tasks on at least part of GPU devices has the reason. It's soon be the only work that will remain. Also, for what reason you account granted with "beta tester" tag?? Where your complains and reports on beta where VLAR was allowed definitely more than month already. As some old philosopher stated, start from yourself... ID: 1787666 ·

Raistmer Volunteer developer Volunteer tester Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121	Message 1787667 - Posted: 15 May 2016, 8:58:10 UTC - in response to Message 1787590. but I understand that not all can correctly communicate even on their own language. LOL Raistmer!! At least now we all know that your day job is not as a diplomat :-) LoL :) But not so old and not SO grumpy ;) ID: 1787667 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.