Message boards :
Number crunching :
Help making my 1070 rig up it's RAC to above my 1060 one
Message board moderation
Author | Message |
---|---|
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
I built a couple crunchers over the last 4-5 months, and they are sitting in the back bedroom happily crunching away. Well, I thought happily, till I took the time to pay attention to their individual performances. The machines are ID: 8064025 X58-DualGTX1060 and ID: 8170251 DualGTX1070 They are pretty similiar machines: 1070- 8 procs at 3.5GHz, 1060- 12 procs as 3.33GHz. The 1070's are 4 meg cards, the 1060's, 6 meg versions. Both running Lunatics, and have been running for at least a month, so they have stabilized pretty well by this point and are running 24x7. And.. The 1070 RAC: about 36k. The 1060 RAC: about 44k. Could someone take a look at them both and see if there is anything obvious that I had missed, because I sure would think that 2 1070's should handily outperform 2 1060's. Shouldn't they? Thanks for any ideas, guys! |
Wiggo Send message Joined: 24 Jan 00 Posts: 34744 Credit: 261,360,520 RAC: 489 |
That's easy, you're running the outdated CUDA app on those 1070's while you're running SoG on the 1060's (it's outdated also BTW). Get the latest Lunatic beta installer and try again (on both please). ;-) Cheers. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Agree with Wiggo... Should look at the new beta installer |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Thanks guys, I was hoping it was something relatively simple like that. I've been up to my arse in alligators for the last 2-3 months, and haven't been paying nearly as close attn to my SETI hobby as I normally have been for the last year. Is there anything special I need to do to uninstall them to prepare them for reinstallation? And I presume I should run SOG on both? Thanks! |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
The most recent Lunatics Beta 6 installer has version that are older than the current stock GPU apps. The stock Nvidia/Radeon 8.22/8.23 apps are r3584 . Those apps can be downloaded from the SETI@home servers or from Raistmer's cloud storage. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I built a couple crunchers over the last 4-5 months, and they are sitting in the back bedroom happily crunching away. Well, I thought happily, till I took the time to pay attention to their individual performances. The machines are First thing that caught my attention is your statement of the 1070 only having 4 GB. That made me scramble to Google to see who produces a 4GB card. No one does. All the 1070's came with 8 GB. The 4 GB you see on SETI is just its lack of being able to report only a maximum of 4 GB memory for graphics cards. Your 1070's actually have 8 GB and can be verified with GPU-Z for example. You don't have to do anything special to update your apps. Just run the Lunatics Beta-06 application and just choose the SoG application for MB since you already are running Anonymous platform. You can also download the latest SoG app R3584 from Mike's World. I find that easier than trying to find Raistmer's download site in a message in the huge support thread. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Bruce Send message Joined: 15 Mar 02 Posts: 123 Credit: 124,955,234 RAC: 11 |
Here is Raistmer's Download Page. It has most of the new apps. Like Keith says, you can get them at Mike's World also. Good Luck. Bruce |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I built a couple crunchers over the last 4-5 months, and they are sitting in the back bedroom happily crunching away. Well, I thought happily, till I took the time to pay attention to their individual performances. The machines are It's been a long time but my memory is telling me that BOINC only displays up to 4GB for Nvidia GPUs because of a limitation in the CUDA detection. Perhaps in the future BOINC can be made to use OpenCL GPU detection for Nvidia GPUs like it does for Intel and Radeon GPUs. So that limitation will no longer be an issue. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13732 Credit: 208,696,464 RAC: 304 |
I would also suggest some very aggressive command line settings to get as much out of the SOG application and GTX 1070s as possible. And probably worth doing the same for your GTX 1080/980Ti system. Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Thanks for the thoughts guys, I've been in and out of town last week and this week for training and a trade show, so it's probably going to have to wait for the weekend, but I will be updating them. Are they both the same programs from either Mikes World and Raistmer's Download Page? And suggestions on aggressive command line settings for both? I do use the 1080/980 system for occasional work (mostly web browsing stuff), but I suppose I can always suspend BOINC from running when I use this computer, because I believe it is set up to resume after 10-15 minutes if I don't manually do it. I do have Keiths rescheduling program on the 1080 system, but not on the others, so if I am running SOG, is his program not necessary, or even desireable? And for the 1060 and 1070 systems, I don't care about lagging response, as they are just sitting and crunching, aggressive command lines are fine for those 2 systems, I just want to maximize their output and see how they compare. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks for the thoughts guys, I've been in and out of town last week and this week for training and a trade show, so it's probably going to have to wait for the weekend, but I will be updating them. Are they both the same programs from either Mikes World and Raistmer's Download Page? And suggestions on aggressive command line settings for both? I do use the 1080/980 system for occasional work (mostly web browsing stuff), but I suppose I can always suspend BOINC from running when I use this computer, because I believe it is set up to resume after 10-15 minutes if I don't manually do it. I do have Keiths rescheduling program on the 1080 system, but not on the others, so if I am running SOG, is his program not necessary, or even desireable? And for the 1060 and 1070 systems, I don't care about lagging response, as they are just sitting and crunching, aggressive command lines are fine for those 2 systems, I just want to maximize their output and see how they compare. It's not my rescheduling program, rather Jimbocous who has polished the front-end to Mr. Kevvy's rescheduler. I have I think good results from this MB command line argument for my 1070's. I don't really have too noticeable system lag when I run Guppies. Only when two Guppies exit and reload on the same card do I notice some keyboard input lag. You can de-tune the command line a bit by removing the -high_perf argument and that will reduce the input lag to be un-noticeable. <cmdline>-sbs 2048 -period_iterations_num 2 -tt 1500 -high_perf -spike_fft_thresh 4096 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 64 -oclfft_tune_cw 64 -high_prec_timer</cmdline> I also run this command line argument for AP work. Of course that is seldom used case in the recent past. <cmdline>-unroll 24 -oclFFT_plan 256 16 256 -ffa_block 16384 -ffa_block_fetch 8192 -tune 1 64 8 1 -tune 2 64 8 1</cmdline> These are command lines in my app_config.xml files under the appropriate app sections. I just use the default written Lunatics app_info.xml underneath this. I control CPU and GPU usage in app_config.xml. Simpler and it doesn't need to change when new apps are installed. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Thanks for the thoughts guys, I've been in and out of town last week and this week for training and a trade show, so it's probably going to have to wait for the weekend, but I will be updating them. Are they both the same programs from either Mikes World and Raistmer's Download Page? And suggestions on aggressive command line settings for both? I do use the 1080/980 system for occasional work (mostly web browsing stuff), but I suppose I can always suspend BOINC from running when I use this computer, because I believe it is set up to resume after 10-15 minutes if I don't manually do it. I do have Keiths rescheduling program on the 1080 system, but not on the others, so if I am running SOG, is his program not necessary, or even desireable? And for the 1060 and 1070 systems, I don't care about lagging response, as they are just sitting and crunching, aggressive command lines are fine for those 2 systems, I just want to maximize their output and see how they compare. Keith how many work units per GPU with this commandline? 2048 seems a bit overaggressive. I thought Raistmer limited the max size to 1024. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks for the thoughts guys, I've been in and out of town last week and this week for training and a trade show, so it's probably going to have to wait for the weekend, but I will be updating them. Are they both the same programs from either Mikes World and Raistmer's Download Page? And suggestions on aggressive command line settings for both? I do use the 1080/980 system for occasional work (mostly web browsing stuff), but I suppose I can always suspend BOINC from running when I use this computer, because I believe it is set up to resume after 10-15 minutes if I don't manually do it. I do have Keiths rescheduling program on the 1080 system, but not on the others, so if I am running SOG, is his program not necessary, or even desireable? And for the 1060 and 1070 systems, I don't care about lagging response, as they are just sitting and crunching, aggressive command lines are fine for those 2 systems, I just want to maximize their output and see how they compare. Nope, no limit as far as I can tell. I'm running two tasks per card which uses up about 4.5GB of video memory on average. The 1070's have 8 GB available. I looked at the tune parameter readout in stderr.txt on completed work units and saw that with my command line arguments, the optimum memory buffer for BLC tasks is on average the best at 2048. Sometimes a 4096 buffer would be best with two tasks running, but that wouldn't fit. It's definitely overkill for the shorty Arecibo tasks though. The FFT tune parameters for Arecibo shorties needs only about 512-1024 kB for optimization. I run the 970's at 1024 buffer because they only have 4 GB on board. Here's an example of a BLC task FFT tune readout. Fftlength=512,pass=3:Tune: sum=10832.1(ms); min=8.97(ms); max=182.8(ms); mean=10.78(ms); s_mean=17.19; sleep=15(ms); delta=1; N=1005; usual Fftlength=1024,pass=3:Tune: sum=12192.7(ms); min=4.68(ms); max=254.1(ms); mean=6.069(ms); s_mean=10.52; sleep=0(ms); delta=1; N=2009; usual Fftlength=2048,pass=3:Tune: sum=4807.21(ms); min=1.072(ms); max=1.536(ms); mean=1.197(ms); s_mean=1.225; sleep=0(ms); delta=1; N=4017; usual Fftlength=4096,pass=3:Tune: sum=3123.5(ms); min=0.3614(ms); max=0.4547(ms); mean=0.3888(ms); s_mean=0.3833; sleep=0(ms); delta=1; N=8033; usual Fftlength=8192,pass=3:Tune: sum=1764.68(ms); min=0.1025(ms); max=0.1239(ms); mean=0.1098(ms); s_mean=0.112; sleep=0(ms); delta=1; N=16065; usual Here's an example of a Arecibo shorty FFT tune readout Fftlength=128,pass=3:Tune: sum=325.808(ms); min=1.776(ms); max=6.626(ms); mean=5.716(ms); s_mean=5.755; sleep=0(ms); delta=1; N=57; usual Fftlength=256,pass=3:Tune: sum=155.991(ms); min=1.237(ms); max=1.536(ms); mean=1.356(ms); s_mean=1.363; sleep=0(ms); delta=1; N=115; usual Fftlength=512,pass=3:Tune: sum=87.9097(ms); min=0.3574(ms); max=0.4557(ms); mean=0.3839(ms); s_mean=0.3846; sleep=0(ms); delta=1; N=229; usual Fftlength=1024,pass=3:Tune: sum=66.9434(ms); min=0.1228(ms); max=6.683(ms); mean=0.1465(ms); s_mean=0.1323; sleep=0(ms); delta=1; N=457; usual The tuning goes all the way up to 8GB buffer. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Here's an example of a BLC task FFT tune readout. Keith, How did you figure the 2048 is the best setting? I can't make head or tails out of this. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Here's an example of a BLC task FFT tune readout. As I said, I can't run the 4096 buffer with two tasks concurrently. The 1070 only has 8192 MB onboard, so not enough memory unless I drop to 1 task running. The stderr.txt shows with my 2048 setting, it will typically use about 2136~~ MB of buffer. You just look at the task timings in each tune line and look for the minimum ms runs with a delta of 1 and the highest N value. In the example above, 8192 is the fastest completion times but not enough memory on the card to use. Same case for the 4096 runs but again not enough memory to run two up on the card. So I settle for 2048 MB buffer. Raistmer explained the tuning runs at Lunatics with that post of his detailing the parameter choices and what they accomplish. Go have a read. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.