The Saga Begins (LotsaCores 2.0)

Message boards : Number crunching : The Saga Begins (LotsaCores 2.0)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · Next

AuthorMessage
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814915 - Posted: 4 Sep 2016, 0:56:50 UTC - in response to Message 1814914.  

Just wondering, on my machine with the 2 1060's, the current command line is

-sbs 256 -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32

and as it's been running for a week or so now, since it seems to be fairly stable with it's current settings, would it make sense to tweak the -sbs setting a little? I just bumped it up to 768 on the 56 core machine a few minutes ago, and it did make a difference in it's 'laggyness', but as all it usually does is sit there and crunch, other when I need to make the occasional post here in the forum from it, I find that lag an acceptable tradeoff as long as it increases performance.

Not sure with this CPU and these cards if bumping it up a notch or 2 is wise, but please let me know your thoughts. Thanks!

*Edit* Also am only running one task per card on it, would it make sense to move it to 2, or does SoG like 1 at a time better?



You can bump it up to 2 if you like. If the 1060 machine isn't doing anything but crunching, you can also try the -sbs 768 on it as well. Let it run for a week and see how it does
ID: 1814915 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814917 - Posted: 4 Sep 2016, 1:14:53 UTC - in response to Message 1814915.  

Ok, adding the cc_config right now, and should I also add the -hp in behind the 768 too? Lastly, SoG wouldn't benefit from a whole CPU core (instead of 0.04) assigned to each GPU, or would it? Thanks Zalster!

ID: 1814917 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814918 - Posted: 4 Sep 2016, 1:17:40 UTC - in response to Message 1814917.  
Last modified: 4 Sep 2016, 1:17:50 UTC

Al,

What are you adding the cc_config.xml for?

app_config.xml is where you can increase the number of work units and also if you like you can put the commandline there.

The addition of the -hp -high_perf -high_perc_timer would probably be beneficial
ID: 1814918 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814919 - Posted: 4 Sep 2016, 1:25:36 UTC - in response to Message 1814918.  
Last modified: 4 Sep 2016, 1:26:43 UTC

Duh, yep, will edit the app_config...

BTW, my standard cc_config file is

<cc_config> 
 <options> 
  <use_all_gpus>1</use_all_gpus>
  <save_stats_days>10000</save_stats_days> 
  <max_event_log_lines>0</max_event_log_lines>
  <max_stdout_file_size>50000000</max_stdout_file_size>
 </options>
</cc_config>

as I like the chart on the stats tab to run pretty much forever, and it will also keep my event log up on the screen till I restart BOINC/ or the computer, and the stdout file size much larger than normally available before overwriting.


So, do I add the -hp -high_perf -high_perc_timer right in behind the 768 in the command line? So it would read

-sbs 768 -hp -high_perf -high_perc_timer -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32

?

ID: 1814919 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814920 - Posted: 4 Sep 2016, 1:29:05 UTC - in response to Message 1814919.  
Last modified: 4 Sep 2016, 1:32:58 UTC

You can try this

Edited..

-sbs 768 -hp -high_perf -high_perc_timer -period_iterations_num 1


and see how it does.

If it does ok, leave it. If it doesn't, then you can remove -high_perf -high_perc_timer -period_iterations_num 1 so it looks like



-sbs 768 -hp -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 -oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32

ID: 1814920 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814922 - Posted: 4 Sep 2016, 1:32:31 UTC - in response to Message 1814920.  

Ok, will give that a shot. Any thoughts about CPU cores and SoG?

ID: 1814922 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814923 - Posted: 4 Sep 2016, 1:34:41 UTC - in response to Message 1814922.  

Al, I corrected the commandline, forgot -period_iterations_num 1

You should have 1 CPU core for each SoG work unit plus 2 free, so if 2 per card and 2 cards then you should have 6 cores total. 4 for the 4 GPU work units, 1 for the OS and 1 to feed the cards.

How many cores does that chip have?
ID: 1814923 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814924 - Posted: 4 Sep 2016, 1:39:11 UTC - in response to Message 1814923.  

Chip has 6 physical, 12 HT cores. I will bump it up to 2 tasks per card, one core per task.

Do I put the -period_iterations_num 1 at the end of the command line, does it matter where it goes?

ID: 1814924 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814925 - Posted: 4 Sep 2016, 1:42:34 UTC - in response to Message 1814924.  
Last modified: 4 Sep 2016, 1:43:30 UTC

put it before the -high_perf -high_perc_timer

Al, you are still running an older SoG on that machine. Have you considered updating it to r3500?
ID: 1814925 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814926 - Posted: 4 Sep 2016, 1:48:55 UTC - in response to Message 1814925.  

OK, it's now

-sbs 768 -hp -period_iterations_num 1 -high_perf -high_perc_timer -spike_fft_thresh 2048 -tune 1 64 1 4 -oclfft_tune_gr 256 -oclfft_tune_lr 16 - oclfft_tune_wg 256 -oclfft_tune_ls 512 -oclfft_tune_bn 32 -oclfft_tune_cw 32

That look good?

ID: 1814926 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1814927 - Posted: 4 Sep 2016, 1:56:26 UTC - in response to Message 1814926.  

Just keep an eye on things. When I tried running more than 1 WU at a time on my GTX 750Ti & GTX 1070 on my i7 I got a "Finish file present too long" error. No problems with only 1 WU at a time.
You've already had several on the CPU WUs, just keep any eye out that you don't end up getting even more.
Grant
Darwin NT
ID: 1814927 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814928 - Posted: 4 Sep 2016, 1:56:54 UTC - in response to Message 1814926.  

Looks like it worked,

9/3/2016 8:50:59 PM |  | Starting BOINC client version 7.6.33 for windows_x86_64
9/3/2016 8:50:59 PM |  | log flags: file_xfer, sched_ops, task
9/3/2016 8:50:59 PM |  | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8
9/3/2016 8:50:59 PM |  | Data directory: C:\ProgramData\BOINC
9/3/2016 8:50:59 PM |  | Running under account Booter
9/3/2016 8:50:59 PM |  | CUDA: NVIDIA GPU 0: GeForce GTX 1060 6GB (driver version 368.81, CUDA version 8.0, compute capability 6.1, 4096MB, 3050MB available, 4698 GFLOPS peak)
9/3/2016 8:50:59 PM |  | CUDA: NVIDIA GPU 1: GeForce GTX 1060 6GB (driver version 368.81, CUDA version 8.0, compute capability 6.1, 4096MB, 3050MB available, 4698 GFLOPS peak)
9/3/2016 8:50:59 PM |  | OpenCL: NVIDIA GPU 0: GeForce GTX 1060 6GB (driver version 368.81, device version OpenCL 1.2 CUDA, 6144MB, 3050MB available, 4698 GFLOPS peak)
9/3/2016 8:50:59 PM |  | OpenCL: NVIDIA GPU 1: GeForce GTX 1060 6GB (driver version 368.81, device version OpenCL 1.2 CUDA, 6144MB, 3050MB available, 4698 GFLOPS peak)
9/3/2016 8:50:59 PM | SETI@home | Found app_info.xml; using anonymous platform
9/3/2016 8:51:00 PM |  | Host name: X58-DualGTX1060
9/3/2016 8:51:00 PM |  | Processor: 12 GenuineIntel Intel(R) Core(TM) i7 CPU       X 980  @ 3.33GHz [Family 6 Model 44 Stepping 2]
9/3/2016 8:51:00 PM |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes syscall nx lm vmx tm2 pbe
9/3/2016 8:51:00 PM |  | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)
9/3/2016 8:51:00 PM |  | Memory: 5.99 GB physical, 11.98 GB virtual
9/3/2016 8:51:00 PM |  | Disk: 148.95 GB total, 118.19 GB free
9/3/2016 8:51:00 PM |  | Local time is UTC -5 hours
9/3/2016 8:51:00 PM |  | Config: event log limit disabled
9/3/2016 8:51:00 PM |  | Config: use all coprocessors
9/3/2016 8:51:00 PM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8064025; resource share 100
9/3/2016 8:51:05 PM | SETI@home | General prefs: from SETI@home (last modified 03-Apr-2013 23:59:56)
9/3/2016 8:51:05 PM | SETI@home | Computer location: home
9/3/2016 8:51:05 PM | SETI@home | General prefs: no separate prefs for home; using your defaults
9/3/2016 8:51:05 PM |  | Preferences:
9/3/2016 8:51:05 PM |  | max memory usage when active: 3067.59MB
9/3/2016 8:51:05 PM |  | max memory usage when idle: 5828.42MB
9/3/2016 8:51:05 PM |  | max disk usage: 74.47GB
9/3/2016 8:51:05 PM |  | (to change preferences, visit a project web site or select Preferences in the Manager)
9/3/2016 8:51:05 PM | SETI@home | Started upload of ap_21ja16aa_B5_P1_00084_20160903_26472.wu_0_0
9/3/2016 8:51:07 PM | SETI@home | Finished upload of ap_21ja16aa_B5_P1_00084_20160903_26472.wu_0_0
9/3/2016 8:51:11 PM | SETI@home | Sending scheduler request: To fetch work.
9/3/2016 8:51:11 PM | SETI@home | Reporting 1 completed tasks
9/3/2016 8:51:11 PM | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
9/3/2016 8:51:13 PM | SETI@home | Scheduler request completed: got 0 new tasks
9/3/2016 8:51:13 PM | SETI@home | Not sending work - last request too recent: 109 sec


this is my startup, looking good so far, and not impossibly laggy, just annoyingly so, but I don't care, as I don't use this machine other than to crunch. :-)

ID: 1814928 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814929 - Posted: 4 Sep 2016, 1:58:05 UTC - in response to Message 1814927.  

Ok, will this appear pretty quickly do you think?

ID: 1814929 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814930 - Posted: 4 Sep 2016, 1:59:43 UTC - in response to Message 1814929.  

Should know after the first few work units
ID: 1814930 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814931 - Posted: 4 Sep 2016, 2:01:11 UTC - in response to Message 1814930.  

K, I will keep my eyes on it, bummer that the WU's take 3+ hours... lol

ID: 1814931 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814932 - Posted: 4 Sep 2016, 2:03:55 UTC - in response to Message 1814931.  

You have a run of AP going thru so once those clear we should see how the MB work.

The work units on the GPU shouldn't be taking 3 hours.
ID: 1814932 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1814933 - Posted: 4 Sep 2016, 2:04:19 UTC - in response to Message 1814929.  
Last modified: 4 Sep 2016, 2:05:36 UTC

Ok, will this appear pretty quickly do you think?

Too random to say.
You might get a few early on, or it might not be till it's been crunching away for most of a day.
The count for that machine at the moment is 6.
1 on Sept 1
2 on Sept 2
3 on Sept 3

I'd check it 24 hours from now & see if there are any more, and if there are more than 3.
If you're not getting any more errors than now, and running 2 at a time gives you more WUs per hour than running 1 WU then i'd stick with it.
If you're getting more errors, you'd need to figure out if the loss of those credits is offset by the increase in work done per hour.
Until this particular problem with BOINC is fixed, that's about all you can do.
Grant
Darwin NT
ID: 1814933 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1814934 - Posted: 4 Sep 2016, 2:12:26 UTC - in response to Message 1814931.  
Last modified: 4 Sep 2016, 2:20:54 UTC

K, I will keep my eyes on it, bummer that the WU's take 3+ hours... lol


http://setiathome.berkeley.edu/result.php?resultid=5135014869
Run time 8 min 27 sec
CPU time 6 min 2 sec

can't tell if you are running 2 at a time or not

edit
http://setiathome.berkeley.edu/result.php?resultid=5135021876
Run time 10 min 56 sec
CPU time 10 min 38 sec


BLC VLAR
http://setiathome.berkeley.edu/result.php?resultid=5134994540
Run time 20 min 48 sec
CPU time 20 min 29 sec

Edit 2

So running 2 at a time, looks like you shaved about 3-4 minutes combined off BLC. For the nonvlar maybe 2 minutes combined time.

What do you think Al?
ID: 1814934 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1814956 - Posted: 4 Sep 2016, 5:20:51 UTC - in response to Message 1814934.  
Last modified: 4 Sep 2016, 5:25:10 UTC

Hmm, well, first of all, the CPU tasks were/are about 3-4 hours, at least that is what it said on the elapsed/remaining timer, but the GPU appears like it definitely made an improvement, at the expense of some system responsiveness, but since I don't care about that as it is unmanned 24 hours a day most days, as long as it is processing well, then I am more than happy with the changes. I will keep an eye on it, as from your post, it appears that since I added the additional commands to the file, as well as bumping it from 1 to 2 tasks and assigning a full CPU instead of .04 CPU's to each task, things have markedly improved. My RAC was still in the climb mode with it, so I will watch for the curve to hopefully get a little steeper for a while, and see how far it advances before it levels off. Thanks much for your help tonite Zalster, I'll let it run for a week or so, keeping an eye out for unhappy tasks piling up, but otherwise, well worth the work this evening it appears. :-) Now let's see what a pair of those 1060's can really do!

ID: 1814956 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1815006 - Posted: 4 Sep 2016, 13:39:48 UTC - in response to Message 1814925.  
Last modified: 4 Sep 2016, 14:03:18 UTC

put it before the -high_perf -high_perc_timer

-high_prec_timer : Windows-only. Attempts to improve Windows multimedia timer resolution. May result in smaller Sleep quantum.
In turn this would allow finer-grain sleep with less performance degradation (if any) with -use_sleep option.

(I always wonder why people type cmdline switches instead of Copy/Paste from ReadMe ...)


Unfortunately the app don't tell e.g. "Unrecognised switch ignored: -high_perc_timer"
http://setiathome.berkeley.edu/result.php?resultid=5136314645


You can also search/scan/find in the .exe itself if some switch is present.
This is direct Copy/Paste from inside of MB8_win_x86_SSE3_OpenCL_NV_SoG_r3500.exe (as seen by F3 View in Total Commander):

-v Verbose level set to:%d
-v N incorrect option use:argument needed
-hp -HP -rtp -RTP -high_prec_timer System timer will be set in high resolution mode
-cpu_lock -CPU_lock CPU affinity adjustment enabled
-no_cpu_lock CPU affinity adjustment disabled
-cpu_lock_fixed_cpu -CPU_lock_fixed_CPU CPU affinity adjustment enabled, fixed CPU %d will be used
-persistent_threads


You can see e.g. that some switches are accepted in several forms (different CASE) like -cpu_lock -CPU_lock but other are not (-no_cpu_lock)
 
 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1815006 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · Next

Message boards : Number crunching : The Saga Begins (LotsaCores 2.0)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.