The Saga Begins (LotsaCores 2.0)

Message boards : Number crunching : The Saga Begins (LotsaCores 2.0)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

AuthorMessage
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812439 - Posted: 25 Aug 2016, 11:00:42 UTC - in response to Message 1812349.  

That's pretty much my situation right now too, the duct up to what is essentially a bedroom is just not sized properly for this. For a bedroom, no issues, for 2 space heaters like these, well, not so much. ;-) So I can understand completely what you did, makes total sense to me!

ID: 1812439 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812584 - Posted: 25 Aug 2016, 21:42:19 UTC
Last modified: 25 Aug 2016, 22:11:59 UTC

I just got off the phone with Supermicro, and have seemingly solved the initial BIOS issues, but then immediately ran into other problems, within Windows itself, as well as in the Precision X16 software as well. After making the suggested change in the BIOS, it apparently freed up more motherboard resources and no longer gives the PCI-E error, but when Windows booted up, it came up with a couple issues, first of all Precision came up saying Too many devices, so it appears to throw up with just 6 recognized. Which brings up the other issue, in GPU-Z, it showed all 7 cards again, but this time 6 of the 7 had info in them instead of just 2, and only one had 0. I went into device manager, and when I pulled up the video cards, one had a ! next to it, and said that there weren't enough resources to enable it.

The guy at Supermicro said that it was most likely a internal Windows issue, as the BOIS error is no longer popping up during bootup, and didn't have any idea as to what I might want to disable to clear out more room in memory to at least get all 7 seen by the OS. I was a little skeptical, seeing as the board has 10 identical slots, and is designed to operate with them all full. So, I know I'm pushing the envelope here, does anyone have any experience with this type of error? The bigger bummer is that I can't control them all in Precision, I'd like to bump up the fan speed in them, and OC them a little bit if possible.

*edit* Did a bit of Googling around, and the best I could come up with doing a quick search was Getting 7 GPU's to Work with Windows 7 Rig on Reddit. And he was running AMD cards. Supposedly Windows 8 will support it, with a performance hit. He had to finally go to Linux, as well as other tweaks to get it running. No time for such foolishness now, I will have to do more searching before I decide to either grudgingly admit defeat and just run with 6, or to try to push he envelope further.

ID: 1812584 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1812622 - Posted: 26 Aug 2016, 0:06:13 UTC - in response to Message 1812584.  

Sounds like progress ... maybe two steps forward and one back. I would suggest you contact Red-Ray author of System Information Viewer (SIV) and ask him whether SIV can control the fans on 7 Nvidia GTX 750's. I like using it for my system fan control. I'm positive it can do at least 4 cards since I've seen his system screenshots with 4 cards installed. A quick look online shows posts in BitCoin and Folding forums saying up to 8 Nvidia cards are possible in x64 or x86. Only 4 AMD cards possible.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1812622 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812664 - Posted: 26 Aug 2016, 3:19:33 UTC - in response to Message 1812622.  

Thanks Keith. I just got the word from the EVGA support a few minutes ago, about the ability to control more than 4, even if it was in a completely unsupported version. Here is their reply:

Hello Alan,

As per your request, I went ahead and asked a senior level technician, and one of the software engineers about your special case. Both confirmed that the software will not work with anything more than four cards. We do apologize for any inconvenience, but this software is intended for a maximum of four cards.

Please let us know if you have any questions or concerns.

Regards,

EVGA Support


Oh well, I tried. Oh, and here are a couple pics of the system, I finally got it moved downstairs.



ID: 1812664 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1812681 - Posted: 26 Aug 2016, 4:11:27 UTC

I'm impressed. Quite nice definition of a "Frankenstein". What about the other programs to try controlling the cards. I'm not sure if you are trying to overclock them or just control the fans. Can Speedfan fulfill your fan control needs? Can MSI Afterburner? Give SIV a try for example.

The issue with Windows is where it maps the video card apertures into available system memory. You might just run out of address space. You would have to look at one 750's system properties and the Resources tab to see how much memory footprint one card takes. Then multiply by seven and see if it can fit in the Windows memory space. Are you having to use the resistor loading trick on the cards to get them seen by Windows?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1812681 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1812690 - Posted: 26 Aug 2016, 5:12:39 UTC - in response to Message 1812681.  

The issue with Windows is where it maps the video card apertures into available system memory. You might just run out of address space. You would have to look at one 750's system properties and the Resources tab to see how much memory footprint one card takes.

I just ran into a problem like this on one of my machines when I tried to upgrade a GTX 660 to a GTX 960. Being a 32-bit Win7 box, it technically only has about 3.5GB memory available to begin with and it appears the GPUs are requiring a lot of "Hardware Reserved" memory for some reason. (None of my other boxes have that issue.) The GTX960 increased that reserved amount by 256MB, just enough to shrink my available memory to a point where it started causing some of the S@H apps to compete with each other for that limited resource. I went back to the 660, at least until I can do some more research.

Now, your system is vastly different from mine, but it might be useful for you to run Resource Monitor and take a look at the Hardware Reserved amount shown on the Memory tab, even if it just eliminates that as a possible problem.
ID: 1812690 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812709 - Posted: 26 Aug 2016, 6:25:18 UTC - in response to Message 1812681.  

Thanks. I think. lol Well, as Precision is a bust, I ended up installing Afterburner, and that allows me to change things, though I haven't found a way to switch between the different cards, there is no drop down to change them. Also, I added a cc_config.xml file to the BOINC dir so it will use all the GPUS that the system initialized, it is:
<cc_config> 
  <options> 
   <start_delay>120</start_delay>
   <Use_All_GPUs>1</Use_All_GPUs> 
  </options> 
</cc_config>


and when it starts up, I get an error message. I thought that this format was correct with the use all gpu's, am I missing something? For giggles, I tossed in a 980Ti to see what happened when I had 8 cards in there, after enabling and disabling a number of combos of cards after windows came up (I watched in device manager as it tried to do it's thing, was kind of interesting to see), it ended up disabling 2 750's, and left 5 750s and the 980 running.

8/26/2016 1:15:33 AM |  | Unrecognized tag in cc_config.xml: <Use_All_GPUs>
8/26/2016 1:15:33 AM |  | Starting BOINC client version 7.6.33 for windows_x86_64
8/26/2016 1:15:33 AM |  | log flags: file_xfer, sched_ops, task
8/26/2016 1:15:33 AM |  | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8
8/26/2016 1:15:33 AM |  | Data directory: C:\ProgramData\BOINC
8/26/2016 1:15:33 AM |  | Running under account Coreman
8/26/2016 1:15:37 AM |  | CUDA: NVIDIA GPU 0: GeForce GTX 980 Ti (driver version 372.54, CUDA version 8.0, compute capability 5.2, 4096MB, 3066MB available, 7271 GFLOPS peak)
8/26/2016 1:15:37 AM |  | CUDA: NVIDIA GPU 1 (not used): GeForce GTX 750 Ti (driver version 372.54, CUDA version 8.0, compute capability 5.0, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | CUDA: NVIDIA GPU 2 (not used): GeForce GTX 750 Ti (driver version 372.54, CUDA version 8.0, compute capability 5.0, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | CUDA: NVIDIA GPU 3 (not used): GeForce GTX 750 Ti (driver version 372.54, CUDA version 8.0, compute capability 5.0, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | CUDA: NVIDIA GPU 4 (not used): GeForce GTX 750 Ti (driver version 372.54, CUDA version 8.0, compute capability 5.0, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | CUDA: NVIDIA GPU 5 (not used): GeForce GTX 750 Ti (driver version 372.54, CUDA version 8.0, compute capability 5.0, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | OpenCL: NVIDIA GPU 0: GeForce GTX 980 Ti (driver version 372.54, device version OpenCL 1.2 CUDA, 6144MB, 3066MB available, 7271 GFLOPS peak)
8/26/2016 1:15:37 AM |  | OpenCL: NVIDIA GPU 1 (ignored by config): GeForce GTX 750 Ti (driver version 372.54, device version OpenCL 1.2 CUDA, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | OpenCL: NVIDIA GPU 2 (ignored by config): GeForce GTX 750 Ti (driver version 372.54, device version OpenCL 1.2 CUDA, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | OpenCL: NVIDIA GPU 3 (ignored by config): GeForce GTX 750 Ti (driver version 372.54, device version OpenCL 1.2 CUDA, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | OpenCL: NVIDIA GPU 4 (ignored by config): GeForce GTX 750 Ti (driver version 372.54, device version OpenCL 1.2 CUDA, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM |  | OpenCL: NVIDIA GPU 5 (ignored by config): GeForce GTX 750 Ti (driver version 372.54, device version OpenCL 1.2 CUDA, 2048MB, 1967MB available, 1606 GFLOPS peak)
8/26/2016 1:15:37 AM | SETI@home | Found app_info.xml; using anonymous platform
8/26/2016 1:15:37 AM |  | Host name: LotzaCores2dot0
8/26/2016 1:15:37 AM |  | Processor: 56 GenuineIntel Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz [Family 6 Model 63 Stepping 2]
8/26/2016 1:15:37 AM |  | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 dca pbe fsgsbase bmi1 smep bmi2
8/26/2016 1:15:37 AM |  | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00)
8/26/2016 1:15:37 AM |  | Memory: 31.89 GB physical, 63.78 GB virtual
8/26/2016 1:15:37 AM |  | Disk: 447.03 GB total, 369.44 GB free
8/26/2016 1:15:37 AM |  | Local time is UTC -5 hours
8/26/2016 1:15:37 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8073456; resource share 100
8/26/2016 1:15:42 AM | SETI@home | General prefs: from SETI@home (last modified 03-Apr-2013 23:59:56)
8/26/2016 1:15:42 AM | SETI@home | Computer location: home
8/26/2016 1:15:42 AM | SETI@home | General prefs: no separate prefs for home; using your defaults
8/26/2016 1:15:42 AM |  | Preferences:
8/26/2016 1:15:42 AM |  | max memory usage when active: 16327.17MB
8/26/2016 1:15:42 AM |  | max memory usage when idle: 31021.62MB
8/26/2016 1:15:42 AM |  | max disk usage: 100.00GB
8/26/2016 1:15:42 AM |  | (to change preferences, visit a project web site or select Preferences in the Manager)
8/26/2016 1:15:42 AM |  | Suspending computation - initial delay
8/26/2016 1:15:43 AM | SETI@home | Started upload of blc2_2bit_guppi_57403_69166_HIP11048_0004.3478.416.21.44.144.vlar_1_0
8/26/2016 1:15:46 AM | SETI@home | Finished upload of blc2_2bit_guppi_57403_69166_HIP11048_0004.3478.416.21.44.144.vlar_1_0


ID: 1812709 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812717 - Posted: 26 Aug 2016, 6:35:02 UTC - in response to Message 1812690.  

Jeff, here is a screenshot of my resource monitor on that machine



Do you see anything out of the ordinary? I did try disabling things like extra LAN ports, both COM ports, the audio that is installed with each card, that kind of thing, but it didn't seem to help.

ID: 1812717 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1812718 - Posted: 26 Aug 2016, 6:36:27 UTC - in response to Message 1812709.  

<cc_config> 
  <options> 
   <start_delay>120</start_delay>
   <use_all_gpus>1</use_all_gpus> 
  </options> 
</cc_config>


Try that and see if it works
ID: 1812718 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1812719 - Posted: 26 Aug 2016, 6:38:09 UTC - in response to Message 1812709.  


8/26/2016 1:15:33 AM | | Unrecognized tag in cc_config.xml: <Use_All_GPUs>


Syntax is wrong. No caps in any cc_config.xml parameter. Should be:

<use_all_gpus>1</use_all_gpus>
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1812719 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1812720 - Posted: 26 Aug 2016, 6:42:19 UTC

Without the cc_config option, BOINC will always just use the most powerful card in the system and ignore the rest.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1812720 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812722 - Posted: 26 Aug 2016, 6:57:15 UTC - in response to Message 1811856.  

Cliff, I just added that command line to the mb_ file, I'll let you know how it goes. Should I be running only one task at a time now or should I load them up with 2-3?

ID: 1812722 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1812723 - Posted: 26 Aug 2016, 6:57:19 UTC

Looks like a pull down selection for cards at the top of the MSI Afterburner properties, General tab. I see a fan tab too. Right underneath the card selection pull down, there is a "synchronize settings for similar graphics processors" which is exactly the case you have with all GTX 750's.


Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1812723 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812725 - Posted: 26 Aug 2016, 6:59:51 UTC - in response to Message 1812720.  

Wow, didn't know that it was so _particular_.. ;-) Changed it, now it is seeing all of them! Excellent Smithers! Thanks

ID: 1812725 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812727 - Posted: 26 Aug 2016, 7:02:55 UTC - in response to Message 1812723.  
Last modified: 26 Aug 2016, 7:05:26 UTC

Oh, ok, gotcha, it lists a 750 on the front screen, but I noticed that it is downclocking the 980 to the same speed as the 750s. I OC'ed them a little, but want to do a separate OC for the 980, can I unlink it from them? I know it's possible to be that granular in Precision, but as that doesn't work for me here, this is what I have to work with.




ID: 1812727 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812731 - Posted: 26 Aug 2016, 7:13:20 UTC - in response to Message 1812723.  
Last modified: 26 Aug 2016, 7:20:28 UTC

Got the 980 OC'd independently. Thanks! Oh, and here is the current SIV screenshot as it is running right now. Notice how my MegaCruncher breaks the screen? lol It isn't set up for so many cores at once I guess, and I couldn't make the window any bigger so it just ran off the side. Maybe I have to crank my res way up? This is running off the on board video, but I suppose it should support 1600x1200, though that is so small, kinda a pain to read, compared to 1280x1024.

ID: 1812731 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1812733 - Posted: 26 Aug 2016, 7:17:58 UTC

Al, I don't see the downclock in your GPU-Z captures. The cards are running at their respective GPU Boost speeds .... which you have no control over. The cards run at whatever speed they think they can and stay within their engineering limits. Things that will limit the amount of GPU Boost is mainly temps and voltage. To raise the card engineering limits, you have to use a program to raise the power target and temperature limits and bump the card voltage in incremental voltage bin slices.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1812733 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1812736 - Posted: 26 Aug 2016, 7:22:25 UTC - in response to Message 1812722.  

Should I be running only one task at a time now or should I load them up with 2-3?

I found with my GTX 750Tis using the supplied command line that while there was a boost in the number of WUs crunched per hour when running 2 or 3 at a time, is was only a very slight one, and not really worth it as when running a Arecibo & a Guppie on the same card can result in a big slow down for the Arecibo WU and a bit of a longer runtime for the Guppie.
I also had a "Finish file present too long error" when running multiple WUs, not a one when running just 1 WU at a time.
Grant
Darwin NT
ID: 1812736 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812737 - Posted: 26 Aug 2016, 7:24:37 UTC - in response to Message 1812733.  

Hmm, Here is the shot off of the 980 in my other cruncher



I guess it isn't hugely different, I must have been thinking about the 1080, which is running at 2k, I thought the 980 was up there a bit but didn't check before I posted, and thought it was significantly higher than the 'lowly' 750... ;-)

ID: 1812737 · Report as offensive
Al Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Avatar

Send message
Joined: 3 Apr 99
Posts: 1682
Credit: 477,343,364
RAC: 482
United States
Message 1812738 - Posted: 26 Aug 2016, 7:25:59 UTC - in response to Message 1812736.  
Last modified: 26 Aug 2016, 7:29:55 UTC

Ok, thanks, currently running one task at a time, I'll leave it that way and see how things progress.

*edit* good gawd, it's after 2, I gotta hit the hay! lol lots to do tomorrow, err, today. thanks for all your help everyone, I think it's running about as good as it's going to get, and just checked the Kill-a-Watt meter, with all of that running, it's pulling 825 watts out of the wall. Interesting to see what it's RAC stabilizes at in a week or 2.

ID: 1812738 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

Message boards : Number crunching : The Saga Begins (LotsaCores 2.0)


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.