Message boards :
Number crunching :
LotzaCores and a GTX 1080 FTW
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · 11 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
The daily credit values for the past few days on their 48 core host are: 34,506 36,640 46,096 62,035 35,202 39,718 35,010 35,898 44,786. Which averages out to 41099 for the past 9 days. And those single day boosts to Credit would have been a result of AP splitting. I noticed when Al got his first batch of AP work, there were MB & AP WUs that each took approx 11,000secs to crunch. The MB WU paid just under 100, the AP WUs paid just on 500 (although the usual difference between AP & MB was more like 4 to 1). Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Looking at the stats screen in BOINC, I'm currently sitting at around 26,000 on this machine, but I have had a couple hiccups over the last week, first with the project downtime, and the 2nd when I temporarily had to borrow this network cable for another computer, but forgot to switch it back for about 18 or so hours, and hence this guy sat idling for the majority of that time, the cache was probably empty within 2-3 hours as has been the case I've noticed during the maint outages. The graph has leveled off a bit, but hopefully will keep increasing slowly as we get past those two 'breaks'. No more loafing! On another note, this machine has been running nonstop since 5/29 per the Event log, not sure how many lines that is, but it hasn't crashed it yet, will be interesting to see just how unlimitied 'unlimited' really is... :-) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . Thanks for that clarification. I guess I missed the halcyon days of MB6 :( |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Well, couple quick updates on this machine, looks like it is leveling off at about 33k RAC, which is a little lower than I had hoped for, but it is what it is. Also, just got the confirmation today that the 1080 FTW shipped, I will be getting it Friday, so I will be able to install it this weekend. This brings up which client to run, it looks like the beta client seems to be working wonders for people, but the config is a little confusing. If I were to go that route, is there a typical setup yet when installing it? Or would I be able to get some suggestions as to the config files? I would like to process work as efficiently as possible, and this 48 core machine should be a very good candidate to do it on. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
If I were to go that route, is there a typical setup yet when installing it? Or would I be able to get some suggestions as to the config files? I would like to process work as efficiently as possible, and this 48 core machine should be a very good candidate to do it on. Hopefully others will be along that know, but i'm pretty sure the GTX 1080 has 20CUs (Compute Units in OpenCL terms), and for maximum performance at this stage the best option is the OpenCL SoG application and giving it 1 CPU core for each GPU WU being crunched. I'd suggest 3WUs to start with & see how it goes- there's a good chance 4 or even 5 may actually give more work per hour, but start with 3 to get a baseline. Mike is the one for help with configuration settings, and I suspect the one that's best for the GTX 980Ti would probably be best (or very close to it) for the GTX 1080 for getting the most from the SoG application. Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Well, had a small incident a few minutes ago, I accidentally kicked the cord out of the wall on this machine, and after it rebooted, my event log shows this: 6/23/2016 8:25:35 AM | | Starting BOINC client version 7.6.22 for windows_x86_64 6/23/2016 8:25:35 AM | | log flags: file_xfer, sched_ops, task 6/23/2016 8:25:35 AM | | Libraries: libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8 6/23/2016 8:25:35 AM | | Data directory: C:\ProgramData\BOINC 6/23/2016 8:25:35 AM | | Running under account Flash 6/23/2016 8:25:35 AM | | No usable GPUs found 6/23/2016 8:25:35 AM | SETI@home | Found app_info.xml; using anonymous platform 6/23/2016 8:25:35 AM | | Host name: LotzaCores 6/23/2016 8:25:35 AM | | Processor: 48 GenuineIntel Intel(R) Xeon(R) CPU E5-2692 v2 @ 2.20GHz [Family 6 Model 62 Stepping 4] 6/23/2016 8:25:35 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes f16c rdrandsyscall nx lm avx vmx smx tm2 dca pbe fsgsbase smep 6/23/2016 8:25:35 AM | | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 6/23/2016 8:25:35 AM | | Memory: 31.97 GB physical, 63.93 GB virtual 6/23/2016 8:25:35 AM | | Disk: 424.70 GB total, 347.19 GB free 6/23/2016 8:25:35 AM | | Local time is UTC -5 hours 6/23/2016 8:25:35 AM | | Config: GUI RPCs allowed from: 6/23/2016 8:25:35 AM | | Zoom-PC 6/23/2016 8:25:35 AM | | Config: event log limit disabled 6/23/2016 8:25:35 AM | | Config: use all coprocessors 6/23/2016 8:25:35 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8012837; resource share 100 6/23/2016 8:25:40 AM | SETI@home | General prefs: from SETI@home (last modified 03-Apr-2013 23:59:56) 6/23/2016 8:25:40 AM | SETI@home | Computer location: home 6/23/2016 8:25:40 AM | SETI@home | General prefs: no separate prefs for home; using your defaults 6/23/2016 8:25:40 AM | | Preferences: 6/23/2016 8:25:40 AM | | max memory usage when active: 16367.02MB 6/23/2016 8:25:40 AM | | max memory usage when idle: 31097.34MB 6/23/2016 8:25:40 AM | | max disk usage: 100.00GB 6/23/2016 8:25:40 AM | | (to change preferences, visit a project web site or select Preferences in the Manager) 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | [error] no project URL in task state file 6/23/2016 8:25:40 AM | SETI@home | Sending scheduler request: To fetch work. 6/23/2016 8:25:40 AM | SETI@home | Requesting new tasks for CPU 6/23/2016 8:25:41 AM | SETI@home | Scheduler request completed: got 0 new tasks 6/23/2016 8:25:41 AM | SETI@home | Not sending work - last request too recent: 114 sec 6/23/2016 8:29:58 AM | SETI@home | Message from task: 0 6/23/2016 8:29:58 AM | SETI@home | Computation for task 13dc10ae.17649.23799.6.33.68_1 finished 6/23/2016 8:29:58 AM | SETI@home | Starting task 13jn10aa.30773.1707.5.32.172_0 6/23/2016 8:30:00 AM | SETI@home | Started upload of 13dc10ae.17649.23799.6.33.68_1_0 6/23/2016 8:30:10 AM | SETI@home | Finished upload of 13dc10ae.17649.23799.6.33.68_1_0 6/23/2016 8:30:46 AM | SETI@home | Sending scheduler request: To fetch work. 6/23/2016 8:30:46 AM | SETI@home | Reporting 1 completed tasks 6/23/2016 8:30:46 AM | SETI@home | Requesting new tasks for CPU 6/23/2016 8:30:48 AM | SETI@home | Scheduler request completed: got 1 new tasks 6/23/2016 8:30:50 AM | SETI@home | Started download of 13dc10ae.10893.21345.10.37.167 6/23/2016 8:30:52 AM | SETI@home | Finished download of 13dc10ae.10893.21345.10.37.167 I shut down and restarted BOINC and the errors are still appearing during startup, but it is running tasks currently and looks to have downloaded a couple as well, so am I safe to assume that this is just a one off event, or is trouble brewing due to this accident? Thank guys. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
I have seen this many times with a computer crash, for me I usually get disconnected with seti, and need to reattach the project. BONIC, should sort it all out in 2 hours and give you a full cache again. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Yeah, just took a quick look at it, it appears that everything is running normally again. One thing that caught my eye, was that the temps on the CPUs appear to be running 3-4 degrees cooler than previously. From around 40 to 50, with most 45 and below. I don't understand how a reboot could have effected that, maybe it's because it's cooler today? Or maybe the moon is aligned with mars rising, and I dropped something on my foot and hopped around a few times may have added to the voodoo? Who knows, I'll take it though. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
IT's because you were running around the room moving wires and creating extra air flow to your open box. You should add this into your morning exercise routine :)) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Ahh, _that's_ what it was. I think I'll find another way to exercise, do this too many times and things will get corrupted and then bad juju for me... ;-) I actually am getting more than enough exercise right now, I'm building my brewery/shed, and that keeps me more than physically busy enough. Especially pouring the concrete, that'll whip you into shape, or kill you... lol It's not that I don't already have enough on my plate, I can't help myself, sadly. Summer is so short up here, I need to get done as much as I can cram in, and I have to say that SETI for me I think will be more of a wintertime endeavor, though I seem to be managing to fit it in after dark often. Sleep? Naa, it's optional, right? :-) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Well, today's the day, just checked UPS, it says it's out for delivery. Now is the time to figure out how I will proceed. Hopefully I will be able to get some good advice from Mike as was mentioned previously, as I'd really like to get the most I can out of this setup. Looks like the version 368.39 which has been out since 6/7 is the latest and greatest, any reason not to go with that one? Should be an exciting day! |
Bernie Vine Send message Joined: 26 May 99 Posts: 9954 Credit: 103,452,613 RAC: 328 |
Hopefully I will be able to get some good advice from Mike as was mentioned previously, Mike is currently on what you guys would probably call a "road trip" across your fair land. Whilst he does check in you may have to wait a little longer than usual. And unless he carries all those command lines in his head.... ;-) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Gotcha, then I'll be patient, and maybe see if I can fumble thru it myself. I mean, how hard can it be, right? ;-) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Well, after a couple false starts, I got the card recognized and as I had cleared my cache I installed the beta lunatics client. It started running tasks, so I am up and running. It processed a few tasks really quickly. Here is my startup log file, not sure if there are any issues with it so if you see anything let me know. Once it has ran for a little bit, I think I want to load it up with at least 2-3 concurrent tasks and see how it goes. 6/25/2016 12:47:57 AM | | Starting BOINC client version 7.6.22 for windows_x86_64 6/25/2016 12:47:57 AM | | log flags: file_xfer, sched_ops, task 6/25/2016 12:47:57 AM | | Libraries: libcurl/7.45.0 OpenSSL/1.0.2d zlib/1.2.8 6/25/2016 12:47:57 AM | | Data directory: C:\ProgramData\BOINC 6/25/2016 12:47:57 AM | | Running under account Flash 6/25/2016 12:47:58 AM | | CUDA: NVIDIA GPU 0: GeForce GTX 1080 (driver version 368.39, CUDA version 8.0, compute capability 6.1, 4096MB, 3044MB available, 9523 GFLOPS peak) 6/25/2016 12:47:58 AM | | OpenCL: NVIDIA GPU 0: GeForce GTX 1080 (driver version 368.39, device version OpenCL 1.2 CUDA, 8192MB, 3044MB available, 9523 GFLOPS peak) 6/25/2016 12:47:58 AM | SETI@home | Found app_info.xml; using anonymous platform 6/25/2016 12:47:58 AM | | Host name: LotzaCores 6/25/2016 12:47:58 AM | | Processor: 48 GenuineIntel Intel(R) Xeon(R) CPU E5-2692 v2 @ 2.20GHz [Family 6 Model 62 Stepping 4] 6/25/2016 12:47:58 AM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes f16c rdrandsyscall nx lm avx vmx smx tm2 dca pbe fsgsbase smep 6/25/2016 12:47:58 AM | | OS: Microsoft Windows 7: Ultimate x64 Edition, Service Pack 1, (06.01.7601.00) 6/25/2016 12:47:58 AM | | Memory: 31.97 GB physical, 63.93 GB virtual 6/25/2016 12:47:58 AM | | Disk: 424.70 GB total, 342.74 GB free 6/25/2016 12:47:58 AM | | Local time is UTC -5 hours 6/25/2016 12:47:58 AM | | Config: GUI RPCs allowed from: 6/25/2016 12:47:58 AM | | Zoom-PC 6/25/2016 12:47:58 AM | | Config: event log limit disabled 6/25/2016 12:47:58 AM | | Config: use all coprocessors 6/25/2016 12:47:58 AM | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 8012837; resource share 100 6/25/2016 12:47:58 AM | SETI@home | General prefs: from SETI@home (last modified 03-Apr-2013 23:59:56) 6/25/2016 12:47:58 AM | SETI@home | Computer location: home 6/25/2016 12:47:58 AM | SETI@home | General prefs: no separate prefs for home; using your defaults 6/25/2016 12:47:58 AM | | Reading preferences override file 6/25/2016 12:47:58 AM | | Preferences: 6/25/2016 12:47:58 AM | | max memory usage when active: 16367.02MB 6/25/2016 12:47:58 AM | | max memory usage when idle: 31097.34MB 6/25/2016 12:47:58 AM | | max disk usage: 100.00GB 6/25/2016 12:47:58 AM | | (to change preferences, visit a project web site or select Preferences in the Manager) 6/25/2016 12:47:58 AM | | Can't resolve hostname in remote_hosts.cfg: Zoom-PC Here is the link to this computer if you'd like to take a closer look at it. |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Took a look this morning at the rig, everything appears to be working properly, so I just created a app_config.xml file, assigned one core per WU, and am running 4 right now to see how well it goes. It is running it's first 4 SOG tasks, they just passed 25% at 6 mins 30 secs. Not sure how this compares to others running it, but I will let it run like this for a while to see how it shakes out. If anyone has suggestions about optimizing it now that I have the card installed, I am all ears! :-) |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
One last thing, I just installed GPU-Z and checking it out, here is a reading from it: Date 2016-06-25 10:09:20 GPU Core Clock [MHz] 2037.5 GPU Memory Clock [MHz] 1133.2 GPU Temperature [°C] 36.0 Fan Speed (%) [%] 100 Fan Speed (RPM) [RPM] 2691 Memory Used [MB] 1524 GPU Load [%] 99 Memory Controller Load [%] 52 Video Engine Load [%] 0 Bus Interface Load [%] 8 Power Consumption [% TDP] 34.1 PerfCap Reason [] 4 VDDC[V] 1.0500 I guess my only question seeing this would be about the GPU load bouncing between 97-100% running 4 tasks, though all other stats look pretty good. I have it overclocked to 2037 with no voltage tweaks, and it's running at 36 degrees, which is WAY better then my other cards, especially my 980Ti, which is about 25-30 degrees warmer than that. Should I be concerned about the GPU load when all the other stats are looking do good? |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
I just set it this way for the time being, so I can keep it under optimal temps, at least for the time being. I will tone it down a little bit in a few days and see how the temps react. I'm pretty suprised at how quiet this thing is even at running 100%. Either one of my 950s or my 750s (can't remember which one) when cranked up to 100%, which I think was up to 4500 rpm, and Man, now That is loud. Sounds like a mosquito in your ear, but louder. That's why I really didn't even notice it running flat out, being as quiet as it is. But, if you read their marketing blurb: EVGA ACX 3.0 fans use double ball bearings, which offer 4X longer lifespan than the sleeve bearing fans used by competitors. The oil that is used in sleeve bearing fans makes them vulnerable and prone to failure after time when the oil dries up. Upgrade to EVGA ACX 3.0 and your card will go the distance! So, it will apparently last pretty much forever! lol |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13722 Credit: 208,696,464 RAC: 304 |
EVGA ACX 3.0 fans use double ball bearings, which offer 4X longer lifespan than the sleeve bearing fans used by competitors. Yeah, ball bearings as so much better than sleeve bearings. But given the GPU temp is 36°c(!) i'd probably run the fans at 75% & see what temperatures you get. 55°c or lower would be good. I guess my only question seeing this would be about the GPU load bouncing between 97-100% running 4 tasks I wouldn't worry about it. Generally running at 97% or so load will give more throughput than when running at 100%. The fact it hits 100% occasionally indicates it's probably running the optimal number of WUs for crunching the most work per hour. Give it a few days, keep an eye on the run times to see what's normal for each type of WU, then give 5 WUs a go & see if the increase in crunching time gives more WUs/ per hour or less. Give it a week or 2 to settle down & make sure everything is stable, then try a few of the suggested optimisations, or wait for Mike to return from his trip. Grant Darwin NT |
Al Send message Joined: 3 Apr 99 Posts: 1682 Credit: 477,343,364 RAC: 482 |
Took a look at it's stats this morning, and noticed that it had Validation inconclusive (37), which I don't believe is a huge deal, but more concerning is the Error (2) that has now shown up. Is this something to be concerned about, or might it be a one off issue? Probably hard to say, but I will be keeping a closer eye on it, because I don't remember having many of these errors on my other machines, and from the day this one was set up, till I installed the new 1080, it had 0 errors. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
Well, that depends on what the errors were, doesn't it? Starting from Error tasks for computer 8012837, you can click on Workunit 2194316903 Workunit 2194241877 Of the 7 replications of those two jobs currently visible, six are allocated to Raistmer's r3430 (stock) builds, or your r3472 (anonymous platform) build. Three have failed with ERROR: Possible wrong computation state on GPU, host needs reboot or maintenance In Beta message 58680 Eric Korpela wrote: I'm beginning to think we should double the size of the "GPU sanity check" threshold on subsequent versions. and Raistmer wrote: I'll block this check in next build. That exchange was about a week ago, and AFAIK there hasn't been a 'next build' yet. In other words, the OpenCL SoG/sah pair are still unfinished business, and as yet unproven - especially on hardware we don't have full crunching experience with yet. Expect the unexpected, and be ready to investigate it. I'll be watching for the outcome of the sole CUDA representative in the second WU, to see what it makes of that job. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.