Posts by jason_gee

1) Message boards : Number crunching : Microsoft Monthly Rollups (Message 1862003)
Posted 13 days ago by Profile jason_gee
Post:
Actually they started that with Win7, after the Vista disaster... and that worked. What they are doing now is clearly worshipping at the shrine of 'Big Data', which according to the book 'The Seven Pillars of Statistical Wisdom' is folly. In this context I believe what they are doing, similar to the CIA/NSA, is guaranteeing their own demise.
2) Message boards : Number crunching : Why are the benchmark results under Ubuntu/linux are nearby doubled ? (Message 1861993)
Posted 13 days ago by Profile jason_gee
Post:
Thank you very much, Jason for explaining the differences in benchmark profiling between modern Linux and Windows systems. It makes sense. Also why the apt derogatory of "Credit_Screw" is so richly deserved. :-}


Lol, since stumbling across that little chestnut (which amounts to abuse of Whetstone) generating factor of 2-8 times meaningless numbers for estimates, I'll get my little dig in at every opportunity. Can't wait for AVX512, where I will pull out the beer and popcorn :D
3) Message boards : Number crunching : Why are the benchmark results under Ubuntu/linux are nearby doubled ? (Message 1861988)
Posted 13 days ago by Profile jason_gee
Post:
The BOINC profiling benchmarks are solely CPU based. I've never heard of a reason why the very same processor benchmarks 50% faster on a Linux system compared to a Windows system. I've only concluded that the Linux system is more efficient in running CPU benchmarks compared to a Windows system simply because of less background tasks or simply better core processing improvement.


For technical background, it helped having looked at the source code for the bench, along with the history of Whetstone, in context of the hardware and OS changes since the Boinc client was originally devised.

-Whetstone originally was FPU-Serial only
- It was designed and improved to defeat common compiler 'tricks' that remove code or avoid loops
- SIMD floating-point became common with Pentium-3, with SSE, and required hand vectorisation to use
- Some dedicated SIMD versions of Whetstone were developed, though were not put in Boinc client Code.
- 64 Bit Windows requires At least SSE2 on Windows, so the compiler is geared to use that *.
---> (*This caused quite some grief trying to make 64 bit Windows applications in the past, since a lot of code used to be 80-bit FPU based)
- Compilers (especially GCC on Linux) became better at vectorising automatically, only relatively recently, either optional or by default, depending on compiler options.
--------------------
- The Boinc client code suggests the naive approach was used, specifically leaving optimisation to the compiler, meaning the bench will compile differently on every compiler+platform combination [Bad Idea for any bench code]
- For 'True' Whetstone, it should have either enforced serial scalar implementation even on SIMD, or,
-Alternatively had a way to differentiate SIMD vector length on client + apps, on the server, or,
- a bench per functional vector length.

The different vector length performance of Whetstone, 'proper' serial Vs SIMD of various vector lengths, is what drives the discrepancy . You can view this by running Sisoft Sandra Lite single threaded Whetstone benchmarks, at each of the vector lengths.

This is one of the main 3 or 4 driving design flaws with estimates that break CreditNew, especially displaying noticeable irregularities when efficient SIMD implementations were added to stock CPU application code, then again more recently when AVX was added.
4) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1859923)
Posted 23 days ago by Profile jason_gee
Post:
Cufft failures like that have a lot of possible causes only really some monitoring can diagnose further. Fingers crossed the nvml support (as used by the nvidia-smi command) will work under the new drivers for Pascal when they materialise. If so I'll just whip up a lightweight (cross platform) standalone monitoring utility, and put some basic monitoring in the apps. There are some basic failures due to bugs or other that may be caught with things like Petri and I discussed in the Linux thread, however there is still some juggling likely across all platforms now. That's since Vulkan and Vulkan over metal is starting to show its teeth in places, and Supposedly (rumoured) Apple wants VR support, demanding high end GPUs
5) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1859891)
Posted 24 days ago by Profile jason_gee
Post:
Nvidia is finally releasing Pascal drivers for the Mac. Guess it's time to start thinking about an upgrade...

https://blogs.nvidia.com/blog/2017/04/06/titan-xp/

Chris


Wooo, That'll make things way easier
6) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1859160)
Posted 28 days ago by Profile jason_gee
Post:
Hi,
Here is an interesting one: http://setiathome.berkeley.edu/workunit.php?wuid=2488511762
The SoG has the same kind of error that my version has. I'd like to know if R. finds a cure for that - it might help me too.

Petri


used to run a 560ti, and early factory OC models were shipped with insufficient core voltage by default. They also tend to get pretty toasty. My feeling is we'll have to eventually embed some monitoring (e.g. NVML sensors during run) and possibly some lightweight spotchecks.

Another potentially handy thing where you use padding, might be to use 0xDEADDEAD instead of zeroes, then throw in some extra threads with a conditional, such that the extras look for the hex value, and either set a flag or throw an exception on corruption detection. Not exactly rigorous, but low cost and better than nothing.

[I plan something along those lines for the generic version, more oriented to the automated tuning, however that's further off.]
7) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1858906)
Posted 29 days ago by Profile jason_gee
Post:
Hi, nearly.
Run time with 1080 is 140+ seconds and with ti it is 107 seconds. That is for vlar. See my results.


Once we get to 12 seconds then we're obsolete, since that is the task observation time, and it might be cheaper to crowdfund 1080tis to Berkeley for realtime analysis, for Arecibo/multibeam anyway.
8) Message boards : Number crunching : Panic Mode On (105) Server Problems? (Message 1858873)
Posted 29 days ago by Profile jason_gee
Post:
looks like I'm out of beer.... Going to go stock up to ride this through.
9) Message boards : Number crunching : Panic Mode On (105) Server Problems? (Message 1858858)
Posted 29 days ago by Profile jason_gee
Post:
Lol, well best of plans... had been trying to get the last tasks on my Linux/Mac host to report, so I can switch hardware and do things. It seems they have finally reported a few minutes ago. I guess the server hamsters got the delivery of replacement rubber bands OK
10) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1858503)
Posted 31 Mar 2017 by Profile jason_gee
Post:
slightly closer to diving in. Will see if this weekend gets as throttled as last...
11) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857917)
Posted 27 Mar 2017 by Profile jason_gee
Post:
heh, spike overflows at 0 chirp. That's weird, since that's just a copy, fft and powerspectrum (normally). Perhaps there's a silent failure or incomplete kernel launch. Will give something to poke at early in the piece.
12) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857898)
Posted 27 Mar 2017 by Profile jason_gee
Post:
Hmmm, will need to remember to check if the pulsefind launches have a tail on the unroll, in case the number of periods is not divisible by pfPeriodsPerLaunch * unroll . If it's missing a tail, or there is a bug in it, then yes a lower value would be less likely to cause an issue. Ultimately we'll need to stick some timers around those launches, and aim for < ~20ms or so between synchs. Probably Linux/Mac will adopt the Windows style driver optimisations (such as kernel fusion in the streams), which means automatic scaling there may become necessary sooner rather than later.
13) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857858)
Posted 27 Mar 2017 by Profile jason_gee
Post:
The usual delays my end, though am running down the Linux/Mac host at the moment for the GPU switchover and passthrough experiments. Will likely end up updating the alpha in svn to Petri's latest supplied sources, at least once the 980 is operational natively in Ubuntu and Sierra and/or El Capitan. Platform cross comparison and migration to Gradle automation will occur gradually, a pulsefind reduction patch likely priority before integration (depending on the inner changes).
14) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857438)
Posted 24 Mar 2017 by Profile jason_gee
Post:
Hmm, Interesting theory. Sure it has EFI rather than BIOS, though setting the system clock should conceivably do the same thing. Most likely I'll end up virtualising both Win10 and OSX (some version), and passing through alternate GPUs at leisure anyway. A lot will depend on what happens with this weekend's round of fiddling. I quite like Sierra and ElCapitan (for what they are) and will preserve their native installs. It's really only the development automation where it would be nice to be able to use near identical setups on each OS simultaneously. That'll hopefully work passing through the 980, we'll see . Worst case on failure will just mean running 3 separate machines with different quirky hardware. Either way nothing insurmountable.
15) Message boards : Number crunching : Panic Mode On (105) Server Problems? (Message 1857334)
Posted 24 Mar 2017 by Profile jason_gee
Post:
This whole discussion is mute. It's been rehashed time and time again. The only way it will ever change is IF and when Dr. A makes the decision.


. . That is so very true. All the discussion in the world will change nothing until Dr A decides to do so.

Stephen

+1


Well yes and no. Since the estimate mechanisms (which drive the whole thing) are based on an outdated programming model, at some point it's inevitable that a widely needed technological change will break the whole system, requiring some more bandaids. Those events aren't in Dr A's control, as for example happened with the introduction of GPUs. The two upcoming examples I'm thinking of would be Convolutional Neural Network approaches, and local heterogeneous clustering. The first case would have arbitrary runtimes compared to the classical model, and the second case from similar to nonsensical. As it stands, two different speed GPUs of the same class [in the same system] are already problematic for estimates.
16) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857133)
Posted 23 Mar 2017 by Profile jason_gee
Post:
Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
The last time I dealt with that situation it was easy to download the Older OS Installer if you are registered as having "purchased" your Free copy earlier. Just go to the App Store App and Download it from your Purchased Tab. If you never Downloaded the Older Installer back when it was current it's worse than pulling teeth to get a copy now. That was well over a year ago. I also found they have Time Limits on the Installers, some of my older Downloaded OS Installers don't work anymore, I had to download a new copy.


Oh well, yeah will cross that bridge if and when I need to (probably won't need to bother). For now I'm surprised that I got the IOMMU cooperating under Linux, so enough juggling for the weekend.
17) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857127)
Posted 23 Mar 2017 by Profile jason_gee
Post:
...Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
I just checked at C.A. and there is a copy of libcufft.7.5.dylib in the Alpha section bundled with the old "Baseline App". I think it has the RT file included. I haven't tested it yet, perhaps that file can be used even if it does has the old App included. The way I remember, the only problem I had running the Special App in Yosemite with the 7.5 driver was a few Overflows with the 750Ti. I don't have the 750Ti in the Mac now, so, I don't expect any problems in Yosemite. I was going to swap over to Yosemite later tonight.


Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
18) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857112)
Posted 22 Mar 2017 by Profile jason_gee
Post:
Well.... I do have a nicely working copy of x41p_zi3k+ for OSX. In fact, it seems to be working slightly better than the Linux version. Just finishing testing on the CUDA 75 version so it will work with Yosemite. The problem is it will need the CUDA 7.5 Libraries, there is a noticeable difference between the CUDA 6.5 and 7.5 Libraries with the App. So, unless someone can Host the 7.5 Libraries, everyone is going to have to download the 7.5 Toolkit and Extract the Libraries. Right now setiathome_x41p_zi3k+_x86_64-apple-darwin_cuda75 is working well in El Capitan with the CUDA 8.0 driver.


Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
19) Message boards : Number crunching : You have to love early Xmas and B'day presents (even if you have to pay for them yourself). (Message 1857106)
Posted 22 Mar 2017 by Profile jason_gee
Post:
Any idea if the Integer performance of the GP100 trickled through to the 1080/1080ti GP102 variants (despite no HBM)? Just updated my Matlab license for other stuff I do, and this time it came with fixed point designer...
20) Message boards : Number crunching : Panic Mode On (105) Server Problems? (Message 1857103)
Posted 22 Mar 2017 by Profile jason_gee
Post:
The one Linux host I currently have running (Actually a Mac Pro running Ubuntu with 1050ti) was stone cold, with no tasks when I awoke an hour ago. Looking just now it's picked up a mix of Arecibo and GBT work, CPU and GPU, without intervention. Looks like it was dry for a pretty long time though (rifling through the logs).


Next 20


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.