Posts by jason_gee

1) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857438)
Posted 2 days ago by Profile jason_gee
Post:
Hmm, Interesting theory. Sure it has EFI rather than BIOS, though setting the system clock should conceivably do the same thing. Most likely I'll end up virtualising both Win10 and OSX (some version), and passing through alternate GPUs at leisure anyway. A lot will depend on what happens with this weekend's round of fiddling. I quite like Sierra and ElCapitan (for what they are) and will preserve their native installs. It's really only the development automation where it would be nice to be able to use near identical setups on each OS simultaneously. That'll hopefully work passing through the 980, we'll see . Worst case on failure will just mean running 3 separate machines with different quirky hardware. Either way nothing insurmountable.
2) Message boards : Number crunching : Panic Mode On (105) Server Problems? (Message 1857334)
Posted 2 days ago by Profile jason_gee
Post:
This whole discussion is mute. It's been rehashed time and time again. The only way it will ever change is IF and when Dr. A makes the decision.


. . That is so very true. All the discussion in the world will change nothing until Dr A decides to do so.

Stephen

+1


Well yes and no. Since the estimate mechanisms (which drive the whole thing) are based on an outdated programming model, at some point it's inevitable that a widely needed technological change will break the whole system, requiring some more bandaids. Those events aren't in Dr A's control, as for example happened with the introduction of GPUs. The two upcoming examples I'm thinking of would be Convolutional Neural Network approaches, and local heterogeneous clustering. The first case would have arbitrary runtimes compared to the classical model, and the second case from similar to nonsensical. As it stands, two different speed GPUs of the same class [in the same system] are already problematic for estimates.
3) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857133)
Posted 3 days ago by Profile jason_gee
Post:
Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
The last time I dealt with that situation it was easy to download the Older OS Installer if you are registered as having "purchased" your Free copy earlier. Just go to the App Store App and Download it from your Purchased Tab. If you never Downloaded the Older Installer back when it was current it's worse than pulling teeth to get a copy now. That was well over a year ago. I also found they have Time Limits on the Installers, some of my older Downloaded OS Installers don't work anymore, I had to download a new copy.


Oh well, yeah will cross that bridge if and when I need to (probably won't need to bother). For now I'm surprised that I got the IOMMU cooperating under Linux, so enough juggling for the weekend.
4) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857127)
Posted 4 days ago by Profile jason_gee
Post:
...Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
I just checked at C.A. and there is a copy of libcufft.7.5.dylib in the Alpha section bundled with the old "Baseline App". I think it has the RT file included. I haven't tested it yet, perhaps that file can be used even if it does has the old App included. The way I remember, the only problem I had running the Special App in Yosemite with the 7.5 driver was a few Overflows with the 750Ti. I don't have the 750Ti in the Mac now, so, I don't expect any problems in Yosemite. I was going to swap over to Yosemite later tonight.


Yes, will be a juggle here as well. At least I preserved my el Capitan install alongside Sierra. Unfortunately I did not preserve the Yosemite install that came on the machine, and don't have installation media for that. Are media available for older versions via Apple ? Bootcamp on el Capitan and Sierra seems to not work or be supported on this older Mac Pro, so have been using rEFInd instead (fine). Would not mind have some physical media in case of emergency anyhow (whichever versions)
5) Message boards : Number crunching : I've Built a Couple OSX CUDA Apps... (Message 1857112)
Posted 4 days ago by Profile jason_gee
Post:
Well.... I do have a nicely working copy of x41p_zi3k+ for OSX. In fact, it seems to be working slightly better than the Linux version. Just finishing testing on the CUDA 75 version so it will work with Yosemite. The problem is it will need the CUDA 7.5 Libraries, there is a noticeable difference between the CUDA 6.5 and 7.5 Libraries with the App. So, unless someone can Host the 7.5 Libraries, everyone is going to have to download the 7.5 Toolkit and Extract the Libraries. Right now setiathome_x41p_zi3k+_x86_64-apple-darwin_cuda75 is working well in El Capitan with the CUDA 8.0 driver.


Can double check the legals and look at hosting the large files on the weekend (compressed or otherwise). Wrestling with work, though will likely want to be comparing WIn/Linux/Mac performance with the 980 myself in due course.
6) Message boards : Number crunching : You have to love early Xmas and B'day presents (even if you have to pay for them yourself). (Message 1857106)
Posted 4 days ago by Profile jason_gee
Post:
Any idea if the Integer performance of the GP100 trickled through to the 1080/1080ti GP102 variants (despite no HBM)? Just updated my Matlab license for other stuff I do, and this time it came with fixed point designer...
7) Message boards : Number crunching : Panic Mode On (105) Server Problems? (Message 1857103)
Posted 4 days ago by Profile jason_gee
Post:
The one Linux host I currently have running (Actually a Mac Pro running Ubuntu with 1050ti) was stone cold, with no tasks when I awoke an hour ago. Looking just now it's picked up a mix of Arecibo and GBT work, CPU and GPU, without intervention. Looks like it was dry for a pretty long time though (rifling through the logs).
8) Message boards : Number crunching : GPU FLOPS: Theory vs Reality (Message 1856959)
Posted 5 days ago by Profile jason_gee
Post:
lel, my wallet won't allow that quite yet, but certainly will be something on the agenda down the road. I'm waiting to see if AMD / motherboard partners manage to resolve the IOMMU/virtualisation issues first. If they do, then some plans will need rejigging, and gumtree might start getting some of my old hardware.
9) Message boards : Number crunching : GPU FLOPS: Theory vs Reality (Message 1856954)
Posted 5 days ago by Profile jason_gee
Post:
Hmm, I suppose if there are ever enough 1080ti's then they might edge out top spot. Well out of my pricerange though. Could probably retire most of my machines and just beef up the Mac Pro for virtualisation.... hmmm
10) Message boards : Number crunching : From FX to Ryzen (Message 1856870)
Posted 6 days ago by Profile jason_gee
Post:
Back in SETI V7 days, we had a choice to use the SSSE3 or SSE4.1 option for AMD CPU in the Lunatics installer. The SSE4.1 ran much faster on my FX's compared to SSE3 or SSSE3. I would assume that to be likely too of Ryzen. I see that all R7 CPU models show up in the Statistics page now. No way to know whether they are running on Windows or LInux though. Yes, it would be interesting to know how they run under Linux.


Way back when I made the windows build under v7, the difference on Core2 45nM having SSE4.1, was small (a couple of percent) over ssse3, though consistent and reproducible on the core2Duos of the time. I never managed to afford an AVX capable system yet myself for development. Since the adoption of AVX, and Intel compiler becoming unusable, the situation changed a fair bit in that time (close to 10 years overall since I started), and Joe Segur hand coded AVX into stock. He's disappeared since then, and no-one has been able to precisely tell me what's happened, so we're probably looking at finding or making new CPU app resources from scratch, more or less.
11) Message boards : Number crunching : From FX to Ryzen (Message 1856724)
Posted 6 days ago by Profile jason_gee
Post:
Some of that is tweaks to parameters and systems, and some of that system bloat that may or may not apply to you. In general though, RAC is a pretty sloppy measure of performance.
12) Message boards : Number crunching : Varying GPU loads on an entry-level video card (Message 1856385)
Posted 8 days ago by Profile jason_gee
Post:
Being a lower end GPU, the display itself might be a factor (if being used), though most likely IMO the substantial driver bloat could have some impact, in particular the hidden network streaming and telemetry components that have been introduced relatively recently. These can be disabled via services, and the program 'autoruns' (Google nvidia autoruns telemetry for a guide). The system looks to be capable otherwise, so not sure I can think of other factors, though I'm sure others will chip in with more parameters if they can help with loading.
13) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856303)
Posted 8 days ago by Profile jason_gee
Post:
Greetings Jason

I seem to having trouble with the BLC workunits, I have bumped the fan up.

I am also going to reboot and see if that helps.

The invalids/errors were on a previous build.

Regards


Great. Fingers crossed nothing serious. The 780's are troopers.
14) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856300)
Posted 8 days ago by Profile jason_gee
Post:
The second host there looks like it broke entirely. It'll be unclear on why yours and the first host mismatch, though your 30 pulses might mean you need a reboot or something.

[Edit:] inconclusives/Invalids seem very high, even for the experimental apps. I'd look at doing the coolbits thing to get the fan up, and keep temps in check.
15) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856294)
Posted 8 days ago by Profile jason_gee
Post:
Maybe a better alternative description: The pulses aren't at one single 'spot', and the pulsefinding algorithm is looking for the nicest 'blob'. If you did it visually it might be less like looking through a static image, and more like a 'Enhance...Pan' like Harrison Ford from Blade Runner Intro.

[Edit:] https://youtu.be/qHepKd38pr0
16) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856292)
Posted 8 days ago by Profile jason_gee
Post:
Hi,
here is an error in the software: https://setiathome.berkeley.edu/result.php?resultid=5594593817. It happens, just not so often..

Pulse finding detects multiple 'pulses' at certain time. Something gets overwritten I guess. Hard to debug since I do not have that wu on my computer.


Superficially looks like some of those pulses in the serial CPU case would [normally] coalesce. Possibly across where you unroll periods. If so, That's likely the missing reduction I mentioned. If that's the case, the issue disappears without unroll (unroll 1), then it's just a reproduction of the race condition I've been describing. Trick will probably be to drag pulse reporting out of the kernels, and either reduce from multiple pulse tables, or alternatively put atomics on the pulse results and only update if score is higher. Not sure which is easier, but probably pulling it all out will be better for the next generation, which will be even bigger.
17) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856285)
Posted 8 days ago by Profile jason_gee
Post:

To me an unmatched overflow is that my version reports 7 autocorr and 23 spikes and the wingman reports 25 spikes and 5 autocorr or vice versa, both 30 signals, i.e. a bad packet..


Good, keep that to work with and please email also. No reason not to match CPU reference other than dirty parallelism with no reductions. [or faulty reductions...]
18) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856279)
Posted 8 days ago by Profile jason_gee
Post:
There is a Crunchers Anonymous link in the first post of this thread, the App can be found there.

Seems the recent x41p_zi3t1b build is producing unmatched Overflows on both platforms with the Arecibo tasks. So far I haven't seen any problems with the x41p_zi3k+ build, so, I guess we'll be staying with the x41p_zi3k+ build for now.


An overflow is an overflow (noisy packet). It may contain 60 spikes, autocorrelations and pulses. They are searched in different work queues on GPU. Results are checked and reported. A different implementation (parallel) can find them in any order. No order can be said best.

The rate of inconclusive results can be found here: http://setiathome.berkeley.edu/results.php?hostid=7475713&offset=0&show_names=0&state=4&appid=29

Petri

What I meant by Unmatched Overflow is that the zi3t1b App overflowed but the WingPeople didn't. Sorta like the problems with the last few versions.
http://setiathome.berkeley.edu/results.php?hostid=6796479&state=5
https://setiathome.berkeley.edu/results.php?hostid=7769537&state=5
http://setiathome.berkeley.edu/workunit.php?wuid=2471201891
Three different machines, same problem. The difference is the previous versions Overflowed on the VLARs whereas these are Arecibos.


Both Petri and I know Exactly what it is, and I beleive we have an understanding of how it's going to go.
19) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856276)
Posted 8 days ago by Profile jason_gee
Post:

You are wrong. Make your code match serial CPU on overflows or I will abandon it entirely. Your Choice.


:)


Good to see we're on the same page ;)
20) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1856274)
Posted 8 days ago by Profile jason_gee
Post:
As always, Start with reproducing under bench with precisely the same parameters against stock/win32 CPU reference, then again with unroll set to 1. Assuming you reproduce mismatch with higher unroll, and then matches without unroll, then you just confirm what we already know. The Application is fast but broken, as it does not match the serial variants. Anyone that claims it it OK to choose whatever signals they like from the full dataset, has no idea what they are talking about, and you should point them out so I can ridicule them with computer science.


a) If a packet is full of crap and the decision is made solely based on number of signals found, say 30, then it is completely irrelevant what sort of crap the packet contains.
b) If a packet has an acceptable amount of signals they must be reported correctly.
c) If a packet does not have reportable signals, finding best non reportable wastes cycles on cosmetics.

Petri


You are wrong. Make your code match serial CPU on overflows or I will abandon it entirely. Your Choice.


Next 20


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.