Posts by ivan


log in
21) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1564988)
Posted 30 Aug 2014 by Profile ivan
Similar spec to my new server -- I got one half of it running today. Here's a view from the back showing the two redundant power supplies in the middle and one of the server slides. Hasn't got any GPUs (yet... :-) but there's 128 GB of RAM in each server.

Second half is now up and running. I turned hyperthreading off to see what the difference might amount to.

I will be interested to see how it compares to my 12c/24t machine. I limit it to 20 tasks at a time.

Actually, I am running a 12-core machine without Hyperthreading (it's the server for my Xeon Phi, which is unfortunately unsuited to s@h). Its RAC is about 30% less than yours, which is probably partly explained by the 17% lower clock but also by Linux/Windows differences. With two identical machines, down to the smallest detail, running one with and one without HT perhaps I can convince myself which way is best. At the moment, the second machine is one day younger and a bit behind -- but it managed to grab a large bunch of AP jobs and at one point was crunching 18 of them at once; at 15 hours or so each and 14 still pending, that left a bit of a hole in its earnings so far.
22) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1563772)
Posted 28 Aug 2014 by Profile ivan
Similar spec to my new server -- I got one half of it running today. Here's a view from the back showing the two redundant power supplies in the middle and one of the server slides. Hasn't got any GPUs (yet... :-) but there's 128 GB of RAM in each server.

Second half is now up and running. I turned hyperthreading off to see what the difference might amount to.
23) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1563267)
Posted 27 Aug 2014 by Profile ivan
I guess this is the spot to brag! I brought this beast online a few weeks ago after years of laptop-only crunching.Included are 2 Xeon E5-2690 V2s (20 cores, 40 threads), a 1250 W Seasonic PSU, 2 8GB Crucial ECC DIMMs, 2 EVGA GTX 780Ti Superclocked GPUs in SLI, 2 Samsung 850 Pro SSDs in RAID 1, and a Blu-Ray burner, all running on an ASUS Z9 P8-DE WS motherboard.

Similar spec to my new server -- I got one half of it running today. Here's a view from the back showing the two redundant power supplies in the middle and one of the server slides. Hasn't got any GPUs (yet... :-) but there's 128 GB of RAM in each server.
24) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1562226)
Posted 25 Aug 2014 by Profile ivan
Commiserations.
I just went in on a holiday Monday to uncover my new beast. Pictures tomorrow I hope, it's currently lying naked on the floor after I ripped off its covering of cardboard and polystyrene... I'm concurrently retiring a few machines whose GFlop/Watt rating is falling way below what newer kit provides. 40 Xeon cores should boost my RAC a bit! :-)
25) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1559119)
Posted 19 Aug 2014 by Profile ivan
Now the SSD I was using to offload the stress on the limited onboard storage has stopped working, so that project is out for the week (I have to go to a meeting Up North).

Turns out not to be the SSD, I just lost the mount points for it. I can manually mount the partitions -- I'll sort out a more permanent arrangement next week. Then I couldn't do name-server lookups so time was way out and no connections to s@h -- fixed that by manually configuring /etc/resolv.conf, ubuntu is doing some weird things in dnsmasq.
26) Message boards : Number crunching : Panic Mode On (89) Server Problems? (Message 1558730)
Posted 18 Aug 2014 by Profile ivan
What could be wrong in my config ; where to look after ? What can I do to get GPU WU ? Ubuntu 14.04, two GTX780 with CUDA6, system works well with F@H. That one is stopped fully so system is dedicated to SETI right now.

There isn't a "stock" CUDA MB build for Linux. There is an AP build for it; if you've just signed up and are not using "anonymous platform" you will get AP jobs when they are available, but as you can see there aren't any available at the moment.
27) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1558710)
Posted 18 Aug 2014 by Profile ivan
Ivan,

Have you thought about compileing the apps to run on the CPU, to get some performance values of the individual Cortex a15 CPU's?

**Nevermind I see you already have some CPU based WU starting to process on it now**

I haven't had any luck with it. Runs OK standalone, but when I fire it up under BOINC it errors out.
Now the SSD I was using to offload the stress on the limited onboard storage has stopped working, so that project is out for the week (I have to go to a meeting Up North).
28) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1555911)
Posted 12 Aug 2014 by Profile ivan
You just need some external PCIe enclosures for those servers.

That might be just a bit hard to sell to TPTB! Maybe if I win Euromillions.
29) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1555892)
Posted 12 Aug 2014 by Profile ivan
Just took delivery of these boxes. Each box contains a 2U rack-mount chassis. Each chassis contains four independent servers. Each server contains two 10-core Xeons (with hyperthreading) and 128 GB of RAM. So, 4x4x2x10x2=640 threads...
Unfortunately I can't run s@h on them, they're for our particle-physics GRID-computing data centre.
But watch this space.
There's got to be a way to hang a bunch GPUs off of them. Send one my way and I'll figure it out. I'll pay for shipping and the electric bill somehow.

Not easily. Cramming four sleds into a 2U rack-unit means there's no space for a double-width PCI-e slot. But fear not, I said to watch this space and today I got word that on Friday I'll take delivery of essentially the same chassis, but with two of these instead, with the same basic specs, and they are definitely designed to take GPUs or coprocessors. And it will be totally under my control; I typically run "stress-testing" software on such systems with the full knowledge of my manager. I'll be retiring two older systems as I bring this online, but they have a total RAC of only 1,500 between them (my Jetson is already up to 630 after only seven days). I'm also eyeing a disk server I've just inherited (out of warranty) and installed as a (20 TB) data-store and archive unit, which has eight cores currently mostly idle...
30) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1555563)
Posted 12 Aug 2014 by Profile ivan
Just took delivery of these boxes. Each box contains a 2U rack-mount chassis. Each chassis contains four independent servers. Each server contains two 10-core Xeons (with hyperthreading) and 128 GB of RAM. So, 4x4x2x10x2=640 threads...
Unfortunately I can't run s@h on them, they're for our particle-physics GRID-computing data centre.
But watch this space.
31) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1555069)
Posted 11 Aug 2014 by Profile ivan
Anyway, to illustrate what I've been posting in the low power thread, here are a couple of shots of my Jetson development kit, before and after adding an external SSD.

How many of those pins are GPIO, and how many are i2c and SPI? I haven't found the docs on that yet.

You should be able to get to the Jetson TK1 Support page. The PM375 Module Specification document gives the names of all the pins; I haven't found a more-verbose description yet but the schematics are there too, if you can read them. Note there are several other interfaces brought out here too, such as LVDS for an LCD panel, touch input, etc.

The price is certainly right.

That sounds like a seriously cool board. Raspberry Pi has it's uses, but this sounds like it's useful for more than just home automation stuff done on the cheap (Raspberry Pi B+ for $35 is really hard to beat).
32) Message boards : Number crunching : a lot of pending WUs...... (Message 1554215)
Posted 9 Aug 2014 by Profile ivan
wow 1200 the ghost of wiggo will be around long after he kicks the bucket and there i thought i had heaps

4,906 pending here, buddy.

I've only got 1,445. :-{(>
33) Message boards : Number crunching : Show and tell your machine. Here's mine. (Message 1553587)
Posted 8 Aug 2014 by Profile ivan
LOL...no, no nixies in the computer.
Although the color in the pic does evoke their glow. Those are just some LED illuminated push button switches on the edge of the mobo.

I do own one of these, though.....

I remember even earlier dekatrons, where the glowing athodes were arranged in a circle, 0-9, and guide cathodes in between could "increment" the glowing cathode quite simply -- a counting circuit in a not-vacuum tube.
Anyway, to illustrate what I've been posting in the low power thread, here are a couple of shots of my Jetson development kit, before and after adding an external SSD.
(Sorry, embedding the images didn't work for some reason, so you'll have to click on the lynx.)
34) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1553356)
Posted 7 Aug 2014 by Profile ivan
The first WU has just finished; Run time 50 min 57 sec, CPU time 21 min 32 sec. Not validated yet. Run time is just about twice what I'm currently achieving with the 750 Ti, but that's running two at once.

Interestingly, I decided about midnight last night to test whether there's a memory bandwidth problem within the Jetson by changing to run two WUs at a time. The interesting part is that for the next few WUs, from the same tape and for about the same credit, the run-time per WU increased by 50% (not 100%) but the reported CPU time fell to 50%. I take the sublinear increase in real time to mean a bottleneck that's alleviated by crunching another thread while memory transfers(?) stall a thread (remember this GPU doesn't have separate RAM, it uses part of system memory AFAICT). The decrease in CPU time perhaps implies some busy-waiting -- the CPU time for two instances is the same as for a single one.
The run-times are starting to fluctuate as more varied WUs arrive. I think I'll let it stabilise for a while and then explore three simultaneous WUs, but cynicism and experience suggest that there's little more to be gained there.
35) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1553101)
Posted 7 Aug 2014 by Profile ivan
This assumes you've installed the CUDA SDK and added the appropriate locations to your PATH and LD_LIBRARY_PATH environment variables, but that's well-covered in the Nvidia documentation.

As I've just been reminded, it's a good idea also to set up the system-wide pointer to the libs with ldconfig, in case you forget to set up LD_LIBRARY_PATH or start up boinc from an account/shell that doesn't set it automatically -- in these cases the libraries aren't found and the jobs all die...
ubuntu@tegra-ubuntu:~/BOINC$ ldd projects/setiathome.berkeley.edu/setiathome_x41zc_armv7l-unknown-linux-gnu_cuda60 libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0xb66cd000) libcudart.so.6.0 => not found libcufft.so.6.0 => not found libstdc++.so.6 => /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 (0xb6621000) libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0xb65b5000) libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0xb6594000) libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xb64ac000) /lib/ld-linux-armhf.so.3 (0xb6704000) ubuntu@tegra-ubuntu:~/BOINC$ sudo nano /etc/ld.so.conf.d/cuda.conf ... [edit file here] ubuntu@tegra-ubuntu:~/BOINC$ cat /etc/ld.so.conf.d/cuda.conf # cuda default configuration /usr/local/cuda/lib ubuntu@tegra-ubuntu:~/BOINC$ sudo ldconfig ubuntu@tegra-ubuntu:~/BOINC$ ldd projects/setiathome.berkeley.edu/setiathome_x41zc_armv7l-unknown-linux-gnu_cuda60 libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0xb6700000) libcudart.so.6.0 => /usr/local/cuda/lib/libcudart.so.6.0 (0xb66b6000) libcufft.so.6.0 => /usr/local/cuda/lib/libcufft.so.6.0 (0xb4b7f000) libstdc++.so.6 => /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 (0xb4ad4000) libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0xb4a68000) libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0xb4a47000) libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0xb495f000) /lib/ld-linux-armhf.so.3 (0xb6738000) libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0xb4954000) librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0xb4946000)
36) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1553089)
Posted 6 Aug 2014 by Profile ivan
I also like what you talked about with the Baytrail-D system. If you don't mind me asking which ones do you have?

Mine is just something eBuyer had on special a couple of months ago for £130: Acer Aspire XC-603 Desktop PC. It took a bit of effort to find something that would boot on it -- I ended up with Centos 7. I could have tried to put corporate Windows 7 on it, but several comments said that it was difficult to get the drivers right. I wouldn't mind putting the wattmeter on it too. As I mentioned earlier, I've not found out if it's possible to get the iGPU crunching too under Linux -- BOINC says there's no GPU.
37) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1553058)
Posted 6 Aug 2014 by Profile ivan
That is pretty awesome performance for a system that NVidia says is using 5 watts under real work loads.

The more I look at Nvidia's support page this doesn't make much since. I don't think this applies to pushing the Cuda cores to their limit. It would be interesting to get a power meter on it to see what it's usage is.

The Jetson docs I was reading yesterday said that total at-the-wall consumption was (IIRC) 10.somethng W. Then it went through the chain describing the inefficiencies (20% loss in the power brick, etc...). It did stress that because it was a development chip the peripherals hadn't been chosen for low power consumption. Next time I power down my home system I'll remove my power-meter and apply it to the Jetson instead.
Remember, though, that the Jetson runs its cores at a lower frequency than many PCI-e video cards and the memory bus is narrower, which drops power consumption.
/home/ubuntu/CUDA-SDK/NVIDIA_CUDA-6.0_Samples/bin/armv7l/linux/release/gnueabihf/deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "GK20A" CUDA Driver Version / Runtime Version 6.0 / 6.0 CUDA Capability Major/Minor version number: 3.2 Total amount of global memory: 1746 MBytes (1831051264 bytes) ( 1) Multiprocessors, (192) CUDA Cores/MP: 192 CUDA Cores GPU Clock rate: 852 MHz (0.85 GHz) Memory Clock rate: 924 Mhz Memory Bus Width: 64-bit L2 Cache Size: 131072 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: Yes Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Bus ID / PCI location ID: 0 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.0, CUDA Runtime Version = 6.0, NumDevs = 1, Device0 = GK20A Result = PASS
38) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1552992)
Posted 6 Aug 2014 by Profile ivan

The first WU has just finished; Run time 50 min 57 sec, CPU time 21 min 32 sec. Not validated yet. Run time is just about twice what I'm currently achieving with the 750 Ti, but that's running two at once.

From your stderr out:

setiathome enhanced x41zc, Cuda 6.00


Where did you get this version from?

As mavrrick says, I compiled it myself. However, the hard part isn't s@h; the hard part is compiling BOINC. It has so many prerequisites. The basic instructions are here.
git clone git://boinc.berkeley.edu/boinc-v2.git boinc cd boinc git tag [Note the version corresponding to the latest recommendation.] git checkout client_release/<required release>; git status ./_autosetup ./configure --disable-server --enable-manager make -j n [where n is the number of cores/threads at your disposal]

The problems you will have is first finding the libraries and utilities that _autosetup wants, then ensuring that you have g++ installed, and then finding all the libraries and development packs that configure wants (you need the -devs for the header definition files). The final hurdle, if you want to use the boincmgr graphical command interface, is getting wxWidgets. It tends not to be included in repositories for modern distributions now so you have to try to compile it yourself. Which I haven't managed lately as BOINC wants an old version which was (apparently) badly coded and gives lots of problems with the newest, smartest gcc/g++ compilers. You may need to just learn how to use the boinccmd command-line controller...
The simplest way to then compile s@h was detailed back in January, in this thread.
cd <directory your boinc directory is in> svn checkout -r1921 https://setisvn.ssl.berkeley.edu/svn/branches/sah_v7_opt/Xbranch cd Xbranch [edit client/analyzeFuncs.h and add the line '#include <unistd.h>'] sh ./_autosetup sh ./configure BOINCDIR=../boinc --enable-sse2 --enable-fast-math make -j n

This assumes you've installed the CUDA SDK and added the appropriate locations to your PATH and LD_LIBRARY_PATH environment variables, but that's well-covered in the Nvidia documentation. As I alluded to above, you will probably have to edit the configure file too, to make sure obsolete gencode entries are removed and appropriate ones for your kit are included. Oh, and drop the --enable-sse2 if you're compiling for other than Intel/AMD CPUs.
39) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1552955)
Posted 6 Aug 2014 by Profile ivan
Ah, there it is!
Well, I got both my hologram reconstruction and s@h compiled and running on the Jetson today. The holograms run about 10x slower than on my GTX 750 Ti (1.5 frames/sec for a 4Kx4K reconstruction). No real problems with the s@h, just the missing include I reported last January, and I had to edit the config file to remove the old compute capabilities that nvcc didn't like and put in 3.2 for the Tegra.
The first WU has just finished; Run time 50 min 57 sec, CPU time 21 min 32 sec. Not validated yet. Run time is just about twice what I'm currently achieving with the 750 Ti, but that's running two at once.
40) Message boards : Number crunching : Energy Efficiency of Arm Based sysetms over x86 or GPU based systems (Message 1552687)
Posted 5 Aug 2014 by Profile ivan

05-Aug-2014 16:03:28 [---] This computer is not attached to any projects 05-Aug-2014 16:03:28 [---] Visit http://boinc.berkeley.edu for instructions 05-Aug-2014 16:03:29 Initialization completed 05-Aug-2014 16:03:29 [---] Suspending GPU computation - computer is in use 05-Aug-2014 16:04:00 [---] Received signal 2 05-Aug-2014 16:04:01 [---] Exit requested by user

Now to try to attach to S@H since the project is up again!

05-Aug-2014 20:31:55 [---] Suspending GPU computation - computer is in use 05-Aug-2014 20:39:33 [---] Running CPU benchmarks 05-Aug-2014 20:39:33 [---] Suspending computation - CPU benchmarks in progress 05-Aug-2014 20:39:33 [---] Running CPU benchmarks 05-Aug-2014 20:39:33 [---] Running CPU benchmarks 05-Aug-2014 20:39:33 [---] Running CPU benchmarks 05-Aug-2014 20:39:33 [---] Running CPU benchmarks 05-Aug-2014 20:40:05 [---] Benchmark results: 05-Aug-2014 20:40:05 [---] Number of CPUs: 4 05-Aug-2014 20:40:05 [---] 966 floating point MIPS (Whetstone) per CPU 05-Aug-2014 20:40:05 [---] 6829 integer MIPS (Dhrystone) per CPU 05-Aug-2014 20:40:06 [---] Resuming computation 05-Aug-2014 20:40:12 [http://setiathome.berkeley.edu/] Master file download succeeded 05-Aug-2014 20:40:17 [---] Number of usable CPUs has changed from 4 to 1. 05-Aug-2014 20:40:17 [http://setiathome.berkeley.edu/] Sending scheduler request: Project initialization. 05-Aug-2014 20:40:17 [http://setiathome.berkeley.edu/] Requesting new tasks for CPU and NVIDIA 05-Aug-2014 20:40:22 [SETI@home] Scheduler request completed: got 0 new tasks 05-Aug-2014 20:40:22 [SETI@home] This project doesn't support computers of type armv7l-unknown-linux-gnueabihf 05-Aug-2014 20:40:24 [SETI@home] Started download of arecibo_181.png 05-Aug-2014 20:40:24 [SETI@home] Started download of sah_40.png 05-Aug-2014 20:40:27 [SETI@home] Finished download of arecibo_181.png 05-Aug-2014 20:40:27 [SETI@home] Finished download of sah_40.png 05-Aug-2014 20:40:27 [SETI@home] Started download of sah_banner_290.png 05-Aug-2014 20:40:27 [SETI@home] Started download of sah_ss_290.png 05-Aug-2014 20:40:29 [SETI@home] Finished download of sah_banner_290.png 05-Aug-2014 20:40:29 [SETI@home] Finished download of sah_ss_290.png 05-Aug-2014 20:43:41 [---] Resuming GPU computation 05-Aug-2014 20:44:27 [---] Suspending GPU computation - computer is in use
:-)
Ah, there it is!


Previous 20 · Next 20

Copyright © 2014 University of California