SAH On Linux

Message boards : Number crunching : SAH On Linux
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1236889 - Posted: 26 May 2012, 2:38:00 UTC
Last modified: 26 May 2012, 2:44:50 UTC

As part of the ongoing experiments with "The Monster" I've installed 64bit Mageia Linux as the OS.

I installed V275 Nvidia drivers from the Mageia repo (the latest available) and the Lunatics x41g Linux app.

The installation went reasonably smoothly and it is now up and crunching.

On first boot I found the crunching times nearly doubled. Investigation showed that the Linux app has roughly double the CPU usage of the Windows version and the poor E7200 just did not have enough grunt to run the system.

I swapped it out for a Q6600 quad. Getting a bit more horsepower under the bonnet helped but crunching times are still around 5 minutes longer for a "standard" unit than the same hardware under Windows, the crunching time for "shorties" has nearly doubled.

I am aware this is a very heavily loaded system. Experiments with other installations has shown that the speed of the x41 Linux app is comparable with the Windows version on a "normal" machine despite the higher CPU usage (4% to 8% per task compared with 1% to 2% for the Win version). Average CPU usage has gone from around 15% to nearly 40%. Memory usage is around 900MB under both OS's

I would guess that the higher CPU usage indicates that the CPU is also talking to the GPU's more often placing a higher load on the PCIE bus thus compounding the problem.

My question is. What can I do to speed things up? Either by reducing the CPU time or finding someway to reduce PCIE bus loading ?

This is an experimental machine so I prepared to experiment with various fixes.

For those who are interested the Linux box is 6654123 and the Windows version is 6629977 both machines are exactly the same hardware, the only difference is the OS

T.A.
ID: 1236889 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1237267 - Posted: 26 May 2012, 9:20:43 UTC - in response to Message 1236889.  

Yet the turn-around time for the Linux system shows as being much faster than when you ran Windows on there...

The real decider is for what RAC you achieve, plus or minus the variable RAC error and fuzziness :-(


Your GTS250 only has 512MBytes of GPU RAM... You can give a bit more workspace by reducing the default virtual desktops down to 1 or 2 rather than the often default 4.

Another trick to give more GPU processing to compute operations is to tick the vsync box in the nVidia control utility so that screen updates are limited to being no faster than your vertical refresh rate for your monitor.

And the real story for utilisation is shown by what temps you get for your CPU and GPU. The most extreme example I've seen has been with PrimeGrid...


Magia2 is recently just out if you want to try a more recent kernel ;-)

Happy fast computing,
Martin

See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1237267 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1237277 - Posted: 26 May 2012, 10:11:45 UTC - in response to Message 1237267.  

Yet the turn-around time for the Linux system shows as being much faster than when you ran Windows on there...

So he has a smaller cache?

Turnround includes the waiting time between download and starting to run...
ID: 1237277 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9541
Credit: 50,759,529
RAC: 60
Yemen
Message 1237344 - Posted: 26 May 2012, 13:50:54 UTC - in response to Message 1237277.  

lets also not forget that Linux OSes are really built like a tank. It may be a bit slower but it's unlikely to stop running, ever.


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1237344 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1237345 - Posted: 26 May 2012, 14:01:51 UTC

All of the shorties I looked at for windows had an AR of around 2.7. Where the shorties that have run on linux are around 1.3. I'm not sure if about 3 times as long is par for those AR values, but it might help explain things.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1237345 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1237359 - Posted: 26 May 2012, 14:40:43 UTC

@Martin
Take no notice of the turn around times, I ran a hundred units through Windows last night to get a bench mark with the new CPU but didn't report them until 0930 my time (around 00:00 UTC). The Linux machine has been on line since then.

The OS is as stripped down as I can make it, with BOINC shut down it only uses 250MB of memory. I'm running the LXDE desktop with only 2 screens. BOINC is the only task using any significant CPU time.

The devil is in the detail, if you compare the crunching times of similar units you will see that Linux is taking nearly twice as long to crunch a "shorty" and at least 5 minutes longer for units in the 0.4 AR range.

I'm pretty sure this is due to the higher CPU usage of the Linux app and probably bus congestion because of this. I've run Linux crunchers before and the crunching times have been comparable to Windows on less heavily loaded systems.

It's a problem that needs to be looked at as a systems with 6 and 8 GPU's (multiple GTX 590's etc.) are already running. I'd like to know how to make Linux competitive in such circumstances.

T.A.
ID: 1237359 · Report as offensive
Profile ivan
Volunteer tester
Avatar

Send message
Joined: 5 Mar 01
Posts: 783
Credit: 348,560,338
RAC: 223
United Kingdom
Message 1237381 - Posted: 26 May 2012, 15:37:24 UTC - in response to Message 1237344.  

lets also not forget that Linux OSes are really built like a tank. It may be a bit slower but it's unlikely to stop running, ever.

My Linux machine quit running last month, just after I noticed it had been up for slightly more than 365 days... The power supply exploded. I never found exactly what shorted, but the PSU was absolutely filled with dust. (I built the machine 5 or 6 years ago, with a Hi-Tek PSU that had three fans, so a lot of air had been through it in its lifetime.)
ID: 1237381 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1237382 - Posted: 26 May 2012, 15:38:10 UTC

In your windows tasks I see:

Priority of process raised successfully
Priority of worker thread raised successfully

In your linux tasks I do not.

It could be the linux app does record that information or the tasks are running lower priority.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1237382 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1237412 - Posted: 26 May 2012, 16:39:43 UTC - in response to Message 1237382.  

In your windows tasks I see:

Priority of process raised successfully
Priority of worker thread raised successfully

In your linux tasks I do not.

It could be the linux app does record that information or the tasks are running lower priority.

From what I understand of a conversation I had at Einstein some years ago, Linux developers don't have access to any tool or setting equivalent to the 'worker thread priority' - they can only adjust priority at the process level. So much so that developers who cut their teeth on Linux code aren't even aware that Windows has threads which can be individually adjusted.

Also, BOINC should raise the process priority of a CUDA app on launch, so there shouldn't be any need for a message about the app adjusting itself later. So, I wouldn't worry about the non-equivalence of the messages, though it might well be something for T.A. to look at next time he's running each of the two OSs.

NB - you'd need to use something like Process Explorer to get the Windows thread priorities - Task Manager can only display/control the process priority.
ID: 1237412 · Report as offensive
Profile Khangollo
Avatar

Send message
Joined: 1 Aug 00
Posts: 245
Credit: 36,410,524
RAC: 0
Slovenia
Message 1237509 - Posted: 26 May 2012, 19:05:48 UTC - in response to Message 1236889.  

On first boot I found the crunching times nearly doubled. Investigation showed that the Linux app has roughly double the CPU usage of the Windows version and the poor E7200 just did not have enough grunt to run the system

In my experiments, Linux app. has exactly the same CPU usage as when I tested it in wintendo 7. What tool were you using to measure CPU usage?
Plese note that "top" shows 0-100+% per process where 100% is maximum use of a single core. Task Manager in windows shows 0-50% where 50% is maximum of a single core on dual core (or 0-25% on quad).

I swapped it out for a Q6600 quad. Getting a bit more horsepower under the bonnet helped but crunching times are still around 5 minutes longer for a "standard" unit than the same hardware under Windows, the crunching time for "shorties" has nearly doubled.

As for lower performance, the default process priority (niceness) optimized app. runs is the culprit.
Are you running that script to lower niceness of CUDA processes to 0 ? It is absolutely mandatory for now, otherwise CUDA takes a huge performance hit.

Anyway, I'm running all my cuda crunchin in linux, RAC is perfectly nominal for that hardware. I would never switch to wintendo. No silly problems with drivers, no problems with monitor sleep, no problems with dummy plugs and desktop settings, no downclocking... etc... both my cuda machines run headless, with only power and network cable attached, no X sever running.
ID: 1237509 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1237579 - Posted: 26 May 2012, 20:40:40 UTC - in response to Message 1237345.  

All of the shorties I looked at for windows had an AR of around 2.7. Where the shorties that have run on linux are around 1.3. I'm not sure if about 3 times as long is par for those AR values, but it might help explain things.

The AR 1.3 tasks might take 10% longer than the AR 2.7 tasks, all else being exactly the same. It's a factor, but small enough that it's usually hidden by other causes for variation.
                                                                   Joe
ID: 1237579 · Report as offensive
doug
Volunteer tester

Send message
Joined: 10 Jul 09
Posts: 202
Credit: 10,828,067
RAC: 0
United States
Message 1238526 - Posted: 28 May 2012, 17:21:01 UTC

You know I gotta think that somewhere out there someone hasn't built the ultimate SETI@HOME OS which is essentially no OS. Start by building a Linux kernel with practically nothing in it. Eliminate the scheduler and make it real time. All of this stuff is pretty much off the shelf. Now just move all the Lunatics apps down into the kernel. Not for the general public, but for the hard core I think it would be a fun project. Probably end up maintaining it for the rest of your life though.

Doug
ID: 1238526 · Report as offensive
doug
Volunteer tester

Send message
Joined: 10 Jul 09
Posts: 202
Credit: 10,828,067
RAC: 0
United States
Message 1238529 - Posted: 28 May 2012, 17:26:16 UTC - in response to Message 1238526.  

Kind of makes me wonder how fast things would run if you booted Windows in safe mode. I would think you might get some improvement.
ID: 1238529 · Report as offensive
Profile Area 51
Avatar

Send message
Joined: 31 Jan 04
Posts: 965
Credit: 42,193,520
RAC: 0
United Kingdom
Message 1238531 - Posted: 28 May 2012, 17:31:56 UTC - in response to Message 1238526.  
Last modified: 28 May 2012, 17:33:27 UTC

You know I gotta think that somewhere out there someone hasn't built the ultimate SETI@HOME OS which is essentially no OS. Start by building a Linux kernel with practically nothing in it. Eliminate the scheduler and make it real time. All of this stuff is pretty much off the shelf. Now just move all the Lunatics apps down into the kernel. Not for the general public, but for the hard core I think it would be a fun project. Probably end up maintaining it for the rest of your life though.

Doug



There is a RAM based linux distro here:

http://www.dotsch.de/boinc/Dotsch_UX.html

specifically for BOINC which can be configured to run from secondary storage.
ID: 1238531 · Report as offensive
doug
Volunteer tester

Send message
Joined: 10 Jul 09
Posts: 202
Credit: 10,828,067
RAC: 0
United States
Message 1238542 - Posted: 28 May 2012, 17:46:22 UTC - in response to Message 1238531.  

You know I gotta think that somewhere out there someone hasn't built the ultimate SETI@HOME OS which is essentially no OS. Start by building a Linux kernel with practically nothing in it. Eliminate the scheduler and make it real time. All of this stuff is pretty much off the shelf. Now just move all the Lunatics apps down into the kernel. Not for the general public, but for the hard core I think it would be a fun project. Probably end up maintaining it for the rest of your life though.

Doug



There is a RAM based linux distro here:

http://www.dotsch.de/boinc/Dotsch_UX.html

specifically for BOINC which can be configured to run from secondary storage.

It's a good start, but I'm talking all the way. Something so hard core that nothing will interrupt it from its I/O and computing activities. Something so perverse that its very existence is for SETI. I've got some spare disks. I might just do this at least up to the point of moving the Lunatics source into the kernel.
ID: 1238542 · Report as offensive
Profile Area 51
Avatar

Send message
Joined: 31 Jan 04
Posts: 965
Credit: 42,193,520
RAC: 0
United Kingdom
Message 1238548 - Posted: 28 May 2012, 17:50:50 UTC - in response to Message 1238542.  

You know I gotta think that somewhere out there someone hasn't built the ultimate SETI@HOME OS which is essentially no OS. Start by building a Linux kernel with practically nothing in it. Eliminate the scheduler and make it real time. All of this stuff is pretty much off the shelf. Now just move all the Lunatics apps down into the kernel. Not for the general public, but for the hard core I think it would be a fun project. Probably end up maintaining it for the rest of your life though.

Doug



There is a RAM based linux distro here:

http://www.dotsch.de/boinc/Dotsch_UX.html

specifically for BOINC which can be configured to run from secondary storage.

It's a good start, but I'm talking all the way. Something so hard core that nothing will interrupt it from its I/O and computing activities. Something so perverse that its very existence is for SETI. I've got some spare disks. I might just do this at least up to the point of moving the Lunatics source into the kernel.



....so, even more bare bones that. Presumably more bones than anything then!!!!

I'm not sure how much you'd gain by binning the entire GUI and literally only compiling in the bits absolutely necessary - unless it was being run on a CPU with shed loads of cores so that all those litle gains really multiplied out, but yes, it would be an interesting little aside!

ID: 1238548 · Report as offensive
Profile Ex: "Socialist"
Volunteer tester
Avatar

Send message
Joined: 12 Mar 12
Posts: 3433
Credit: 2,616,158
RAC: 2
United States
Message 1238670 - Posted: 29 May 2012, 0:06:47 UTC

Yeah, modern processors run an OS with very little overhead. My servers linux install uses less than .5% of my proc when idle. Doesn't even register it's use basically, just shows as 100% idle. LOL

More interesting, I'd still like to know why BOINC crunches seti@home slower in Linux than windows...

(And before someone comes in and says it would be faster if it was compiled different, tell us HOW. Also there are some people here who have linux machines performing as well as windows, again, I'd like to know HOW)
#resist
ID: 1238670 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 20265
Credit: 7,508,002
RAC: 20
United Kingdom
Message 1238674 - Posted: 29 May 2012, 0:17:16 UTC - in response to Message 1238548.  
Last modified: 29 May 2012, 0:19:39 UTC

You know I gotta think that somewhere out there someone hasn't built the ultimate SETI@HOME OS which is essentially no OS...


There is a RAM based linux distro here:

http://www.dotsch.de/boinc/Dotsch_UX.html

specifically for BOINC which can be configured to run from secondary storage.

It's a good start, but I'm talking all the way. Something so hard core that nothing will interrupt it from its I/O and computing activities. Something so perverse that its very existence is for SETI. I've got some spare disks. I might just do this at least up to the point of moving the Lunatics source into the kernel.


....so, even more bare bones that. Presumably more bones than anything then!!!!

I'm not sure how much you'd gain by binning the entire GUI and literally only compiling in the bits absolutely necessary - unless it was being run on a CPU with shed loads of cores so that all those litle gains really multiplied out, but yes, it would be an interesting little aside!

It would be a very curious aside...

The OS has very little overhead and offers a huge benefit in tried and tested services that make running user-side applications very easy and efficient.

For a super-stripped-down system, then you could try looking at how configurable Gentoo is. You can compile and run that with no graphical display at all if you wished. You could see how far that could be stripped down...


However, for the time and effort of in effect replicating OS functions to add to the Boinc + s@h code, I strongly suspect that effort would produce far greater gains if you were to instead help with improving GPU optimisation. You would get far more compute power to exercise.

Also note: You can consider a GPU to be a computer without an OS as we usually assume one to be... New super-compute as never before seen for s@h?


Happy fast crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 1238674 · Report as offensive
doug
Volunteer tester

Send message
Joined: 10 Jul 09
Posts: 202
Credit: 10,828,067
RAC: 0
United States
Message 1238684 - Posted: 29 May 2012, 0:53:17 UTC - in response to Message 1238674.  

You have a point. Not to mention that all you have to do is wait a short bit of time for your compute potential to double anymore. I guess it's just the point. I hate inefficiency and it bothers me that I now need a Gb RAM to run an OS. Where's it all going? Just chalk it all up to software friction?
ID: 1238684 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1238701 - Posted: 29 May 2012, 3:50:51 UTC - in response to Message 1238529.  

Kind of makes me wonder how fast things would run if you booted Windows in safe mode. I would think you might get some improvement.

While in the process of waiting to set of several of my servers I ran BOINC from WinPE until I had the time to set them up. Which was a few months. I made custom PE build for the PowerEdge boxes I have. So it loaded up the embedded ATI GPU and SCSI controllers. While none of them have the GPU hardware to do anything I imagine it could be configured to work. Windows Server 2008 & R2 have an install option to run w/o a GUI in basically the same manor.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1238701 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : SAH On Linux


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.