LINUX priority (nice) and boinc...

Message boards : Number crunching : LINUX priority (nice) and boinc...
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Jean-David Beyer

Send message
Joined: 10 Jun 99
Posts: 60
Credit: 1,301,105
RAC: 1
United States
Message 1883 - Posted: 25 Jun 2004, 13:44:32 UTC

I run Red Hat Enterprise Linux 3 ES on a machine with two Intel XEON Hyperthreaded processors, the ones with a 1 Megabyte L3 cache. I also run IBM's DB2 UDB V8.1.5 relational dbms that sets up a lot of servers for different aspects of its work.

Now boinc itself runs at nice level 0, which is of no interest because it uses so little processor time. The setathome processes run at nice level 19 and a priority that is usually 39; i.e., extremely unlikely to get processor time if any other processor wants the processor. The setiathome processes are compute limited and use very little IO once they start running (and not all that much to begin with).

DB2, OTOH, tends to be IO limited except for rare exceptions. The servers and client take about 65% of one CPU on a machine that acts as though it has four CPUs, so I have enough compute power. There is 4096Megabytes of RAM on the machine, and a setiathome process takes only 0.3% of it. Similarly, while DB2 uses much more memory, it rarely takes over 10% of it. There is essentially no paging going on.

Yet a DB2 job that takes about 30 minutes when no setiathome processes are running takes about 45 minutes when they are. I am trying to figure out what the bottleneck is. It seems it cannot be compute power, since about 75% of the compute resource is running nice. It should not be IO power, since setiathome does essentially no IO. It cannot be memory "power" since there is so much of it that about 3/4 of it is used as input buffers that can be discarded at anytime by the OS if memory is needed for something else.

About the only thing left is memory bandwidth. (It is amusing to note that it seems to me that all machines have the same memory bandwidth according to BOINC, which cannot really be the case. I cannot believe my 166MHz Pentium machine has the same memory bandwidth as my Dual 3.06GHz Xeon machine with E7501 chipset.)

In other words, is setiathome dirtying the L1, L2, and L3 caches so badly that the DB2 software has to fetch everything from memory, whereas when setiathome is not running, most of what DB2 needs is in the caches?
ID: 1883 · Report as offensive
Profile MAOJC

Send message
Joined: 31 Jan 00
Posts: 11
Credit: 991,339
RAC: 0
United States
Message 1895 - Posted: 25 Jun 2004, 14:00:29 UTC - in response to Message 1883.  


>
> In other words, is setiathome dirtying the L1, L2, and L3 caches so badly that
> the DB2 software has to fetch everything from memory, whereas when setiathome
> is not running, most of what DB2 needs is in the caches?
>
>

Nail hit firmly on the head I suspect. If the proccess page in an out then the chaches get flushed. Even a momentary lag in the DB2 allows seti to come on stong. Just watch the progress seti makes while the DB2 stuff is running and you will seet that the seti proccess stil makes headway.
ID: 1895 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 1901 - Posted: 25 Jun 2004, 14:06:22 UTC - in response to Message 1883.  

> About the only thing left is memory bandwidth. (It is amusing to note that it
> seems to me that all machines have the same memory bandwidth according to
> BOINC, which cannot really be the case. I cannot believe my 166MHz Pentium
> machine has the same memory bandwidth as my Dual 3.06GHz Xeon machine with
> E7501 chipset.)

Testing memory bandwidth was removed one or two revisions ago, but cleaning up the reporting of that has not happened yet.
ID: 1901 · Report as offensive
Profile Jean-David Beyer

Send message
Joined: 10 Jun 99
Posts: 60
Credit: 1,301,105
RAC: 1
United States
Message 2645 - Posted: 29 Jun 2004, 13:37:03 UTC - in response to Message 1895.  

>
> >
> > In other words, is setiathome dirtying the L1, L2, and L3 caches so badly
> that
> > the DB2 software has to fetch everything from memory, whereas when
> setiathome
> > is not running, most of what DB2 needs is in the caches?
> >
> >
>
> Nail hit firmly on the head I suspect. If the proccess page in an out then
> the chaches get flushed. Even a momentary lag in the DB2 allows seti to come
> on stong. Just watch the progress seti makes while the DB2 stuff is running
> and you will seet that the seti proccess stil makes headway.
>
>
Well, I have so much memory that the processes are not paging in and out. But if a db2 server waits for an IO operation to complete, setiathome gets the processor until it does. If IO operations take 2 milliseconds to complete, setiathome can totally replace the contents of the 1 Megabyte L3 cache on a processor by the time the IO completes and DB2 wants the processor again. So it must fetch its working set from RAM again. The E7501 chipset uses two banks of 266 MHz DDR memory modules in parallel to simulate a single 533MHz memory system, but that is still slow compared to the 3.06GHz processors.

Absolutely: setiathome gets about 75% or more of the cpu time available even when DB2 is running as hard as it can. The question was: if DB2 needs compute time, running at more priority than setiathome, should it not get it? And if it does, why does setiathome impede its progress.

The answer seems to be that DB2 gets all the CPU time it asks for, but that it is not as well employed when setiathome is running because setiathome keeps the caches (L3 especially) dirty so DB2 is essentially running without the benefit of the cache.
ID: 2645 · Report as offensive
Profile Keck_Komputers
Volunteer tester
Avatar

Send message
Joined: 4 Jul 99
Posts: 1575
Credit: 4,152,111
RAC: 1
United States
Message 2713 - Posted: 29 Jun 2004, 16:18:00 UTC - in response to Message 2645.  

> >
> > >
> > > In other words, is setiathome dirtying the L1, L2, and L3 caches so
> badly
> > that
> > > the DB2 software has to fetch everything from memory, whereas when
> > setiathome
> > > is not running, most of what DB2 needs is in the caches?
> > >
> > >
> >
> > Nail hit firmly on the head I suspect. If the proccess page in an out
> then
> > the chaches get flushed. Even a momentary lag in the DB2 allows seti to
> come
> > on stong. Just watch the progress seti makes while the DB2 stuff is
> running
> > and you will seet that the seti proccess stil makes headway.
> >
> >
> Well, I have so much memory that the processes are not paging in and out. But
> if a db2 server waits for an IO operation to complete, setiathome gets the
> processor until it does. If IO operations take 2 milliseconds to complete,
> setiathome can totally replace the contents of the 1 Megabyte L3 cache on a
> processor by the time the IO completes and DB2 wants the processor again. So
> it must fetch its working set from RAM again. The E7501 chipset uses two banks
> of 266 MHz DDR memory modules in parallel to simulate a single 533MHz memory
> system, but that is still slow compared to the 3.06GHz processors.
>
> Absolutely: setiathome gets about 75% or more of the cpu time available even
> when DB2 is running as hard as it can. The question was: if DB2 needs compute
> time, running at more priority than setiathome, should it not get it? And if
> it does, why does setiathome impede its progress.
>
> The answer seems to be that DB2 gets all the CPU time it asks for, but that it
> is not as well employed when setiathome is running because setiathome keeps
> the caches (L3 especially) dirty so DB2 is essentially running without the
> benefit of the cache.
>
>
I have something to add here. Since you are running on HT CPUs the OS may not realize that seti is a lower priority. Especially if you are running an older version. You may want to try limiting BOINC to the number of real processors and see if that clears up you problem.

John Keck
BOINCing since 2002/12/08
ID: 2713 · Report as offensive
Profile Jean-David Beyer

Send message
Joined: 10 Jun 99
Posts: 60
Credit: 1,301,105
RAC: 1
United States
Message 3128 - Posted: 1 Jul 2004, 13:08:23 UTC - in response to Message 2713.  

[snip]

> > if a db2 server waits for an IO operation to complete, setiathome gets
> the
> > processor until it does. If IO operations take 2 milliseconds to
> complete,
> > setiathome can totally replace the contents of the 1 Megabyte L3 cache on
> a
> > processor by the time the IO completes and DB2 wants the processor again.
> So
> > it [DB2] must fetch its working set from RAM again. The E7501 chipset uses two
> banks
> > of 266 MHz DDR memory modules in parallel to simulate a single 533MHz
> memory
> > system, but that is still slow compared to the 3.06GHz processors.
> >
> > Absolutely: setiathome gets about 75% or more of the cpu time available
> even
> > when DB2 is running as hard as it can. The question was: if DB2 needs
> compute
> > time, running at more priority than setiathome, should it not get it? And
> if
> > it does, why does setiathome impede its progress.
> >
> > The answer seems to be that DB2 gets all the CPU time it asks for, but
> that it
> > is not as well employed when setiathome is running because setiathome
> keeps
> > the caches (L3 especially) dirty so DB2 is essentially running without
> the
> > benefit of the cache.
> >
> >
> I have something to add here. Since you are running on HT CPUs the OS may not
> realize that seti is a lower priority.

No that is not the problem. My OS is Red Hat Enterprise Linux 3 ES, up to date using their up2date system. It is quite clear that the OS kernel recognizes that the setiathome processors are at minimum priority.

> Especially if you are running an older
> version.

I am running the most up to date version of this current distribution I can get. Kernel 2.4.21-15.0.2.ELsmp This kernel has many of the features of 2.6 kernels in it.

> You may want to try limiting BOINC to the number of real processors
> and see if that clears up you problem.

It does help, but turning BOINC off completely completely clears up the problem.
But I do not want to do that. I will let BOINC run four instances of setiathome and put up with the dirty caches.

In fact, this is an interesting test as to how much a cache is worth in a real environment. When setiathome is running, the DB2 processes run 50% slower (45 minutes instead of 30, roughly).
ID: 3128 · Report as offensive

Message boards : Number crunching : LINUX priority (nice) and boinc...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.