One WU not respecting nice...

Message boards : Number crunching : One WU not respecting nice...
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Samuel Blomqvist
Avatar

Send message
Joined: 14 May 99
Posts: 8
Credit: 492,504
RAC: 0
Sweden
Message 306479 - Posted: 15 May 2006, 16:59:11 UTC

This is strange... I have dual system with a total of four cores and suddenly saw one of the CPU-meters indicating 100% load (It shouldn't show "nice") and it turns out it is a special WU (30mr99aa.29392.16146.492302.3.196_3) that is causing this... I have no idea if this has any real effect on my computer or not but all the other WUs are running fine with nice 19 and not showing up on the CPU-load meter.

Anybody's got an idea?

BTW running BOINC under Debian GNU/Linux with a 2.6.16-kernel.

/Samuel
ID: 306479 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21668
Credit: 7,508,002
RAC: 20
United Kingdom
Message 306559 - Posted: 15 May 2006, 18:45:02 UTC - in response to Message 306479.  

This is strange... I have dual system with a total of four cores and suddenly saw one of the CPU-meters indicating 100% load (It shouldn't show "nice") and it turns out it is a special WU...

Anybody's got an idea?

BTW running BOINC under Debian GNU/Linux with a 2.6.16-kernel.

Never seen that happen here.

Can you check what the process priorities actually are for that process and any associated processes? (Use "top" in a terminal window?)

Can you change that process priority down to "nice 19"? (Use "renice"?)

Or have you got strange settings for that one CPU meter? (Double check its options/properties?)

Let us know what you find,

Good luck,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 306559 · Report as offensive
Profile Samuel Blomqvist
Avatar

Send message
Joined: 14 May 99
Posts: 8
Credit: 492,504
RAC: 0
Sweden
Message 306615 - Posted: 15 May 2006, 20:10:09 UTC - in response to Message 306559.  
Last modified: 15 May 2006, 20:11:03 UTC

Can you check what the process priorities actually are for that process and any associated processes? (Use "top" in a terminal window?)

Can you change that process priority down to "nice 19"? (Use "renice"?)

Or have you got strange settings for that one CPU meter? (Double check its options/properties?)

Let us know what you find,

Good luck,
Martin



Here is part of what top shows:
Tasks: 148 total, 2 running, 146 sleeping, 0 stopped, 0 zombie
Cpu(s): 26.9% us, 0.6% sy, 72.3% ni, 0.0% id, 0.0% wa, 0.1% hi, 0.2% si
Mem: 2075172k total, 2006832k used, 68340k free, 37300k buffers
Swap: 2634620k total, 160k used, 2634460k free, 1237836k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11380 boinc 34 19 59240 37m 3420 S 100 1.9 167:47.01 setiathome-5.12
955 boinc 34 19 60240 38m 3420 S 99 1.9 414:59.22 setiathome-5.12
11808 boinc 39 19 18968 15m 1252 R 97 0.8 129:52.83 setiathome_4.02
956 boinc 34 19 59144 37m 3420 S 94 1.9 415:46.37 setiathome-5.12
14347 samuel 16 0 28556 14m 11m S 5 0.7 0:03.49 boincmgr


taking into account that that process has about 2h 50min of CPU-time I'd say it's the topmost one...

the 100% CPU-load moves between the cores so I don't think it has anything to do with settings in the CPU-meters (I'm using gkrellm)

shut down the processes one by one and when I shut down that one the 100% load disapeared.


/LinuxSam
ID: 306615 · Report as offensive
Profile Samuel Blomqvist
Avatar

Send message
Joined: 14 May 99
Posts: 8
Credit: 492,504
RAC: 0
Sweden
Message 308443 - Posted: 17 May 2006, 11:59:42 UTC - in response to Message 306615.  

I now have another WU missbehaving...

It's the WU 07ja99aa.20918.7201.686074.3.57

Shows up on the CPU-load meters as not being "nice" but looking in top it shows with a nice 19.

/Samuel


ID: 308443 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 308451 - Posted: 17 May 2006, 12:09:25 UTC - in response to Message 308443.  
Last modified: 17 May 2006, 12:11:03 UTC

I now have another WU missbehaving...

It's the WU 07ja99aa.20918.7201.686074.3.57

Shows up on the CPU-load meters as not being "nice" but looking in top it shows with a nice 19.

/Samuel



A nice level of 19 means "idle priority". There's something weird going on.

Try increasing top's update rate with the "s" command, maybe you can spot something .

If you login as root, you can specify sub-second update rates like "0.1" or even "0.02"

Use the "i" command to only display active tasks.

Regards Hans
ID: 308451 · Report as offensive
Profile Samuel Blomqvist
Avatar

Send message
Joined: 14 May 99
Posts: 8
Credit: 492,504
RAC: 0
Sweden
Message 308475 - Posted: 17 May 2006, 12:30:03 UTC - in response to Message 308451.  





A nice level of 19 means "idle priority". There's something weird going on.

Try increasing top's update rate with the "s" command, maybe you can spot something .

If you login as root, you can specify sub-second update rates like "0.1" or even "0.02"

Use the "i" command to only display active tasks.

Regards Hans


I don't see anything that is strange... (tried increasing the updaterate but didn't see anything then either) the only process that shows up as running is a 4.02 setiversion... Seti_enhanced allways shows up as sleeping for some reason.

/Samuel
ID: 308475 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21668
Credit: 7,508,002
RAC: 20
United Kingdom
Message 308973 - Posted: 17 May 2006, 23:29:52 UTC - in response to Message 308475.  
Last modified: 17 May 2006, 23:31:32 UTC

I don't see anything that is strange... (tried increasing the updaterate but didn't see anything then either) the only process that shows up as running is a 4.02 setiversion... Seti_enhanced allways shows up as sleeping for some reason.

/Samuel

Looking at that top output, all looks normal. Your "samuel" user process is a little suspicious for using so much CPU time unless you have some heavyweight task running there.

Even at the lowest priority of "nice 19", a process can take 100% if there is nothing else (higher priority) for that processor to do. You have a multi-processor system so you could well have periods where only one CPU supports your system and the other (one or more) CPU(s) are spare to do 100% Boinc.

Also note: The process state flags are a "snapshot" at some instant. You can expect most processes to be sleeping waiting for something new to happen. Also, the Linux kernel process scheduler will try to maintain CPU affinity for a process, hence a lucky Boinc process could get (and keep) one processor all to itself for a few seconds at a time or so...

All looks normal, with good utilisation.

Happy crunchin',
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 308973 · Report as offensive
Hans Dorn
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2262
Credit: 26,448,570
RAC: 0
Germany
Message 309021 - Posted: 18 May 2006, 0:20:32 UTC

Hi Martin,

the strange thing is he's seeing 25%us (User) load, and only 75%ni (Nice),
while all boinc tasks are at nearly 100% cpu.

There really should be just about 100%ni in the second line.


Regards Hans



ID: 309021 · Report as offensive
Profile Samuel Blomqvist
Avatar

Send message
Joined: 14 May 99
Posts: 8
Credit: 492,504
RAC: 0
Sweden
Message 309170 - Posted: 18 May 2006, 2:02:57 UTC - in response to Message 309021.  

Hi Martin,

the strange thing is he's seeing 25%us (User) load, and only 75%ni (Nice),
while all boinc tasks are at nearly 100% cpu.

There really should be just about 100%ni in the second line.


Regards Hans




Exactly. I know all the other stuff and BOINC is using almost 100% x4 CPUtime. And that is fine with me. However they should all be running with nice 19 and looking at top they do but still in gkrellm some WUs show up as not being nice. And when you look at top you see that there is only 75% or less nice when there should really be almost 100% nice... since nothing much else is running. When running with nice 19 BOINC shouldn't generate "user load" should it?

I just got another one it is the WU 22fe99aa.2560.546.959636.3.85

/LinuxSam
ID: 309170 · Report as offensive
Profile ML1
Volunteer moderator
Volunteer tester

Send message
Joined: 25 Nov 01
Posts: 21668
Credit: 7,508,002
RAC: 20
United Kingdom
Message 309618 - Posted: 18 May 2006, 16:42:17 UTC - in response to Message 309170.  

the strange thing is he's seeing 25%us (User) load, and only 75%ni (Nice), while all boinc tasks are at nearly 100% cpu.

There really should be just about 100%ni in the second line.

Exactly. I know all the other stuff and BOINC is using almost 100% x4 CPUtime. And that is fine with me. However they should all be running with nice 19...

Have you somehow got one of the Boinc processes running as yourself as Samuel?...

Or is this indeed a bug that you have found?

Can you post the output from "pstree -p" and "export TERM='xterm' ; top -b -n 1"? (Link to a txt file on a webpage somewhere? It will likely be unintelligible on here!)

Comparing the pstree process numbers and top output should show all!

Regards,
Martin
See new freedom: Mageia Linux
Take a look for yourself: Linux Format
The Future is what We all make IT (GPLv3)
ID: 309618 · Report as offensive

Message boards : Number crunching : One WU not respecting nice...


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.