WU data doesn't match expectations. Explain the discrepancies, please.

Message boards : Number crunching : WU data doesn't match expectations. Explain the discrepancies, please.
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984706 - Posted: 12 Mar 2019, 9:19:35 UTC
Last modified: 12 Mar 2019, 9:32:01 UTC

Using Lunatics optimized app (0.45b), results show the average CPU time/run time is 0.85 (14 results) with many having run times of 36 seconds but reporting 217 seconds of CPU https://setiathome.berkeley.edu/result.php?resultid=7494014591. Others seem more normal but still show 650 sec run time with 350 CPU seconds https://setiathome.berkeley.edu/result.php?resultid=7494014031.
I know the CPU report is wrong as I watched several WU and they never take more than 25% of a single core (holds steady at ~4.2% CPU in Process Hacker with total 6 cores) and the GPU is fully occupied for most of the run time.

A run time of 26 seconds with 250+ CPU is a multi threaded result and these WU are not multi threading. The other cores are nearly fully occupied with prime searches.

(EDIT: Made my machine visible so you all can see the results)
ID: 1984706 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1984707 - Posted: 12 Mar 2019, 9:29:48 UTC - in response to Message 1984706.  
Last modified: 12 Mar 2019, 9:31:33 UTC

The only thing that comes to mind is that the cpu is doing extremely Heavy task switching. (I still can't see the system from S@H)

Seti cpu tasks has a priority level set to lowest, and most other tasks has normal.
So when the O/S decide that it's time for you to have a go to your process it just alow it for a short period of time etc etc.

So the actual cpu time spent on this task is 26 seconds or so, but in reality it took 250+ seconds due to taskswitching (It's a performance costly thing) and different O/S is behaving better/worse.
When more cores/virtual ones comes into play they all compete for internal bus time to-from cache-memory etc.

That's a simplified version of actual process time and real time spent.

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1984707 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984710 - Posted: 12 Mar 2019, 9:47:18 UTC - in response to Message 1984707.  
Last modified: 12 Mar 2019, 9:49:52 UTC

(OK, I changed the setting to hide the computers then went in again and unhid but if you click on my name still shows hidden. Not sure how long that will take to change.)


So the actual cpu time spent on this task is 26 seconds or so, but in reality it took 250+ seconds due to taskswitching

If all the CPU times were longer than the run times; I'd assume it was a reporting error with the CPU//Run time swapped but some run times are actually longer.

I was thinking some of the same things as you so I used Process Hacker to add the Lunatics app to a list of priority settings adjustments and forced the app to 'above normal' and it gets it's 4.2% CPU steadily against the CPU projects for the last few hours (it's internal threads are all 'normal priority') but 2 of those last 3 WU results still have the odd characteristics.

This one looks normal:
https://setiathome.berkeley.edu/result.php?resultid=7494014179
Since you probably can't see it yet:
7494014179
3385719454
8245453
11 Mar 2019, 9:17:53 UTC
12 Mar 2019, 2:17:19 UTC
Completed and validated
4,877.47 sec rt
337.63 sec cpu
66.74 credit
SETI@home v8
Anonymous platform (ATI GPU)

The other 2 are: https://setiathome.berkeley.edu/result.php?resultid=7494014031
614.76s
304.97s
86.51c

and https://setiathome.berkeley.edu/result.php?resultid=7494014660
20.66s
32.84s
35.50c

The regular app results were all similar to: https://setiathome.berkeley.edu/result.php?resultid=7490394814
4,755.01s
204.88s
105.96c
ID: 1984710 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 1984712 - Posted: 12 Mar 2019, 9:58:18 UTC

All the results i can see look as intended.

But this one was abnormal :) https://setiathome.berkeley.edu/result.php?resultid=7494014660
Quite cool to have more cputime than actualtime but i guess it's due to what i wrote about. Taskswitching combined with other random stuff that i can't Point on.
But other than that it all look fine. Many gpu results vary in cputime needed to complete the workunit so it's all correct.

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 1984712 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984714 - Posted: 12 Mar 2019, 10:05:28 UTC

Every task listed in this thread so far has been calculated on an ATI GPU.

The difference between CPU time and run time is entirely normal and indeed desirable.

A GPU is a co-processor. The idea is for the CPU to hand off the computation load to its partner, and then be free to do something else. BOINC doesn't measure or report the run-time for the GPU separately, but it is likely to be close to the run time for the task as a whole.

When using a GPU, it's actually inefficient for the CPU to be active on the same tasks, but some tasks/hardware/applications/programming languages require it.
ID: 1984714 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 1984716 - Posted: 12 Mar 2019, 10:26:19 UTC - in response to Message 1984710.  

Hi Marmot,

(OK, I changed the setting to hide the computers then went in again and unhid but if you click on my name still shows hidden. Not sure how long that will take to change.)

Here's a dumb question: Do you go to the bottom of the page to apply the changes made to your settings? Some users forget to do that. ;) I have on occasion when I was new to BOINC.

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1984716 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984725 - Posted: 12 Mar 2019, 11:52:44 UTC - in response to Message 1984714.  
Last modified: 12 Mar 2019, 12:25:05 UTC


The difference between CPU time and run time is entirely normal and indeed desirable.


No, these run time differences are not normal. Many look like multi-threaded results when they are not advertised as such.

A GPU is a co-processor. The idea is for the CPU to hand off the computation load to its partner, and then be free to do something else. BOINC doesn't measure or report the run-time for the GPU separately, but it is likely to be close to the run time for the task as a whole.


That is completely off topic and information we all know already and has nothing to do with the errors seen. The run time is simply the ending time minus the start time. The time the WU started until it finishes, it's the upper limit of GPU seconds and by definition must be greater than the CPU time unless the WU is multi-threaded. It also must be 4 times longer (or more) than the CPU time since the Lunatics WU is locking the CPU into 25% of a CPU core.

The run times seem to be misreported.

ID: 1984725 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984726 - Posted: 12 Mar 2019, 11:53:36 UTC - in response to Message 1984716.  
Last modified: 12 Mar 2019, 11:54:06 UTC

Hi Marmot,

(OK, I changed the setting to hide the computers then went in again and unhid but if you click on my name still shows hidden. Not sure how long that will take to change.)

Here's a dumb question: Do you go to the bottom of the page to apply the changes made to your settings? Some users forget to do that. ;) I have on occasion when I was new to BOINC.

Have a great day! :)

Siran

Damn man... Look at my S@H birthdate before asking something like that... I joined BEFORE you did...
ID: 1984726 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984728 - Posted: 12 Mar 2019, 12:14:28 UTC - in response to Message 1984712.  
Last modified: 12 Mar 2019, 12:23:00 UTC

All the results i can see look as intended.

But other than that it all look fine. Many gpu results vary in cputime needed to complete the workunit so it's all correct.


The data says otherwise:

9/20 or 45% of the WU's have run times shorter than the CPU time which is only possible with multi-threaded WU's.
These are not multi-threaded; I observed their function and probed their threads. There is one calculation thread (amdocl.dll!clGetPipeInfo+0x18af0) taking up CPU cycles. To multi thread the WU would start several copies of that process.

Edit:AH HA!

The wrapper DID start multiple calculation threads; there are currently 2 (amdocl.dll!clGetPipeInfo+0x18af0) executing CPU cycles and 4 available.

So the answer to this reporting anomalies is that the Lunatics WU is multi-threading.
ID: 1984728 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984732 - Posted: 12 Mar 2019, 12:52:03 UTC

Yes, the Lunatics SETI app is multi-threaded. This is the Process Explorer view of the NVidia build, but all versions are built from the same code base.



Most of the threads are called from the driver/language DLL, but there are three threads called directly from the Lunatics executable.

Two of these are background threads managed by the BOINC library API, for tasks like timing and message-passing which are required for an application to run under the BOINC environment. As you see here, they normally occupy a tiny fraction of a single CPU core. The 23.75% figure (one full core) is specific to the NVidia OpenCL programming environment, and represents the spin-wait synchronisation loop required by NVidia's implementation of OpenCL - it won't apply on your AMD cards.

So I would say that where you see CPU time exceeding GPU time, there is indeed something unexpected happening on your computer, that is not designed into the Lunatics apps.
ID: 1984732 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 1984744 - Posted: 12 Mar 2019, 13:25:14 UTC - in response to Message 1984726.  

Hi Marmot,

(OK, I changed the setting to hide the computers then went in again and unhid but if you click on my name still shows hidden. Not sure how long that will take to change.)

Here's a dumb question: Do you go to the bottom of the page to apply the changes made to your settings? Some users forget to do that. ;) I have on occasion when I was new to BOINC.

Have a great day! :)

Siran

Damn man... Look at my S@H birthdate before asking something like that... I joined BEFORE you did...

Big deal so frakking what. So you joined 7 days before me. You don't need to get snippy about it. I was just trying to help.

I went to your page and saw the computers hidden. What I posted was the first thing I thought of. I just went again now and now they are not. Kinda seems like my post was relevant. :|
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1984744 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984747 - Posted: 12 Mar 2019, 13:34:04 UTC - in response to Message 1984745.  

And Btw, Richard is the accepted expert upon Lunatics installers.
Yes, I wrote the installers, and I have personal responsibility for their contents.

But I did not write the payload they deliver - the actual binary executable programs themselves.

I did test many of the applications myself, though not the AMD variants - I've never owned an AMD/ATI compute-capable GPU. But you don't work alongside developers and test their products over a number of years without learning something of what goes on under the hood.
ID: 1984747 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 1984748 - Posted: 12 Mar 2019, 13:34:42 UTC - in response to Message 1984745.  

And Btw, Richard is the accepted expert upon Lunatics installers.

Was I giving him advice on the Lunatics installer? No! So STFU!
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1984748 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 1984775 - Posted: 12 Mar 2019, 15:08:46 UTC - in response to Message 1984728.  


So the answer to this reporting anomalies is that the Lunatics WU is multi-threading.


I have had Windows 10 sometimes install in such a way as boinc thought it was running on 2 gpus, but in fact it was running two threads on 1 gpu. This happened with both Nvidia and AMD gpu drivers under Windows 10.

The way I noticed is when I looked at the task manager, one gpu was busy but the other wasn't. Since the time to process the gpu tasks was headed for more than double the time it normally took, I infer something was running 2 tasks on a gpu.

The fix that seems to work is to backdate to a significantly earlier driver. I have been using 3.97 something or other.

I am not sure if this post is relevant or not. If it isn't please ignore it.

Tom
A proud member of the OFA (Old Farts Association).
ID: 1984775 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1984781 - Posted: 12 Mar 2019, 15:20:18 UTC - in response to Message 1984775.  

I think the "two tasks on one GPU" problem may be related to recent reports that the same GPU may be detected twice at startup and listed twice in the initial display of detected devices.

This appears to be related to two different OpenCL runtime stacks, perhaps with different version numbers and from different suppliers, being present in the operating system and associated with the NVidia hardware. That possibility could be investigated by referring to the opening lines of the BOINC Event Log at startup.
ID: 1984781 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984884 - Posted: 13 Mar 2019, 2:54:13 UTC - in response to Message 1984744.  

You don't need to get snippy about it. I was just trying to help.

I went to your page and saw the computers hidden. What I posted was the first thing I thought of. I just went again now and now they are not. Kinda seems like my post was relevant. :|


Reverse the situation and imagine someone telling you, a nearly 20 year veteran, of making the most rookie mistake especially after they mentioned they adjusted the preferences twice.

It was insulting and deserved clap back.

It takes a website a period of time to unhide computers; I'd forgotten the lag seen when unhiding at YAFU and Collatz

Thanks for the other aid.
ID: 1984884 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984885 - Posted: 13 Mar 2019, 2:58:46 UTC - in response to Message 1984775.  
Last modified: 13 Mar 2019, 3:11:16 UTC

Tom M:

So the answer to this reporting anomalies is that the Lunatics WU is multi-threading.


I have had Windows 10 sometimes install in such a way as boinc thought it was running on 2 gpus, but in fact it was running two threads on 1 gpu. This happened with both Nvidia and AMD gpu drivers under Windows 10.

The way I noticed is when I looked at the task manager, one gpu was busy but the other wasn't. Since the time to process the gpu tasks was headed for more than double the time it normally took, I infer something was running 2 tasks on a gpu.

The fix that seems to work is to backdate to a significantly earlier driver. I have been using 3.97 something or other.

I am not sure if this post is relevant or not. If it isn't please ignore it.

Tom


Your input might be relevant at least for another issue.
One of the issues from 2 days ago is that MSI Afterburner reports 4 video cards, 1&2 are the Phenom' II's internal, 3 is the RX 550 and 4 is the nVidia 1060.
Collatz won't d/l and run nVida WU's (it ran them fine when 1060 was the only coproc) but sends RX 550 work and none of the attempted fixes helped. GPUGrid, Enigma, SETI are working on the ATI/nVidia setup fine, though.

Richard Haselgrove:
I think the "two tasks on one GPU" problem may be related to recent reports that the same GPU may be detected twice at startup and listed twice in the initial display of detected devices.

This appears to be related to two different OpenCL runtime stacks, perhaps with different version numbers and from different suppliers, being present in the operating system and associated with the NVidia hardware. That possibility could be investigated by referring to the opening lines of the BOINC Event Log at startup.


Thanks, I'll look into that, might be related to the Collatz issue.

I'll also take a screen shot of the WU threads for the ATI card so you can compare to your above picture.

Be back with an update by tomorrow.
ID: 1984885 · Report as offensive
marmot
Avatar

Send message
Joined: 15 May 99
Posts: 144
Credit: 1,220,664
RAC: 0
United States
Message 1984900 - Posted: 13 Mar 2019, 5:04:24 UTC - in response to Message 1984732.  
Last modified: 13 Mar 2019, 5:05:12 UTC

Yes, the Lunatics SETI app is multi-threaded. This is the Process Explorer view of the NVidia build, but all versions are built from the same code base.



Most of the threads are called from the driver/language DLL, but there are three threads called directly from the Lunatics executable.

Two of these are background threads managed by the BOINC library API, for tasks like timing and message-passing which are required for an application to run under the BOINC environment. As you see here, they normally occupy a tiny fraction of a single CPU core. The 23.75% figure (one full core) is specific to the NVidia OpenCL programming environment, and represents the spin-wait synchronisation loop required by NVidia's implementation of OpenCL - it won't apply on your AMD cards.

So I would say that where you see CPU time exceeding GPU time, there is indeed something unexpected happening on your computer, that is not designed into the Lunatics apps.



Here is the WU running on the RX 550.
There are 4 amdocl calc threads started but have only seen 2 at once using CPU and the one with the lower TID is always greater than the second in usage by ~50%. Makes me wonder if it's possible (or advantageous), to get all 4 amdolc threads operating with cmdline options.
This is Lunatics 0.45b with cmdline -hp (sets the wrapper thread to high priority, useful against when many CPU WU's running).
The process MB8_win_xxx takes maximum 4.6% (of 6 cores) under load but will use 16.3% (1 core) on loading the data set.


ID: 1984900 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 1984914 - Posted: 13 Mar 2019, 10:33:14 UTC - in response to Message 1984884.  

You don't need to get snippy about it. I was just trying to help.

I went to your page and saw the computers hidden. What I posted was the first thing I thought of. I just went again now and now they are not. Kinda seems like my post was relevant. :|


Reverse the situation and imagine someone telling you, a nearly 20 year veteran, of making the most rookie mistake especially after they mentioned they adjusted the preferences twice.

It was insulting and deserved clap back.

It takes a website a period of time to unhide computers; I'd forgotten the lag seen when unhiding at YAFU and Collatz

Thanks for the other aid.

You know what? There is obviously a BIG difference between us when it comes to people helping. If the situation was reversed I would not have gotten all pissy about the suggestion. It was a reminder that there is an apply button to change settings. I would have said something like: "Yeah I thought of that too and made sure I hit the button. Can't mistake it though, with that 'select all pictures with...' that they use for making sure we are all humans. ;)"

Well I'll be darned. I just checked and you are correct in that, at least with that particular type of setting, there is a delay for when it takes affect. Every other setting I have changed was an instant change, except that one. And, there's no "gottcha" to prove we are human. Hmmm...

My apologies for trying to help and insulting you. It was not my intent to insult. I did not see that you joined 8 days before me and are more proficient than I. I will not attempt to help again.

Situation dropped.

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1984914 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1984921 - Posted: 13 Mar 2019, 12:17:52 UTC - in response to Message 1984914.  

It's called browser cache. Your browser caches the last state until the cache is refreshed. Or you can force a cache refresh with CTRL+F5 (Firefox) or whatever possible equivalent for other browsers.

People read things that aren't there and are much too easily offended these days, and always find they have the right to rip the helper a new one. Their loss as they're the one asking for help, which if they react enough like so, no one will offer again.
ID: 1984921 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : WU data doesn't match expectations. Explain the discrepancies, please.


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.