Is the S@H server biased at sending more guppis to NV GPUs?

Message boards : Number crunching : Is the S@H server biased at sending more guppis to NV GPUs?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1805216 - Posted: 27 Jul 2016, 22:32:04 UTC

I am swapping tasks differently on 2 almost identical rigs.

One is sending nonVLARS from CPU to GPU, and the GPU queue keeps increasing (on most days) into the hundreds.

On the other, I am using my semi-automated script (w/o Mr Kevvy's script)to send Guppis from GPU to CPU, and after 3 days of doing so, the CPU queue has gone from 100 to slightly more than 200.

The later scenario had occurred to my sole alpha tester on a Core2Duo rig with a GTX 9x0, but I had assumed it was normal since the GPU:CPU ratio was greater than 10:1.

But with my 2 almost identical rigs, I am left wondering:
- Is this just an anomaly based on the fact that the server sometimes sends much more of one type of task; or
- Is the server actually sending a different ratio of nonVLAR:guppi to the GPU than the CPU queues?
    Additional details
    ==================
    On my NV_SoG app rig with GTX 750 Ti, I am using Mr Kevvy's script to swap nonVLARs to GPU with Guppis to CPU.
    It is processing per day almost twice as many nonVLARs on GPU (~100 tasks/day) as Guppis on CPU (~54 tasks/day).
    My GPU queue is increasing on most days well into the hundreds and if I didn't manage it in other ways, could top the 1k.

    On my Cuda50 app rig with also a GTX 750 Ti, I am using my semi-automated script (made prior to Mr K's release) to move Guppis from GPU to CPU.
    It is processing per day twice as many nonVLARs on GPU (~110 tasks/day) as Guppis on CPU (~54 tasks/day).
    During the last 3 days, my CPU queue has increased from 100 to slightly over 200.


Any thoughts if the Server might be sending diff ratios of tasks?
Rob ;-)

ID: 1805216 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1805232 - Posted: 28 Jul 2016, 0:14:52 UTC - in response to Message 1805216.  

No not really. Since GUPPI's are being split more often, if your look on SSP you'll see more GUPPI tapes with progress than arecibo.
The scheduler sends out WU in the queue in a first come, first serve method. So if GUPPI tapes finish splitting first and seconds later arecibo ones finish, the scheduler will send GUPPI WU since that finished first, and who knows how many WU per tape for GUPPI.
(Conjecture)I might be wrong about the amount for both, but arecibo should yield 352k WU per tape, GUPPI being the same size should also yield 352k WU.(Conjecture end)
ID: 1805232 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1805242 - Posted: 28 Jul 2016, 1:13:43 UTC - in response to Message 1805232.  

Hey Kiska,
Thanks for your reply.
From what I've observed (since I started swapping manually (about a month ago), semi-automated and then automated) when my GPU queues are over 100 tasks, many more nonVLARS than guppis are sent to be assigned to CPU...and then I transfer the nonVLARs to the GPU queue.

From recollection, I'd estimate to about half the time when I get all tasks of 1 type (either all Guppi or all nonVLAR). The other times, I get a mix of nonVLARs and Guppis.
So to reconcile my experience with your reply:
No not really. Since GUPPI's are being split more often, if your look on SSP you'll see more GUPPI tapes with progress than arecibo.
The scheduler sends out WU in the queue in a first come, first serve method. So if GUPPI tapes finish splitting first and seconds later arecibo ones finish, the scheduler will send GUPPI WU since that finished first, and who knows how many WU per tape for GUPPI.

there should be about half the time when the splitting of both is happening in parallel to create a FIFO queue that is what I have been experiencing.

From looking at the S@h server status page, I can't reconcile my experience of getting much more nonVLARs during the last month than what I'm interpreting on that page since it seems there should be more uppis in the field than nonVLARs.

So all this to say I'm perplexed since my experience doesn't match up with your reply and the SSP.
I'm guessing I'm not seeing an important variable.
Any idea what it might be?
ID: 1805242 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1805270 - Posted: 28 Jul 2016, 4:19:27 UTC - in response to Message 1805242.  

An important variable is that other people are also requesting tasks. Like right now for my GPU I got a bunch of arecibo tasks, that are nonVLAR.
ID: 1805270 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22200
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1805287 - Posted: 28 Jul 2016, 7:00:29 UTC

Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1805287 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1805293 - Posted: 28 Jul 2016, 8:03:19 UTC - in response to Message 1805287.  
Last modified: 28 Jul 2016, 8:04:32 UTC

Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.

If they run twice as long check your host setup. Execution times did not change cause length of telescope data inside file remained the same. Just each point takes twice in bits. {And this change applied both to Arecibo and GBT data}
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1805293 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1805307 - Posted: 28 Jul 2016, 9:32:20 UTC

I now have time to sit down and fully read your message. Again with the ratio of splitting. The current ratio is 7:4 (GBT:Arecibo), looking at SSP in progress. So for every Arecibo WU there will be 1.75 GBT WU. So you are most likely going to get 64:36 (GBT:Arecibo) in terms of what you get.
Now not every Arecibo WU will be VHAR(nonVLAR) since we piggyback on whatever science project the telescope is running. Whilst GBT data will almost always be VLAR since it is dedicated to scanning the sky for us.
Remember that regardless of where the data comes from, the designations stay the same, VLAR, midAR, nonVLAR(VHAR).
(Conjecture) VLAR is when True angle range is < 0.15, midAR is when angle range is between 0.15 and 1.00 and VHAR(nonVLAR) is > 1.00(Conjecture end)
Also note that other people are requesting tasks and whilst you see that you might be just getting one type of a unit others might not be.
Like today I've just been getting VHAR's, no GUPPI or VLAR's in sight.
ID: 1805307 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1805309 - Posted: 28 Jul 2016, 10:00:55 UTC - in response to Message 1805293.  

Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.

If they run twice as long check your host setup. Execution times did not change cause length of telescope data inside file remained the same. Just each point takes twice in bits. {And this change applied both to Arecibo and GBT data}


With the more bits per complex samples we will get somewhat more results for essentially the same compute time, plus 1 or 2 minutes. I am very happy with the new complex samples will continue to test.
ID: 1805309 · Report as offensive
Onno Hommes

Send message
Joined: 18 Jul 16
Posts: 2
Credit: 47,606
RAC: 0
United States
Message 1805763 - Posted: 30 Jul 2016, 4:13:57 UTC

In the last two weeks I have been assigned 5 Astropulse tasks which are the only ones actually using my GPU on Linux. My GPU tasks complete in fractions (i.e. ten times faster) of the CPU tasks yet I barely get any GPU tasks assigned.

Is there a bias in assigning tasks? Why don't I get more GPU tasks especially since I can crunch much faster on that hardware. The GPU is totally underutilized.

Any comments?
ID: 1805763 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1805773 - Posted: 30 Jul 2016, 5:12:55 UTC - in response to Message 1805763.  

Why don't I get more GPU tasks especially since I can crunch much faster on that hardware.

Do you have an application on the system to crunch MB tasks?
Grant
Darwin NT
ID: 1805773 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22200
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1805775 - Posted: 30 Jul 2016, 6:04:09 UTC

As you only have the GPU application for processing AstroPulse tasks you only get AstroPulse tasks for the GPU, and they are not "split" as often as "MulitBeam" tasks. This low availability is for a number of reasons including:
- a "tape" doesn't contain as many as many AstroPulse tasks as it does MultiBeam tasks;
- currently only tapes from the "Arecibo" telescope are being split for AstroPulse tasks;
- many of the "Arecibo" tapes have already had their AstoPulse tasks split, these tapes are having their "normal" tasks split a second time due to the new MB splitter being a lot more sensitive than the old one.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1805775 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1805809 - Posted: 30 Jul 2016, 11:19:16 UTC - in response to Message 1805763.  

Astropulse is only currently split on Arecibo data, and the ratio per WU is. 1 AP per 30 MB task. Also with AP they usually grant more credit and therefore more 'lucrative' to some people
ID: 1805809 · Report as offensive
Onno Hommes

Send message
Joined: 18 Jul 16
Posts: 2
Credit: 47,606
RAC: 0
United States
Message 1805844 - Posted: 30 Jul 2016, 13:30:04 UTC - in response to Message 1805775.  

Thanks for the feedback. I didn't realize this and as a result I kept adjusting my compute settings to see if that made a difference. I'll reenable CPU processing again too since I only disabled that as an experiment to see if that had any impact.
ID: 1805844 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1805993 - Posted: 31 Jul 2016, 4:49:43 UTC - in response to Message 1805216.  
Last modified: 31 Jul 2016, 4:57:37 UTC

Additional details
==================
On my NV_SoG app rig with GTX 750 Ti, I am using Mr Kevvy's script to swap nonVLARs to GPU with Guppis to CPU.
It is processing per day almost twice as many nonVLARs on GPU (~100 tasks/day) as Guppis on CPU (~54 tasks/day).
My GPU queue is increasing on most days well into the hundreds and if I didn't manage it in other ways, could top the 1k.

On my Cuda50 app rig with also a GTX 750 Ti, I am using my semi-automated script (made prior to Mr K's release) to move Guppis from GPU to CPU.
It is processing per day twice as many nonVLARs on GPU (~110 tasks/day) as Guppis on CPU (~54 tasks/day).

During the last 3 days, my CPU queue has increased from 100 to slightly over 200.[/list]
Any thoughts if the Server might be sending diff ratios of tasks?

...as compared between CPU-assigned incoming ratios and GPU-assigned incoming ratios.

Gents,
My NV_SoG rig is still receiving much more nonVLARs as compared to Guppis.
The reassigned nonVLARs from CPU to GPU queue has even gone up by another 100 nonVLAR tasks since my original posting and is reaching 450 now!
Keep in mind that I am also processing about 3:2 nonVLARs on GPU as compared to guppi on CPU. So if we should be getting more guppi than nonVLARs, my GPU queue should be decreasing over time...which it is not.
I know I have only 2 rigs to compare so I could have one that is constantly receiving an atypical supply over 1 week...but:
Is it possible the S@H server is sending more nonVLARs to CPU?
...sicnce that is what I have experienced during the last month with my task swapping, which ends up after a few days being sending solely nonVLARs to GPU

[edit] for future ref: I currently have 81 guppis on CPU queue and 450 nonVLARs on GPU queue.
if I filled the queue on CPU to 94, it would involve pressing Update, then transfering a few (if not all) new tasks to GPU...and repeating every 5 minutes until I got big batch of guppis[/e]
ID: 1805993 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1806734 - Posted: 3 Aug 2016, 5:55:04 UTC - in response to Message 1805993.  
Last modified: 3 Aug 2016, 5:56:43 UTC

The issue is getting worse on both rigs:
-rig1: CPU queue keeps getting bigger with Guppis when moving them from GPU queue
-rig2: GPU queue keeps getting bigger with nonVLARs when moving them from CPU queue

The only explanation I've come up with that can explain this pattern is:

When Guppis were introduced on NV GPUs earlier this year, the plan might have been to limit the ratio of Guppi to nonVLAR as compared to what was distributed for CPUs.
But a typo was done in a config file at the server that ended up sending a higher % of guppi to NV GPU as compared to CPU (instead of a lower %)

If anyone is:
- using BoincTasks and logging the History.cvs and
- NOT doing any task rescheduling (such as with the GUPPIrescheduler.exe)
you should be able to confirm or negate my peculiar situation.

PM me if you want me to analyse your BoincTasks History.cvs file
ID: 1806734 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1806743 - Posted: 3 Aug 2016, 6:27:20 UTC - in response to Message 1806734.  

When Guppis were introduced on NV GPUs earlier this year, the plan might have been to limit the ratio of Guppi to nonVLAR as compared to what was distributed for CPUs.

AFAIK the only plan was to split work, the ratio between Guppies & Arecibo being dependant on how much of each is available.
Grant
Darwin NT
ID: 1806743 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1806749 - Posted: 3 Aug 2016, 7:10:09 UTC - in response to Message 1806734.  

Another thing is that until the tapes are split, they won't know the Angle range. Also:
Guppi = Greenbank telescope data
nonVLAR = (can be)Greenbank telescope and Arecibo data.
VLAR = (can be) Arecibo and Greenbank telescope data.
I have seen blc units that had angle ranges of 0.44291 and I have seen Arecibo with angle ranges of 0.005536. So until they are split they won't know what the angle range is and by the time they do know, its already in the workunit queue.
ID: 1806749 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1806828 - Posted: 3 Aug 2016, 17:18:41 UTC - in response to Message 1806743.  

When Guppis were introduced on NV GPUs earlier this year, the plan might have been to limit the ratio of Guppi to nonVLAR as compared to what was distributed for CPUs.

AFAIK the only plan was to split work, the ratio between Guppies & Arecibo being dependant on how much of each is available.

I believe Eric had stated that at some point 90% of the work may be coming from GBT. As Arecibo hasn't been very active recently. It might be having budget/closure issues again.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1806828 · Report as offensive
Kiska
Volunteer tester

Send message
Joined: 31 Mar 12
Posts: 302
Credit: 3,067,762
RAC: 0
Australia
Message 1806915 - Posted: 4 Aug 2016, 0:07:19 UTC - in response to Message 1806734.  

I will be able to do the gathering of data once the PrimeGrid challenge ends
ID: 1806915 · Report as offensive
Profile Stubbles
Volunteer tester
Avatar

Send message
Joined: 29 Nov 99
Posts: 358
Credit: 5,909,255
RAC: 0
Canada
Message 1809257 - Posted: 15 Aug 2016, 9:33:39 UTC

Finally a solid answer from Richard in another thread:

https://setiathome.berkeley.edu/forum_thread.php?id=79954&postid=1809255

No need to do anything Kiska

Cheers,
Rob :-)
ID: 1809257 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Is the S@H server biased at sending more guppis to NV GPUs?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.