Is the S@H server biased at sending more guppis to NV GPUs?

留言板 : Number crunching : Is the S@H server biased at sending more guppis to NV GPUs?
留言板合理

To post messages, you must log in.

1 · 2 · 后

作者消息
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1809280 - 发表于:15 Aug 2016, 11:49:03 UTC - 回复消息 1809257.  

Spreadsheet, you cannot edit it :)
I am was just going to give you the same spreadsheet as I did in the thread I started. You can see a workunit counter on the side.
ID: 1809280 · 举报违规帖子
Profile Stubbles
志愿者测试人员
Avatar

发送消息
已加入:29 Nov 99
贴子:358
积分:5,909,255
近期平均积分:0
Canada
消息 1809257 - 发表于:15 Aug 2016, 9:33:39 UTC

Finally a solid answer from Richard in another thread:

https://setiathome.berkeley.edu/forum_thread.php?id=79954&postid=1809255

No need to do anything Kiska

Cheers,
Rob :-)
ID: 1809257 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1806915 - 发表于:4 Aug 2016, 0:07:19 UTC - 回复消息 1806734.  

I will be able to do the gathering of data once the PrimeGrid challenge ends
ID: 1806915 · 举报违规帖子
Profile HAL9000
志愿者测试人员
Avatar

发送消息
已加入:11 Sep 99
贴子:6533
积分:196,805,888
近期平均积分:57
United States
消息 1806828 - 发表于:3 Aug 2016, 17:18:41 UTC - 回复消息 1806743.  

When Guppis were introduced on NV GPUs earlier this year, the plan might have been to limit the ratio of Guppi to nonVLAR as compared to what was distributed for CPUs.

AFAIK the only plan was to split work, the ratio between Guppies & Arecibo being dependant on how much of each is available.

I believe Eric had stated that at some point 90% of the work may be coming from GBT. As Arecibo hasn't been very active recently. It might be having budget/closure issues again.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1806828 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1806749 - 发表于:3 Aug 2016, 7:10:09 UTC - 回复消息 1806734.  

Another thing is that until the tapes are split, they won't know the Angle range. Also:
Guppi = Greenbank telescope data
nonVLAR = (can be)Greenbank telescope and Arecibo data.
VLAR = (can be) Arecibo and Greenbank telescope data.
I have seen blc units that had angle ranges of 0.44291 and I have seen Arecibo with angle ranges of 0.005536. So until they are split they won't know what the angle range is and by the time they do know, its already in the workunit queue.
ID: 1806749 · 举报违规帖子
Grant (SSSF)
志愿者测试人员

发送消息
已加入:19 Aug 99
贴子:13012
积分:208,696,464
近期平均积分:304
Australia
消息 1806743 - 发表于:3 Aug 2016, 6:27:20 UTC - 回复消息 1806734.  

When Guppis were introduced on NV GPUs earlier this year, the plan might have been to limit the ratio of Guppi to nonVLAR as compared to what was distributed for CPUs.

AFAIK the only plan was to split work, the ratio between Guppies & Arecibo being dependant on how much of each is available.
Grant
Darwin NT
ID: 1806743 · 举报违规帖子
Profile Stubbles
志愿者测试人员
Avatar

发送消息
已加入:29 Nov 99
贴子:358
积分:5,909,255
近期平均积分:0
Canada
消息 1806734 - 发表于:3 Aug 2016, 5:55:04 UTC - 回复消息 1805993.  
最近的修改日期:3 Aug 2016, 5:56:43 UTC

The issue is getting worse on both rigs:
-rig1: CPU queue keeps getting bigger with Guppis when moving them from GPU queue
-rig2: GPU queue keeps getting bigger with nonVLARs when moving them from CPU queue

The only explanation I've come up with that can explain this pattern is:

When Guppis were introduced on NV GPUs earlier this year, the plan might have been to limit the ratio of Guppi to nonVLAR as compared to what was distributed for CPUs.
But a typo was done in a config file at the server that ended up sending a higher % of guppi to NV GPU as compared to CPU (instead of a lower %)

If anyone is:
- using BoincTasks and logging the History.cvs and
- NOT doing any task rescheduling (such as with the GUPPIrescheduler.exe)
you should be able to confirm or negate my peculiar situation.

PM me if you want me to analyse your BoincTasks History.cvs file
ID: 1806734 · 举报违规帖子
Profile Stubbles
志愿者测试人员
Avatar

发送消息
已加入:29 Nov 99
贴子:358
积分:5,909,255
近期平均积分:0
Canada
消息 1805993 - 发表于:31 Jul 2016, 4:49:43 UTC - 回复消息 1805216.  
最近的修改日期:31 Jul 2016, 4:57:37 UTC

Additional details
==================
On my NV_SoG app rig with GTX 750 Ti, I am using Mr Kevvy's script to swap nonVLARs to GPU with Guppis to CPU.
It is processing per day almost twice as many nonVLARs on GPU (~100 tasks/day) as Guppis on CPU (~54 tasks/day).
My GPU queue is increasing on most days well into the hundreds and if I didn't manage it in other ways, could top the 1k.

On my Cuda50 app rig with also a GTX 750 Ti, I am using my semi-automated script (made prior to Mr K's release) to move Guppis from GPU to CPU.
It is processing per day twice as many nonVLARs on GPU (~110 tasks/day) as Guppis on CPU (~54 tasks/day).

During the last 3 days, my CPU queue has increased from 100 to slightly over 200.[/list]
Any thoughts if the Server might be sending diff ratios of tasks?

...as compared between CPU-assigned incoming ratios and GPU-assigned incoming ratios.

Gents,
My NV_SoG rig is still receiving much more nonVLARs as compared to Guppis.
The reassigned nonVLARs from CPU to GPU queue has even gone up by another 100 nonVLAR tasks since my original posting and is reaching 450 now!
Keep in mind that I am also processing about 3:2 nonVLARs on GPU as compared to guppi on CPU. So if we should be getting more guppi than nonVLARs, my GPU queue should be decreasing over time...which it is not.
I know I have only 2 rigs to compare so I could have one that is constantly receiving an atypical supply over 1 week...but:
Is it possible the S@H server is sending more nonVLARs to CPU?
...sicnce that is what I have experienced during the last month with my task swapping, which ends up after a few days being sending solely nonVLARs to GPU

[edit] for future ref: I currently have 81 guppis on CPU queue and 450 nonVLARs on GPU queue.
if I filled the queue on CPU to 94, it would involve pressing Update, then transfering a few (if not all) new tasks to GPU...and repeating every 5 minutes until I got big batch of guppis[/e]
ID: 1805993 · 举报违规帖子
Onno Hommes

发送消息
已加入:18 Jul 16
贴子:2
积分:47,606
近期平均积分:0
United States
消息 1805844 - 发表于:30 Jul 2016, 13:30:04 UTC - 回复消息 1805775.  

Thanks for the feedback. I didn't realize this and as a result I kept adjusting my compute settings to see if that made a difference. I'll reenable CPU processing again too since I only disabled that as an experiment to see if that had any impact.
ID: 1805844 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1805809 - 发表于:30 Jul 2016, 11:19:16 UTC - 回复消息 1805763.  

Astropulse is only currently split on Arecibo data, and the ratio per WU is. 1 AP per 30 MB task. Also with AP they usually grant more credit and therefore more 'lucrative' to some people
ID: 1805809 · 举报违规帖子
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者负责人
志愿者测试人员

发送消息
已加入:7 Mar 03
贴子:18752
积分:416,307,556
近期平均积分:380
United Kingdom
消息 1805775 - 发表于:30 Jul 2016, 6:04:09 UTC

As you only have the GPU application for processing AstroPulse tasks you only get AstroPulse tasks for the GPU, and they are not "split" as often as "MulitBeam" tasks. This low availability is for a number of reasons including:
- a "tape" doesn't contain as many as many AstroPulse tasks as it does MultiBeam tasks;
- currently only tapes from the "Arecibo" telescope are being split for AstroPulse tasks;
- many of the "Arecibo" tapes have already had their AstoPulse tasks split, these tapes are having their "normal" tasks split a second time due to the new MB splitter being a lot more sensitive than the old one.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1805775 · 举报违规帖子
Grant (SSSF)
志愿者测试人员

发送消息
已加入:19 Aug 99
贴子:13012
积分:208,696,464
近期平均积分:304
Australia
消息 1805773 - 发表于:30 Jul 2016, 5:12:55 UTC - 回复消息 1805763.  

Why don't I get more GPU tasks especially since I can crunch much faster on that hardware.

Do you have an application on the system to crunch MB tasks?
Grant
Darwin NT
ID: 1805773 · 举报违规帖子
Onno Hommes

发送消息
已加入:18 Jul 16
贴子:2
积分:47,606
近期平均积分:0
United States
消息 1805763 - 发表于:30 Jul 2016, 4:13:57 UTC

In the last two weeks I have been assigned 5 Astropulse tasks which are the only ones actually using my GPU on Linux. My GPU tasks complete in fractions (i.e. ten times faster) of the CPU tasks yet I barely get any GPU tasks assigned.

Is there a bias in assigning tasks? Why don't I get more GPU tasks especially since I can crunch much faster on that hardware. The GPU is totally underutilized.

Any comments?
ID: 1805763 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1805309 - 发表于:28 Jul 2016, 10:00:55 UTC - 回复消息 1805293.  

Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.

If they run twice as long check your host setup. Execution times did not change cause length of telescope data inside file remained the same. Just each point takes twice in bits. {And this change applied both to Arecibo and GBT data}


With the more bits per complex samples we will get somewhat more results for essentially the same compute time, plus 1 or 2 minutes. I am very happy with the new complex samples will continue to test.
ID: 1805309 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1805307 - 发表于:28 Jul 2016, 9:32:20 UTC

I now have time to sit down and fully read your message. Again with the ratio of splitting. The current ratio is 7:4 (GBT:Arecibo), looking at SSP in progress. So for every Arecibo WU there will be 1.75 GBT WU. So you are most likely going to get 64:36 (GBT:Arecibo) in terms of what you get.
Now not every Arecibo WU will be VHAR(nonVLAR) since we piggyback on whatever science project the telescope is running. Whilst GBT data will almost always be VLAR since it is dedicated to scanning the sky for us.
Remember that regardless of where the data comes from, the designations stay the same, VLAR, midAR, nonVLAR(VHAR).
(Conjecture) VLAR is when True angle range is < 0.15, midAR is when angle range is between 0.15 and 1.00 and VHAR(nonVLAR) is > 1.00(Conjecture end)
Also note that other people are requesting tasks and whilst you see that you might be just getting one type of a unit others might not be.
Like today I've just been getting VHAR's, no GUPPI or VLAR's in sight.
ID: 1805307 · 举报违规帖子
Profile Raistmer
志愿者开发人员
志愿者测试人员
Avatar

发送消息
已加入:16 Jun 01
贴子:6259
积分:106,370,077
近期平均积分:121
Russia
消息 1805293 - 发表于:28 Jul 2016, 8:03:19 UTC - 回复消息 1805287.  
最近的修改日期:28 Jul 2016, 8:04:32 UTC

Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.

If they run twice as long check your host setup. Execution times did not change cause length of telescope data inside file remained the same. Just each point takes twice in bits. {And this change applied both to Arecibo and GBT data}
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1805293 · 举报违规帖子
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
志愿者负责人
志愿者测试人员

发送消息
已加入:7 Mar 03
贴子:18752
积分:416,307,556
近期平均积分:380
United Kingdom
消息 1805287 - 发表于:28 Jul 2016, 7:00:29 UTC

Currently the distributed files size is the same. There have be some trials on Beta recently with "double size" files - those I currently have are "Arecibo type" names, and have taken about twice as long to run as standard length files. I guess Eric has been trying something on the splitters over there.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1805287 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1805270 - 发表于:28 Jul 2016, 4:19:27 UTC - 回复消息 1805242.  

An important variable is that other people are also requesting tasks. Like right now for my GPU I got a bunch of arecibo tasks, that are nonVLAR.
ID: 1805270 · 举报违规帖子
Profile Stubbles
志愿者测试人员
Avatar

发送消息
已加入:29 Nov 99
贴子:358
积分:5,909,255
近期平均积分:0
Canada
消息 1805242 - 发表于:28 Jul 2016, 1:13:43 UTC - 回复消息 1805232.  

Hey Kiska,
Thanks for your reply.
From what I've observed (since I started swapping manually (about a month ago), semi-automated and then automated) when my GPU queues are over 100 tasks, many more nonVLARS than guppis are sent to be assigned to CPU...and then I transfer the nonVLARs to the GPU queue.

From recollection, I'd estimate to about half the time when I get all tasks of 1 type (either all Guppi or all nonVLAR). The other times, I get a mix of nonVLARs and Guppis.
So to reconcile my experience with your reply:
No not really. Since GUPPI's are being split more often, if your look on SSP you'll see more GUPPI tapes with progress than arecibo.
The scheduler sends out WU in the queue in a first come, first serve method. So if GUPPI tapes finish splitting first and seconds later arecibo ones finish, the scheduler will send GUPPI WU since that finished first, and who knows how many WU per tape for GUPPI.

there should be about half the time when the splitting of both is happening in parallel to create a FIFO queue that is what I have been experiencing.

From looking at the S@h server status page, I can't reconcile my experience of getting much more nonVLARs during the last month than what I'm interpreting on that page since it seems there should be more uppis in the field than nonVLARs.

So all this to say I'm perplexed since my experience doesn't match up with your reply and the SSP.
I'm guessing I'm not seeing an important variable.
Any idea what it might be?
ID: 1805242 · 举报违规帖子
Kiska
志愿者测试人员

发送消息
已加入:31 Mar 12
贴子:295
积分:3,067,762
近期平均积分:0
Australia
消息 1805232 - 发表于:28 Jul 2016, 0:14:52 UTC - 回复消息 1805216.  

No not really. Since GUPPI's are being split more often, if your look on SSP you'll see more GUPPI tapes with progress than arecibo.
The scheduler sends out WU in the queue in a first come, first serve method. So if GUPPI tapes finish splitting first and seconds later arecibo ones finish, the scheduler will send GUPPI WU since that finished first, and who knows how many WU per tape for GUPPI.
(Conjecture)I might be wrong about the amount for both, but arecibo should yield 352k WU per tape, GUPPI being the same size should also yield 352k WU.(Conjecture end)
ID: 1805232 · 举报违规帖子
1 · 2 · 后

留言板 : Number crunching : Is the S@H server biased at sending more guppis to NV GPUs?


 
©2020 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.