Anything relating to AstroPulse tasks

Message boards : Number crunching : Anything relating to AstroPulse tasks
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 120 · Next

AuthorMessage
OTS
Volunteer tester

Send message
Joined: 6 Jan 08
Posts: 369
Credit: 20,533,537
RAC: 0
United States
Message 1711819 - Posted: 12 Aug 2015, 17:13:02 UTC - in response to Message 1711603.  

I finally got 2 full GPUs Whew.

Has anyone else noticed that you never get AP CPU tasks until the GPU is full? I always see that.

The only time I get CPU tasks is if it is a resend if GPU is not full.



I see the opposite, CPU has to be full first before GPU gets any work


What I see is that my ATI GPU has to be full before Nvidia GPU's get any work, and after that is CPU.

Anoying that ATI-first schedule, many times my dual-750ti's are empty when waiting for ATI to fill up.



What I do is set the "Store at least" value to a fraction of day, usually .4, and the "Store up to an additional" value to 0 days. When the GPU hits about 30 tasks in the queue it starts filling CPU queue. When the CPU queue has sufficient tasks to run half a day or so, I increase the "Store at least" value so the GPU will fill to the maximum 100 tasks while the CPU chugs along. You could probably do something similar to that so both GPUs and the CPU have some work before trying to max out the queues.
ID: 1711819 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22202
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1711965 - Posted: 12 Aug 2015, 21:27:46 UTC
Last modified: 12 Aug 2015, 21:29:05 UTC

Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then.

With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often.
A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one.

My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1711965 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22202
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1711967 - Posted: 12 Aug 2015, 21:28:38 UTC - in response to Message 1711817.  

ATM there are no APs left to split, oh well it was a good run.


Ah well, down to picking up the rejects again....
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1711967 · Report as offensive
Profile Todderbert
Avatar

Send message
Joined: 17 Jun 99
Posts: 221
Credit: 53,153,779
RAC: 0
United States
Message 1711969 - Posted: 12 Aug 2015, 21:30:55 UTC - in response to Message 1711965.  

Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then.

With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often.
A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one.

My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...)


Good information right there.
ID: 1711969 · Report as offensive
Profile Louis Loria II
Volunteer tester
Avatar

Send message
Joined: 20 Oct 03
Posts: 259
Credit: 9,208,040
RAC: 24
United States
Message 1712010 - Posted: 12 Aug 2015, 22:28:35 UTC - in response to Message 1711969.  

Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then.

With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often.
A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one.

My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...)


Good information right there.


I not understanding the nuances of all of these settings. I am set for 6 stored and 4 additional days of work (I hate it when the servers are down) and I have received at least 140 APs in the last two days. Is your calculated workload based on benchmarks/performance?

I run an app-config which allows 3 WUs per GPU and an additional 4 WUs for the CPU. I do however suspend all other tasks when I catch an inrush of APs. My rig will process them with no other lost WUs due to deadlines missed or otherwise.
ID: 1712010 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22202
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1712159 - Posted: 13 Aug 2015, 5:12:14 UTC

Your 4 additional days will not ensure you get a regular supply of work, rather a very lumpy delivery. I'm running 3MB or 2 AP per GPU and 4 or 6 CPU tasks at a time, so much the same as you.
I don't bother with suspending tasks, I let BOINC do what it is designed to do, and in the last burst of APs managed to have about 700 total on board at one time.

My settings have been generated by trial and error, and deliver a good constant throughput, and I rarely miss a deadline on any project due to BOINC misbehavior (I do miss the odd one when when of these PCs has to go off line to do a big CFD model or the like - they can take days to run on a pair of GTX980s!!!)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1712159 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1712164 - Posted: 13 Aug 2015, 5:31:57 UTC

I remember the cache settings for boinc 7+ changed from how they were before.. but I have no problems with 6.10.58 with getting tasks when asking for a tiny amount of work. For example, yesterday, I had "requesting 7.45 seconds of work" and got assigned an AP with an estimated duration of ~37,700 seconds. My cache settings are "connect every 0.01 days, additional work buffer 10.00 days."

But generally, yes. If you have your work buffer set for a small value, you won't get more APs than you can reasonably process in that time period. So if you have the work buffer set for 0.5 days, and you can do 10 APs/day, you probably won't get more than about 5 of them at a time.

It's one of those weird catch-22s, because you don't want to load up on a maximum cache of MBs for when APs start feeding again and then not be able to get many of them because you're already near/at the server-side limits for number of tasks, so you go with a small buffer.. but when APs come around, you'll then just have to manually change it to basically 10 days to load up on as many as you can get, and then change it back to 0.5 once the feeding frenzy ends, and then repeat that cycle.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1712164 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1712824 - Posted: 14 Aug 2015, 13:32:42 UTC - in response to Message 1711631.  

I finally got 2 full GPUs Whew.

Has anyone else noticed that you never get AP CPU tasks until the GPU is full? I always see that.

The only time I get CPU tasks is if it is a resend if GPU is not full.



I see the opposite, CPU has to be full first before GPU gets any work


What I see is that my ATI GPU has to be full before Nvidia GPU's get any work, and after that is CPU.

Anoying that ATI-first schedule, many times my dual-750ti's are empty when waiting for ATI to fill up.


Interesting enough, the worst performing GPU (Intel HD4600) gets work first for me. It's even slower than the CPU.
Then the 980 gets work, and last the CPU gets filled with work.
That's how it works for AP for me, but with MB it works more correct.


Does anyone has good explanation of this?

My slow, integrated, ATI APU GPU's in two hosts gets AP tasks first. Two 750ti's in first hosts are waiting and 660 in another host is waiting those tasks.

Same problem with Grumpy Swede's host, worst performing Intel GPU gets work first.

Is it up to S@H server configuration, or is it in BOINC configuration???

Alphabetical order? Ati, Intel, Nvidia? :D
ID: 1712824 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1712887 - Posted: 14 Aug 2015, 16:30:43 UTC - in response to Message 1712824.  

Alphabetical order? Ati, Intel, Nvidia? :D

Source code for work_fecth.cpp says Nvidia, AMD, Intel.
(Lines 405 and further)
ID: 1712887 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1712888 - Posted: 14 Aug 2015, 16:32:16 UTC

From my observations:
intel gpu, nvidia gpu, cpu.
ID: 1712888 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1712893 - Posted: 14 Aug 2015, 16:42:34 UTC - in response to Message 1712887.  

Alphabetical order? Ati, Intel, Nvidia? :D

Source code for work_fetch.cpp says Nvidia, AMD, Intel.
(Lines 405 and further)

No, that's client code - might be the order in which the client decides to add each request to the single sched_request...xml file.

But the questions are about how the server decides to respond to those combined requests. I suspect you might have to start looking around handle_request.cpp - that's scheduler (server) code, not client code.

And if the behaviour is different between MB and AP, it might be as simple as the order in which the various plan_class specifications are encountered in the plan_class_spec.xml file. Maybe oldest first?
ID: 1712893 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1712896 - Posted: 14 Aug 2015, 16:45:24 UTC

In my case it's ATI, nVidia, CPU...at present. This is much better than it was late last year when it was CPU first.
Back then my 3 ATI GPUs would sit there without work while the server sent dozens of tasks to the CPUs.
ID: 1712896 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1713020 - Posted: 14 Aug 2015, 20:22:40 UTC
Last modified: 14 Aug 2015, 20:37:01 UTC

Yips, lucky shot there. I just saw my HD7870 GPU temperature hovering around 80C, while normal temp under load is high 50s, low 60s centigrade. Fan speed was stuck at 30%.

It's that stupid Catalyst Control Center again. Once every so many months it resets something internally, and then the fan speed is stuck at 30%. Doesn't matter if I run Speedfan at the same time, this CCC setting overrides all.

I have to up the speed in Speedfan, then go into CCC, enable manual fan speed there, put it to something like 70% or above, Apply.
Then immediately the actual fan speed drops to ~50%, ignoring both the manual speed set in CCC and the speed set in Speedfan.
Next I can disable the manual fan speed again, Apply and that fixes it.

Until it happens again in the future.

Edit: here, running 15 minutes at 60% fan speed and temperature is at 55C. Way better than 80!
ID: 1713020 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1713048 - Posted: 14 Aug 2015, 21:22:47 UTC
Last modified: 14 Aug 2015, 21:39:22 UTC

For those with NV cards: (an optimisation) (optimize -- my ear says optimization)

1) open your AP_clFFTplan_GeForceGTX*bin* file with a text editor. {a * means any text}
2) find lines saying (in plain english assembler code) bar.sync
3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between.

I have commented out (//) two lines: L 1049 and L 1565.

Feel free to try.

... like this ...
	---
	ld.shared.f32 	%f161, [%rd4+12336];
	ld.shared.f32 	%f163, [%rd4+13364];
	ld.shared.f32 	%f165, [%rd4+14392];
	ld.shared.f32 	%f167, [%rd4+15420];
//	bar.sync 	0;
	add.s32 	%r70, %r10, %r1;
	add.s32 	%r71, %r70, %r4;
	mul.wide.s32 	%rd15, %r71, 8;
	add.s64 	%rd16, %rd5, %rd15;
	st.global.v2.f32 	[%rd16], {%f104, %f137};
	st.global.v2.f32 	[%rd16+2048], {%f106, %f139};
	st.global.v2.f32 	[%rd16+4096], {%f108, %f141};
	st.global.v2.f32 	[%rd16+6144], {%f110, %f143};
	st.global.v2.f32 	[%rd16+8192], {%f112, %f145};
	st.global.v2.f32 	[%rd16+10240], {%f114, %f147};
	st.global.v2.f32 	[%rd16+12288], {%f116, %f149};
	st.global.v2.f32 	[%rd16+14336], {%f118, %f151};
	st.global.v2.f32 	[%rd16+16384], {%f120, %f153};
	st.global.v2.f32 	[%rd16+18432], {%f122, %f155};
	st.global.v2.f32 	[%rd16+20480], {%f124, %f157};
	st.global.v2.f32 	[%rd16+22528], {%f126, %f159};
	st.global.v2.f32 	[%rd16+24576], {%f128, %f161};
	st.global.v2.f32 	[%rd16+26624], {%f130, %f163};
	st.global.v2.f32 	[%rd16+28672], {%f132, %f165};
	st.global.v2.f32 	[%rd16+30720], {%f134, %f167};
	ret;
}



For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to.
(for the generating code leave out the last open.cl.BARRIER.or.something please)
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1713048 · Report as offensive
Rasputin42
Volunteer tester

Send message
Joined: 25 Jul 08
Posts: 412
Credit: 5,834,661
RAC: 0
United States
Message 1713096 - Posted: 14 Aug 2015, 22:22:33 UTC
Last modified: 14 Aug 2015, 22:23:21 UTC

// bar.sync 0;
st.shared.f32 [%r7], %f7;
st.shared.f32 [%r7+512], %f8;
st.shared.f32 [%r7+1024], %f9;
st.shared.f32 [%r7+1536], %f10;
st.shared.f32 [%r7+2048], %f11;
st.shared.f32 [%r7+2560], %f12;
st.shared.f32 [%r7+3072], %f13;
st.shared.f32 [%r7+3584], %f14;

Would that be correct?(spaces are not showing)
ID: 1713096 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1713098 - Posted: 14 Aug 2015, 22:29:09 UTC - in response to Message 1713096.  
Last modified: 14 Aug 2015, 22:31:45 UTC

// bar.sync 0;
st.shared.f32 [%r7], %f7;
st.shared.f32 [%r7+512], %f8;
st.shared.f32 [%r7+1024], %f9;
st.shared.f32 [%r7+1536], %f10;
st.shared.f32 [%r7+2048], %f11;
st.shared.f32 [%r7+2560], %f12;
st.shared.f32 [%r7+3072], %f13;
st.shared.f32 [%r7+3584], %f14;

Would that be correct?(spaces are not showing)



Definitely no.

Just the two places where there are no ld.shared... or st.shared.. lines before a "ret;"
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1713098 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1713153 - Posted: 15 Aug 2015, 0:11:35 UTC - in response to Message 1713048.  

For those with NV cards: (an optimisation) (optimize -- my ear says optimization)

1) open your AP_clFFTplan_GeForceGTX*bin* file with a text editor. {a * means any text}
2) find lines saying (in plain english assembler code) bar.sync
3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between.

I have commented out (//) two lines: L 1049 and L 1565.

Feel free to try.

... like this ...
	---
	ld.shared.f32 	%f161, [%rd4+12336];
	ld.shared.f32 	%f163, [%rd4+13364];
	ld.shared.f32 	%f165, [%rd4+14392];
	ld.shared.f32 	%f167, [%rd4+15420];
//	bar.sync 	0;
	add.s32 	%r70, %r10, %r1;
	add.s32 	%r71, %r70, %r4;
	mul.wide.s32 	%rd15, %r71, 8;
	add.s64 	%rd16, %rd5, %rd15;
	st.global.v2.f32 	[%rd16], {%f104, %f137};
	st.global.v2.f32 	[%rd16+2048], {%f106, %f139};
	st.global.v2.f32 	[%rd16+4096], {%f108, %f141};
	st.global.v2.f32 	[%rd16+6144], {%f110, %f143};
	st.global.v2.f32 	[%rd16+8192], {%f112, %f145};
	st.global.v2.f32 	[%rd16+10240], {%f114, %f147};
	st.global.v2.f32 	[%rd16+12288], {%f116, %f149};
	st.global.v2.f32 	[%rd16+14336], {%f118, %f151};
	st.global.v2.f32 	[%rd16+16384], {%f120, %f153};
	st.global.v2.f32 	[%rd16+18432], {%f122, %f155};
	st.global.v2.f32 	[%rd16+20480], {%f124, %f157};
	st.global.v2.f32 	[%rd16+22528], {%f126, %f159};
	st.global.v2.f32 	[%rd16+24576], {%f128, %f161};
	st.global.v2.f32 	[%rd16+26624], {%f130, %f163};
	st.global.v2.f32 	[%rd16+28672], {%f132, %f165};
	st.global.v2.f32 	[%rd16+30720], {%f134, %f167};
	ret;
}



For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to.
(for the generating code leave out the last open.cl.BARRIER.or.something please)



Petri, what does this do?
ID: 1713153 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1713261 - Posted: 15 Aug 2015, 5:45:33 UTC
Last modified: 15 Aug 2015, 5:49:08 UTC

Please read post above this. Apologies for the double post
ID: 1713261 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1713262 - Posted: 15 Aug 2015, 5:45:38 UTC
Last modified: 15 Aug 2015, 5:47:05 UTC

Task 4311596089 It is interesting to me. It was created 4 seconds after it was sent
Created 12 Aug 2015, 7:52:17 UTC Sent 12 Aug 2015, 7:52:13 UTC
Interesting
ID: 1713262 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1713284 - Posted: 15 Aug 2015, 7:06:09 UTC - in response to Message 1713153.  

For those with NV cards: (an optimisation) (optimize -- my ear says optimization)

1) open your AP_clFFTplan_GeForceGTX*bin* file with a text editor. {a * means any text}
2) find lines saying (in plain english assembler code) bar.sync
3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between.

I have commented out (//) two lines: L 1049 and L 1565.

Feel free to try.

... like this ...
	---
	ld.shared.f32 	%f161, [%rd4+12336];
	ld.shared.f32 	%f163, [%rd4+13364];
	ld.shared.f32 	%f165, [%rd4+14392];
	ld.shared.f32 	%f167, [%rd4+15420];
//	bar.sync 	0;
	add.s32 	%r70, %r10, %r1;
	add.s32 	%r71, %r70, %r4;
	mul.wide.s32 	%rd15, %r71, 8;
	add.s64 	%rd16, %rd5, %rd15;
	st.global.v2.f32 	[%rd16], {%f104, %f137};
	st.global.v2.f32 	[%rd16+2048], {%f106, %f139};
	st.global.v2.f32 	[%rd16+4096], {%f108, %f141};
	st.global.v2.f32 	[%rd16+6144], {%f110, %f143};
	st.global.v2.f32 	[%rd16+8192], {%f112, %f145};
	st.global.v2.f32 	[%rd16+10240], {%f114, %f147};
	st.global.v2.f32 	[%rd16+12288], {%f116, %f149};
	st.global.v2.f32 	[%rd16+14336], {%f118, %f151};
	st.global.v2.f32 	[%rd16+16384], {%f120, %f153};
	st.global.v2.f32 	[%rd16+18432], {%f122, %f155};
	st.global.v2.f32 	[%rd16+20480], {%f124, %f157};
	st.global.v2.f32 	[%rd16+22528], {%f126, %f159};
	st.global.v2.f32 	[%rd16+24576], {%f128, %f161};
	st.global.v2.f32 	[%rd16+26624], {%f130, %f163};
	st.global.v2.f32 	[%rd16+28672], {%f132, %f165};
	st.global.v2.f32 	[%rd16+30720], {%f134, %f167};
	ret;
}



For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to.
(for the generating code leave out the last open.cl.BARRIER.or.something please)



Petri, what does this do?


It may give some speed.

It gives a GPU core a permission to continue calculations after reading from shared memory.
Since these loads are preceded by bar.sync (a wait) and no writes are done to shared memory after these loads it is not necessary to wait all reads to be finished before continuing. Nothing can alter the state of the shared memory when all cores are doing load operations.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1713284 · Report as offensive
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 120 · Next

Message boards : Number crunching : Anything relating to AstroPulse tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.