Anything relating to AstroPulse tasks

Author	Message
OTS Volunteer tester Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0	Message 1711819 - Posted: 12 Aug 2015, 17:13:02 UTC - in response to Message 1711603. I finally got 2 full GPUs Whew. Has anyone else noticed that you never get AP CPU tasks until the GPU is full? I always see that. The only time I get CPU tasks is if it is a resend if GPU is not full. I see the opposite, CPU has to be full first before GPU gets any work What I see is that my ATI GPU has to be full before Nvidia GPU's get any work, and after that is CPU. Anoying that ATI-first schedule, many times my dual-750ti's are empty when waiting for ATI to fill up. What I do is set the "Store at least" value to a fraction of day, usually .4, and the "Store up to an additional" value to 0 days. When the GPU hits about 30 tasks in the queue it starts filling CPU queue. When the CPU queue has sufficient tasks to run half a day or so, I increase the "Store at least" value so the GPU will fill to the maximum 100 tasks while the CPU chugs along. You could probably do something similar to that so both GPUs and the CPU have some work before trying to max out the queues. ID: 1711819 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380	Message 1711965 - Posted: 12 Aug 2015, 21:27:46 UTC Last modified: 12 Aug 2015, 21:29:05 UTC Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then. With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often. A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one. My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1711965 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380	Message 1711967 - Posted: 12 Aug 2015, 21:28:38 UTC - in response to Message 1711817. ATM there are no APs left to split, oh well it was a good run. Ah well, down to picking up the rejects again.... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1711967 ·

Todderbert Send message Joined: 17 Jun 99 Posts: 221 Credit: 53,153,779 RAC: 0	Message 1711969 - Posted: 12 Aug 2015, 21:30:55 UTC - in response to Message 1711965. Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then. With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often. A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one. My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...) Good information right there. ID: 1711969 ·

Louis Loria II Volunteer tester Send message Joined: 20 Oct 03 Posts: 259 Credit: 9,208,040 RAC: 24	Message 1712010 - Posted: 12 Aug 2015, 22:28:35 UTC - in response to Message 1711969. Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then. With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often. A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one. My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...) Good information right there. I not understanding the nuances of all of these settings. I am set for 6 stored and 4 additional days of work (I hate it when the servers are down) and I have received at least 140 APs in the last two days. Is your calculated workload based on benchmarks/performance? I run an app-config which allows 3 WUs per GPU and an additional 4 WUs for the CPU. I do however suspend all other tasks when I catch an inrush of APs. My rig will process them with no other lost WUs due to deadlines missed or otherwise. ID: 1712010 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380	Message 1712159 - Posted: 13 Aug 2015, 5:12:14 UTC Your 4 additional days will not ensure you get a regular supply of work, rather a very lumpy delivery. I'm running 3MB or 2 AP per GPU and 4 or 6 CPU tasks at a time, so much the same as you. I don't bother with suspending tasks, I let BOINC do what it is designed to do, and in the last burst of APs managed to have about 700 total on board at one time. My settings have been generated by trial and error, and deliver a good constant throughput, and I rarely miss a deadline on any project due to BOINC misbehavior (I do miss the odd one when when of these PCs has to go off line to do a big CFD model or the like - they can take days to run on a pair of GTX980s!!!) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1712159 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1712164 - Posted: 13 Aug 2015, 5:31:57 UTC I remember the cache settings for boinc 7+ changed from how they were before.. but I have no problems with 6.10.58 with getting tasks when asking for a tiny amount of work. For example, yesterday, I had "requesting 7.45 seconds of work" and got assigned an AP with an estimated duration of ~37,700 seconds. My cache settings are "connect every 0.01 days, additional work buffer 10.00 days." But generally, yes. If you have your work buffer set for a small value, you won't get more APs than you can reasonably process in that time period. So if you have the work buffer set for 0.5 days, and you can do 10 APs/day, you probably won't get more than about 5 of them at a time. It's one of those weird catch-22s, because you don't want to load up on a maximum cache of MBs for when APs start feeding again and then not be able to get many of them because you're already near/at the server-side limits for number of tasks, so you go with a small buffer.. but when APs come around, you'll then just have to manually change it to basically 10 days to load up on as many as you can get, and then change it back to 0.5 once the feeding frenzy ends, and then repeat that cycle. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1712164 ·

WezH Volunteer tester Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95	Message 1712824 - Posted: 14 Aug 2015, 13:32:42 UTC - in response to Message 1711631. I finally got 2 full GPUs Whew. Has anyone else noticed that you never get AP CPU tasks until the GPU is full? I always see that. The only time I get CPU tasks is if it is a resend if GPU is not full. I see the opposite, CPU has to be full first before GPU gets any work What I see is that my ATI GPU has to be full before Nvidia GPU's get any work, and after that is CPU. Anoying that ATI-first schedule, many times my dual-750ti's are empty when waiting for ATI to fill up. Interesting enough, the worst performing GPU (Intel HD4600) gets work first for me. It's even slower than the CPU. Then the 980 gets work, and last the CPU gets filled with work. That's how it works for AP for me, but with MB it works more correct. Does anyone has good explanation of this? My slow, integrated, ATI APU GPU's in two hosts gets AP tasks first. Two 750ti's in first hosts are waiting and 660 in another host is waiting those tasks. Same problem with Grumpy Swede's host, worst performing Intel GPU gets work first. Is it up to S@H server configuration, or is it in BOINC configuration??? Alphabetical order? Ati, Intel, Nvidia? :D ID: 1712824 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 1712887 - Posted: 14 Aug 2015, 16:30:43 UTC - in response to Message 1712824. Alphabetical order? Ati, Intel, Nvidia? :D Source code for work_fecth.cpp says Nvidia, AMD, Intel. (Lines 405 and further) ID: 1712887 ·

Rasputin42 Volunteer tester Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0	Message 1712888 - Posted: 14 Aug 2015, 16:32:16 UTC From my observations: intel gpu, nvidia gpu, cpu. ID: 1712888 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1712893 - Posted: 14 Aug 2015, 16:42:34 UTC - in response to Message 1712887. Alphabetical order? Ati, Intel, Nvidia? :D Source code for work_fetch.cpp says Nvidia, AMD, Intel. (Lines 405 and further) No, that's client code - might be the order in which the client decides to add each request to the single sched_request...xml file. But the questions are about how the server decides to respond to those combined requests. I suspect you might have to start looking around handle_request.cpp - that's scheduler (server) code, not client code. And if the behaviour is different between MB and AP, it might be as simple as the order in which the various plan_class specifications are encountered in the plan_class_spec.xml file. Maybe oldest first? ID: 1712893 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1712896 - Posted: 14 Aug 2015, 16:45:24 UTC In my case it's ATI, nVidia, CPU...at present. This is much better than it was late last year when it was CPU first. Back then my 3 ATI GPUs would sit there without work while the server sent dozens of tasks to the CPUs. ID: 1712896 ·

Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 1713020 - Posted: 14 Aug 2015, 20:22:40 UTC Last modified: 14 Aug 2015, 20:37:01 UTC Yips, lucky shot there. I just saw my HD7870 GPU temperature hovering around 80C, while normal temp under load is high 50s, low 60s centigrade. Fan speed was stuck at 30%. It's that stupid Catalyst Control Center again. Once every so many months it resets something internally, and then the fan speed is stuck at 30%. Doesn't matter if I run Speedfan at the same time, this CCC setting overrides all. I have to up the speed in Speedfan, then go into CCC, enable manual fan speed there, put it to something like 70% or above, Apply. Then immediately the actual fan speed drops to ~50%, ignoring both the manual speed set in CCC and the speed set in Speedfan. Next I can disable the manual fan speed again, Apply and that fixes it. Until it happens again in the future. Edit: here, running 15 minutes at 60% fan speed and temperature is at 55C. Way better than 80! ID: 1713020 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1713048 - Posted: 14 Aug 2015, 21:22:47 UTC Last modified: 14 Aug 2015, 21:39:22 UTC For those with NV cards: (an optimisation) (optimize -- my ear says optimization) 1) open your AP_clFFTplan_GeForceGTXbin file with a text editor. {a * means any text} 2) find lines saying (in plain english assembler code) bar.sync 3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between. I have commented out (//) two lines: L 1049 and L 1565. Feel free to try. ... like this ... --- ld.shared.f32 %f161, [%rd4+12336]; ld.shared.f32 %f163, [%rd4+13364]; ld.shared.f32 %f165, [%rd4+14392]; ld.shared.f32 %f167, [%rd4+15420]; // bar.sync 0; add.s32 %r70, %r10, %r1; add.s32 %r71, %r70, %r4; mul.wide.s32 %rd15, %r71, 8; add.s64 %rd16, %rd5, %rd15; st.global.v2.f32 [%rd16], {%f104, %f137}; st.global.v2.f32 [%rd16+2048], {%f106, %f139}; st.global.v2.f32 [%rd16+4096], {%f108, %f141}; st.global.v2.f32 [%rd16+6144], {%f110, %f143}; st.global.v2.f32 [%rd16+8192], {%f112, %f145}; st.global.v2.f32 [%rd16+10240], {%f114, %f147}; st.global.v2.f32 [%rd16+12288], {%f116, %f149}; st.global.v2.f32 [%rd16+14336], {%f118, %f151}; st.global.v2.f32 [%rd16+16384], {%f120, %f153}; st.global.v2.f32 [%rd16+18432], {%f122, %f155}; st.global.v2.f32 [%rd16+20480], {%f124, %f157}; st.global.v2.f32 [%rd16+22528], {%f126, %f159}; st.global.v2.f32 [%rd16+24576], {%f128, %f161}; st.global.v2.f32 [%rd16+26624], {%f130, %f163}; st.global.v2.f32 [%rd16+28672], {%f132, %f165}; st.global.v2.f32 [%rd16+30720], {%f134, %f167}; ret; } For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to. (for the generating code leave out the last open.cl.BARRIER.or.something please) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1713048 ·

Rasputin42 Volunteer tester Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0	Message 1713096 - Posted: 14 Aug 2015, 22:22:33 UTC Last modified: 14 Aug 2015, 22:23:21 UTC // bar.sync 0; st.shared.f32 [%r7], %f7; st.shared.f32 [%r7+512], %f8; st.shared.f32 [%r7+1024], %f9; st.shared.f32 [%r7+1536], %f10; st.shared.f32 [%r7+2048], %f11; st.shared.f32 [%r7+2560], %f12; st.shared.f32 [%r7+3072], %f13; st.shared.f32 [%r7+3584], %f14; Would that be correct?(spaces are not showing) ID: 1713096 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1713098 - Posted: 14 Aug 2015, 22:29:09 UTC - in response to Message 1713096. Last modified: 14 Aug 2015, 22:31:45 UTC // bar.sync 0; st.shared.f32 [%r7], %f7; st.shared.f32 [%r7+512], %f8; st.shared.f32 [%r7+1024], %f9; st.shared.f32 [%r7+1536], %f10; st.shared.f32 [%r7+2048], %f11; st.shared.f32 [%r7+2560], %f12; st.shared.f32 [%r7+3072], %f13; st.shared.f32 [%r7+3584], %f14; Would that be correct?(spaces are not showing) Definitely no. Just the two places where there are no ld.shared... or st.shared.. lines before a "ret;" To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1713098 ·

Zalster Volunteer tester Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242	Message 1713153 - Posted: 15 Aug 2015, 0:11:35 UTC - in response to Message 1713048. For those with NV cards: (an optimisation) (optimize -- my ear says optimization) 1) open your AP_clFFTplan_GeForceGTXbin file with a text editor. {a * means any text} 2) find lines saying (in plain english assembler code) bar.sync 3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between. I have commented out (//) two lines: L 1049 and L 1565. Feel free to try. ... like this ... --- ld.shared.f32 %f161, [%rd4+12336]; ld.shared.f32 %f163, [%rd4+13364]; ld.shared.f32 %f165, [%rd4+14392]; ld.shared.f32 %f167, [%rd4+15420]; // bar.sync 0; add.s32 %r70, %r10, %r1; add.s32 %r71, %r70, %r4; mul.wide.s32 %rd15, %r71, 8; add.s64 %rd16, %rd5, %rd15; st.global.v2.f32 [%rd16], {%f104, %f137}; st.global.v2.f32 [%rd16+2048], {%f106, %f139}; st.global.v2.f32 [%rd16+4096], {%f108, %f141}; st.global.v2.f32 [%rd16+6144], {%f110, %f143}; st.global.v2.f32 [%rd16+8192], {%f112, %f145}; st.global.v2.f32 [%rd16+10240], {%f114, %f147}; st.global.v2.f32 [%rd16+12288], {%f116, %f149}; st.global.v2.f32 [%rd16+14336], {%f118, %f151}; st.global.v2.f32 [%rd16+16384], {%f120, %f153}; st.global.v2.f32 [%rd16+18432], {%f122, %f155}; st.global.v2.f32 [%rd16+20480], {%f124, %f157}; st.global.v2.f32 [%rd16+22528], {%f126, %f159}; st.global.v2.f32 [%rd16+24576], {%f128, %f161}; st.global.v2.f32 [%rd16+26624], {%f130, %f163}; st.global.v2.f32 [%rd16+28672], {%f132, %f165}; st.global.v2.f32 [%rd16+30720], {%f134, %f167}; ret; } For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to. (for the generating code leave out the last open.cl.BARRIER.or.something please) Petri, what does this do? ID: 1713153 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1713261 - Posted: 15 Aug 2015, 5:45:33 UTC Last modified: 15 Aug 2015, 5:49:08 UTC Please read post above this. Apologies for the double post ID: 1713261 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1713262 - Posted: 15 Aug 2015, 5:45:38 UTC Last modified: 15 Aug 2015, 5:47:05 UTC Task 4311596089 It is interesting to me. It was created 4 seconds after it was sent Created 12 Aug 2015, 7:52:17 UTC Sent 12 Aug 2015, 7:52:13 UTC Interesting ID: 1713262 ·

petri33 Volunteer tester Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156	Message 1713284 - Posted: 15 Aug 2015, 7:06:09 UTC - in response to Message 1713153. For those with NV cards: (an optimisation) (optimize -- my ear says optimization) 1) open your AP_clFFTplan_GeForceGTXbin file with a text editor. {a * means any text} 2) find lines saying (in plain english assembler code) bar.sync 3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between. I have commented out (//) two lines: L 1049 and L 1565. Feel free to try. ... like this ... --- ld.shared.f32 %f161, [%rd4+12336]; ld.shared.f32 %f163, [%rd4+13364]; ld.shared.f32 %f165, [%rd4+14392]; ld.shared.f32 %f167, [%rd4+15420]; // bar.sync 0; add.s32 %r70, %r10, %r1; add.s32 %r71, %r70, %r4; mul.wide.s32 %rd15, %r71, 8; add.s64 %rd16, %rd5, %rd15; st.global.v2.f32 [%rd16], {%f104, %f137}; st.global.v2.f32 [%rd16+2048], {%f106, %f139}; st.global.v2.f32 [%rd16+4096], {%f108, %f141}; st.global.v2.f32 [%rd16+6144], {%f110, %f143}; st.global.v2.f32 [%rd16+8192], {%f112, %f145}; st.global.v2.f32 [%rd16+10240], {%f114, %f147}; st.global.v2.f32 [%rd16+12288], {%f116, %f149}; st.global.v2.f32 [%rd16+14336], {%f118, %f151}; st.global.v2.f32 [%rd16+16384], {%f120, %f153}; st.global.v2.f32 [%rd16+18432], {%f122, %f155}; st.global.v2.f32 [%rd16+20480], {%f124, %f157}; st.global.v2.f32 [%rd16+22528], {%f126, %f159}; st.global.v2.f32 [%rd16+24576], {%f128, %f161}; st.global.v2.f32 [%rd16+26624], {%f130, %f163}; st.global.v2.f32 [%rd16+28672], {%f132, %f165}; st.global.v2.f32 [%rd16+30720], {%f134, %f167}; ret; } For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to. (for the generating code leave out the last open.cl.BARRIER.or.something please) Petri, what does this do? It may give some speed. It gives a GPU core a permission to continue calculations after reading from shared memory. Since these loads are preceded by bar.sync (a wait) and no writes are done to shared memory after these loads it is not necessary to wait all reads to be finished before continuing. Nothing can alter the state of the shared memory when all cores are doing load operations. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones ID: 1713284 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.