Message boards :
Number crunching :
Anything relating to AstroPulse tasks
Message board moderation
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 120 · Next
Author | Message |
---|---|
OTS Send message Joined: 6 Jan 08 Posts: 369 Credit: 20,533,537 RAC: 0 |
I finally got 2 full GPUs Whew. What I do is set the "Store at least" value to a fraction of day, usually .4, and the "Store up to an additional" value to 0 days. When the GPU hits about 30 tasks in the queue it starts filling CPU queue. When the CPU queue has sufficient tasks to run half a day or so, I increase the "Store at least" value so the GPU will fill to the maximum 100 tasks while the CPU chugs along. You could probably do something similar to that so both GPUs and the CPU have some work before trying to max out the queues. |
rob smith Send message Joined: 7 Mar 03 Posts: 22222 Credit: 416,307,556 RAC: 380 |
Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then. With a small "store at least" setting you will have a small cache, probably well below the 100 that is allowed for the CPU, set it higher and you will find that you don't have to tune it so often. A zero "extra days" will not always work, setting it just above zero will ensure that you will make regular requests for work, not at the default interval that is set server side, again this will ensure you request a replacement task every time you return one. My cache setting is 6 days, plus 0.01, and all four of my rigs have full caches (well, bouncing off the limits) most of the time and I don't have to worry about spending time increasing and decreasing the settings. (One of mine has a pile of ghosts that will vanish one day...) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
rob smith Send message Joined: 7 Mar 03 Posts: 22222 Credit: 416,307,556 RAC: 380 |
ATM there are no APs left to split, oh well it was a good run. Ah well, down to picking up the rejects again.... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Todderbert Send message Joined: 17 Jun 99 Posts: 221 Credit: 53,153,779 RAC: 0 |
Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then. Good information right there. |
Louis Loria II Send message Joined: 20 Oct 03 Posts: 259 Credit: 9,208,040 RAC: 24 |
Not getting AP tasks (when they are available) is down to your SMALL cache size - an AP takes about 0.5 days to run, while an MB takes about 0.1 days, so if your cahce is set to 0.5 days, plus 0 you will only ever get APs when you almost right out of MB tasks, indeed, you may not get any then. I not understanding the nuances of all of these settings. I am set for 6 stored and 4 additional days of work (I hate it when the servers are down) and I have received at least 140 APs in the last two days. Is your calculated workload based on benchmarks/performance? I run an app-config which allows 3 WUs per GPU and an additional 4 WUs for the CPU. I do however suspend all other tasks when I catch an inrush of APs. My rig will process them with no other lost WUs due to deadlines missed or otherwise. |
rob smith Send message Joined: 7 Mar 03 Posts: 22222 Credit: 416,307,556 RAC: 380 |
Your 4 additional days will not ensure you get a regular supply of work, rather a very lumpy delivery. I'm running 3MB or 2 AP per GPU and 4 or 6 CPU tasks at a time, so much the same as you. I don't bother with suspending tasks, I let BOINC do what it is designed to do, and in the last burst of APs managed to have about 700 total on board at one time. My settings have been generated by trial and error, and deliver a good constant throughput, and I rarely miss a deadline on any project due to BOINC misbehavior (I do miss the odd one when when of these PCs has to go off line to do a big CFD model or the like - they can take days to run on a pair of GTX980s!!!) Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I remember the cache settings for boinc 7+ changed from how they were before.. but I have no problems with 6.10.58 with getting tasks when asking for a tiny amount of work. For example, yesterday, I had "requesting 7.45 seconds of work" and got assigned an AP with an estimated duration of ~37,700 seconds. My cache settings are "connect every 0.01 days, additional work buffer 10.00 days." But generally, yes. If you have your work buffer set for a small value, you won't get more APs than you can reasonably process in that time period. So if you have the work buffer set for 0.5 days, and you can do 10 APs/day, you probably won't get more than about 5 of them at a time. It's one of those weird catch-22s, because you don't want to load up on a maximum cache of MBs for when APs start feeding again and then not be able to get many of them because you're already near/at the server-side limits for number of tasks, so you go with a small buffer.. but when APs come around, you'll then just have to manually change it to basically 10 days to load up on as many as you can get, and then change it back to 0.5 once the feeding frenzy ends, and then repeat that cycle. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
WezH Send message Joined: 19 Aug 99 Posts: 576 Credit: 67,033,957 RAC: 95 |
I finally got 2 full GPUs Whew. Interesting enough, the worst performing GPU (Intel HD4600) gets work first for me. It's even slower than the CPU. Does anyone has good explanation of this? My slow, integrated, ATI APU GPU's in two hosts gets AP tasks first. Two 750ti's in first hosts are waiting and 660 in another host is waiting those tasks. Same problem with Grumpy Swede's host, worst performing Intel GPU gets work first. Is it up to S@H server configuration, or is it in BOINC configuration??? Alphabetical order? Ati, Intel, Nvidia? :D |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Alphabetical order? Ati, Intel, Nvidia? :D Source code for work_fecth.cpp says Nvidia, AMD, Intel. (Lines 405 and further) |
Rasputin42 Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0 |
From my observations: intel gpu, nvidia gpu, cpu. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
Alphabetical order? Ati, Intel, Nvidia? :D No, that's client code - might be the order in which the client decides to add each request to the single sched_request...xml file. But the questions are about how the server decides to respond to those combined requests. I suspect you might have to start looking around handle_request.cpp - that's scheduler (server) code, not client code. And if the behaviour is different between MB and AP, it might be as simple as the order in which the various plan_class specifications are encountered in the plan_class_spec.xml file. Maybe oldest first? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
In my case it's ATI, nVidia, CPU...at present. This is much better than it was late last year when it was CPU first. Back then my 3 ATI GPUs would sit there without work while the server sent dozens of tasks to the CPUs. |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Yips, lucky shot there. I just saw my HD7870 GPU temperature hovering around 80C, while normal temp under load is high 50s, low 60s centigrade. Fan speed was stuck at 30%. It's that stupid Catalyst Control Center again. Once every so many months it resets something internally, and then the fan speed is stuck at 30%. Doesn't matter if I run Speedfan at the same time, this CCC setting overrides all. I have to up the speed in Speedfan, then go into CCC, enable manual fan speed there, put it to something like 70% or above, Apply. Then immediately the actual fan speed drops to ~50%, ignoring both the manual speed set in CCC and the speed set in Speedfan. Next I can disable the manual fan speed again, Apply and that fixes it. Until it happens again in the future. Edit: here, running 15 minutes at 60% fan speed and temperature is at 55C. Way better than 80! |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
For those with NV cards: (an optimisation) (optimize -- my ear says optimization) 1) open your AP_clFFTplan_GeForceGTX*bin* file with a text editor. {a * means any text} 2) find lines saying (in plain english assembler code) bar.sync 3) Of those "bar.sync" lines comment out (add // at the beginning of the line) those that have no aftercomming lines having NO more ld.s... There shold be 2 lines saying bar.sync before a return (ret) and no ld.s between. I have commented out (//) two lines: L 1049 and L 1565. Feel free to try. ... like this ... --- ld.shared.f32 %f161, [%rd4+12336]; ld.shared.f32 %f163, [%rd4+13364]; ld.shared.f32 %f165, [%rd4+14392]; ld.shared.f32 %f167, [%rd4+15420]; // bar.sync 0; add.s32 %r70, %r10, %r1; add.s32 %r71, %r70, %r4; mul.wide.s32 %rd15, %r71, 8; add.s64 %rd16, %rd5, %rd15; st.global.v2.f32 [%rd16], {%f104, %f137}; st.global.v2.f32 [%rd16+2048], {%f106, %f139}; st.global.v2.f32 [%rd16+4096], {%f108, %f141}; st.global.v2.f32 [%rd16+6144], {%f110, %f143}; st.global.v2.f32 [%rd16+8192], {%f112, %f145}; st.global.v2.f32 [%rd16+10240], {%f114, %f147}; st.global.v2.f32 [%rd16+12288], {%f116, %f149}; st.global.v2.f32 [%rd16+14336], {%f118, %f151}; st.global.v2.f32 [%rd16+16384], {%f120, %f153}; st.global.v2.f32 [%rd16+18432], {%f122, %f155}; st.global.v2.f32 [%rd16+20480], {%f124, %f157}; st.global.v2.f32 [%rd16+22528], {%f126, %f159}; st.global.v2.f32 [%rd16+24576], {%f128, %f161}; st.global.v2.f32 [%rd16+26624], {%f130, %f163}; st.global.v2.f32 [%rd16+28672], {%f132, %f165}; st.global.v2.f32 [%rd16+30720], {%f134, %f167}; ret; } For my AMD and an Intel a bin file is a bin file. I'd do the same if I knew how to. (for the generating code leave out the last open.cl.BARRIER.or.something please) To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Rasputin42 Send message Joined: 25 Jul 08 Posts: 412 Credit: 5,834,661 RAC: 0 |
// bar.sync 0; st.shared.f32 [%r7], %f7; st.shared.f32 [%r7+512], %f8; st.shared.f32 [%r7+1024], %f9; st.shared.f32 [%r7+1536], %f10; st.shared.f32 [%r7+2048], %f11; st.shared.f32 [%r7+2560], %f12; st.shared.f32 [%r7+3072], %f13; st.shared.f32 [%r7+3584], %f14; Would that be correct?(spaces are not showing) |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
// bar.sync 0; Definitely no. Just the two places where there are no ld.shared... or st.shared.. lines before a "ret;" To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
For those with NV cards: (an optimisation) (optimize -- my ear says optimization) Petri, what does this do? |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Please read post above this. Apologies for the double post |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Task 4311596089 It is interesting to me. It was created 4 seconds after it was sent Created 12 Aug 2015, 7:52:17 UTC Sent 12 Aug 2015, 7:52:13 UTC Interesting |
petri33 Send message Joined: 6 Jun 02 Posts: 1668 Credit: 623,086,772 RAC: 156 |
For those with NV cards: (an optimisation) (optimize -- my ear says optimization) It may give some speed. It gives a GPU core a permission to continue calculations after reading from shared memory. Since these loads are preceded by bar.sync (a wait) and no writes are done to shared memory after these loads it is not necessary to wait all reads to be finished before continuing. Nothing can alter the state of the shared memory when all cores are doing load operations. To overcome Heisenbergs: "You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.