Message boards :
Number crunching :
Average Credit Decreasing?
Message board moderation
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 32 · Next
Author | Message |
---|---|
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
And if I move to Mexico and get a job as a janitor my paycheque in pesos will be larger than it is currently in dollars. I knew this was coming, so i'm just the more pleased - thanks a lot! :þ :)) Aloha, Uli |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Perhaps some of our resident smart guys can fix it? I'm looking at you, Jason!!!! Oh I appreciate the 'faith', only warn that the time I have at the moment is pretty patchy/limited. Between all the apps being experimented with at the moment by others, and the knowledge we all have accumulated, I'm sure we'll find the better options. For now just assume it's going to take time, and there may not be simple one-size-fits-all solutions here. My contributions for the short term will probably be limited to testing some theories (with some special test pieces). For one thing it looks like speed/throughput on these is only a part of the problem, while scaling/load-balancing on vastly different machines may be a bigger part of the issues than before. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
Down, down, and down it goes. Where it stops, nobody knows. Grant Darwin NT |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Ah well, it's quite clear now. With the introduction of GBT work, my RAC is falling like a stone on MB work. Since AP is just a "once per week" thing nowadays, my RAC doesn't have time to recover. . . Some things are funny, the Guppies MB tasks on my i5 run quicker than arecibo WU's and get a better ratio of credits/time, but on the GPU they take at least twice as long to run as Arecibo WU's and are only now beginning to show some proportionate increase in awarding credits, leaving them still way behind other tasks. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Yeah, the transient response on creditNew is pretty poor, and overshoot+oscillation is likely. [i.e untuned system] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
IIRC we have the same conversation few years ago when V7 apears and the diference V7 vs AP comes to a big number. . . Time for a revolution brother ??? :) |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
. . Time for a revolution brother ??? :) Revolution isn't the right word, we need a new one for 'Didn't realise what they made so now need to accept the future is not in their control.' "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
Perhaps some of our resident smart guys can fix it? I'm looking at you, Jason!!!! Hey, J, you can do it. As I see it, you are correct, the problem with CN will not be solved easily. BUT that is not what is causing RACs to fall like a stone now. It is the fact that (given so many GPU MBs are vlars) the MB app has a bug in it that, while it still works, takes several times longer to complete for vlars than for non-vlars. Putting the effort into finding and fixing THAT problem will do a lot more than fixing CN. And should work across all machines, I think. So a more fruitful place to get credit "fixes". SO: even if CN was giving "better" numbers, it can't do it for GPU vlars because, after all, they are really not several times the work, so the time consumed destroys the credit that machines like mine receive. Let me give 2 examples, based on (very) rough estimates of what my machines were/are doing. One ("Machine A") is an i7-4790k running with HT off (for reasons of temperature control) running 2 threads of CPU. The other ("Machine B") is a Xeon E5-2670 running HT on, using 14 of 16 threads for CPU work. Both are running 2 x GTX 980 with 3 threads per card. So A has 8 threads total, and B has 20. Before vlars, on A, an MB on CPU took about 1 hour; on the GPU, 15 minutes. So A could do roughly 24x2 + 6x4x24 MBs/day, or 624. On B, it was ~1.5 hours on CPU and ~20 minutes on GPU, so B could do 18x14 + 3x6x24 = 684, so slightly more than A. And that's what I saw - A running a bit under 50K credit/day, B slightly over 50K/day. But now, with almost entirely vlars on the 980s, which take about 4 times as long, we have the following situation: A -> 24x2 + (6x4x24)/4 = 192/day, B -> 18x14 + (3x6x24)/4 = 360/day. So even if CN was not a factor, Machine A would see his credit drop more than 2/3, and Machine B, almost 50%. Horrific. That's the source of the current credit crash, and why I am going to move Machine A's pair of 980s to a new machine (see the thread about cheap 32 core machines) with 2 E5-2670s running with HT on, to get 30 CPU threads, so I can have some decent credit granted despite the bad MB app. Anybody want to buy an i7-4790k and motherboard? |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Down, down, and down it goes. Grant Coles stopped using that for there adds mate Down Down prices are down ! Or as I say Down Down food quality is down Or down down Credit new is down and screwing things up . |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
How about just adding to Credit new a time part if it takes longer than 1 hr and every hour after that a extra 80-100 credits for all units CPU and GPU If the GPU can do it in under 1 hr no extra credit If it takes longer you get the extra credits This will solve the problem of slower machines not being treated fair It won't penalise faster machines like it is doing now And you won't need to change much as you can add a extra fumula to the score once credit new determines what your score should be less extra credits for taking a long time Credit new score + 80 for every hour past 1 hr simple should solve the Vlar problem buy not PENALISING fast machines and will give slower ones a little extra and should balance out much better . |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Extra formula for credit new UT/3600=ET If ET<3600 then ET=0 ET*80=EC CN+EC=FS UT = Unit time ET = Extra time CN = credit new EC = extra credit FS = Final score I'm shore to add those lines into what ever code should not be to complicated Mr Anderson |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
You can add this too Fill in what ever in brackets If U = (how ever you I.D the units) then (goto , goback , jump ,)(program line) else (goto , goback , jump to )(program line) U = Unit If you wish to exclude CPU units all together to make it only for GPU's and repeat line for AP's weather GPU or CPU That should make it even more fairer (for those whom may whinged about slower computers getting more credit and those with a Xenon Chip or Chip with more than 4 cores getting to much extra credit ) |
Darth Beaver Send message Joined: 20 Aug 99 Posts: 6728 Credit: 21,443,075 RAC: 3 |
Extra formula for credit new Edit : I got it wrong first time so line ET*80=EC should be ET+80=EC |
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
"Credit" should reflect the real work done, so every machine - regardless of it's speed - should "earn" the same amount (for this unit!). If a machine is slower, it gets less credit just because it can't crunch as much units, as a faster machine - simple as that. Same work - same credit. Unfortunately at this time credit is "magically guessed" by some highly scientifically blown up "wannabe all in one" algorithm, that is in reality simply a magnificent complicated random number generator - really good for nothing at all. Aloha, Uli |
Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340 |
"Credit" should reflect the real work done, so every machine - regardless of it's speed - should "earn" the same amount (for this unit!). If a machine is slower, it gets less credit just because it can't crunch as much units, as a faster machine - simple as that. Same work - same credit. That's what I was saying in my post a few above this one - and the slowdown caused by GPU vlars IS being faithfully represented by CN (within its limits that we all know). The current credit crash is being caused by the stretchout of vlars on GPUs, which is a faulty app that should be fixed a.s.a.p. and doesn't have all the issues debated for CN. It just needs someone to find and fix the bug, which is (I assume) a design problem. |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
The app works fine. It was refined to deal with the more complex data coming out of Green Bank. You could turn off the credit system and the apps would work like they are supposed to. It's the credit system that is the problem, it's based off some imaginary number that then gets multiplied by some correction factor that then gets spit out. So what we need is a fixed credit for each work unit. Then it would properly reflect how well one's computers do. |
Chris Adamek Send message Joined: 15 May 99 Posts: 251 Credit: 434,772,072 RAC: 236 |
The app does not have a "bug" in it as you keep espousing. The VLAR wu's are not the same type of signal as mid-range or very high range angle ranged wu's. Both of those types are moving across the sky to put it very simply, the larger the number the "faster" it passes a point in the sky. The MB app has several types of "apps" within it to search for signals, the split of computation time each of these kernels takes of the overall computation time is in part driven by the angle range. In the case of VLARs, where the telescope is looking at a pinpoint location the sky for a long time, there's a whole lot of time spent looking for pulses. There's a limit to how parallelized you can make those pulse searches. That's the high level gist of how the all works and is working as designed. There's no bug in the app but some architectures do pulse finding more efficiently than others. The current Cuda app struggles with it, making the computer laggy and that is why for a very long time VLARS were kept off the GPU's even though AMD cards running OpenCL were not adversely impacted by VLARs. The developers recently released a pen OpenCL app for the nvidia cards to overcomes the lag issue, making it viable to reintroduce VLAR to the GPUs on the main project. It doesn't really change the fact they take longer to run, but that's not a bug, that's do to the reasons I described above. Further optimization will likely be possible over time but in the mean time it was better to release an app that allowed the entire community to contribute to the Green Bank data, as it will soon make up the majority of the work we have to do. The only reason you are seeing a drop in RAC is because CN does not correctly give credit for work done and in introducing the new wu's we are going to see a lot of oscillation in the RAC granted on any given wu. I've seen VLARs get 60 points and others get 200 for the same amount of work. Eventually that variation will likely settle down but our overall RAC will almost certainly always be lower the more optimized our apps gets, because it is what is flawed... I rambled on way more that I intended, but I didn't want other people having the misconception that there was a "bug" in the app that was causing the credit drop. That is catigorically incorrect. Chris |
Ulrich Metzner Send message Joined: 3 Jul 02 Posts: 1256 Credit: 13,565,513 RAC: 13 |
+1 CreditScrew is the bug, not the application! Aloha, Uli |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3797 Credit: 1,114,826,392 RAC: 3,319 |
To add to what Chris wrote: at first there was much complaining that there was never enough work.... we were running out every Tuesday and not catching up until the next day, and plenty of other times as well. Solved: the SETI@Home scientists built the GBT receiver and a new splitter and got on the Breakthrough Listen initiative, and now we have enough work (possibly even too much!) Next, there was much complaining (including from myself) that CUDA GPUs weren't being assigned the new GBT work, though it's possibly far more likely to contain a candidate signal, and thus find something. Solved: the same group beta-tested the impact to system stability/usability of these work units being run on CUDA GPUs and then released them (this was also probably necessary, because there is so much work and CUDA GPUs make up such a significant fraction of machines available to do it.) Now, there is much complaining that this work is causing a reduction of credit. Sure, it's sad that our RAC is dropping, but it isn't anyone's "fault" except the NVidia engineers who designed the memory architecture of the CUDA-enabled cards, and it's the unintended side effect of something that was asked for. It isn't CreditNew or Dr. David Anderson's doing. Fixed credit won't solve it as 3x the complete speed still equals 1/3 the fixed credit over time. It needs a possibly complex, and possibly impossible, workaround in the SETI@Home client itself. So, in the interim, be patient, enjoy the new science, and be happy that those GUPPI work units trade off three times the compute time for... I dunno... thousands of times the chance of actually finding something, and I'll take that tradeoff any day! |
Richard A. Van Dyke Send message Joined: 17 May 99 Posts: 73 Credit: 318,717,859 RAC: 4,214 |
The app works fine. It was refined to deal with the more complex data coming out of Green Bank. Just a thought... Perhaps going back to the days of SETI Classic is the answer: One WU completed equals one credit (Cobblestone). It is still an indicator of how much work your machine has completed without the imaginary number multiplied by some correction factor. You get credit for completing the WU regardless of whether you have the latest and greatest hardware or something that is 10 years old (just like we do now). I imagine there would be some resistance from the user community to going back to the 1 WU = 1 credit system, but it would put an end to the CreditNew/CreditScrew problem. Besides, after crunching for 17 years now, I still haven't been able to gain enough credits for that darn toaster. :-) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.