Message boards :
Number crunching :
Panic Mode On (84) Server Problems?
Message board moderation
Previous · 1 . . . 17 · 18 · 19 · 20
Author | Message |
---|---|
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 ![]() ![]() |
Well,look at it from their side, why should they do it? Bill, in case you haven't seen this, I have copied it in from another thread. Look at the data. The picture is clear. There is an issue with the logic of determining v7 credits. I’ve had a look at the data around v7 and v6 WUs as well as AP WUs and below is a quick observational analysis of my data and the impact of v7 in relation to granted credits. Under v6, I was roughly averaging 100 credits per Work Unit (WU). Under v7, it seems that the average is sitting around 75-80 credits per WU. In looking at run-times and taking the outliers out, cpu run time was around 600-660 seconds (10-11 minutes) per WU for v6, and appears to be around 800-1100 seconds (13+ to 18+ minutes) for a v7 WU. CPU time seems to have gone up by a factor of 2-3 from 50-60 seconds for v6 to 90-180 seconds for v7. So doing a quick Back of the Envelope (10.5/15.83=0.66) shows that from a WU processing/throughput capability, I can expect to do roughly 66% of the volume of WUs that I did before (for example, if I was doing 400 WUs per day under v6, I can now expect to do around 264 WUs per day under v7). Looking at the impact on credit gives 0.66*0.775 = 0.514 or 51.4%. In essence I can expect that daily credit for v7 will drop to circa 51% of what I was getting under v6. I am aware of the comments around “that the system needs time to settle down†and that “it thinks all the WUs coming back at the moment are easy, hence the low credit†however, if the system continues to perform as is, then I can expect to see no change from current trajectory. To test the assumption, I have looked at credit per day pre v7 and post v7 implementation (the data is contained below). The average daily credit prior to migration was 221,878. Following migration on 1st June, the average daily credit is showing as 102,376 which is circa 46.1% of the previous daily average under v6. Up to and including 22 June, I saw no real change in daily total even though was comment that the credit system had been tweaked. As part of my thinking, I decided to “benchmark†v7 credits against Astropulse credits. I started this on 23 June with the migration of a single box. Over the next day or two, I noticed what appeared to be an increase in daily totals, so decided to migrate the other two boxes to AP only to see what the full impact would be. The assumption that I was running with was that pre v7, v6 and AP should have been fairly well benchmarked against each other and that the granting of credits would be in approximate equilibrium based on a daily basis. If this was the case, then I would expect to see a rise in daily totals from circa 102k to circa 220k and for the daily total to remain around 220k. From 23 June onwards, totals increased on a daily basis towards the peak of 222k on 29 June. Unfortunately there has been a lack of AP WUs starting around 30 June and so daily totals have declined over the last few days. My observation in here (albeit based on a small amount of data) is that v6 and AP seemed to be fairly well benchmarked against each other. The issue is with v7. V7 is not well benchmarked against v6 WUs, nor is it well benchmarked against AP WUs. In fact, one can almost double their existing daily run rate through only processing AP WUs. Daily run rates: 2013.05.16 – 244,130 2013.05.17 – 220,168 2013.05.18 – 231,098 2013.05.19 – 226,353 2013.05.20 – 224,723 2013.05.21 – 210,477 2013.05.22 - 0 2013.05.23 – 431,485 2013.05.24 – 229,312 2013.05.25 – 228,767 2013.05.26 – 239,021 2013.05.27 – 231,271 2013.05.28 – 231,050 2013.05.29 – 0 2013.05.30 – 392,635 2013.05.31 – 209,556 Migration of all boxes to v7 on “day 1†2013.06.01 – 123,072 2013.06.02 – 94,061 2013.06.03 – 102,333 2013.06.04 – 99,896 2013.06.05 - 65,653 2013.06.06 - 112,209 2013.06.07 - 102,538 2013.06.08 - 110,760 2013.06.09 - 89,757 2013.06.10 - 96,018 2013.06.11 - 111,653 2013.06.12 - 90,091 2013.06.13 - 119,848 2013.06.14 - 99,884 2013.06.15 - 104,561 2013.06.16 - 110,566 2013.06.17 - 110,603 2013.06.18 - 102,856 2013.06.19 - 85,268 2013.06.20 - 140,694 2013.06.21 - 70,247 2013.06.22 - 109,698 Migration of all boxes away from v7 to AP only (over period of ~4 days) 2013.06.23 – 126,873 2013.06.24 – 141,847 2013.06.25 – 169,625 2013.06.26 – 169,517 2013.06.27 – 183,693 2013.06.28 – 206,019 2013.06.29 – 222,126 2013.06.30 – 211,468 (Lack of AP wus from here) 2013.07.01 – 182,394 |
![]() ![]() ![]() Send message Joined: 29 Jun 99 Posts: 11450 Credit: 29,581,041 RAC: 66 ![]() ![]() |
Mike, Sciman Steve`s host is how it should work with nvidia.is not fair, Steve is running a one off, heavily OC'd system vs AFAIK "stock" hardware in a production environment. |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
Mike,Sciman Steve`s host is how it should work with nvidia.is not fair, Steve is running a one off, heavily OC'd system vs AFAIK "stock" hardware in a production environment. LOL - That´s what i say, i´m a poor poor BFB cruncher who only could runs "stock" hardware... now just the #5 at SETI Top participants... with a great help of creditnew of course! ![]() |
bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0 ![]() |
Understood your point perfectly. What you seem to be missing that anybody's work will be put to use that the 'powers that be' consider more productive. Increasing the RAC for those that are dissatisfied is so far down on the list that it's beyond comprehension. Sorry, but there it is. I wish you luck. |
Lionel Send message Joined: 25 Mar 00 Posts: 680 Credit: 563,640,304 RAC: 597 ![]() ![]() |
Understood your point perfectly. What you seem to Bill, I am not asking for development of a differential credit system where those that are dissatisfied get more credit than those that are not. This in itself would lead to further dissatisfaction. I have provided data and commentary that supports the proposition that there is a problem with the credit system. The numbers above are indicative of what many others have also seen. I am sure that if you looked at daily RAC, say for the top 1,000 users pre and post the introduction of v7 you will see similar impact. This is not the grumblings of a single user, but the angst created by an ill tested system from a reasonable proportion of users. What I find hard to accept is the lack of acknowledgement, investigation and ownership that Korpela, Anderson and possible others are showing towards this issue. We are not solving for unknown particles here, all that needs to be done is "root cause analysis" and I am in a way staggered by what appears to be a fingers in the ear la la la approach hoping that the problem goes away. If you talk to them as you indicate, then you are in a prime position to affect an investigation and get them to be proactive, albiet somewhat late. |
bill Send message Joined: 16 Jun 99 Posts: 861 Credit: 29,352,955 RAC: 0 ![]() |
Understood your point perfectly. What you seem to I agree that your perception is definitely different from the folks in the lab. I suggest that you try e-mailing Dr. Anderson for his take on all this. His e-mail address is not that hard to find. I know I found it with very little looking. I'm sure a direct dialogue would be much better than relaying through a third party. That way any misconceptions are minimized and I have no interest in playing pivot man in any such conversation. |
tbret ![]() Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 ![]() ![]() |
Ah, I see your problem, Lionel! You think "science" has something to do with observations of outcomes predicted by a theory. No. No. NO! Bad, Lionel! Bad, bad, Lionel! Science is about whatever our equations predict and whatever inadequacy we can find in our observations that dismisses the falsification of our theory. Nobody is going to listen to you! You are surrounded by Marsupials and we all know that people who live around Marsupials are, uh, upside-down and possibly backward. Living in a less-evolved world as you do, and standing on your head (from a planetary perspective -- well, at least the way globes are usually labeled) has obviously made all of the blood stagnate in your brain. I'm sorry to have to point-out these things, but you obviously have no point. The credits that you are receiving are exactly the same as they were for v6 of the application. No, really. You think you see something different? That's because you're looking. Remember that "credits" could be waves or particles. When you aren't looking they are like waves... the waves you get when someone is leaving and receding into the distance. They are just as large as they ever were, but only appear smaller from your perspective, and since there is no preferred perspective, you are wrong. So, if you don't look - RAC goes bye-bye and you don't notice. But if you DO look, then you will see your RAC as particles, and with your frame of reference, it can appear that your credits are going back in time and choosing to present themselves as being small from the beginning, even though they would have been waving if you hadn't looked. Nobody has ever heard of a large particle. If they were large, they'd be balls, and nobody would call them particles. So, you have to realize that by looking your making your balls into particles and they appear small. This is known as the duelality of credits. Juan thinks the whole thing is like a random number generator, but he only thinks that because of the Uncertainty Principle. You see, Juan can either have a high degree of confidence in the speed with-which his RAC declines, OR he can have an idea of what his RAC is, but he cannot know both with any degree of certainty and then only from his perspective. So, at the time he looks at his credit he "fixes" that credit in time. The other person who owns the rig against which his work unit validated may or may not have looked at THEIR work unit. This is a double-blind, single-blind, dual double-slit experiment where the person who is blind might get the same credit for more or less work depending on whether Juan looks at his result while sitting on his up-quark. If Juan's is up, then we know the other validater's credit must be down. That's why when you are sitting in a blind you might hear something go quark-quark before you look. I plan on taking two identical laptops and copying a work unit onto each. I will then climb a water tower and put one on top and let the other crunch at ground-level. The one near the top should crunch the work unit in less time compared to the one on the ground. If it doesn't, I will throw it off of the top of the water tower and count until it hits the ground, which will tell me how fast my RAC fell. And no matter what I observe, someone will tell me that I didn't. The real problem with the RAC we're playing with is that it is too small. Why do I say that? Because if I have a RAC of 125,000 and I add a GTX 670 to it, I *expect* to do more to my RAC than to go to 145,000. My RAC feels like it got bigger by too little. If I have four GPUs and my RAC is 60,000 and I go buy another $400 GPU and my RAC only increases to 75,000 I haven't had a "psychologically rewarding experience." If it went from 600,000 to 750,000, well, that's a whole 150,000!!! Look at my RAC climb!! AND really low RAC makes it hard as hell to see a small improvement. You have a 10,000 RAC. You make a 3% improvement, you have a RAC of 10,300. BUT, if you have a RAC of 1,000,000 a 3% improvement is 30,000!!! Hey, HEY!! We all know it's easier to see 30,000 than 300 of something. Nobody cares. It doesn't change the "SCIENCE!" of shoveling sand into a bucket and sifting it and taking the bucket of non-sand to the scientist so he can see if the stuff in the bottom that isn't sand might be something interesting. And he hasn't said a word about whether we've brought him anything interesting or not in a long time. We're arguing about whether our bucket holds 5 or 40. 5 or 40 "what" doesn't matter. BUT, if we are being *paid* by the "x", then whether our bucket holds 5 of them or 40 of them does matter. All anyone around here wants is a consistent "x" and I'm tellin' you (preaching to the choir), as long as someone *believes* their formula is correct, all of the observation in the world isn't going to shake that belief. I also notice that people who have a small RAC are the most likely to say size doesn't matter. {yes, I am very tired and a little loopy} |
![]() ![]() Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54 ![]() ![]() |
I see that its the ones with the bigger RAC that seem to cry the most. O woe is me I went from #23 to #24. Give it a rest, Dr.A could care less what anyones RAC is. What with all the tear jerking calls for a better way to calulate RAC, I wish hed say screw it and go back to the way it was. Crunch one you get a 1. That would be fair right? Will we lose folks who crunch? Maybe, But we get them all the time for various reasons. I see you are all still here. ![]() Old James |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13904 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I wish hed say screw it and go back to the way it was. Crunch one you get a 1. That would be fair right? Nope. Hence the Cobblestone. Just a shame credit new borked it. Grant Darwin NT |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 34464 Credit: 79,922,639 RAC: 80 ![]() ![]() |
Juan. First of all no offense meant. Looking at this unit for example http://setiathome.berkeley.edu/result.php?resultid=3076512395 This is a VHAR unit and took 200 seconds longer than other VHARs on the same host. It indicates the CPU had not enough free cycles at startup. 4 690`s are physically 8 680`s so in theory you`d need 8 free cores to feed the GPU`s properly. At least 4 cores. As i can see your host with 3 670`s has only 5K RAC less which shows this one is perfect. One 690 should produce more than 5K. All i want to say is that this has nothing to do with credit new. Each host with 4+ cards has the same issue. And i dont talk about overclocked hardware. With each crime and every kindness we birth our future. |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
First of all no offense meant. That i clearely undestand, and is the same from my side. My question was only in the direction to learn how to optimize the host, realy the numbers bug´s my mind, even with the slow problem (i will make some test during this week to see what i could do) the diference of credit on the allready processed WU (on the same host) seems totaly wierd, more processig time get less credit is what i can't understand. The 3x670 host actualy is a 690+670 host. A 670 is faster than 1/2 690 so the RAC diferences are not so big. The APR on this host is 193.14636426995 about 10% faster than the 2x690 host. I will change later (the host is on a remote location) the configuration on the 2x690 host and free 4 cores to feeds the GPU´s let´s see what i get. But that rises the question, why works fine before with V6 and don´t work with V7? ![]() |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 34464 Credit: 79,922,639 RAC: 80 ![]() ![]() |
First of all no offense meant. Its not a does work or does not work situation. Its a optimizing thing. It works but not perfect. V7 uses more CPU cycles than V6 does. So if 4 GPU instances start at the same time there is simply a lack in feeding the GPU`s. But this is a long term thing. I also freed one CPU core more since V7 release and as you can see i dont suffer anything. You need patience and try for weeks not days. With each crime and every kindness we birth our future. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13904 Credit: 208,696,464 RAC: 304 ![]() ![]() |
You need patience and try for weeks not days. Almost (but not quite) 2 months now. RAC is still falling. Grant Darwin NT |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
So if 4 GPU instances start at the same time there is simply a lack in feeding the GPU`s. That´s is very unlikely, this hosts runs 24/7 so if that happening sure is very rare and must be a hell of coincidence we look exactly that WU. I also freed one CPU core more since V7 release and as you can see i dont suffer anything. I could understand that but you don't have a 2 x very fast GPU on the same host, so is easy not to suffer too much, faster is your GPU´s bigger is the fall, that what i see on the field, just look the posts. Who runs most CPU work suffer a lot less than us who depends on GPU´s work. You need patience and try for weeks not days. I know that, an totaly agree, to run SETI patience is always needed, to many problems and to few suport/feedback from the lab, the only things that allready helps is the volunteers like you who try to help the community. Thanks for your help, let´s see what i get. ![]() |
![]() ![]() ![]() Send message Joined: 17 Feb 01 Posts: 34464 Credit: 79,922,639 RAC: 80 ![]() ![]() |
Like you said i only have 1 GPU and it happens at least 4 times a day that 3 instances starting at the same time. I only noticed that because of those happenings. With each crime and every kindness we birth our future. |
juan BFP ![]() ![]() ![]() ![]() Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 ![]() ![]() |
Besides when i do a reboot of the host (something very rare normaly just when windows or nvidia made some update) i not realy see more than 2 WU starting at some determinate time and normaly takes few seconds to DL the data from the HD to the GPU to crunch could be because i use a Intel CPU diferent from the AMD you use and my CPU load for others task (beside boinc task itself) is very very low (normaly less than 2% - 5% top), i could said this is almost a crunching only host. But i will try to free 2 more cores (actualy 2 cores are allready free) latter ASAP when i arrive to the remote location, i don´t like to make any changes in the configuration remotely, you know sh*** happenigs... and our friend Murphy is allways around. ![]() |
tbret ![]() Send message Joined: 28 May 99 Posts: 3380 Credit: 296,162,071 RAC: 40 ![]() ![]() |
Yes. You know I don't personally care. One can take their 14 years-worth of credits and try to buy a cup of coffee and see where that gets them. My $600 electric bill (last month, one location) is proof positive that credits don't have value; they cost. You know the "yard stick" I really liked was the number of hours. But, with an hour's worth of work being very different between very different hardware... I don't guess it matters, does it? |
rob smith ![]() ![]() ![]() Send message Joined: 7 Mar 03 Posts: 22721 Credit: 416,307,556 RAC: 380 ![]() ![]() |
Locked, getting a bit long. Please make your way over to Panic Mode On (85) Server Problems? Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.