Boinc juggling Cuda versions, examples from Juan

Author	Message
jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1374789 - Posted: 1 Jun 2013, 15:16:32 UTC Last modified: 1 Jun 2013, 15:17:12 UTC for analyses The data is realy flowing again but still receiving the wrong cuda version in some hosts. Did anyone else have the same problem? let's analyse a specific host Juan ? which host(s) ? (maybe start a thread for it) Sorry, i was sleeping.... here some examples: (all runs V7 from the begining) This host is a 590+2x580: http://setiathome.berkeley.edu/host_app_versions.php?hostid=5264653 it receives only cuda50, expected cuda42 since all the GPUs are Fermis. This is a 690+670 host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=6690764 it still receive cuda32 (few WU something like 1 in 10 the rest are cuda50) , expected cuda50 since all the GPUs are kepplers. All GPUs are running at stock speeds and the keplers are EVGA Classified/FTW. I will check my others hosts for any anormalities. Hope that helps to clarify what i try show. IÂ´m still thinking the old way, where we choose the best apps is better. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1374789 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1374792 - Posted: 1 Jun 2013, 15:19:26 UTC Last modified: 1 Jun 2013, 15:35:12 UTC first one: This host is a 590+2x580: http://setiathome.berkeley.edu/host_app_versions.php?hostid=5264653 it receives only cuda50, expected cuda42 since all the GPUs are Fermis. This host has 'completed' mostly Cuda5, ~1000+ completed tasks Vs 254 total for the older versions. Statistically the Cuda5 average should be 10x more accurate with the current work mix, and indicate that app is a lot faster than the older versions with little data. I would argue Cuda 4.2 and Cuda 3.2 need more data to give accurate numbers, as ~100 tasks probably doesn't represent much of a work mix. Let's hope that the scheduler decides to probe with the older versions. It's entirely possible both ways, that Cuda 4.2 would be a better choice (needs more to run), or that Cuda5 on average is doing 'something better' not easily visible benchin6g a few tasks. The funny thing about statistics is they can look 'wacky' with small numbers, then suddenly look sensible with en6ough data. It remains to be seen whether the server will make sensible probe tests with the other versions. The averages so far suggest it could go either way. Maybe the dynamics are different with 2 at a time too. Which is better _? 3 slow ones, two fast ones, one really fast one & less power ? (I can't answer that. Taking control yourself is installer territory) SETI@home v7 7.00 windows_intelx86 (cuda32) Number of tasks completed 108 Max tasks per day 213 Number of tasks today 0 Consecutive valid tasks 113 Average processing rate 157.79027950233 Average turnaround time 0.10 days SETI@home v7 7.00 windows_intelx86 (cuda42) Number of tasks completed 146 Max tasks per day 258 Number of tasks today 0 Consecutive valid tasks 158 Average processing rate 148.0548789089 Average turnaround time 0.08 days SETI@home v7 7.00 windows_intelx86 (cuda50) Number of tasks completed 1017 Max tasks per day 1203 Number of tasks today 455 Consecutive valid tasks 1040 Average processing rate 171.612778602 Average turnaround time 0.12 days "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1374792 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1374803 - Posted: 1 Jun 2013, 15:36:29 UTC - in response to Message 1374789. Last modified: 1 Jun 2013, 15:44:00 UTC Second one: This is a 690+670 host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=6690764 it still receive cuda32 (few WU something like 1 in 10 the rest are cuda50) , expected cuda50 since all the GPUs are kepplers. Looks like the server is 'measuring' this one, The APRs are close, but more tasks needed all around IMO. Do you trust long term averages better _? or a synthetic bench with 1-4 shortened pretend tasks ? I have no idea if the server will converge on sensible solutions, or if they will match the common wisdom. It'll be interesting to see if our prejudices turn out correct, or we should drink more beer & leave the statistics to the server. SETI@home v7 7.00 windows_intelx86 (cuda32) Number of tasks completed 71 Max tasks per day 173 Number of tasks today 36 Consecutive valid tasks 73 Average processing rate 100.29694958426 Average turnaround time 0.19 days SETI@home v7 7.00 windows_intelx86 (cuda42) Number of tasks completed 321 Max tasks per day 453 Number of tasks today 9 Consecutive valid tasks 316 Average processing rate 99.605971421119 Average turnaround time 0.25 days SETI@home v7 7.00 windows_intelx86 (cuda50) Number of tasks completed 183 Max tasks per day 255 Number of tasks today 284 Consecutive valid tasks 144 Average processing rate 107.70633060762 Average turnaround time 0.28 days "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1374803 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1374807 - Posted: 1 Jun 2013, 15:44:31 UTC - in response to Message 1374803. Second one: This is a 690+670 host: http://setiathome.berkeley.edu/show_host_detail.php?hostid=6690764 it still receive cuda32 (few WU something like 1 in 10 the rest are cuda50) , expected cuda50 since all the GPUs are kepplers. Watch out for that one getting its APRs in a twist, if one app_version happens to pick up a substantial run of VLARs. I think we'd better try to get Eric to define CUDA VLARs as outliers, pronto - see Beta. ID: 1374807 ·

SciManStev Volunteer tester Send message Joined: 20 Jun 99 Posts: 6652 Credit: 121,090,076 RAC: 0	Message 1374816 - Posted: 1 Jun 2013, 15:54:03 UTC Although I have been way to busy to check specifics, my 480's have been given mostly Cuda 5.0, by a huge margin. For now, I'll crunch anything that comes my way. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website ID: 1374816 ·

Gatekeeper Send message Joined: 14 Jul 04 Posts: 887 Credit: 176,479,616 RAC: 0	Message 1374830 - Posted: 1 Jun 2013, 16:21:24 UTC My 590's are all getting 4.2, as expected, but the 580's are getting almost 100% 5.0. Like Steve, I'll take all the work I can get. ID: 1374830 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1374834 - Posted: 1 Jun 2013, 16:24:12 UTC My dual 680 rig is 4.2 on everything in the cache now. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1374834 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1374837 - Posted: 1 Jun 2013, 16:29:29 UTC - in response to Message 1374834. My dual 680 rig is 4.2 on everything in the cache now. My advice would be to wait a week or so before forming a posse. If things don't stabilise I'll be right there with ya. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1374837 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004	Message 1374848 - Posted: 1 Jun 2013, 16:43:16 UTC - in response to Message 1374837. Last modified: 1 Jun 2013, 17:13:37 UTC My dual 680 rig is 4.2 on everything in the cache now. My advice would be to wait a week or so before forming a posse. If things don't stabilise I'll be right there with ya. No troubles here, mate. All rigs full on, flat out, full bore. Same sh*t, different day. That's all. My little RAC weenie has just shrunk a bit. ROFLMAO....... It's actually just a little schnitzel of it'self right now. "Freedom is just Chaos, with better lighting." Alan Dean Foster ID: 1374848 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1374922 - Posted: 1 Jun 2013, 20:11:34 UTC - in response to Message 1374837. My dual 680 rig is 4.2 on everything in the cache now. My advice would be to wait a week or so before forming a posse. If things don't stabilise I'll be right there with ya. And the effect of people aborting everything & anything they don't like in favour of the things they do like? Grant Darwin NT ID: 1374922 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1374926 - Posted: 1 Jun 2013, 20:41:38 UTC - in response to Message 1374803. or we should drink more beer & leave the statistics to the server. Sorry out for work, i hate to work on saturdayÂ´s. I allways do what our master guru JasonÂ´s says, so going to drink some beers and leave the stats to the servers, anyway i need few beers because my RAC drops from 440K to less than 392K, kick out of the 400k RAC club. LetÂ´s see what we get in few days. ID: 1374926 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1375184 - Posted: 2 Jun 2013, 7:38:12 UTC Last modified: 2 Jun 2013, 8:15:46 UTC Not exacly relatef to this topic ubut I Just see this: http://setiathome.berkeley.edu/result.php?resultid=3025225878 an old V6 WU just now processed (a re-load?) in 537.36 sec and credit of 130.66 on a 590 against for example: http://setiathome.berkeley.edu/result.php?resultid=3024873098 a new V7 processed in 562.89 sec (almost the same time) and credit of 33.10 on a 690 So is clear, for the same processing time a V7 WU receive about 1/4 of the credit of a V6 WU, or iÂ´m wrong? Even if you consider the 690 is a little faster then the 590. Another host that receives "wrong" cuda version WU: http://setiathome.berkeley.edu/show_host_detail.php?hostid=5280419 its a single 580 (fermi expected cuda42) host and only receive cuda50. ID: 1375184 ·

Mike Volunteer tester Send message Joined: 17 Feb 01 Posts: 34258 Credit: 79,922,639 RAC: 80	Message 1375216 - Posted: 2 Jun 2013, 8:12:46 UTC - in response to Message 1375184. Not exacly relatef to this topic ubut I Just see this: http://setiathome.berkeley.edu/result.php?resultid=3025225878 an old V6 WU just now processed (a re-load?) in 537.36 sec and credit of 130.66 on a 590 against for example: http://setiathome.berkeley.edu/result.php?resultid=3024873098 a new V7 processed in 562.89 sec (almost the same time) and credit of 33.10 on a 690 So is clear, for the same processing time a V7 WU receive about 1/4 of the credit of a V6 WU, or iÂ´m wrong? Even if you consider the 690 is a little faster then the 590. You cant compare those 2 units. The V6 unit was midrange AR whilst the V7 unit is a VHAR. With each crime and every kindness we birth our future. ID: 1375216 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1375219 - Posted: 2 Jun 2013, 8:17:06 UTC - in response to Message 1375216. Last modified: 2 Jun 2013, 8:19:15 UTC You cant compare those 2 units. The V6 unit was midrange AR whilst the V7 unit is a VHAR. IÂ´m not comparing the type of WU just the processing time vs credit or iÂ´m wrong? I allways belive the credit was related to the processing time the WU needs to be crunched. More time more credit, less time less credit. ID: 1375219 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1375221 - Posted: 2 Jun 2013, 8:21:29 UTC - in response to Message 1375219. I allways belive the credit was related to the processing time the WU needs to be crunched. It was related to the amount of work required to process a WU- that's why a stock, optimised & GPU based applications could take different times to process a given WU, but all would get the same credit for it. Unfotunately Credit New screwed that up, now credit is more of a random thing than based on work done. Or it at least appears that way. Grant Darwin NT ID: 1375221 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1375222 - Posted: 2 Jun 2013, 8:26:12 UTC - in response to Message 1375221. Last modified: 2 Jun 2013, 8:26:57 UTC I allways belive the credit was related to the processing time the WU needs to be crunched. It was related to the amount of work required to process a WU- that's why a stock, optimised & GPU based applications could take different times to process a given WU, but all would get the same credit for it. Unfotunately Credit New screwed that up, now credit is more of a random thing than based on work done. Or it at least appears that way. I could understand that, but in almost the same GPU (590-690 diference is about 20%) i allways belive the time to process is what give us the credit the we get. So i expect 2 diferent WU crunched in almost the same time receive almost the same credit, at least the common sense make us belive on that. ID: 1375222 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22200 Credit: 416,307,556 RAC: 380	Message 1375231 - Posted: 2 Jun 2013, 8:37:15 UTC "New Credit" is calculated by an (overly) complex algorithm. Crudely it makes an estimate from the two agreeing reported data sets of the amount of work required to do a particular task. It then takes the lower of the two values it "guesses" and that is the value used to calculate the credit awarded to both crunchers. One thing to note is that the calculation is based on the information held on the server about the target processor on the target cruncher, so if you indulge in rescheduling, and move lots of tasks from a humble CPU to a mega GPU you will be seen as returning the task quickly from the CPU, and so the credit will be significantly lower than the task warrants. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1375231 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1375237 - Posted: 2 Jun 2013, 8:42:59 UTC - in response to Message 1375222. So i expect 2 diferent WU crunched in almost the same time receive almost the same credit, at least the common sense make us belive on that. If the 2 different WU were similar types of WU then they would normally get the same credit, but as you know shoties, mid range & VLARs are very different types of WU. So regardless of how similar the crunching times might be between 2 different types of WU they won't get the same credit. The only way to compare them accurately is if they are the same angle range, being processed by the same device with the same aplication on the same hardware. Then the run times would be the same (or very, very close). But different applications on different hardware will give different run times, although the amount of processing done is the same. Grant Darwin NT ID: 1375237 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1375245 - Posted: 2 Jun 2013, 8:48:18 UTC - in response to Message 1375233. Last modified: 2 Jun 2013, 8:55:20 UTC Read this about CreditNew, and then you will understand that it is totally random, and that you as well as everyone else here, doesn't understand anything about CreditNew :-) http://boinc.berkeley.edu/trac/wiki/CreditNew Now i understand why i canÂ´t understand how credit new works, itÂ´s something totaly non-sense! Thats remeember me: Why made something simple and easy to understand if you could make it complicated and totaly incomprensible? I need a beer! but is 5:45 AM here, so iÂ´m going for a coffee. @Grant give or thaking all diferences, 4x less credit for the same processing time in about the same hardware? that makes little or no sense at all... Anyway thanks for all the answers. Now back to the old question, why the servers still sending cuda50 to the 580 (fermis) hosts? ID: 1375245 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22200 Credit: 416,307,556 RAC: 380	Message 1375249 - Posted: 2 Jun 2013, 8:55:23 UTC Simple answer - the servers don't think they have enough VALIDATED results to see the error of their ways. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1375249 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.