Message boards :
Number crunching :
Average Credit Decreasing?
Message board moderation
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 32 · Next
Author | Message |
---|---|
Kevin Olley Send message Joined: 3 Aug 99 Posts: 906 Credit: 261,085,289 RAC: 572 |
I have looked a couple of times at the task names and have seen a few _2's most are inconclusives or error while computing, only one has been aborted. Hopefully this practice will not become widespread. Kevin |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13822 Credit: 208,696,464 RAC: 304 |
More and more top users are now aborting GBT VLARs from their GPU's, en masse. It's thousands and thousands of WU's that has to be sent out again. Maybe if they hold off sending out any more Arecibo WUs till all the VLAR aborters have left, then start sending out the Arecibo work again. And maybe they need to reset the resend number on WUs so that an Abort of a WU doesn't count against the number of replications. Otherwise WUs will end up not being processed. Grant Darwin NT |
Mr. Kevvy Send message Joined: 15 May 99 Posts: 3791 Credit: 1,114,826,392 RAC: 3,319 |
Perhaps it's just my ignorance showing again. but to somewhat improve the situation, would it not be possible to undo the misassignment of GUPPI VLARs to GPUs while CPUs are getting Arecibo MB tasks? It appears that the file that controls this is sched_request_setiathome.berkeley.edu.xml in the project folder. As well, there are boinc_task_state.xml state files while they are processing. GPU-assigned work units show: <app_version>1</app_version> <plan_class>opencl_nvidia_sah</plan_class> whereas CPU-assigned show simply: <app_version>0</app_version> It would appear that if this file were loaded, instances of each counted and then switched if there were both GPU "guppi" units and non-guppi CPU units (as the count has to remain the same of each... 100 max. per CPU and GPU), ensuring that they weren't present in any of the boinc_task_state.xml. Problem is if this was being done constantly we'd wind up in the same situation because eventually they would be all slow guppis, the faster non-guppis having been reassigned to GPUs and completed. So, maybe every few hours would be ideal. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13822 Credit: 208,696,464 RAC: 304 |
Perhaps it's just my ignorance showing again. but to somewhat improve the situation, would it not be possible to undo the misassignment of GUPPI VLARs to GPUs while CPUs are getting Arecibo MB tasks? It would just be another temporary work around IMHO. Once all the Arecibo data is gone, all that's left would be Guppies. I figure once they start splitting Guppie AP WUs then all those that left will come back for the Credit orgy that is AP. Grant Darwin NT |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Perhaps it's just my ignorance showing again. but to somewhat improve the situation, would it not be possible to undo the misassignment of GUPPI VLARs to GPUs while CPUs are getting Arecibo MB tasks? Especially since stock AP cpu doesn't include AVX optimisations that break creditNew normalisation (more than what SSE-SSE41 do anyway), and any GBT AP implementations are bound to require working in trickles or similar. Everything I look at seems to be telling me that the designer(s) has/have no idea that the normalisation step completely reverses the best of intentions, when the stock CPU application receives optimisations. At the same time I really find it difficult to comprehend that only the users would notice multiple stepwise credit drops for the same work (in the energy sense), despite escalating processing throughput and improving application efficiency. Something fishy about that. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
George 254 Send message Joined: 25 Jul 99 Posts: 155 Credit: 16,507,264 RAC: 19 |
Forgive me guys, but as the one who started this thread with a simple enquiry about RAC's falling (where the feedback/discussion for a non-techie like me has been very informative) let me pitch in. Many of the comments make me wonder why we are doing this BOINC stuff? Is it for science or personal glory? FWIW I prefer to make my own stats of Tasks completed daily. Aborting WUs because they don't generate as much credit (or Brownie Points as we might say here in the UK) doesn't help crunch the data which gets us closer to actually finding SETI. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Forgive me guys, but as the one who started this thread with a simple enquiry about RAC's falling (where the feedback/discussion for a non-techie like me has been very informative) let me pitch in. Think most of us agree with that sentiment (at least as far as I can tell), but at the same time I think change is harder for some than others. There are legitimate challenges for the project, and us volunteer developers. I suspect if the cherrypicking starts to adversely affect results noticeably, then the project may do something (an option/switch was mentioned as being looked at). As a developer, the only bad option at the moment would be to remove processing of these across the board. That's just because a handful of testers and machines can only reveal so much about the problems and options. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
"Credit" should reflect the real work done, so every machine - regardless of it's speed - should "earn" the same amount (for this unit!). If a machine is slower, it gets less credit just because it can't crunch as much units, as a faster machine - simple as that. Same work - same credit. . . That seems to be the major consensus of opinion (heavily borne out by the facts). |
kittyman Send message Joined: 9 Jul 00 Posts: 51473 Credit: 1,018,363,574 RAC: 1,004 |
People who abort work just because they don't like it........ Screw them. They do not give a shit about this project and it's goals. I am very angry with them. You be on notice now.... I have one exception, and he knows who he is. I granted him his bit, because he was working to a certain goal. I have always regarded my RAC to be a barometer of how much I have contributed to this project. It seems that is no longer properly reflects how much. And I can accept that. The kitties are crunching all work sent, without exception. It ain't pretty, but it is my commitment to this project. If it needs to be analyzed, I shall do it. Any bunghole that cherry picks their work to make themselves proud....take a hike to another project, please. "Time is simply the mechanism that keeps everything from happening all at once." |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
The app does not have a "bug" in it as you keep espousing. The VLAR wu's are not the same type of signal as mid-range or very high range angle ranged wu's. Both of those types are moving across the sky to put it very simply, the larger the number the "faster" it passes a point in the sky. The MB app has several types of "apps" within it to search for signals, the split of computation time each of these kernels takes of the overall computation time is in part driven by the angle range. In the case of VLARs, where the telescope is looking at a pinpoint location the sky for a long time, there's a whole lot of time spent looking for pulses. There's a limit to how parallelized you can make those pulse searches. That's the high level gist of how the all works and is working as designed. . . As you mentioned one of the reasons ATI/AMD cards are less affected is because they run Open_CL tasks and not CUDA. But Nvidia cards also run Open_CL, and VLARs run as Open_CL tasks on my Nvidia cards have about half the time dilation effect of VLARs run as CUDA tasks. So for the interest of efficiency surely they should be distributed to Nvidia equipped hosts as Open_CL too, not as CUDA?? . . And both issues form the complete reason why RACs are crashing so badly, bloated run times exacerbated by inconsistent credit ratings. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Forgive me guys, but as the one who started this thread with a simple enquiry about RAC's falling (where the feedback/discussion for a non-techie like me has been very informative) let me pitch in. . . So I am not the only one who calls them that! :) . . But I am a fan of efficiency, and tactics that create or cause longer processing time for no valid reason offend me. Since making VLARs (particularly Guppis) run as CUDA tasks can reduce the output of the Nvidia resources available to SETI by extending runtimes by about 400%, then it is ridiculous to press on down that path. The eccentricities of Nvidia cards that make that happen are not going to change but they do also run Open_CL and while still slower than normal AR work they cause only about a 100% increase in runtimes. Which makes the use of CUDA apps to run Guppis (VLARs) seem particularly ridiculous. |
kittyman Send message Joined: 9 Jul 00 Posts: 51473 Credit: 1,018,363,574 RAC: 1,004 |
There is a valid reason. The Guppi tasks simply contain more work that previously issued. That should be a good thing for us all....more information processed. This is why we are here, buddy. We are here, most of us, for the duration of the project. It does not matter to us how long it takes really, given the eons it has taken for this information to get here. "Time is simply the mechanism that keeps everything from happening all at once." |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14666 Credit: 200,643,578 RAC: 874 |
There is a valid reason. Actually, Stephen's point is valid. The guppi VLAR tasks 'contain' - or perhaps more properly, 'require' - less work to process than Arecibo VLAR tasks, as can be seen by running both types of tasks with the same CPU app on the same CPU. I'm not sure quite why that should be the case - I wish we could have some input from somebody like Joe Segur on the subject - but it's a consistent observation here. Aborting tasks still isn't the long-term solution to the inefficiency, though. (personally, I call them Gollum Points - "Mine, all mine, my precious") |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Are you determining more or less work based on elapsed time ? because the CPU code is exceptionally efficient for that long/deep pulsefinding, making the communications (memory access) cost relatively low compared to our familiar Arecibo Shorties that tend to thrash CPU cache. It *could* clarify some things, if you're able to monitor CPU core temperature running a single Guppi VLAR, versus core temp with an Arecibo VHAR, then compare cpu_time/deltaTfromIdle. Could be tricky to get something consistent/meaningful, but maybe worthwhile for the sakes of confirming the true work. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14666 Credit: 200,643,578 RAC: 874 |
Are you determining more or less work based on elapsed time ? because the CPU code is exceptionally efficient for that long/deep pulsefinding, making the communications (memory access) cost relatively low compared to our familiar Arecibo Shorties that tend to thrash CPU cache. No - not that, I said nothing about shorties or VHAR. The comparison - yes, on elapsed time - was between guppi VLAR and Arecibo VLAR - which I was expecting, a priori to be comparable. And of course, when I call up Valid tasks for computer 5828732, the difference is nothing like as striking as I remember..... Edit - on that current data, guppi averages about 7.5% quicker than Arecibo. |
kittyman Send message Joined: 9 Jul 00 Posts: 51473 Credit: 1,018,363,574 RAC: 1,004 |
There is a valid reason. I miss Joe's insights. Anybody ever find out where Joe went? I would like to think that he still haunts this place and understands that he is missed. "Time is simply the mechanism that keeps everything from happening all at once." |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Are you determining more or less work based on elapsed time ? because the CPU code is exceptionally efficient for that long/deep pulsefinding, making the communications (memory access) cost relatively low compared to our familiar Arecibo Shorties that tend to thrash CPU cache. What I'm getting at, is if GuppiVLAR gives lower-time/higherDeltaT (i.e. run hotter for shorter), and AreciboVLAR gives longer-time/lowerDeltaT (cooler for longer), and both ratios are similar, then there is a search efficiency impact, rather than useful computation work difference. If the ratios come out vastly different, then it'd be a parameter+data driven difference rather than an efficiency one. My curveball of throwing in the VHAR, was that we know fundamentally VHAR are less computationally efficient on CPU than VLAR, due to memory accesses, therefore a lower ratio on these of cpu_time/deltaT would be expected, therefore confirmation/validation if something sensible or weird and unexpected showed up.... Like say the Guppis ran faster without generating more heat but did more operations. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
Edit - on that current data, guppi averages about 7.5% quicker than Arecibo. Ah, that narrow. a degree or two hotter ? "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14666 Credit: 200,643,578 RAC: 874 |
As chance would have it, I've got no Arecibo VLARs in the cache on that machine at the moment. But I do have TThrottle running (in display-only mode, not actively throttling), so I'll make notes if I see any passing through. |
jason_gee Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0 |
As chance would have it, I've got no Arecibo VLARs in the cache on that machine at the moment. But I do have TThrottle running (in display-only mode, not actively throttling), so I'll make notes if I see any passing through. Cheers, positive or negative results could end up unimportant, or have some clues buried in there. [One of those things that triggers a 'that's odd', feeling] "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.