Monitoring inconclusive GBT validations and harvesting data for testing

Author	Message
Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1810665 - Posted: 20 Aug 2016, 7:48:18 UTC Having this 'pause' in GBT splitting is really useful. Normally, the vast majority of WUs pass straight through the system and down the pan at the first attempt. But a period of 'resends only' allows the real turds to float to the top for screening. Once all my 'initial split' tasks (_0 and _1) are out of the way - a few more hours yet - I might harvest the remaining data files so we have a representative crop for use in future application testing. ID: 1810665 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1810669 - Posted: 20 Aug 2016, 7:54:01 UTC - in response to Message 1810665. Having this 'pause' in GBT splitting is really useful. Normally, the vast majority of WUs pass straight through the system and down the pan at the first attempt. But a period of 'resends only' allows the real turds to float to the top for screening. Once all my 'initial split' tasks (_0 and _1) are out of the way - a few more hours yet - I might harvest the remaining data files so we have a representative crop for use in future application testing. Good thanks. Will be delighted to stand up for my portion of the floaters, and justify my growing saltiness. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1810669 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1810688 - Posted: 20 Aug 2016, 10:00:22 UTC Down to three Messier tapes at 09:40 UTC, so one of the GBT splitters will have switched back to HIPs - and it won't take long for the rest to follow. With a reduced RTS, and faster returns for pure Arecibo work, I reckon the optimum time for a floater harvest will be around 15:00 UTC. I'll try and clear the pipes by then, if I can work around yet another hardware failure. Single most critical (and frequent) failure point for my little shrubbery is the Â£5 wall-wart that powers the KVM switch. This one's "lifetime warranty" lasted about six months - and I need it for real-world time sensitive work tomorrow Sunday. Bah. ID: 1810688 ·

Kiska Volunteer tester Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0	Message 1810695 - Posted: 20 Aug 2016, 11:18:49 UTC - in response to Message 1810688. Is there any particular files that need harvesting? I can see if I can get some downloaded and uploaded to google drive ID: 1810695 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1810700 - Posted: 20 Aug 2016, 12:00:50 UTC - in response to Message 1810695. Is there any particular files that need harvesting? I can see if I can get some downloaded and uploaded to google drive Thanks, but I don't think that will be necessary. I'm not planning to download anything - simply to copy the datafiles which are already sitting on my home machines, waiting to be processed. But choosing specifically those tasks where there's already some evidence of a validation difficulty. Then, I'll be following the progress of those workunits, and looking for patterns. Not for evidence of individual badly-maintained or over-stressed hosts, but of systemic errors in particular application builds. Top of the list will be Petri's "special" code, because that is still under active development out here in the volunteer community, and shows great promise - if only the inaccuracies can be ironed out. I'm hoping that having a stock of WUs known to trigger the 'inconclusive' outcome will allow the users and developers affected - perhaps after the challenge is over - to run the harvested WUs offline under bench conditions, and find out exactly what the differences in the result files are. That's the first step in responsible debugging. I'll also be keeping an eye open for examples of the other examples of turd-droppers listed by Jason - such as those awful stock apple-darwin apps - but most of them are less amenable to fixing by the external community. All assuming I can get my monitor to light up again. Off into town now to try and source a replacement for that wall-wart - probably end up paying Â£20 at Maplin for a Â£5 part. ID: 1810700 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1810741 - Posted: 20 Aug 2016, 15:22:14 UTC - in response to Message 1810700. Is there any particular files that need harvesting? I can see if I can get some downloaded and uploaded to google drive Thanks, but I don't think that will be necessary. I'm not planning to download anything - simply to copy the datafiles which are already sitting on my home machines, waiting to be processed. But choosing specifically those tasks where there's already some evidence of a validation difficulty. Then, I'll be following the progress of those workunits, and looking for patterns. Not for evidence of individual badly-maintained or over-stressed hosts, but of systemic errors in particular application builds. Top of the list will be Petri's "special" code, because that is still under active development out here in the volunteer community, and shows great promise - if only the inaccuracies can be ironed out. I'm hoping that having a stock of WUs known to trigger the 'inconclusive' outcome will allow the users and developers affected - perhaps after the challenge is over - to run the harvested WUs offline under bench conditions, and find out exactly what the differences in the result files are. That's the first step in responsible debugging. I'll also be keeping an eye open for examples of the other examples of turd-droppers listed by Jason - such as those awful stock apple-darwin apps - but most of them are less amenable to fixing by the external community. All assuming I can get my monitor to light up again. Off into town now to try and source a replacement for that wall-wart - probably end up paying Â£20 at Maplin for a Â£5 part. I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly. This has already been posted; So, Darwin 15.4, 15.5. Ok, this match perfectly with what Urs supplied to me yesterday. Will try to get exclusion of these OS versions. You can add Darwin 15.6 & 16.0 to that list now. For the Laptops the list would be 15.0-16.0. Here are a few machines I've got marked, http://setiathome.berkeley.edu/results.php?hostid=1575265 http://setiathome.berkeley.edu/results.php?hostid=6787046 http://setiathome.berkeley.edu/results.php?hostid=6134063 They are Many others. The problems with the AVX CPUs in Darwin 11.4.2 have been known about since the MBv7 days. I'm still trying to figure out why a handful of ATI HD4 cards can't be excluded from being sent the HD5 App. Those few items would cure many problems with Inconclusive results. Then there is all those Intel iGPUs... ID: 1810741 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1810747 - Posted: 20 Aug 2016, 15:39:34 UTC - in response to Message 1810741. I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly. I'll gladly identify task data files which appear to trigger the instabilities, and make them available to testers with offline capability. But could you help me identify active developers who would be capable of fixing the code? If we can demonstrate that we have supplied better-performing applications (measured by percentage of first-time validation, nothing to do with speed) - perhaps through anonymous platform running - I'd feel more confident about approaching Eric and the other lab staff to seek a stock upgrade. ID: 1810747 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1810750 - Posted: 20 Aug 2016, 15:57:42 UTC - in response to Message 1810747. Last modified: 20 Aug 2016, 16:01:10 UTC I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly. I'll gladly identify task data files which appear to trigger the instabilities, and make them available to testers with offline capability. But could you help me identify active developers who would be capable of fixing the code? If we can demonstrate that we have supplied better-performing applications (measured by percentage of first-time validation, nothing to do with speed) - perhaps through anonymous platform running - I'd feel more confident about approaching Eric and the other lab staff to seek a stock upgrade. I believe you are aware of the developers. Look at these results, http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=73656 Note how the one App was producing Valid results and then the machine hit two opencl_nvidia_mac tasks resulting in two Inconclusive results. There are a few of those machines at Beta, unfortunately there was never an announcement on the the App's availability. Currently there are few people testing the App at Beta. The SSSE3 CPU App that could run on those Darwin 11.4.2 AVX CPUs was offered to Beta at the same time as the CUDA Apps, the CPU App didn't make it. ID: 1810750 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1810754 - Posted: 20 Aug 2016, 16:05:14 UTC - in response to Message 1810747. I would suggest you concentrate on the older Apps that have already been identified as troublesome rather than Apps that are updated regularly. I'll gladly identify task data files which appear to trigger the instabilities, and make them available to testers with offline capability. But could you help me identify active developers who would be capable of fixing the code? If we can demonstrate that we have supplied better-performing applications (measured by percentage of first-time validation, nothing to do with speed) - perhaps through anonymous platform running - I'd feel more confident about approaching Eric and the other lab staff to seek a stock upgrade. Also still wrestling with the challenges of being down a key man especially with the CPU builds. No solutions from my direction yet, though might come across something during x42 Cuda that could aid some of the issues. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1810754 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1810758 - Posted: 20 Aug 2016, 16:16:15 UTC - in response to Message 1810754. Also still wrestling with the challenges of being down a key man ... Two key men if you include Charlie Fenton, who would be Eric's go-to man for stock Mac builds, but was lost in the BOINC NSF cull - he hung around for a while, but I haven't seen him posting since early May. ID: 1810758 ·

jason_gee Volunteer developer Volunteer tester Send message Joined: 24 Nov 06 Posts: 7489 Credit: 91,093,184 RAC: 0	Message 1810761 - Posted: 20 Aug 2016, 16:32:41 UTC - in response to Message 1810758. ... BOINC NSF cull ... Ugh, oh yeah that... might be something worth bouncing off the committee then. Maybe some compute based project with Mac developers would be amenable to some exchange ? just musing. "Living by the wisdom of computer science doesn't sound so bad after all. And unlike most advice, it's backed up by proofs." -- Algorithms to live by: The computer science of human decisions. ID: 1810761 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1810776 - Posted: 20 Aug 2016, 17:39:01 UTC - in response to Message 1810642. So far, a Windows CPU, a Mac NVIDIA GPU and a Windows Intel GPU all disagree, while a Windows Cuda50 timed out and a Mac ATI GPU crapped out. My Win7 host is next in line and will probably run it as Cuda50 sometime tomorrow, unless I reschedule it to the CPU or to SoG. Let's see....what might produce the most interesting result? Hmmm... Just a quick follow-up on WU 2192117866. I chose to run my _5 task with SoG this morning. The result agreed with the Windows CPU (_0 host). The Mac NVIDIA and the Intel GPU were close enough to also get validated. Richard, if this is the sort of mixed-results WU you're looking to archive, I have it saved, just in case. ID: 1810776 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1810851 - Posted: 20 Aug 2016, 22:37:56 UTC - in response to Message 1810776. So far, a Windows CPU, a Mac NVIDIA GPU and a Windows Intel GPU all disagree, while a Windows Cuda50 timed out and a Mac ATI GPU crapped out. My Win7 host is next in line and will probably run it as Cuda50 sometime tomorrow, unless I reschedule it to the CPU or to SoG. Let's see....what might produce the most interesting result? Hmmm... Just a quick follow-up on WU 2192117866. I chose to run my _5 task with SoG this morning. The result agreed with the Windows CPU (_0 host). The Mac NVIDIA and the Intel GPU were close enough to also get validated. Richard, if this is the sort of mixed-results WU you're looking to archive, I have it saved, just in case. Sorry, took a bit of a breather while my machines flushed the remainder of the uninteresting caches (i.e. not resends). A couple are still doing that - may have to wait until morning. I think you've got the right idea - make a note of what the original reason for the inconclusiveness was, and return later to see how many validate in the end. The most interesting one I've got so far is WU 2239728586, which is a triple inconclusive between stock CPU vs. stock nvidia_mac vs. petri special being run by -= Vyper =-. I've got the tie-breaker on an optimised AVX. The Mac and the Petri both record one more pulse than the stock, but must differ somewhere - they both have printed signal summaries, so if no-one beats me to it, I can try a visual comparison in the morning. (That phrase "stock nvidia_mac" is appearing far too often in my working notes already) ID: 1810851 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1811304 - Posted: 22 Aug 2016, 9:05:17 UTC Last modified: 22 Aug 2016, 9:51:36 UTC Placeholder post for thread separation - please don't post here until threads rebuilt. Edit - OK, thread separation complete, feel free to join the conversation. ID: 1811304 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14654 Credit: 200,643,578 RAC: 874	Message 1811307 - Posted: 22 Aug 2016, 9:53:26 UTC I think it would only be fair to link back to Jeff Buck's message 1810642, which sparked the whole idea off. ID: 1811307 ·

Kiska Volunteer tester Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0	Message 1811335 - Posted: 22 Aug 2016, 12:55:13 UTC I have harvested some workunits that have _2 or more and have them in a google drive folder. Folder is set to read only, but I can adjust that if needed. Included is the data file and the results file. Drive ID: 1811335 ·

Kiska Volunteer tester Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0	Message 1811336 - Posted: 22 Aug 2016, 13:03:23 UTC Last modified: 22 Aug 2016, 13:17:33 UTC Workunit analysis via stderr. Pulse: peak=1.450148, time=45.82, period=2.1, d_freq=1209298291.57, score=1.002, chirp=-48.902, fft_len=256 Pulse: peak=6.229137, time=45.99, period=16.37, d_freq=1209299147.79, score=1.012, chirp=-54.71, fft_len=4k This pulse appears on AVX CPU build but not on GPU. Pulse: peak=1.284491, time=45.82, period=1.556, d_freq=1209295817.29, score=1.005, chirp=-56.07, fft_len=256 Pulse: peak=7.644391, time=45.9, period=22.43, d_freq=1209299911.74, score=1.003, chirp=60.614, fft_len=2k Pulse: peak=5.469792, time=45.9, period=13.96, d_freq=1209300442.47, score=1.003, chirp=95.306, fft_len=2k This pulse appears on GPU but not on AVX CPU. 5 Differing pulses on both sides not recorded. I am only analysing taking the frequency of each pulse/triplet Phenom running SSE3 Spike count: 0 Autocorr count: 0 Pulse count: 24 Triplet count: 1 Gaussian count: 0 Mac OpenCL Nvidia Spike count: 0 Autocorr count: 0 Pulse count: 25 Triplet count: 1 Gaussian count: 0 Petri special code Spike count: 0 Autocorr count: 0 Pulse count: 25 Triplet count: 1 Gaussian count: 0 AVX optimised app Spike count: 0 Autocorr count: 0 Pulse count: 24 Triplet count: 1 Gaussian count: 0 Looks like CPU and GPU have one less pulse count. All CPU's agree with 0/0/24/1/0, and all GPU's agree with 0/0/25/1/0. Hmmm..... interesting ID: 1811336 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1811340 - Posted: 22 Aug 2016, 13:34:01 UTC Here's an interesting one; blc5_2bit_guppi_57451_69387_HIP117559_0023.13978.0.18.27.23.vlar All three Hosts report; Spike count: 3 Autocorr count: 1 Pulse count: 22 Triplet count: 4 Gaussian count: 0 All three Hosts are Inconclusive. ID: 1811340 ·

Kiska Volunteer tester Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0	Message 1811347 - Posted: 22 Aug 2016, 13:47:52 UTC - in response to Message 1811340. Analysis of workunit via stderr Pulse: peak=3.748531, time=45.99, period=8.545, d_freq=1163343856.57, score=1.051, chirp=4.6176, fft_len=4k Pulse: peak=5.39325, time=45.99, period=14.5, d_freq=1163343830.48, score=1.013, chirp=5.3259, fft_len=4k Pulse: peak=5.36132, time=45.99, period=14.05, d_freq=1163343598.39, score=1.008, chirp=11.761, fft_len=4k ATI had these pulses but Intel one doesn't Pulse: peak=5.636717, time=45.86, period=13, d_freq=1163348386.64, score=1.012, chirp=83.249, fft_len=1024 Pulse: peak=7.938285, time=45.86, period=22.85, d_freq=1163353871.61, score=1.016, chirp=-83.742, fft_len=1024 Pulse: peak=2.082857, time=45.84, period=3.294, d_freq=1163350491.92, score=1.036, chirp=84.358, fft_len=512 Intel had these pulses but ATI doesn't CPU cannot be analysed because it doesn't print out where it found resultts ID: 1811347 ·

Kiska Volunteer tester Send message Joined: 31 Mar 12 Posts: 302 Credit: 3,067,762 RAC: 0	Message 1811354 - Posted: 22 Aug 2016, 14:08:06 UTC Last modified: 22 Aug 2016, 14:08:20 UTC Hey Look another one with 3 PC do not agree with each other. There is unfortunately no stderr output for where it found its results. Workunit Cuda42 says: Spike count: 30 Autocorr count: 0 Pulse count: 0 Triplet count: 0 Gaussian count: 0 Intel Xeon E3-1230 v3 says: Spike count: 0 Autocorr count: 1 Pulse count: 5 Triplet count: 4 Gaussian count: 0 Unknown Nvidia GPU says: Spike count: 0 Autocorr count: 1 Pulse count: 3 Triplet count: 4 Gaussian count: 0 So much differ....... ID: 1811354 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.