Message boards :
Number crunching :
Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation
Previous · 1 . . . 145 · 146 · 147 · 148 · 149 · 150 · 151 . . . 162 · Next
Author | Message |
---|---|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Whatever. The simple facts are after a reboot on this machine, https://setiathome.berkeley.edu/show_host_detail.php?hostid=8097309 usually two NV GPUs will Miss All pulses until you move the monitor cable from the Intel GPU to the offending GPUs long enough to display a desktop, and then return the cable to the Intel GPU. After that, the machine will run for Months without missing any pulses on the three NV GPUs. The same thing happens in Linux on the First task after a reboot. After the First task the Linux system 'self corrects' and doesn't require moving the cables whereas on a Mac it will continue until you move the cables. This started with the 0.9x series, the older zi3v App doesn't have this problem when using the same drivers. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Somebody left out a synchronisation call? What's the diff between the last zi3v and the first 0.9x? |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Maybe it's a problem with using the iGPU? have you tried just driving the monitor off the nvidia card all the time? or maybe just a MacOS problem? I've never seen my systems cause any missed pulses tied to a system reboot. but i also run a high checkpoint value *shrug* but about the checkpointing, i'll echo my comments from another thread "I really don't see the need to mess around with this anyway, not sure why it's such a hot topic now. the checkpoint "issue" isn't a big issue. if you reboot your system, you might have a couple tasks that didn't restart properly and give a bad result, the server handles this like it does all inconclusive results. it's also easily mitigated by changing the checkpoint settings in the boinc manager compute settings. just set the checkpoint longer than any task would run and it will never checkpoint." Maybe Richard can dig up some insight to the following: I set my systems to 600, which would effectively stop any checkpointing. i played around with super huge numbers and it just defaulted back to '0'. not sure if '0' here means never checkpoint, or always checkpoint. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Buckeye4LF Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 |
i did a fresh install of Linux as was suggested from the beginning of my problems but went with Mint, already working better with my setup. I am not sure if it is the thread-ripper or the dual cards that were causing the problem in Ubuntu. I am maxed out now and my tower might as well be a space heater ) |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
Fair comment, and the loss of a couple of tasks on a monthly reboot is trivial at the speed of the 'special sauce' app. But that applies only if you use GPUs for SETI, and nothing else for anything else. I run a different project on the CPUs in my 'special sauce' crunchers. Why should they have to sacrifice restart time to the Great God SETI? Again, as I said in the original post, preventing the problem at source (assuming you can't write a proper checkpointing routine) will be more robust, easier for users, and save some time in the app. And it'll save (a little) server wear and tear processing the resends sent to computers which could be processing first-run work. It just feels like a more elegant solution, all round. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I still think it's a timing problem, with the code not waiting for a result to be returned before using it.very possible. some kind of idiosyncrasy with timing on the MacOS version of the app. https://setiathome.berkeley.edu/forum_thread.php?id=81271&postid=2010859#2010859 Message 2010859 - Posted: 6 Sep 2019, 3:14:34 UTC Last modified: 6 Sep 2019, 3:22:44 UTC I'm going to Post this just to have a Record of the App Missing ALL PULSES after a reboot...in Linux. This is the same problem that exists with the Mac version, except, the Linux version only misses All Pulses on the first task after a reboot. After the first task the Linux version then finds the Pulses on the following tasks, on the Mac you have to cycle the monitor cable to have the App find Pulses after a reboot and then not have the monitor change states. Validate state: Invalid https://setiathome.berkeley.edu/result.php?resultid=8018382337 <core_client_version>7.14.2</core_client_version> <![CDATA[ <stderr_txt> setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GTX 980, 4040 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 2, pciSlotID = 0 Device 3: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 2 setiathome_CUDA: CUDA Device 2 specified, checking... Device 2: GeForce GTX 980 is okay SETI@home using CUDA accelerated device GeForce GTX 980 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.023625 Sigma 107 Sigma > GaussTOffsetStop: 107 > -43 Thread call stack limit is: 1k Pulse: peak=10.10095, time=45.9, period=24.25, d_freq=2258849760.33, score=1.047, chirp=0.32874, fft_len=2k Pulse: peak=10.11994, time=45.9, period=24.25, d_freq=2258849762.2, score=1.049, chirp=0.36936, fft_len=2k Pulse: peak=7.70986, time=45.9, period=21.53, d_freq=2258849750.39, score=1.013, chirp=-0.86184, fft_len=2k Pulse: peak=3.249045, time=45.86, period=6.163, d_freq=2258845024.72, score=1.002, chirp=-1.3124, fft_len=1024 Pulse: peak=11.1925, time=45.9, period=23.71, d_freq=2258849742.43, score=1.161, chirp=-1.8874, fft_len=2k Pulse: peak=9.696304, time=45.9, period=26.49, d_freq=2258849778.12, score=1.002, chirp=2.4205, fft_len=2k Pulse: peak=4.606018, time=45.9, period=9.336, d_freq=2258849780.11, score=1.065, chirp=2.5855, fft_len=2k Pulse: peak=6.338041, time=45.9, period=17.58, d_freq=2258849799.88, score=1.001, chirp=4.9641, fft_len=2k Pulse: peak=6.376744, time=45.9, period=17.22, d_freq=2258849823.7, score=1.008, chirp=7.7959, fft_len=2k Spike: peak=24.16184, time=40.09, d_freq=2258843704.6, chirp=7.815, fft_len=128k Spike: peak=24.25899, time=40.09, d_freq=2258843704.6, chirp=7.8238, fft_len=128k Pulse: peak=4.411401, time=45.9, period=9.723, d_freq=2258849667.12, score=1.019, chirp=-10.832, fft_len=2k Pulse: peak=4.688462, time=45.82, period=9.485, d_freq=2258854180.64, score=1.011, chirp=14.114, fft_len=256 Autocorr: peak=19.04472, time=74.45, delay=5.5501, d_freq=2258847490.31, chirp=-14.527, fft_len=128k Autocorr: peak=18.22151, time=74.45, delay=5.5501, d_freq=2258847490.12, chirp=-14.529, fft_len=128k REBOOT Here, you see the App was finding Pulses up until now setiathome_CUDA: Found 4 CUDA device(s): Device 1: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 1, pciSlotID = 0 Device 2: GeForce GTX 980, 4040 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 2, pciSlotID = 0 Device 3: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 3, pciSlotID = 0 Device 4: GeForce GTX 980, 4043 MiB, regsPerBlock 65536 computeCap 5.2, multiProcs 16 pciBusID = 4, pciSlotID = 0 In cudaAcc_initializeDevice(): Boinc passed DevPref 4 setiathome_CUDA: CUDA Device 4 specified, checking... Device 4: GeForce GTX 980 is okay SETI@home using CUDA accelerated device GeForce GTX 980 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 9.00 special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.023625 Sigma 107 Sigma > GaussTOffsetStop: 107 > -43 Thread call stack limit is: 1k Spike: peak=24.16184, time=40.09, d_freq=2258843704.6, chirp=7.815, fft_len=128k Spike: peak=24.25899, time=40.09, d_freq=2258843704.6, chirp=7.8238, fft_len=128k Autocorr: peak=19.04472, time=74.45, delay=5.5501, d_freq=2258847490.31, chirp=-14.527, fft_len=128k Autocorr: peak=18.22151, time=74.45, delay=5.5501, d_freq=2258847490.12, chirp=-14.529, fft_len=128k Triplet: peak=11.6631, time=17.81, period=10.18, d_freq=2258852321.28, chirp=28.558, fft_len=512 Triplet: peak=11.28005, time=17.81, period=10.18, d_freq=2258852324.2, chirp=28.721, fft_len=512 Triplet: peak=11.53578, time=67.51, period=10.83, d_freq=2258845420.53, chirp=37.748, fft_len=1024 Best spike: peak=24.25899, time=40.09, d_freq=2258843704.6, chirp=7.8238, fft_len=128k Best autocorr: peak=19.04472, time=74.45, delay=5.5501, d_freq=2258847490.31, chirp=-14.527, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.6631, time=17.81, period=10.18, d_freq=2258852321.28, chirp=28.558, fft_len=512 Spike count: 2 Autocorr count: 2 Pulse count: 0 Triplet count: 3 Gaussian count: 0 14:38:27 (1945): called boinc_finish(0) </stderr_txt> The Correct result; Best pulse: peak=11.19249, time=45.9, period=23.71, d_freq=2258849742.43, score=1.161, chirp=-1.8874, fft_len=2k Spike count: 2 Autocorr count: 2 Pulse count: 18 Triplet count: 3 Gaussian count: 0 |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
What were the initial conditions for the re-run? Was there a checkpoint restart file (state.sah) present? |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
I was never able to replicate that on my systems. *shrug* Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Ian&Steve C. Send message Joined: 28 Sep 99 Posts: 4267 Credit: 1,282,604,591 RAC: 6,640 |
Richard, do you know what effect a value of '0' has on the checkpoint setting? default is 60 seconds. but zero appears to be a valid entry. would zero mean always or never? i don't see anything about this in the wiki pages, just the same description about what the setting is for. Seti@Home classic workunits: 29,492 CPU time: 134,419 hours |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
No, I don't - I'll look in the morning. I did once do some extensive running at an interval of 1 second, so I could capture the declared 'progress age%' and point it out to Raistmer. There was a problem with one of his apps recording progress in two separate ways, with different values. We got it fixed in the end. Edit - come to think of it, that could be related to a small qripe I have with the 'special sauce' app. When running mid-AR tasks (neither VHAR nor VLAR), it quits at just over 56% progress, never registering that it has processed everything it's due to process. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
I was never able to replicate that on my systems. *shrug*I can pop one off in a heartbeat. All I have to do is restart the 9 GPU GDDR5 machine and usually at least one GPU will Miss All Pulses on the first task. *shrug* |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
What were the initial conditions for the re-run? Was there a checkpoint restart file (state.sah) present?This problem Has nothing to do with checkpoints. It's simply a problem of starting BOINC after the machine has been restarted. If you don't restart the machine, you don't have the problem. Oh, I'd also suggest using a machine where the GPUs use GDDR5 ram.... |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
The problem with missed pulses after boot has been narrowed down to the type of memory on the card, seems it disappears when you go from a 1070 to a 1080Ti. Unfortunately, most here don't have 1080s, and everything lower has the problem. At least time should fix that one, when people stop using GDDR5 GPUs. . . Sadly that will be a while I suspect, there are a lot of good cards out there using GDDR5 ram. Like my 1060s which will hopefully be around a while yet ... . . Maybe when RTX cards are a little more attractive in price :) Stephen < shrug > |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
@ Richard We think on a way to remove the checkpoint, the question now is: how we could test if that works? For now the modification was made on the 10.2 mutex builds (the one i use ) but if works will be easy to replicate to the others versions. |
Tom M Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 |
It just feels like a more elegant solution, all round. Often an elegant solution turns out to also be more efficient. Not always but often enough that it really is a good rule of thumb. A proud member of the OFA (Old Farts Association). |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
First task after a restart on this machine; https://setiathome.berkeley.edu/results.php?hostid=6813106&offset=4060 They will show up when the Replica catches up. SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.423278 Sigma 3 Thread call stack limit is: 1k Autocorr: peak=18.29169, time=33.55, delay=3.7766, d_freq=1418750195.17, chirp=5.8164, fft_len=128k Autocorr: peak=20.05877, time=33.55, delay=3.7766, d_freq=1418750195.38, chirp=5.8228, fft_len=128k Triplet: peak=10.797, time=54.81, period=2.672, d_freq=1418749256.09, chirp=-8.005, fft_len=32 Triplet: peak=11.36246, time=55.58, period=2.286, d_freq=1418754024.39, chirp=12.008, fft_len=32 Triplet: peak=9.814651, time=14.04, period=0.1081, d_freq=1418750496.05, chirp=-19.012, fft_len=64 Triplet: peak=11.09979, time=54.81, period=2.672, d_freq=1418749294.14, chirp=-24.015, fft_len=32 Spike: peak=24.20125, time=33.55, d_freq=1418749809.13, chirp=-29.352, fft_len=128k Triplet: peak=11.25733, time=54.81, period=2.672, d_freq=1418749270.25, chirp=-30.019, fft_len=32 Best spike: peak=24.20125, time=33.55, d_freq=1418749809.13, chirp=-29.352, fft_len=128k Best autocorr: peak=20.05877, time=33.55, delay=3.7766, d_freq=1418750195.38, chirp=5.8228, fft_len=128k Best gaussian: peak=4.216119, mean=0.5433921, ChiSq=1.271613, time=69.63, d_freq=1418751609.9, score=1.012289, null_hyp=2.226028, chirp=-56.946, fft_len=16k Best pulse: peak=0, time=-2.123e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.36246, time=55.58, period=2.286, d_freq=1418754024.39, chirp=12.008, fft_len=32 Spike count: 1 Autocorr count: 2 Pulse count: 0 Triplet count: 5 Gaussian count: 0 18:51:35 (2724): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>29jn14ab.26608.310113.16.43.128</wu_name> Device 3: GeForce GTX 1070 is okay SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.008741 Sigma 155 Sigma > GaussTOffsetStop: 155 > -91 Thread call stack limit is: 1k Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Spike: peak=24.56085, time=100.7, d_freq=1421220730.71, chirp=-4.6583, fft_len=128k Spike: peak=24.51882, time=100.7, d_freq=1421220730.71, chirp=-4.662, fft_len=128k Spike: peak=24.01455, time=20.13, d_freq=1421216589.75, chirp=8.5161, fft_len=128k Spike: peak=24.16371, time=20.13, d_freq=1421216589.75, chirp=8.5198, fft_len=128k Triplet: peak=11.93469, time=98.4, period=5.397, d_freq=1421220767.86, chirp=-11.747, fft_len=64 Triplet: peak=12.0822, time=98.4, period=5.397, d_freq=1421220757.72, chirp=-14.952, fft_len=64 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Triplet: peak=11.17857, time=98.4, period=5.397, d_freq=1421220747.66, chirp=-18.155, fft_len=64 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Triplet: peak=10.08963, time=64.12, period=30.54, d_freq=1421221790.23, chirp=-37.779, fft_len=512 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Triplet: peak=11.34613, time=28.4, period=22.48, d_freq=1421225446.21, chirp=-74.758, fft_len=64 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Best spike: peak=24.56085, time=100.7, d_freq=1421220730.71, chirp=-4.6583, fft_len=128k Best autocorr: peak=16.94834, time=20.13, delay=3.4291, d_freq=1421220887.21, chirp=9.1437, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.122e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.122e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=12.0822, time=98.4, period=5.397, d_freq=1421220757.72, chirp=-14.952, fft_len=64 Spike count: 4 Autocorr count: 0 Pulse count: 0 Triplet count: 5 Gaussian count: 0 18:51:15 (2725): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>22oc12ab.27414.885.7.34.125.vlar</wu_name> Device 4: GeForce GTX 1070 is okay SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.059568 Sigma 12 Thread call stack limit is: 1k Spike: peak=24.61784, time=74.45, d_freq=7813283963.74, chirp=8.0206, fft_len=128k Spike: peak=24.94963, time=74.45, d_freq=7813283963.75, chirp=8.0218, fft_len=128k Spike: peak=24.12242, time=74.45, d_freq=7813283963.76, chirp=8.0231, fft_len=128k Spike: peak=24.66986, time=74.45, d_freq=7813283963.73, chirp=8.0345, fft_len=128k Spike: peak=25.24916, time=74.45, d_freq=7813283963.74, chirp=8.0358, fft_len=128k Spike: peak=24.6944, time=74.45, d_freq=7813283963.75, chirp=8.0371, fft_len=128k Autocorr: peak=17.99974, time=28.63, delay=4.6295, d_freq=7813281343.32, chirp=-23.369, fft_len=128k Spike: peak=24.37449, time=51.54, d_freq=7813284140.3, chirp=28.145, fft_len=128k Triplet: peak=10.35954, time=39.73, period=10.25, d_freq=7813283025.56, chirp=-33.292, fft_len=1024 Best spike: peak=25.24916, time=74.45, d_freq=7813283963.74, chirp=8.0358, fft_len=128k Best autocorr: peak=17.99974, time=28.63, delay=4.6295, d_freq=7813281343.32, chirp=-23.369, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=10.35954, time=39.73, period=10.25, d_freq=7813283025.56, chirp=-33.292, fft_len=1024 Spike count: 7 Autocorr count: 1 Pulse count: 0 Triplet count: 1 Gaussian count: 0 18:51:04 (2726): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>blc66_2bit_guppi_58838_01073_TIC434234955_0016.13680.0.20.29.239.vlar</wu_name> Device 6: GeForce GTX 1070 is okay SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.011931 Sigma 95 Sigma > GaussTOffsetStop: 95 > -31 Thread call stack limit is: 1k Spike: peak=24.13465, time=83.04, d_freq=3958481533.46, chirp=-7.3263, fft_len=64k Spike: peak=25.9573, time=83.04, d_freq=3958481533.45, chirp=-7.3517, fft_len=64k Spike: peak=25.269, time=83.04, d_freq=3958481533.48, chirp=-7.3618, fft_len=64k Spike: peak=24.95727, time=83.04, d_freq=3958481533.44, chirp=-7.377, fft_len=64k Spike: peak=26.07322, time=83.04, d_freq=3958481533.47, chirp=-7.3872, fft_len=64k Spike: peak=24.08016, time=83.04, d_freq=3958481533.45, chirp=-7.4126, fft_len=64k Spike: peak=24.05482, time=85.9, d_freq=3958472594.67, chirp=9.2734, fft_len=128k Triplet: peak=11.56924, time=33.89, period=13.65, d_freq=3958477227.9, chirp=-16.942, fft_len=512 Best spike: peak=26.07322, time=83.04, d_freq=3958481533.47, chirp=-7.3872, fft_len=64k Best autocorr: peak=16.06669, time=17.18, delay=4.9547, d_freq=3958476873.6, chirp=-8.508, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.56924, time=33.89, period=13.65, d_freq=3958477227.9, chirp=-16.942, fft_len=512 Spike count: 7 Autocorr count: 0 Pulse count: 0 Triplet count: 1 Gaussian count: 0 18:50:59 (2728): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>blc75_2bit_guppi_58693_09523_HIP98825_0145.13617.409.22.45.41.vlar</wu_name> Device 7: GeForce GTX 1070 is okay SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.423278 Sigma 3 Thread call stack limit is: 1k Spike: peak=24.32879, time=73.82, d_freq=1419065508.6, chirp=17.083, fft_len=128k Spike: peak=24.48889, time=73.82, d_freq=1419065508.6, chirp=17.084, fft_len=128k Spike: peak=25.03698, time=100.7, d_freq=1419065599.2, chirp=19.726, fft_len=128k Spike: peak=25.83367, time=100.7, d_freq=1419065599.2, chirp=19.729, fft_len=128k Spike: peak=25.09234, time=100.7, d_freq=1419065599.2, chirp=19.733, fft_len=128k Triplet: peak=11.44017, time=85.58, period=2.123, d_freq=1419064050.91, chirp=33.279, fft_len=256 Triplet: peak=11.74501, time=85.58, period=2.123, d_freq=1419064055.55, chirp=33.779, fft_len=256 Triplet: peak=10.14546, time=85.58, period=2.123, d_freq=1419064048.21, chirp=35.03, fft_len=256 Triplet: peak=10.67206, time=85.58, period=2.123, d_freq=1419064052.85, chirp=35.53, fft_len=256 Best spike: peak=25.83367, time=100.7, d_freq=1419065599.2, chirp=19.729, fft_len=128k Best autocorr: peak=17.70459, time=60.4, delay=6.45, d_freq=1419061255.92, chirp=-20.598, fft_len=128k Best gaussian: peak=3.76466, mean=0.5651146, ChiSq=1.324526, time=101.5, d_freq=1419063510.21, score=-1.005524, null_hyp=2.145819, chirp=-54.777, fft_len=16k Best pulse: peak=0, time=-2.123e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.74501, time=85.58, period=2.123, d_freq=1419064055.55, chirp=33.779, fft_len=256 Spike count: 5 Autocorr count: 0 Pulse count: 0 Triplet count: 4 Gaussian count: 0 18:51:34 (2729): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>29jn14ab.26608.310113.16.43.160</wu_name> Device 8: GeForce GTX 1070 is okay SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.423278 Sigma 3 Thread call stack limit is: 1k Autocorr: peak=17.87708, time=6.711, delay=4.2779, d_freq=1421044980.86, chirp=8.7897, fft_len=128k Autocorr: peak=19.29076, time=6.711, delay=4.2779, d_freq=1421044980.87, chirp=8.7906, fft_len=128k Autocorr: peak=19.60522, time=6.711, delay=4.2779, d_freq=1421044980.87, chirp=8.7916, fft_len=128k Autocorr: peak=18.69166, time=6.711, delay=4.2779, d_freq=1421044980.88, chirp=8.7925, fft_len=128k Triplet: peak=10.96123, time=36.7, period=1.07, d_freq=1421042660.39, chirp=-20.045, fft_len=32 Triplet: peak=10.88522, time=36.7, period=1.07, d_freq=1421042660.39, chirp=-20.045, fft_len=32 Spike: peak=24.11472, time=23.49, d_freq=1421043362.72, chirp=-20.678, fft_len=64k Triplet: peak=11.05557, time=36.7, period=1.07, d_freq=1421042671.34, chirp=-28.063, fft_len=32 Triplet: peak=10.96822, time=36.7, period=1.07, d_freq=1421042671.34, chirp=-28.063, fft_len=32 Triplet: peak=10.82123, time=36.7, period=1.07, d_freq=1421042682.28, chirp=-36.081, fft_len=32 Triplet: peak=10.80066, time=36.7, period=1.07, d_freq=1421042682.28, chirp=-36.081, fft_len=32 Triplet: peak=10.30143, time=36.7, period=1.07, d_freq=1421042693.22, chirp=-44.099, fft_len=32 Triplet: peak=10.3704, time=36.7, period=1.07, d_freq=1421042693.22, chirp=-44.099, fft_len=32 Triplet: peak=9.604724, time=36.7, period=1.07, d_freq=1421042704.16, chirp=-52.117, fft_len=32 Triplet: peak=9.705168, time=36.7, period=1.07, d_freq=1421042704.16, chirp=-52.117, fft_len=32 Spike: peak=24.01481, time=32.72, d_freq=1421041611.9, chirp=-73.846, fft_len=16k Spike: peak=24.51206, time=32.72, d_freq=1421041611.92, chirp=-73.882, fft_len=16k Spike: peak=24.33436, time=32.72, d_freq=1421041611.88, chirp=-73.901, fft_len=16k Spike: peak=24.67686, time=32.72, d_freq=1421041611.96, chirp=-73.917, fft_len=16k Spike: peak=25.09182, time=32.72, d_freq=1421041611.92, chirp=-73.936, fft_len=16k Spike: peak=24.07961, time=32.72, d_freq=1421041612, chirp=-73.952, fft_len=16k Spike: peak=25.01396, time=32.72, d_freq=1421041611.88, chirp=-73.956, fft_len=16k Spike: peak=25.06525, time=32.72, d_freq=1421041611.96, chirp=-73.971, fft_len=16k Spike: peak=24.41496, time=32.72, d_freq=1421041611.84, chirp=-73.975, fft_len=16k Spike: peak=25.60272, time=32.72, d_freq=1421041611.92, chirp=-73.991, fft_len=16k Spike: peak=24.27016, time=32.72, d_freq=1421041612.01, chirp=-74.006, fft_len=16k Spike: peak=25.63157, time=32.72, d_freq=1421041611.88, chirp=-74.01, fft_len=16k Spike: peak=25.38454, time=32.72, d_freq=1421041611.97, chirp=-74.026, fft_len=16k Spike: peak=24.18738, time=32.72, d_freq=1421041611.82, chirp=-74.031, fft_len=16k Spike: peak=26.07127, time=32.72, d_freq=1421041611.9, chirp=-74.046, fft_len=16k SETI@Home Informational message -9 result_overflow NOTE: The number of results detected equals the storage space allocated. Best spike: peak=26.07127, time=32.72, d_freq=1421041611.9, chirp=-74.046, fft_len=16k Best autocorr: peak=19.60522, time=6.711, delay=4.2779, d_freq=1421044980.87, chirp=8.7916, fft_len=128k Best gaussian: peak=3.367055, mean=0.5832998, ChiSq=1.380842, time=36.07, d_freq=1421041563.59, score=-3.467275, null_hyp=2.042459, chirp=35.49, fft_len=16k Best pulse: peak=0, time=-2.123e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.05557, time=36.7, period=1.07, d_freq=1421042671.34, chirp=-28.063, fft_len=32 Spike count: 16 Autocorr count: 4 Pulse count: 0 Triplet count: 10 Gaussian count: 0 18:51:24 (2730): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>29jn14ab.26608.310113.16.43.107</wu_name> Device 9: GeForce GTX 1070 is okay SETI@home using CUDA accelerated device GeForce GTX 1070 Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.423278 Sigma 3 Thread call stack limit is: 1k Autocorr: peak=18.26439, time=46.98, delay=5.8973, d_freq=1420721417.92, chirp=-26.361, fft_len=128k Gaussian: peak=4.364394, mean=0.5521422, ChiSq=1.414316, time=67.95, d_freq=1420720131.57, score=2.20463, null_hyp=2.370589, chirp=-86.789, fft_len=16k Best spike: peak=23.67932, time=20.13, d_freq=1420719958.74, chirp=-10.438, fft_len=128k Best autocorr: peak=18.26439, time=46.98, delay=5.8973, d_freq=1420721417.92, chirp=-26.361, fft_len=128k Best gaussian: peak=4.364394, mean=0.5521422, ChiSq=1.414316, time=67.95, d_freq=1420720131.57, score=2.20463, null_hyp=2.370589, chirp=-86.789, fft_len=16k Best pulse: peak=0, time=-2.123e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=0, time=-2.123e+11, period=0, d_freq=0, chirp=0, fft_len=0 Spike count: 0 Autocorr count: 1 Pulse count: 0 Triplet count: 0 Gaussian count: 1 18:51:34 (2731): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>29jn14ab.26608.310113.16.43.74</wu_name> Device 10: GeForce GTX 1060 3GB is okay SETI@home using CUDA accelerated device GeForce GTX 1060 3GB Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.014356 Sigma 50 Sigma > GaussTOffsetStop: 50 > 14 Thread call stack limit is: 1k Triplet: peak=11.80147, time=74.84, period=4.131, d_freq=7864775380.76, chirp=9.05, fft_len=8 Triplet: peak=11.05247, time=57.66, period=14.33, d_freq=7864785494.14, chirp=-17.534, fft_len=128 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Triplet: peak=11.73985, time=74.84, period=4.131, d_freq=7864775304.7, chirp=27.149, fft_len=8 Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Triplet: peak=11.00238, time=53.69, period=24.7, d_freq=7864778867.72, chirp=-82.073, fft_len=8k Triplet: peak=12.01505, time=57.65, period=8.937, d_freq=7864785485.41, chirp=-97.564, fft_len=256 Triplet: peak=11.23825, time=57.65, period=8.937, d_freq=7864785481.24, chirp=-98.412, fft_len=256 Best spike: peak=23.86442, time=62.99, d_freq=7864784776.11, chirp=21.208, fft_len=128k Best autocorr: peak=16.8412, time=74.45, delay=2.8409, d_freq=7864779465.29, chirp=-12.898, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=12.01505, time=57.65, period=8.937, d_freq=7864785485.41, chirp=-97.564, fft_len=256 Spike count: 0 Autocorr count: 0 Pulse count: 0 Triplet count: 6 Gaussian count: 0 18:51:20 (2732): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>blc66_2bit_guppi_58838_04586_TIC311183180_0027.14939.0.19.28.131.vlar</wu_name> Device 11: GeForce GTX 1060 3GB is okay SETI@home using CUDA accelerated device GeForce GTX 1060 3GB Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.014356 Sigma 50 Sigma > GaussTOffsetStop: 50 > 14 Thread call stack limit is: 1k Find triplets Cuda kernel encountered too many triplets, or bins above threshold, reprocessing this PoT on CPU... err = 1 Triplet: peak=11.21719, time=47.06, period=27.66, d_freq=7864767769.25, chirp=-59.952, fft_len=64 Best spike: peak=23.75414, time=62.99, d_freq=7864767046.04, chirp=13.25, fft_len=128k Best autocorr: peak=16.83648, time=40.09, delay=3.0515, d_freq=7864769159.72, chirp=4.4476, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.21719, time=47.06, period=27.66, d_freq=7864767769.25, chirp=-59.952, fft_len=64 Spike count: 0 Autocorr count: 0 Pulse count: 0 Triplet count: 1 Gaussian count: 0 18:51:15 (2733): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>blc66_2bit_guppi_58838_04586_TIC311183180_0027.14939.0.19.28.130.vlar</wu_name> Device 12: GeForce GTX 1060 3GB is okay SETI@home using CUDA accelerated device GeForce GTX 1060 3GB Unroll autotune 1. Overriding Pulse find periods per launch. Parameter -pfp set to 1 setiathome v8 enhanced x41p_V0.98b1, Cuda 10.2 Special Modifications done by petri33, compiled by TBar Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements. Work Unit Info: ............... WU true angle range is : 0.011931 Sigma 96 Sigma > GaussTOffsetStop: 96 > -32 Thread call stack limit is: 1k Spike: peak=24.09869, time=85.9, d_freq=3955480165.28, chirp=-20.353, fft_len=128k Spike: peak=24.11948, time=5.727, d_freq=3955482387.33, chirp=21.452, fft_len=128k Spike: peak=24.45933, time=5.727, d_freq=3955482387.34, chirp=21.453, fft_len=128k Triplet: peak=11.53761, time=61.46, period=13.13, d_freq=3955483935.99, chirp=-30.666, fft_len=16 Triplet: peak=10.1629, time=14.32, period=3.288, d_freq=3955483317.03, chirp=40.569, fft_len=512 Triplet: peak=11.42607, time=25.55, period=7.416, d_freq=3955475392.47, chirp=78.261, fft_len=256 Triplet: peak=10.35243, time=25.55, period=7.416, d_freq=3955475396.73, chirp=80.178, fft_len=256 Best spike: peak=24.45933, time=5.727, d_freq=3955482387.34, chirp=21.453, fft_len=128k Best autocorr: peak=16.8982, time=74.45, delay=0.65562, d_freq=3955480357.46, chirp=22.691, fft_len=128k Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0, score=-12, null_hyp=0, chirp=0, fft_len=0 Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 Best triplet: peak=11.53761, time=61.46, period=13.13, d_freq=3955483935.99, chirp=-30.666, fft_len=16 Spike count: 3 Autocorr count: 0 Pulse count: 0 Triplet count: 4 Gaussian count: 0 18:51:14 (2734): called boinc_finish(0) </stderr_txt> ]]> </stderr_out> <wu_name>blc75_2bit_guppi_58693_09523_HIP98825_0145.13640.409.22.45.35.vlar</wu_name> That was a nasty one. Usually it's only a couple. The way I get around this is to let it run for about 15 seconds then restart BOINC. It doesn't happen after that. Oh.....This is NOT a Mac. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Fair comment, and the loss of a couple of tasks on a monthly reboot is trivial at the speed of the 'special sauce' app. . . OOh, aaah! Maybe I have to relinquish part of my user name to Richard ... LOL. Stephen :) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
The Problem is, Most SETI users Don't run their machines for months on end. Just because You do, doesn't mean others do. This is a problem Every Time you reboot, and was on the List along with the DIFFERENT PROBLEM with Checkpoints, and BAD BEST Pulses. These will need to be solved if the App is ever going to make it to Beta. |
Oddbjornik Send message Joined: 15 May 99 Posts: 220 Credit: 349,610,548 RAC: 1,728 |
i did a fresh install of Linux as was suggested from the beginning of my problems but went with Mint, already working better with my setup. I am not sure if it is the thread-ripper or the dual cards that were causing the problem in Ubuntu. I am maxed out now and my tower might as well be a space heater )Your GPU tasks fail , most likely because you need a newer driver for the CUDA 10.2 program to work. |
Buckeye4LF Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 |
I noticed that, but thanks for checking. I was going to let it run for a day as is to make sure it was not random. Seemed 440 drivers worked last round |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.