Posts by RueiKe

1) Message boards : Number crunching : Main Computer Down...again (Message 1914614)
Posted 5 hours ago by Profile RueiKeSpecial Project $250 donor
Post:
I am happy with the results I am getting from my XSPC Raystorm Pro CPU waterblock.


Forgot to mention that the coverage issue is a Threadripper specific issue. EK just used their original design which doesn't give good coverage of the multi-die configuration of Threadripper.
2) Message boards : Number crunching : Main Computer Down...again (Message 1914611)
Posted 6 hours ago by Profile RueiKeSpecial Project $250 donor
Post:
I'm 2 days into a 2 week trip to the US and see that my main system is down. I suspect that weather warmed up a bit and the room I moved this machine to got too warm and system was unstable. I probably need to upgrade the CPU waterblock, as the one from EK doesn't have good coverage. Also, auto reboot and restart of BOINC would be good. Anyone have a setup to reboot, auto-login and restart BOINC in linux?
3) Message boards : Number crunching : A very steep decline in Average Credits!!! (Message 1913454)
Posted 5 days ago by Profile RueiKeSpecial Project $250 donor
Post:
If somebody has some hard data showing just what the impact of rescheduling is on granted credits, or can run some new tests to generate a comparison, I think it would be very useful. When I first experimented with rescheduling in June of 2016, there were some people who said it did affect credit and others who said that was a myth that had already been put to rest long before.

So, just to make sure that my own rescheduling wasn't messing up other people's credit, I did some fairly extensive comparisons. My results were posted in Message 1799300. The conclusion I reached, based on those results, was that rescheduling had "no more impact to the credits than is caused by the random number generator that assigns them in the first place."

Rescheduling at that time simply meant moving Guppi VLARs that were originally assigned to the GPUs over to the CPUs, and moving non-VLAR Arecibo tasks that were originally assigned to the CPUs over to the GPUs. So, yes, tasks were being run on a different device than what they were originally assigned to, which is the issue that is being raised again here.

Now, perhaps things have changed in some way in the last year and a half, such that my previous conclusion is no longer valid. If so, I think new testing and documented results would be needed to demonstrate it.

My results also show that rescheduled work from GPU to CPU gets normal if not higher credit. See this example 6317203128. My observation is that non-rescheduled WU's that ran after the rescheduling event get lower credit. This could be the result of the WUs post outage are very different than WUs before. But I was concerned that something is going on with credit calculation after the rescheduling. Did the rescheduled work somehow change the reference for credit calculation of new WU's? Can information be extracted for the 2 WUs I referenced to verify this?
4) Message boards : Number crunching : A very steep decline in Average Credits!!! (Message 1913394)
Posted 5 days ago by Profile RueiKeSpecial Project $250 donor
Post:
No that is NOT the cause. CreditScrew is the cause. Think about it. You moved tasks assigned originally to the gpus on your system. The scheduler took into account the APR for the gpu application. You moved the gpu tasks temporarily to the cpu for bunkering. The scheduler and server has no knowledge of this action. You then move your gpu tasks temporarily stored in the cpu cache back to the gpu cache where you process them during the outrage.

What has changed? Nothing. You processed the originally assigned gpu tasks on the gpu as intended. You get 50 credits per task. Thank you CreditScrew.

In my case, I am not doing "bunkering". I moved a bunch of WUs to CPU and left them there. I was not trying to get more tasks, only keep my CPU fully loaded during the outage. I only run SETI and LHC, and LHC doesn't have GPU tasks, so my plan was to move tasks from GPU to CPU to keep the CPU loaded during the outage and use the GPUs for mining. But if this is messing up credit calculations for work done, then I won't do it.

Low consistent credit is not an issue for me. What I like about LHC is that credit is also very low, perhaps even more difficult than SETI. This makes the competitive computing aspect of it even more meaningful. I only raised the concern in this thread since some of the observations after rescheduling seemed extreme. Even some tasks below 20credits, so still concerned that rescheduling is a factor. In this case, there is also a shift in work unit types, so still uncertain what happened.
5) Message boards : Number crunching : A very steep decline in Average Credits!!! (Message 1913341)
Posted 6 days ago by Profile RueiKeSpecial Project $250 donor
Post:
If you go through your results, you'll see quite a few like that.
It's all to do with Credit New & the way it determines Credit.

My WAG (Wild Arse Guess)- your GPU APR (Average Processing Rate) is only 183.57 GFLOPS, I suspect the theoretical value for the GPU is much, much higher (checkout the first few startup lines in the log to see what the claimed FLOPS for that particular card is)- For that WU your Device peak FLOPS is 8,192.00 GFLOPS.
Yet your actual processing time is very quick, much faster than your APR would indicate.
Credit New considers your cards to be extremely inefficient (big discrepancy between APR & benchmark FLOPS) and a very high device Peak flops (with such a low APR means it's really, really in efficient, or the numbers have been fudged (it's interpretation)).
Credit New makes all sorts of assumptions, and if your figures don't meet those assumptions then it considers your figures to be a result of poor efficiency, or cheating, or both.
Final end result- bugger all credit for work done. By design.


I watch my machines quite closely and have not noticed sub 50s credit awards like this in the past. I did try something different in the latest work shortage. I decide to try the rescheduler for the first time. I used it to move several hundred GPU tasks to CPU. Since I knew the machine would completely run out of work, I decided on a strategy to keep it fully loaded. I moved enough GPU tasks to CPU to make sure the CPU would be fully loaded while I slept my Sunday evening. I then enabled mining on the GPUs. In the morning, I stopped mining and un-suspended GPUs. Everything looked normal. Only thing that doesn't make sense is that it is not the rescheduled work that is getting the lower credit. It is the work that actually ran afterward on the GPUs. Anyone think this is the cause?
6) Message boards : Number crunching : A very steep decline in Average Credits!!! (Message 1913307)
Posted 6 days ago by Profile RueiKeSpecial Project $250 donor
Post:
This example of credit granted for a normal WU definitely looks off:
2819745779
Only 26 credits for 360s of GPU work or 3,935s of CPU work from the wingman seems well beyond a credit new effect. Perhaps there is a problem somewhere?
7) Message boards : Number crunching : ubuntu Install (Message 1912722)
Posted 9 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Hi RueiKe!

Message 1896117 on the previous page in this thread.

Perhaps you meant to say "but I can not get to advanced view" here?

Except for that, becomes the same word twice, but still perhaps not the same.


That was so long ago that I can not remember the details of what I was doing. But definitely all of the issues I was having were fixed by Tbar's latest Linux build of BOINCmgr.
8) Message boards : Number crunching : ubuntu Install (Message 1912721)
Posted 9 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Guess we should be running the SSE41 app on the BLC05 cpu tasks.

[Its a wash between the r3345 AVX app and the r3711 SSE41 apps. But the r3345 AVX app is 23% faster than the r3712 AVX2 app.


I have confirmed that my actual performance had improved when I switched from AVX to SSE42 as my benchmarks indicated. Probably a good idea to re-validate optimization when WU characteristics change.
9) Message boards : Number crunching : ubuntu Install (Message 1912720)
Posted 9 days ago by Profile RueiKeSpecial Project $250 donor
Post:
From my experience it depends on the tasks and the CPU load.
On my benches AVX2 was slower than SSE4.1 in most cases on my Ryzen 1800X.


I have definitely found that to be the case. My approach now is to free up 1 core on my machine for the benchmark runs and make sure it doesn't stop BOINCmgr from continuing to run tasks. I am still concerned that results may be influence by what app type is running on the rest of the cores (will tesing AVX be invluence by other cores running SSE42?)
10) Message boards : Number crunching : ECC Memory Correction Rate (Message 1912134)
Posted 12 days ago by Profile RueiKeSpecial Project $250 donor
Post:
During the downtime I increased memory voltage from 1.2V to 1.21V and after 25hours, I only have 8 CE. I will bump it up another 10mv next time I reboot.
11) Message boards : Number crunching : ubuntu Install (Message 1912005)
Posted 13 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Since I recently found out about the availability of Linux AVX2 apps, I decided to revisit the work I did in this posting on performance comparisons of the different app versions and testing methods. Here are the new results. Previous results are lower in this thread.


My approach was to use the same single core Linux VM on my 1950X Win10 desktop that I had previously used. Only difference is that this time the CPU is only 70% loaded. This showed AVX2 was best, but strangely invalidated my previous conclusion that AVX was better than SSE41. I suspect this is the result running with the system 70% utilized. I got similar results using the new Win10 linux subsystem. I released AVX2 on Eos and found that processing times increased about 10%. Then I decided to test directly on Eos, my main contributer to SETI. I set CPU usage to 97% which left only 1 thread idle. I then ran one of the original test WUs and found that the results were not that far off from Nemesis, with AVX2 app significantly faster than AVX. I then used one the current WUs (Second column of data) and found that AVX2 was worse and SSE42 was best. I released SSE42 on my system and processing times decreased by 10%.

Seems like WUs coming in now are a bit different from what we received previously and previous optimizations don't apply.
12) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911743)
Posted 14 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Thanks again. Just wanted to be certain. So, MBv8_8.05r3345_avx_linux64 needs to go under the microscope.


One more item to point out. Even though those 3 tasks were still active, I did not observe the "Waiting to acquire lock" error. Actually, I have only observed that error the one time I posted here. I was only raising these observations as being potentially relevant.
13) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911737)
Posted 14 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Thanks - that's very clear about which app to focus on, too.

Just to be absolutely clear, you are aware that BOINC Manager (boincmgr) doesn't need to be running for the BOINC Client (boinc) to do its work? There is an option "Stop running tasks when exiting the BOINC Manager": if that option is unchecked, it will behave - deliberately - as you are describing.

The option is contained in the Exit Confirmation dialog: if that doesn't appear, enable it from the Options --> Other options... menu in BOINC Manager.


Yes, I am aware of that option and checked it and indicated it should remember, so it should stop all tasks each time I quit. Plus all but 3 MB process did exit.
14) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911735)
Posted 14 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Not sure if this observation is relevant, but on my Linux system I have always had an issue where if I started boincmgr too soon after exiting it, it would not connect to the project and I would have to terminate and try again. To avoid this issue I would always monitor MB processes in system monitor and wait for all to finish before starting boincmgr again. It usually takes a long time (~1 min) of some processes being idle before they stop running. I did this just now and found 8 processes still listed after 3min:


Now it has been over 10min and 3 of those 8 are still listed in the system monitor. I just restarted boincmgr and after more than 30min, those 3 processes still show up as active:


18836   1696  0 Jan06 pts/18   00:00:17 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
18837   1696  0 Jan06 pts/18   00:00:17 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
18838   1696  0 Jan06 pts/18   00:00:17 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59633  59606 96 17:45 pts/18   00:21:42 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59635  59606 98 17:45 pts/18   00:22:07 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59637  59606 95 17:45 pts/18   00:21:36 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59638  59606 96 17:45 pts/18   00:21:47 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59640  59606 97 17:45 pts/18   00:22:00 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59642  59606 96 17:45 pts/18   00:21:50 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59645  59606 96 17:45 pts/18   00:21:48 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59646  59606 96 17:45 pts/18   00:21:50 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59648  59606 97 17:45 pts/18   00:21:54 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59650  59606 96 17:45 pts/18   00:21:48 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59652  59606 96 17:45 pts/18   00:21:43 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59654  59606 96 17:45 pts/18   00:21:40 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59656  59606 95 17:45 pts/18   00:21:33 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59659  59606 96 17:45 pts/18   00:21:40 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59661  59606 97 17:45 pts/18   00:21:58 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59663  59606 96 17:45 pts/18   00:21:38 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59665  59606 97 17:45 pts/18   00:22:02 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59667  59606 96 17:45 pts/18   00:21:44 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59669  59606 96 17:45 pts/18   00:21:41 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59670  59606 97 17:45 pts/18   00:21:58 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59671  59606 95 17:45 pts/18   00:21:25 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59673  59606 95 17:45 pts/18   00:21:37 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59681  59606 97 17:45 pts/18   00:22:01 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59685  59606 97 17:45 pts/18   00:22:00 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59687  59606 96 17:45 pts/18   00:21:45 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59825  59606 97 17:54 pts/18   00:13:40 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59954  59606 94 18:03 pts/18   00:04:39 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59973  59606 95 18:04 pts/18   00:03:29 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
59977  59606 97 18:04 pts/18   00:03:27 ../../projects/setiathome.berkeley.edu/MBv8_8.05r3345_avx_linux64
15) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911646)
Posted 15 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Then stop rescheduling, so that that can be taken out of the equation to see if BOINC and the apps operate normally without changing the client_state file on restarts.

But RueiKe has the same issue and i'm almost sure he not use the same rescheduler program i use, but it's better to ask him.

@RueiKe Could you tell us is you do rescheduling and what program or script you use to do that if you do?

<edit> I PM him and ask him to help us with the answer and post it here. Let's wait.


I don't do any rescheduling.
16) Message boards : Number crunching : ECC Memory Correction Rate (Message 1911442)
Posted 15 days ago by Profile RueiKeSpecial Project $250 donor
Post:
I can't remember seeing a screenshot of the ZE LLC pages or settings. I have load-line calibration settings for both cpu and memory on my Prime Pro. I also have power phase control for both. In both the Prime Pro and CH6H threads, it is always recommended to use the Extreme power phase control settings for both cpu and memory and set it to 140%. That seems to be the most stable for memory.

I would use the spec 1.2V for your memory at minimum and would suggest bumping the memory with an offset to get it into the 1.25V range. Then monitor your corrected errors and see if it goes down or stays the same. With all 8 slots of memory occupied, you are having to deliver a good amount of current with the resultant voltage drop that ensues. I doubt you really have 1.2V at all slots. The current getting set to 140% would certainly help.

When you change the power phase delivery to Extreme, you prevent the BIOS from doing any phase shedding which prevents voltage droop and keeps the voltage more stable. All the VRM phases are kept active at all times. Since this is a BOINC workstation, its not as if you are wanting or needing to do any power management that keeps power levels down.


Thanks for the recommendations. I have the LLC for Vcore at standard, but I had checked with a meter and found Vcore solid at 1.233V when fully loaded. If I remember correctly, memory LLC is extreme by default, but I will check it out when I bring the machine down in the next SETI outage.
17) Message boards : Number crunching : ECC Memory Correction Rate (Message 1911433)
Posted 15 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Found this article. You might want to skip to the part explaining how the EDAC system works. Would be helpful to determine exactly which stick is throwing the errors.

Monitoring-Memory-Errors


Hi Keith, Thanks for the reference. It quotes Googles rate of 2000-6000 CE/GB-Yr. I think that translates to about 5 CE/GB-Day, which is much higher than what I am seeing. But this doesn't seem right. I will need to continue to research it.

When I use the BIOS defaults for this memory, it sets voltage to 1.155, though it is spec'ed at 1.2V. For this run, I had manually set it at1.2V. Also, I had pushed my OC so that Vcore is just enough to prevent crashing. Perhaps I should bump it up one more VRM increment.
18) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911406)
Posted 15 days ago by Profile RueiKeSpecial Project $250 donor
Post:
I looked at the timestamp of /etc/hosts and found I made the modification that fixed the download issue at 17:04 6-Jan. Here is the log of postponed messages:

06-Jan-2018 17:44:00 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
.....
.....
07-Jan-2018 07:35:13 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
Take a look in the Event Log again and see when BOINC last restarted prior to that 17:44:00 timestamp. There should be a line that has "Starting BOINC client version" as part of the text. My guess would be that the timestamp on that line would be much closer to 17:44 than the 17:04 timestamp on your hosts file.


Yes, my last start of boincmgr was at 17:43:17. The previous entry was suspension of computation at 17:42:09. I typically make sure all relevant processes stop before starting boincmgr again. I suspect this was when I changed app_config back to original settings.
19) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911392)
Posted 15 days ago by Profile RueiKeSpecial Project $250 donor
Post:
Jeff, Thanks for asking this question as it made me think of exaclty what I did last night before calling it a day. As I mentioned, yesterday my system was not getting any CPU tasks, so I modified my hosts file and solved that problem. In doing that, I found it looked like GPU tasks were taking longer with CPU tasks running. I made some changes to app_config, increasing and deacressing CPU allocation to GPU tasks. So I did go through several starts and stops. I ended up returning the system back to ints original configuration before going to sleep. I did check the system to make sure all was ok. So perhaps my activity from yesterday evening did trigger the problem. I will continue to observe and report any additional reoccurence here.
You should be able to pinpoint when the lockfile issue first showed up by taking a look at your BOINC Event Log (stored in "stdoutdae.txt" and "stdoutdae.old"). You should start seeing "task postponed 600.000000 sec: Waiting to acquire lock" at some point shortly after a restart.


I looked at the timestamp of /etc/hosts and found I made the modification that fixed the download issue at 17:04 6-Jan. Here is the log of postponed messages:

06-Jan-2018 17:44:00 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 17:44:01 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 17:44:02 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 18:48:27 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 18:49:02 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 18:49:37 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:02:03 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:02:25 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:02:28 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:22:19 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:22:33 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:22:54 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:55:37 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:56:13 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 19:56:48 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 20:09:40 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 20:09:44 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 20:09:46 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 20:28:39 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 20:29:08 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
06-Jan-2018 20:29:14 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
***message repeats all night****
07-Jan-2018 06:08:21 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:08:53 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:09:28 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:19:03 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:19:31 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:20:06 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:29:39 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:30:11 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:30:42 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:40:29 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:41:05 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:41:18 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:51:10 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:51:45 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 06:52:20 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:01:53 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:02:25 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:03:01 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:12:46 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:13:09 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:13:45 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:23:57 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:23:58 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:24:22 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:35:11 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:35:12 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
07-Jan-2018 07:35:13 [SETI@home] task postponed 600.000000 sec: Waiting to acquire slot directory lock.  Another instance may be running.
20) Message boards : Number crunching : Postponed: Waiting to acquire lock (Message 1911372)
Posted 15 days ago by Profile RueiKeSpecial Project $250 donor
Post:
I woke today to the same problem described by the OP. Yesterday, I was not getting any CPU tasks, so I updated the hosts file and it seemed that all was fine. Got a full cache of CPU work. This morning I found only 3 CPU tasks and all Waiting to acquire lock. I stopped boincmgr and deleted the lockfile. Tried again and got errors waiting for slots lockfile, so I deleted those and tried again. Now the 3 CPUs tasks are running and wile I am typing, CPU WUs started to download. Everything seems fine now, but not sure of the original cause. It is happening on this host Eos
Definitely good to get another report. Did you have a BOINC shutdown and restart immediately preceding the first appearance of the lockfile messages?

It appears that you're running "AVXxjf Linux64 Build 3345", while Juan is running a newer app, "AVX2jf Linux64 Build 3712". So, if it is an app-specific issue, it may not be confined to a single build.


Jeff, Thanks for asking this question as it made me think of exaclty what I did last night before calling it a day. As I mentioned, yesterday my system was not getting any CPU tasks, so I modified my hosts file and solved that problem. In doing that, I found it looked like GPU tasks were taking longer with CPU tasks running. I made some changes to app_config, increasing and deacressing CPU allocation to GPU tasks. So I did go through several starts and stops. I ended up returning the system back to ints original configuration before going to sleep. I did check the system to make sure all was ok. So perhaps my activity from yesterday evening did trigger the problem. I will continue to observe and report any additional reoccurence here.


Next 20


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.