Posts by JStateson

1) Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users (Message 2002716)
Posted 1 day ago by Profile JStateson
I would be interested in results of this special app on a full x16 slots.

No difference, or so little as to be insignificant.
Bandwidth between the CPU & GPU, even for high end video cards, isn't a factor for Seti work.
A low clock speed CPU may limit GPU production by not being able to keep up with the GPUs needs.

I quickly found that the "two thread" Celeron (G1840) that came with the used TB85 mining motherboard was unable to keep up and I replaced it with an E3-1230 Xeon supporting 8 threads.

I did run some tests using that stock Celeron and it worked OK on Milkyway with several AMD RX560 boards but any other project or using NVidia boards there was not enough CPU to go around to feed anything and run 18.04 desktop.

I tested a 4-in-1 riser under windows on a socket 1366 dual Xeon (slots: x8,x4,x8,x4) and as there was a performance hit for GPUGrid when going from the x8 slot to riser, When I added additional gtx1060s to fill the 2,3,4 slots in the 4-in-1 riser the degradation increased but it was not divided by 2, 3 or 4. It went from about %75 utilization for a single board to %45 for each of 3 boards. Windows choked on a 4th board in that 4-in-1 riser and I moved all the boards to the TB85 where they will stay. When GPUGrid releases their Linux app (old one was withdrawn: serious bug) I will try running it on my rig.
2) Message boards : Number crunching : Setting up Linux to crunch CUDA90 and above for Windows users (Message 2002599)
Posted 2 days ago by Profile JStateson
Just switched to that seti special app. Performance boosts is incredible. Have only been running (the special app) for an hour on a new Linux box with four gtx1060 and a pair of 1070 all on x1 risers. I appreciate the work that was done to get this app working!

Some graphics of elapsed time:

Histogram shows gtx1070 left of gtx1060 Look like about a 45 second difference in completion (sorry for missing tics)

Graph shows differences between the 4 Linux versions that have been running on the same system.

I would be interested in results of this special app on a full x16 slots.

3) Message boards : Number crunching : Bitcoin Mining Machines good for BOINC / SETI? (Message 1999939)
Posted 19 days ago by Profile JStateson
Have had mixed success expanding older motherboards to house more GPUs. Not sure if the problem is windows, the power supply, motherboard or the 4-in-1 adapter. My DIY watt meter shows 100 - 200 watts under ratings on 750 and 850 power supplies respectively.

I am thinking that the risers may not get enough power from the SATA or 4 pin Molex.

Windows does not seem to like 4 NVidia. I have a pair of ATI (0n motherboard risers) and three NVidia on a 4-in-1 risers but I cant put a remaining gtx1060 on either the 4-in-1 or the remaining slot on the motherboard even with 200 watts to spare. Device manager loses an existing gtx1060 in addition to the new one. I assume this is driver related. System crunches SETI just fine with 5 boards.

The Ubuntu system with the 750 watt supply has 5 RX560 which get all power form the riser. Putting an RX570 on that causes stability problem and I assume the problem is power even though there are 100 watts to spare.

Looking for ideas.
4) Message boards : Number crunching : have more GPUs than actually exist (Message 1999047)
Posted 25 days ago by Profile JStateson
I'm guessing 2671.3 uses the key in Software and 2766.5 uses the key in CurrentControlSet, and that something went wrong with the transition and now you have both and while ICD loader has some code to handle multiple instances of the same driver it isn't smart enough to handle this situation. Remove the key from Software and see what happens.

Started looking at this again. I bought a 4-in-1 riser with the idea of using 1xHD7950 + 3xS9000 + 1xS9100 + 2xRX560 as these 7 fit in the 850watt supply rating (will be close). On first boot I had 10 GPUs which I expected (twice as many as actually exist)

Have not got to the RX560 but I have a stable*** system with the HD7950 & S9x00 boards but did find a problem.

tried two AMD drivers each generated a slightly different software key

That key cannot be deleted. Even with BOINC not running, deleting the key (I exported it first) causes the blue screen where windows gathers information to send back. When the system reboots the key is back in the registry but that might be because the system died before the registry could be updated. I looked in the event viewer to see what was using that key but didn't see anything of value just the normal erro "last reboot was unexpected" or whatever.

I cannot find the other key you mentioned, the "CurrentControlSet" key. Must be in another thread?? I can try deleting that. Currently the system is working with 5 boards using my trick of editing that coproc info file and then making it read only.

*** I wont know for sure how stable the system is as I am running only 2 concurrent tasks (Milkyway) per board until I find out why I cannot run 5 per board for very long

[EDIT] Found problem: the d0 GPU (one of the S9000) has not been assigned a stask. Instead the two tasks it was to get were assigned to another GPU . The d0 (using gpuz) shows 300mhz speed (idle). I have seen this before: Phantom tasks that never complete. The copro_info I edited: I simply delete the last 5 devices as I assumed they were duplicates. I will have to go back to the origin copro_info and find the correct d0 board. The board indexes I usually go with are 0.0,,, … 4.4 but I supect the 4-in-1 riser messes with the index and the d0 board maybe be 1.0 or 5.5 instead of 0.0 Plus there is no telling if the first board listed in copro_info is d0 or not. The net effect is the index is not correct and one of the gpus never gets assigned a task and the two "phantom" tasks never complete.

6			6/21/2019 9:02:34 AM	Failed to delete old coproc_info.xml. error code -110	
7			6/21/2019 9:02:53 AM	OpenCL: AMD/ATI GPU 0: AMD FirePro S9000 (driver version 2841.5, device version OpenCL 1.2 AMD-APP (2841.5), 6144MB, 6144MB available, 3154 GFLOPS peak)	
8			6/21/2019 9:02:53 AM	OpenCL: AMD/ATI GPU 1: AMD FirePro S9000 (driver version 2841.5, device version OpenCL 1.2 AMD-APP (2841.5), 6144MB, 6144MB available, 3154 GFLOPS peak)	
9			6/21/2019 9:02:53 AM	OpenCL: AMD/ATI GPU 2: AMD FirePro S9100 (driver version 2841.5, device version OpenCL 2.0 AMD-APP (2841.5), 12288MB, 12288MB available, 4506 GFLOPS peak)	
10			6/21/2019 9:02:53 AM	OpenCL: AMD/ATI GPU 3: AMD FirePro S9000 (driver version 2841.5, device version OpenCL 1.2 AMD-APP (2841.5), 6144MB, 6144MB available, 3154 GFLOPS peak)	
11			6/21/2019 9:02:53 AM	OpenCL: AMD/ATI GPU 4: AMD Radeon HD 7900 Series (driver version 2841.5, device version OpenCL 1.2 AMD-APP (2841.5), 3072MB, 3072MB available, 3604 GFLOPS peak)

5) Message boards : Number crunching : Can different machines be this different? (Message 1998436)
Posted 16 Jun 2019 by Profile JStateson
I looked up that work unit and its"name"
at the location

All results completed are inconclusive and it is pretty obvious there are huge differences in the results.
The GPU task had info about OpenCL and your CPU one was simpler.

Be interesting to see that the ATI system returns and other systems will probably receive the task if the ATI one is also much different.
6) Message boards : Number crunching : Bitcoin Mining Machines good for BOINC / SETI? (Message 1998421)
Posted 16 Jun 2019 by Profile JStateson
since you aren't using the CPU for anything other than feeding the GPUs, you can also try underclocking the CPU. set the core ratios to something low like 30 (3.0GHz), it will reduce power use and temps. I did this on my 7700k when it was near maxed out with 7 GPUs (w/ nobs), and an inferior cooler, though i didn't try clocking it down quite that far. just a 200MHz underclock from 4.2 GHz to 4.0GHz.

This worked out nicely for me. My open frame systems have their PS2 connector blocked. One mombo, x8dtl, with dual Xeon x5675, does not enable the USB port during the start of the post and it was not possible to get into the bios to make speed changes w/o disassembly. The processors, x5675, do not support Thermal Monitoring and were running too hot even after removing the protective film from the water block (another story). TThrottle help but the temps were erratic and on border line. I got into the windows power plan and set the %processor usage to keep the multiplier at 20 which made a huge difference in temps. GRC mining pays CPU or GPU projects equally so I always mine on CPU projects. I would not know how to do this on my Ubuntu miner but it does not have a heating problem and its USB ports work during post.
7) Message boards : Cafe SETI : The joke thread Part 4. (Message 1996925)
Posted 5 Jun 2019 by Profile JStateson
Over on the Number Crunching forum there is a thread about

"How many gpus can you run on an AMD AM4 socket motherboard"

I spent a long time thinking about making a joke out of that but I gave up as I had only intel motherboards
8) Message boards : Number crunching : Bitcoin Mining Machines good for BOINC / SETI? (Message 1996826)
Posted 4 Jun 2019 by Profile JStateson
I picked up a used Biostar TB80 with G1840 celeron for under $80 total on eBay. It has 6 slots and I had 5 RX-560 working fine on Seti & Milkyway. with a 430watt supply (since upgraded) CPU has only two threads but that was OK for the RX-560 for those two apps. last time I looked on eBay I noticed the price has dropped down to $60 and ram is now included. It is running Ubuntu 18.04 with latest AMD driver and used ddr3 ram from retired systems.

If I had it do over I would have just bought the board by itself, under $40, and add the cpu. I ended up getting an i7-4790S later as I have other plans in the future for it. That chip was not cheap and I spent a lot of time looking for alternatives.

Unfortunate, the CPU prices for 1150 are not anywhere as cheap as the 1366 CPUs . If anyone knows of 1366 miners, let me know.
9) Message boards : Number crunching : Peformance tool for BoincTasks users (Message 1996096)
Posted 31 May 2019 by Profile JStateson
I put together a program that uses the history files produced by BoincTasks to do some performance analysis. It only works with BoincTasks. There is a description of the program here (Fred created a 3rd party forum for add-ins) and the sources and executables are at github

Here is a sample plot of Elapsed Time for various SETI GPU apps.

10) Message boards : Number crunching : have more GPUs than actually exist (Message 1996052)
Posted 31 May 2019 by Profile JStateson
The next time someone has a problem with too many OpenCL devices detected.

Download Oblomov's clinfo and run it in Terminal / Command Prompt. Count the number of devices reported carefully. Note that the report includes both GPU and CPU devices (if you have drivers for those.)

If clinfo reports too many devices then something has gone wrong with the driver install and it's the vendor who needs to fix things. Go to the vendors website and open a support ticket. You'll need to tell what driver version you installed and how exactly you installed it. Include the clinfo output in the ticket as well and a link to clinfo website so that the vendor can easily find it for in-house retesting (though they probably already know about it.)

And btw. Turns out HKEY_LOCAL_MACHINE\SOFTWARE\Khronos\OpenCL\Vendors is for old drivers. Now it's more complicated and in the future still more complicated.

I ran clinfo and posted some info back Feb 11 here

The problem then, and I assume now, is that the boinc gpu detect is finding different drivers and, I am guessing, assumes there is a GPU attached to each, and ends up thinking there are 2x as many GPUs as exist.

As you can see in the "message log" driver 2766.5 is on device 0 & 1 and driver 26.71 (older driver) is on device 2 and 3. In reality there are exactly two RX-560 cards at that time.
Note that window device manager sees only 2 gpus and tech power up's gpu-z clearly show only 2. My guess is that the gpudetect looks for GPUs two different ways (opencl and cal) and cal returns the older, unused drivers, and opencl returns the new drivers and the gpu detect program thinks there are 4 gpus where in reality there are only 2. That is just a guess. I would offer to help debug the problem using VS2017 but the kitchen sink et, al, would have to be removed from the client. That is as unlikely as hell freezing over. The program that reads the coproc_info.xml has access to windows runtime modules (unlike opencl) and could easily enumerate the GPUS and compare the results to what opencl (or cal) found, blessed, and stored in that coproc_info file.
11) Message boards : News : 20 years and counting! (Message 1995029)
Posted 24 May 2019 by Profile JStateson
SETI started publically May 17, 1999
It took me 10 days to figure out how to get it working on an unused Apollo work station and (later)
on a "used" SPARCstation. An engineer called me to tell me to change the "nice" setting but
otherwise it was ok. I added more systems as the years went by and watched as other users
ran up huge scores that turned out to be fake. Some of the cheaters were members of otherwise
good clubs that didn't bother to check their own member for faking results.

Will be 20 years exactly on the 27th of this month.
12) Message boards : Number crunching : have more GPUs than actually exist (Message 1994937)
Posted 23 May 2019 by Profile JStateson
The system is presently working nicely (thanks to your workaround), but it would be nice to have a fix, not a kludge to get around the problem.

I have been successful using revo uninstaller and enabling the advanced scan. I make sure the driver I want is ready to be installed and when the system starts to reboot I pull the ethernet cable to make sure windows does not fetch an old driver. I tried the ddu and also tried the so-called "clean" install that is an AMD option. I assume it will work just as good on nvidia as amd. It also removes things that are missing the uninstall.

Revo is free. It worked so well (the free one) that I bought the portable so I could just connect a USB to do a clean uninstall of anything. I did not bother with the uninstall this last time as I routinely save the copro_info file that works and it is convenient to copy and paste it into the boinc folder.
13) Message boards : Number crunching : Hacking Seti (Message 1994930)
Posted 23 May 2019 by Profile JStateson
Thanks, that was an interesting read. How did you come across it?
14) Message boards : Number crunching : has anyone retired old hardware for seti or bonic (Message 1994870)
Posted 23 May 2019 by Profile JStateson
I posted on "free for computer science students" and rid of core 2 systems with 7850s, gtx670, 650TI as well as old ant miners. I had no problem selling single slot gtx460's as there are a lot of gateway users that have only single slot real estate. I also sold some triple slot gtx570. I was amazed anyone wanted them. I donated a lot of really old stuff to goodwill, opterons & opteron server boards. I was asked if it was old computer stuff and I said they were all working when I removed them. While true, it was not the answer to their question but they took the stiff.

I started crunching seti on an Apollo workstation and later a spark. Berkeley dropped support for Apollo and when I complained they offered to send me the sources but I had to agree to maintain them and distribute to other users. I did not take them up on that and made a note to myself to never complain again and later I junked the unused MicroVAX's that could have been used.
15) Message boards : Number crunching : have more GPUs than actually exist (Message 1994861)
Posted 23 May 2019 by Profile JStateson
Sorry, just saw this reply.

I have had a number of systems and have never seen a problem with any nvidia boards. This problem (it still exists) seems to be "owned" by Windows and AMD and first showed up with RX and S series boards. I had not seen this on HD7950s nor 7850s

I am guessing that the search for drivers in the client ( gpu_detect.cpp, gpu_amd.cpp …. gpu_opencl.cpp) uses opencl and CAL and (bigger guess) both of those return a driver and the algorithm selects both instead of just one. I as thinking of building a VS utility program that just used those programs and see if I could debug it. IMHO this program has had to support so many platforms it is difficult to pull out a few modules and build a test unit. I did notice recently there are some sample programs in GitHub including an "openclapp.cpp" and the full source for "clinfo" is always available which could be used as a starter.

The most recent example of the problem was this:

Three S9000 and one S9100 were working fine but the fan fell off the back of the S9000. The fan holder was a 3d printer forged POS. I pulled the S9100 to see if I could do a better job of securing the fan. The system ran fine with three S9000.

I secured the fan, and rebooted with the "taped on fan" . The device manager showed four S9000 which is incorrect. I selected the first board and instructed windows to update the driver using the AMD one that it had been using. After the update the correct GPUs were identified unfortunately, the boinc client now saw 8 GPUs. two s9100 and six s9000. There is an AMD driver problem somewhere clearly, but the client should not be using "phantom" GPUs. They were assigned tasks and seem to be running but from prior experience I know the tasks never finish so I did my trick of editing that opencl info table and then making it read only. FWIW, I have multipole AMD boards under ubuntu with AMD drivers and have not seen this problem. Restricted to Windows & AMD it seems.
16) Message boards : Number crunching : Cannot get any work with 3 GPU, no queue size, one GPU sometimes idle (Message 1992318)
Posted 2 May 2019 by Profile JStateson
Thanks Keith, Richard!

Yes, after changing priority at BAM! and doing a sync, about 5 minutes later I got a boatload of tasks. I was unaware of the "0" but did know about the problem with low priority tasks ending up with a lot of WU's that never complete.

This is what I have been working on:

I have GPUs that have extremely fast double precision float. S9x00, HD79x0: They work best on Milkyway

All other GPUs, RX5x0, GTX1070 have superior single precision over the above AMD boards but really suck on double precision, typically 1:16 ratio. Waste of electricity on Milkyway

Milkyway & Seti go off line for maintenance regularly. I want priority on science projects with fallback to non-science when those are offline. Not all projects have ATI apps, most have nVidia. so fallback to Asteroids cannot be on ATI systems (for example)

There is a problem with Milkyway in that they have work but do not supply it for some reason. Some type of bug, perfect example is HERE.
During those 10-15 minute gaps my secondary projects suck up work units and if I set the priority too low there will be real problems later near their deadlines.

I am thinking that I cannot use BAM! nor boinc client general preferences and need to use project preferences. Not sure how to do this or if it is even possible keeping BAM! as account manager. I am not sure if Milkyway even looks at project preferences like .1 and .25. I seem to get exactly 200 WUs for each GPU.

I cannot change what Milkyway is doing. The best I can do to avoid idle time is to fall back on seti or Einstein on those double precision AMD boards. I will try a low number for seti & Einstein on those AMD boards. My other GPUs do not run milkyway nor do I plan to other then getting statistics for various studies I am doing.

If you got any suggestions let me know. Maybe there should be a WiKi about this and also about cost (KWH) of running various projects.
17) Message boards : Number crunching : Cannot get any work with 3 GPU, no queue size, one GPU sometimes idle (Message 1992310)
Posted 2 May 2019 by Profile JStateson
You must have some configuration conflicting with resources allocation. The host has more gpu work for other projects and Seti only gets the last little slice of gpu allocation. Or you have mistakenly put a decimal point in the wrong place in your usage in Preferences.

What does setting the sched_ops_debug flag in Logging options show for work request? It will show the number of seconds of work requested for both cpu and gpu. You could also set work_fetch_debug and look at its more detailed report.

With even a 0.5 day work cache, you should get 100 tasks for each gpu and another 100 tasks for the cpu.

OK, set those debug flags.
Results are here
It the above does not work remove the www. I have no idea which sites use which protocol. Be nice if all boinc projects upgrade to allow storage on the cloud like newer forums / communities.

Going to make a guess after looking at the chatter.

I have share set to 0 because I want seti to run behind all other GPU tasks.
There are no other GPU tasks on this system nor do I plan on any but that might change.

Maybe that is the problem?
18) Message boards : Number crunching : Cannot get any work with 3 GPU, no queue size, one GPU sometimes idle (Message 1992304)
Posted 2 May 2019 by Profile JStateson
======brought this over from boinc forum as maybe the problem is SETI???========
Noticed for some time that SETI had exactly 1 task running on each of the 3 GPUs. There is no queue depth. Since generally there are 100,000 or so at the project then something is wrong.

All my systems use account manager BAM! but it seems that preferences at BAM! are not used (they show .1 and .25), the same as the local client preference (according to BoincTasks)
I went to SETI and set preferences there for .25 daily queue with .50 additional (used to be .1 and .25) just to see what happened.

Did an update as that was required by the project and event queue reported

    3347 SETI@home 5/2/2019 10:53:52 AM update requested by user
    3348 SETI@home 5/2/2019 10:53:52 AM Sending scheduler request: Requested by user.
    3349 SETI@home 5/2/2019 10:53:52 AM Not requesting tasks: don't need (CPU: ; AMD/ATI GPU: )
    3350 SETI@home 5/2/2019 10:53:54 AM Scheduler request completed
    3351 SETI@home 5/2/2019 10:53:54 AM General prefs: from SETI@home (last modified 02-May-2019 10:53:54)
    3352 SETI@home 5/2/2019 10:53:54 AM Host location: none
    3353 SETI@home 5/2/2019 10:53:54 AM General prefs: using your defaults
    3354 5/2/2019 10:53:54 AM Reading preferences override file
    3355 5/2/2019 10:53:54 AM Preferences:
    3356 5/2/2019 10:53:54 AM max memory usage when active: 6139.56 MB
    3357 5/2/2019 10:53:54 AM max memory usage when idle: 11051.20 MB
    3358 5/2/2019 10:53:54 AM max disk usage: 116.17 GB
    3359 5/2/2019 10:53:54 AM max CPUs used: 20
    3360 5/2/2019 10:53:54 AM (to change preferences, visit a project web site or select Preferences in the Manager)

As far as I could tell not only did the increase have no effect but I actually lost a work unit as the update asked for a data too soon which caused (I am guessing) the project asked for a backoff. So, an UPDATE needs to happen before the preferences get updated and during an UPDATA the client asks for more data? Is this correct?

After 5-6 minutes (backoff is 300 seconds as I recall) I finally got an extra workunit and all three of my RX560 are busy.

However, what happened to the request for addition buffer? Is the project preferences being overridden by the general client? In any event exactly 1 work unit for a GPU is a queue of exactly ZERO. Where is the original .1 day or the new .25

What has control over preference? client? bam!? project?

Maybe this should be asked over at SETI??

I just changed the local (client) preferences and read the following that indicates I need to go to the project (which I did earlier)


    3603 SETI@home 5/2/2019 11:42:31 AM General prefs: from SETI@home (last modified 02-May-2019 10:53:55)
    3604 SETI@home 5/2/2019 11:42:31 AM Host location: none
    3605 SETI@home 5/2/2019 11:42:31 AM General prefs: using your defaults
    3606 5/2/2019 11:42:31 AM Reading preferences override file
    3607 5/2/2019 11:42:31 AM Preferences:
    3608 5/2/2019 11:42:31 AM max memory usage when active: 6139.56 MB
    3609 5/2/2019 11:42:31 AM max memory usage when idle: 11051.20 MB
    3610 5/2/2019 11:42:31 AM max disk usage: 116.17 GB
    3611 5/2/2019 11:42:31 AM max CPUs used: 20
    3612 5/2/2019 11:42:31 AM (to change preferences, visit a project web site or select Preferences in the Manager)

In any event, nothing happened though I did not lose a work unit because no update was actually done.

Here is an image from Boinc Manager (not BoincTasks). It shows only 2 tasks running, one GPU is idle and no queue size.

19) Message boards : Number crunching : have more GPUs than actually exist (Message 1991731)
Posted 27 Apr 2019 by Profile JStateson
This will save 2-4 seconds of idle time in the task loading transaction for each task. Over an hour or a day, that will allow more tasks being crunched and production will go up.

Not sure how useful this would be, but I have a windows program that reads the BoincTasks history file and shows idle time for various projects and systems. It is at and would have to be built with VS2017.

Here is a sample output that shows an idle problem on milkyway
20) Message boards : Number crunching : Opinions requested from home Linux users (Message 1989929)
Posted 13 Apr 2019 by Profile JStateson
Have run Ubuntu on a few systems but stick mostly to windows 10. Why?

I cannot always get temperatures back from AMD , Nvidia and occasionally the CPU and am not competent enough to figure out which sensor driver is missing nor how to control fans on GPU as a function of temps plus the minimal install (server) lacks all the desktop features (are not need for a BOINC anyway). Several years ago I put together a task that send temps back to BOINCTASKS but it required full desktop and broke on every new update. I gave up on that project.
Benefits: free download to latest, don't have to install an SLIC to get the most recent version.

benefits: AMD & Nvidia temps can be controlled and, depending on manufacturer (dell, etc) CPU & Case cans can be controlled automatically as a function of temp
Cost: free windows 10 if your systems has an SLIC 2.1 in the BIOS as there is still a free upgrade path to win10. Most used computers have SLIC and the majority of the ones that do not can easily have the SLIC installed so just about any used motherboard from eBay (HP Z-400, Lenovo S20,) Generally if it has a 1366 socket win10 can go in with no hassle Socket 775 and a few early 1366 that came with Vista need SLIC 2.1 to get win10.

Next 20

©2019 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.