Posts by Raistmer

1) Message boards : Number crunching : Building on Arm Linux (Message 1931233)
Posted 17 hours ago by Profile Raistmer
I'm going to commit initial SIMDe emulation of SSE on ARM.

But you need to modify configure line (lets omit -DUSE_OPENCL for now, need to deal with CPU code first): add -DUSE_ARMV7 to properly highlight build path. In this path it will attempt toreplace already written SSE code with its emulation via SIMDe.
Laso check if SIMDE_SSE_NEON and similar defines enabled in build path - this will allow to use NEON where possible instead of plain SIMD emulation.
Seems SIMDe able to deal with emulation even w/o NEON support but resulting code could be inacceptable slow.

Would be good if you attempt to build from new rev and write new errors you get. Emulation far from completion currently but would be good to know if it's right path to go or completely wrong one.


The goal is to select same path I know to be working one and emulate all SSE usage on it via SIMDe.

EDIT2: At revision: 3750
Now awaiting results of your attempt.
2) Message boards : Number crunching : Building on Arm Linux (Message 1931121)
Posted 1 day ago by Profile Raistmer
looks like (example ) to use SIMDe all intrinsics should be re-written by prefixing them.
So, either separate path needed or additional re-defining layer.
3) Message boards : Number crunching : Building on Arm Linux (Message 1931110)
Posted 1 day ago by Profile Raistmer
Well, ARM path for opt codebase isn't complete
#ifdef __ARM_NEON
  sah_complex* cx_DataArray,
  sah_complex* cx_ChirpDataArray,
  int chirp_rate_ind,
  double chirp_rate,
  int  ul_NumDataPoints,
  double sample_rate
) {
; //TODO: neon port of SSE chirping

so SSE emulation would be preferable way to go indeed (especially because goal is to build OpenCL part, not just to get faster CPU build).
4) Message boards : Number crunching : Building on Arm Linux (Message 1931004)
Posted 1 day ago by Profile Raistmer
BTW, looking at arm_neon.h from VS it seems __n128 plays same role there as __m128 for x86.
5) Message boards : Number crunching : Building on Arm Linux (Message 1931002)
Posted 1 day ago by Profile Raistmer
f_recip() used only for lcgf() so for now you could try to replace SIMD lcgf by its double scalar version:
/* Log of the compliment of the incomplete gamma function
 * log(1-P(a,x)) valid only for (a+1)<x

double lcgf(double a, double x) {
  call_stack.enter("double lcgf()");
  int i;
  const double EPS=std::numeric_limits<double>::epsilon();
  const double FPMIN=std::numeric_limits<double>::min()/EPS;
  double an,b,c,d,del,h,gln=gammln(a),rv;

  // assert(x>=(a+1));
  for (i=1;i<=ITMAX;i++) {
    an = -i*(i-a);
    b += 2.0;
    if (fabs(d)<FPMIN) d=FPMIN;
    if (fabs(c)<FPMIN) c=FPMIN;
    if (fabs(del-1.0)<EPS) break;
  // assert(i<ITMAX);
  return rv;

from stock codebase.
6) Message boards : Number crunching : Building on Arm Linux (Message 1931000)
Posted 1 day ago by Profile Raistmer
Well, you are on SSE SIMD code path currently so some NEON SIMD analogue should be provided....
Perhaps fixing path route selection (to not step on SSE one) would be better than dealing with all SSE-based functions one by one.
It means to step back to _m128 errors and find why those lines are reached.
Not sure opt codebase contains non-SIMD path at all though need more reading on that point.
7) Message boards : Number crunching : Building on Arm Linux (Message 1930576)
Posted 4 days ago by Profile Raistmer

The file analyzeFuncs.h is a header file in the client sub-directory. If am not mistaken, the missing types are related to SSE code for the x86 architecture, which is not available on arm. I have compared analyzeFuncs.h of the sah7_opt branch to the stock client and it seems that it has been heavily pimped. I suspect the code simply does not support arm neon architecture yet?

Adjusting the code for arm neon is definitely beyond my skills. So unless this can be fixed with a few compiler flag settings or adjustments in the source code I believe I have reached the point where need to give up.

Maybe a last idea: I have come across this web page [1] during some searches. Could it make sense to emulate SSE on ARM using simde library instead of rewriting the code?

As always any hints welcome! I am running out of ideas here.


Thanks for link and description of your experiment.
I never attempted to build opt codebase for ARM so need some time to evaluate all this,
8) Message boards : Nebula : Progress (Message 1929895)
Posted 7 days ago by Profile Raistmer
Thanks for update!
9) Message boards : Number crunching : Should BOINC/Seti be stratified? (Message 1929894)
Posted 7 days ago by Profile Raistmer
AFAIK there was SETIv10 movement some time ago (skipping v9 is new trend it seems).
Not very much info since then or I just missed it. But v10 tasks expected to be more computationally heavy.
10) Message boards : Number crunching : Why is my host trashing workunits? (Message 1929893)
Posted 7 days ago by Profile Raistmer

I'm trying to return BOINC to user mode, but getting a consistent installer error 2203: The process cannot access the file because another process has locked a portion of the file. Eh? Never come across that one before. I've had a google round all the usual places, but nothing recognisable turns up. I'll sleep on it, but unless some bright spark turns up with a new idea, I'll probably give up on this machine - the speed of processing (best part of a day per WU) isn't worth the electricity.

If it's XP attempting to run into safe boot could help.
Apparently some process keeps lock on some of files BOINC needs during setup.
Process Explorer/Disk Monitor could provide another hints.
11) Message boards : Number crunching : Odd Result with 03ap18 WU in SoG - had to abort (Message 1929892)
Posted 7 days ago by Profile Raistmer
Ideal app case would run only single app instance per device. The more slowdown you get running 2 tasks simultaneously (for same hardware) the better optimization of single app instance is.
That means AP OpenCL much better uses GPU than OpenCL MB.
Running 2 instances per device always has inherent overhead (flushing all kind of caches; context switching). So it preferred only if there are free computational resources exist. Either by partially loaded CUs or by idle time intervals big enough to offset mentioned inherent overhead.
12) Message boards : Number crunching : Building on Arm Linux (Message 1929206)
Posted 11 days ago by Profile Raistmer

I am uncertain now what to do. Is this an error in the configuration script? Or is there still something wrong with the parameters given to the configure script? I am grateful for any hints.

Congrats for going so far!

Apparently some misconfiguration takes place.
I would start from fixing particular compiler error (maybe it's not "correct" way but inherently I don't trust "too smart" tools and *nix autoconfigure scripts that "know better" what capabilities your system has like this armv8 issue fall into that category).

So, attempt to establish from where exactly -march=armv8 option came, why compiler string gets this option.
Find place where it assigned and hack it to manually assign armv7 or native. No matter what autoconfig thinks it should be.
13) Message boards : Number crunching : Building on Arm Linux (Message 1927677)
Posted 20 days ago by Profile Raistmer
Until someone with more fresh knowledge steps in try to look through this thread: (ARM processor) app and alternatives)
14) Message boards : Number crunching : Mali-T628 GPUs (Message 1927675)
Posted 20 days ago by Profile Raistmer

It seems that libboinc.a and libboinc_api.a are missing despite the fact that I installed the boinc-dev package.
I have tried to contact Urs. He has not replied yet. Is there any active mailing list for developers?

1) I never attempted to use any boinc dev pack.
The way to go usually is just to download BOINc sources ( , place them:
and attempt to build SETI against them.

2) Yes, there is dev list:
But I would not say it's too active. Could help if you ask there though.
15) Message boards : Number crunching : Titan V and validation error rates (Message 1927499)
Posted 21 days ago by Profile Raistmer
Well, may be just sync issue as with our apps.
This generation take parallelism further so new missing sync point can surface.
AMBER can have such bugs too,
And sync errors can result in non-reproducible declination in correct answers indeed.
One should be very careful when point to "add" errors instead.
16) Message boards : Number crunching : Mali-T628 GPUs (Message 1927185)
Posted 23 days ago by Profile Raistmer
Well, besides to get in touch with Urs I would recommend to use comparison approach.
Checkout stock codebase and attempt to build CPU ARM app from it. If success there will be some ground for comparison what is wrong with opt codebase config.
So far I did ARM build from stock codebase (on Parallella) but never tried with opt one (and OpenCL based on optimized codebase).
17) Message boards : Number crunching : Mali-T628 GPUs (Message 1927113)
Posted 23 days ago by Profile Raistmer
So, do you able to build ARM binary from OpenCL codebase (
If you can try to get binary with same defines as in
put binary, workunit.sah, init_data.xml into same dir and try to run.
What stderr.txt contains?
18) Message boards : Number crunching : Mali-T628 GPUs (Message 1926845)
Posted 25 days ago by Profile Raistmer
What I can offer is to test code and provide documentation. I would also be willing to sponsor an Odroid XU4 or similar device if someone sufficiently knowledgeble is keen to add Mali GPU support to Seti@Home.



That's good too.
Get in touch with Urs Echternacht ( as he is most appropriate person for OpenCL+Linux/Android combos.
If you are able to build executables for your device by yourself (provided you have sources to build from) we could try to test my OpenCL code compatibility with Mali together.

Some modifications will be needed cause at startup there is a check for particular vendor (OpenCL portability just as any other portability is a myth - if one want to get properly and fast working app at least).
So new vendor should be added. Urs working in that direction (and AFAIK MESA driver currently produce invalid results). So I would rather be oriented on some another OpenCL stack if available.
Posted CLinfo in this thread (by Kiska) looks OK at first glance. Device w/o local memory but there is old code path that can handle such devices too.
Local memory oriented path will suffer performance-wise but should work too.

P.S. codebase for OpenCL app lives here:
19) Message boards : Number crunching : Mali-T628 GPUs (Message 1926734)
Posted 26 days ago by Profile Raistmer
The latter are said to support OpenCL 1.2.

What CLinfo utility reports?
20) Message boards : Number crunching : M$ crazyness (Message 1925479)
Posted 20 Mar 2018 by Profile Raistmer
Yep, hard-coding system drive as C: is issue too, but here more peculiar case.
They detected that my D: drive is SD card so "removable". And though I use it as constantly attached (as in smartphones) they refuse to install on this drive.
What a smartass behavior :).
Of course I circumvented this already, but to spend time on such things.... bad-bad-bad M$...

Most astounishing that VS 15.4 freely installed on same "removable" D: before. And with 15.6 they decided it should not work...

Next 20

©2018 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.