SETI orphans

Message boards : Number crunching : SETI orphans
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 . . . 43 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073130 - Posted: 12 Apr 2021, 10:53:32 UTC - in response to Message 2073127.  

So you need to install Intel SDK.
"I may be some time"
ID: 2073130 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073131 - Posted: 12 Apr 2021, 10:54:32 UTC - in response to Message 2073128.  

But remember the final 'compilation for BOINC' is done by IBM, not Scripps.

I even didn't attached to this project :) So absolutely no idea how, who and what :)
But yes, seems it's "that" sources.

#include "autostop.hpp"
#include "performdocking.h"
#include "stringify.h"
#include "correct_grad_axisangle.h"

Familiar name :)
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073131 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073133 - Posted: 12 Apr 2021, 11:02:58 UTC - in response to Message 2073130.  

So you need to install Intel SDK.
"I may be some time"


Time ... it's the limiting factor indeed.
Could you zip offline runnable bench and put it somewhere to download?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073133 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073134 - Posted: 12 Apr 2021, 11:18:05 UTC - in response to Message 2073133.  

Could you zip offline runnable bench and put it somewhere to download?
It's on GoogleDrive already. I'll PM you a link.

I also need to find and copy my notes to the original beneficiary - give me a few minutes.
ID: 2073134 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073139 - Posted: 12 Apr 2021, 11:29:49 UTC - in response to Message 2073129.  
Last modified: 12 Apr 2021, 11:39:41 UTC

It seems VTune has OpenCL support now. It's very reverend tool since I would say Pentium era or even earlier...
https://software.intel.com/content/www/us/en/develop/tools/oneapi/base-toolkit/download.html?operatingsystem=window&distributions=webdownload&options=offline

3.5GB installer :)


Huh? Intel C++ compiler included? In free download? Remember that scandal with Intel optimizing compiler usage for free SETI project?...
Well... But we need only vTune currently from this pack.

EDIT: so many years passed and still can't provide adequate installer. Mighty intel...
If one selects only vTune (and interface allows it!) installation fails complaining about compiler...
Attempt two...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073139 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073149 - Posted: 12 Apr 2021, 12:23:08 UTC

C:\__Test>wcgrid_beta29_autodockgpu_7.28_windows_x86_64__opencl_intel_gpu_102 -jobs OPNG_0000025_00056.job -input OPNG_0000025_00056.zip -seed 160279976 -wcgruns 1700 -wcgdpf 34
Running 1 jobs in pipeline mode
Warning: value of -devnum argument ignored. Value must be an integer between 1 and 65536.
AutoDock-GPU version: 51800118f2c7e78ac6794e087a956e50737c5d85-dirty

Kernel source file: ./device/calcenergy.cl
Kernel compilation flags: -I ./device -I ./common -DN128WI -cl-mad-enable
OpenCL device: Intel(R) HD Graphics 500
(Thread 0 is setting up Job #1)

Running Job #1:
Fields from: receptor.maps.fld
Ligands from: ZINC000309335454-ACR2.13_RX1--fr2266benz_001--CYS114.pdbqt
Using heuristics: (capped) number of evaluations set to 2818671
Local-search chosen method is: ADADELTA (ad)

Executing docking runs, stopping automatically after either reaching 0.15 kcal/mol standard deviation of
the best molecules of the last 4 * 5 generations, 27000 generations, or 2818671 evaluations:

Generations | Evaluations | Threshold | Average energy of best 10% | Samples | Best energy
------------+--------------+------------------+------------------------------+---------+-------------------
0 | 100 | 2.90 kcal/mol | 0.30 +/- 1.04 kcal/mol | 4 | -0.82 kcal/mol
5 | 34051 | 2.90 kcal/mol | -1.73 +/- 1.33 kcal/mol | 978 | -5.86 kcal/mol
10 | 63164 | -1.71 kcal/mol | -4.94 +/- 0.41 kcal/mol | 70 | -5.87 kcal/mol
15 | 90893 | -4.88 kcal/mol | -5.76 +/- 0.18 kcal/mol | 23 | -6.28 kcal/mol
20 | 118377 | -5.68 kcal/mol | -6.01 +/- 0.23 kcal/mol | 19 | -6.78 kcal/mol

it runs offline at least.
If I cut jobs list to just single entry?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073149 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073150 - Posted: 12 Apr 2021, 12:24:23 UTC - in response to Message 2073149.  

-cl-mad-enable and iGPU.... danger :))))
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073150 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073151 - Posted: 12 Apr 2021, 12:28:24 UTC - in response to Message 2073150.  
Last modified: 12 Apr 2021, 12:30:01 UTC



And it just finished first job. Seems my HD500 is powerful enough to pass w/o watchdog triggering
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073151 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073152 - Posted: 12 Apr 2021, 12:38:33 UTC - in response to Message 2073150.  

-cl-mad-enable and iGPU.... danger :))))
I've told him that. At the moment, they're requiring validation against same device class only - so it says it's OK, but might not mean it. He's been warned about that too - all those faulty drivers polluting the database here by cross-validating.
ID: 2073152 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073153 - Posted: 12 Apr 2021, 12:40:40 UTC - in response to Message 2073149.  

If I cut jobs list to just single entry?
Haven't tried to find exactly how to do that. Feel free - it would certainly be quicker.

But CPU load between sub-jobs needs optimising too.
ID: 2073153 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073154 - Posted: 12 Apr 2021, 12:56:56 UTC - in response to Message 2073153.  
Last modified: 12 Apr 2021, 13:38:34 UTC

If I cut jobs list to just single entry?
Haven't tried to find exactly how to do that. Feel free - it would certainly be quicker.

But CPU load between sub-jobs needs optimising too.

It works as expected. Just optimizes first atom config and exits.

But.... different energies! Quite different!

Finished evaluation after reaching -6.68 +/- 0.13 kcal/mol combined.
45 samples, best energy -6.94 kcal/mol.


I'll try to run test few times and will edit this post with values

Richard's run:
Finished evaluation after reaching
-8.37 +/- 0.06 kcal/mol combined.
51 samples, best energy -8.49 kcal/mol.

My second run:
Finished evaluation after reaching -6.92 +/- 0.07 kcal/mol combined.
48 samples, best energy -7.09 kcal/mol.

One more run:
Finished evaluation after reaching -6.91 +/- 0.15 kcal/mol combined.
23 samples, best energy -7.07 kcal/mol.

So, they DEFINITELY didn't reach real minimum in those runs.

Interesting task - to validate between 2 results when each new run even on the very same host/device/initial setup gives different data :))))

EDIT:
genetic algorithms tend to give different results between runs cause random "mutations"included, but usually final value approx the same, just length and trajectory to it will differ from run to run.
Here is quite different situation, final optimized energy value differs between runs and quite strongly. [Hence, assuming they know what they doing, result value is not the final energy minimum and some additional stage to find true minimum will be done on servers]

But from "number crunching" point of view it's not quite clear how to correctly validate such results? Especially if -cl-mad-enable can add subtle errors in calculations. Such errors can be much smaller than deviation between runs ....
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073154 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073155 - Posted: 12 Apr 2021, 13:29:33 UTC - in response to Message 2073152.  

all those faulty drivers polluting the database here by cross-validating.


And non-reproducible results add to that complexity.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073155 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073170 - Posted: 12 Apr 2021, 15:41:06 UTC - in response to Message 2073155.  
Last modified: 12 Apr 2021, 15:50:35 UTC

OK, got some initial profile data


separate spikes....

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073170 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073171 - Posted: 12 Apr 2021, 16:31:04 UTC - in response to Message 2073170.  



some unbelievable low numbers ://
Sampling overhead too big?...
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073171 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073179 - Posted: 12 Apr 2021, 18:31:53 UTC - in response to Message 2073171.  
Last modified: 12 Apr 2021, 18:33:14 UTC

Seems I found adequate metrics and config:




That gradient_minAD takes 4s per launch.
Why TDR tolerates - no idea, maybe some time ago I disabled it...
launch space(global size) is huge, 640000 per kernel call.

Maybe worth do decrease it to avoid crashes.
That's all for today, Richard, at least smth to start with.
There are 19 launches catched in log going one by one. Maybe dev can launch twice more but smaller... (19 is arbitrary number, just length of data collection while 4s per launch and 640000 items per launch characterize kernel itself.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073179 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073180 - Posted: 12 Apr 2021, 18:48:32 UTC - in response to Message 2073179.  

Thanks for that. I'm about to go down and see if my little Celeron can provide any matching figures for that. Not sure yet if I'll be able to pass anything on today - partly depends if Uplinger comes back with anything after his weekend. Otherwise, tomorrow.
ID: 2073180 · Report as offensive     Reply Quote
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24877
Credit: 3,081,182
RAC: 7
Ireland
Message 2073221 - Posted: 13 Apr 2021, 8:17:59 UTC

What I find fascinating is the variations in numbers that crunching throws our way.
For example
The teams stats for Sun
11/04/2021 49 12 59 7 843,522 679
For Mon
12/04/2021 49 7 3 50 1,043,506 681
A few more of those 2 wu's please. :-)
Yet for 300 results less
01/04/2021 57 19 32 20 312,045 375
ID: 2073221 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073229 - Posted: 13 Apr 2021, 11:52:25 UTC - in response to Message 2073179.  

Finally got mine to work - I think the first run stalled, but I didn't see any messages about driver restarts. It's now free to carry on as long as it wants.



Pretty much confirms your result. So, what's the best way to convey this to WCG/Scripps? I can't see any way of capturing the report except by screen-grabbing.

(This data came from a full run of the first job in the data file. Many parts of the report say "No data to show. The collected data is not sufficient." I'll run it again on the full-length task, and see if that changes anything. Test machine details are here as host 8888441)
ID: 2073229 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073245 - Posted: 13 Apr 2021, 14:29:09 UTC - in response to Message 2073229.  

I think this picture is enough to point to main problem - there is kernel that runs too long on low-level hardware.
And there are good chances that issue can be solved very easy - just to split single call to few. Global size of 640000 easely allows that. There are 640k /128 workgroups so will fill almost any GPU, even biggest ones. So, kernel could be split even unconditionally (not only for low-level ones).
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073245 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073298 - Posted: 14 Apr 2021, 9:48:53 UTC - in response to Message 2073229.  

This data came from a full run of the first job in the data file. Many parts of the report say "No data to show. The collected data is not sufficient."
So, the full run (34 jobs) also failed - not enough disk space for analysis. I've freed some up, and I'm trying again with 15 jobs - hoping that's the Goldilocks setting.

Otherwise, we're going to need a bigger disk...
ID: 2073298 · Report as offensive     Reply Quote
Previous · 1 . . . 28 · 29 · 30 · 31 · 32 · 33 · 34 . . . 43 · Next

Message boards : Number crunching : SETI orphans


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.