Posts by -= Vyper =-

1) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1890432)
Posted 2 days ago by Profile -= Vyper =-Project Donor
I agree aswell with both TBars and petri33s statement.

A noisy overflowed WU is what it is.. There is a reason for this limit set and if they change it in the future they will and they will modify the original code tree to include perhaps 60, 100 or even 500 signals?! Who knows.
But as they both say.. If it finds thirty signals within a few seconds, then consider it "aborted" instead in everyones minds then it would be easier to understand the terminology of the meaning of "30 signals found , overflow triggered".

My 2 cents!
2) Message boards : Number crunching : Tesla K80 on Google Cloud platform - not recognised (Message 1882113)
Posted 4 Aug 2017 by Profile -= Vyper =-Project Donor
Try teamviewer.. It worx flawless..
3) Message boards : Number crunching : Ryzen on Linux (Message 1880059)
Posted 24 Jul 2017 by Profile -= Vyper =-Project Donor
My Ryzen isn`t running very hot.
I usually have 51° to 55°C running 6 instances on CPU plus GPU.
I have a Noctua D15 installed and its not even running full load atm.
Of course you are overclocked with your 1700 which is different.

I have Linux running on a 64 GB USB 3 stick. That was just 60 Bucks.
It runs smoothly and very fast.
You just need some time to read thruough everything.

A warning is at a place here (Eine kleine warnung).

Boinc and the seti app both write to disk (USB). BOINC writes just as it is told to do in the user interface (once in a minute or so).
The (GPU) App writes checkpoints hundreds of times a second unless it is modified in the source code to not to do so.
Please make a backup for your own sake.

My NV code will not do any checkpoints in the next version. A restart of a task will start from 0% but that is not so a big loss, is it?

Viel Grüss,

This is solved by Reading this and check the section for Approach 2!
4) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1879431)
Posted 21 Jul 2017 by Profile -= Vyper =-Project Donor
Hi TBar..

Just wanted you to know that i've switched my Quad 750Ti to your application now. If you perhaps want a machine to monitor missbehaviour within the executable but i presume that wont be much of an issue.

Thanks for keeping it up!
5) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875361)
Posted 27 Jun 2017 by Profile -= Vyper =-Project Donor
How do we know that the CPU portion of latest code isn't effected by sporadic errors when running Hyperthreading enabled?!
Has anybody run tests with and without HT on Skylake and Kaby Lake computers?!
6) Message boards : Number crunching : Debian Project Warns: Turn off Hyperthreading with Skylake and Kaby Lake (Message 1875360)
Posted 27 Jun 2017 by Profile -= Vyper =-Project Donor
Thank u for the heads up!!! +1
7) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1875358)
Posted 27 Jun 2017 by Profile -= Vyper =-Project Donor
BTW, can Petri's app run on such GPUs (mobile ones) ?:

NVIDIA GeForce 940MX
NVIDIA GeForce 820M

This is what i've found out:

"The executable is version zi3t2b and it can be run on sm_35, 50, 52, and 61. (750,780,980,1080 and likes).
With 1 Mb of GPU ram you need -unroll 1. Other can use -unroll autotune.
Use -bs to reduce CPU usage.
Set -pfb to 8, 16 or 32."

EDIT: 820M seems out of luck.. CC2.1 only but the 940MX seems to work CC50
8) Message boards : Number crunching : CES 2017 -- AMD RYZEN CPU (Message 1847796)
Posted 10 Feb 2017 by Profile -= Vyper =-Project Donor

Oh is that so! If this is true then this could be a cpu crunchers wet Dream..
9) Message boards : Number crunching : Transferring work units from a dead laptop (Message 1841366)
Posted 11 Jan 2017 by Profile -= Vyper =-Project Donor
Was it a Windows XP computer?
10) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1836823)
Posted 18 Dec 2016 by Profile -= Vyper =-Project Donor

But i'm using 16.04 on one of my hosts. Worked like a charm!
Though i started with installing only the server version with nothing more than OpenSSH server to continue from there though.
Good that more ppl are jumping the Linux bandwagon for eeking out the most of their hardware at the moment.
11) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1834772)
Posted 8 Dec 2016 by Profile -= Vyper =-Project Donor
TBAR: Have you compared the speed of your compile compared to Petris different builds? Good that the invalid rates are down but as we all know by know we cant eliminate the way the validator thing works either.
The more you produce faster the more inconclusive ratio that host seem to have until it vanishes of.

What i now write below is my theory:

With that i mean, if you have a slow host that doesn't process that much WUs per day you tend to end up crunching units that your wingman already has crunched. If the validator compares the work of a (I call it Petri Cuda) WU and compare it to the other that has been crunched already you get a validation pass and both get rewarded credits and the WU soon is cleared from the system (Cannot find the WU) as we can see when they have been processed and thus the invalid ratio is low!

When you have the opposite a ultra-speedy system that crunches thousands of WUs per day the more inconclusive you will get because that machine is so fast and returns the work first of them all and Waits for other computers to Catch up and when they start to return WUs and the overflowed results are pooring in so that speedy Machines inconclusive ratio will rise faster than others as well.

/End of Theory

What actually matters is ofcourse that the code does the work properly! Q ratio high as possible in various tasks, GBT, High/Low AR etc etc you all know that part but the value as Tbar refers to as "Consecutive valid tasks"
That one is the main thing to keep track of in my mind not the inconclusive part because the more parallel code the more inconclusives we will get wether it's an CPU, GPU , FPGA, PS4 yada.

Thanks for your work TBar and thank you Petri for going the brute force route of taking advantage of newer hardware that made this leap. Latest SoG is also speedy as hell! The 1080 if mine is utilized better than running multiple parallell Cudas now! Thank you Raistmer,Jason,Urs and all you Alpha/Beta testers and others that has contributed that has made that we're where we are at the moment! The list of ppl would get long.
12) Message boards : Number crunching : Spammers (Message 1830835)
Posted 16 Nov 2016 by Profile -= Vyper =-Project Donor
Delete: Just tested.. From. -= Vyper =- (Is this bullshit for real with extreme amounts of letters in the name, why is it even possible in the first place? Who could have figured that you could write a novel here and it is allowed to have so many characters in the name. Lol
13) Message boards : Number crunching : Spammers (Message 1830834)
Posted 16 Nov 2016 by Profile -= Vyper =-Project Donor
Omg! I tested and it worked! Lol! :D
14) Message boards : Number crunching : The wonders of Micro$oft Windoze (Message 1829935)
Posted 11 Nov 2016 by Profile -= Vyper =-Project Donor
And this then?
15) Message boards : Politics : Hillary Clinton - the next president of America? (Message 1829266)
Posted 9 Nov 2016 by Profile -= Vyper =-Project Donor
It seems to be over for Hillary now!

Trump is Trumphiant :P
16) Message boards : Politics : Donald Trump for President? (Message 1829265)
Posted 9 Nov 2016 by Profile -= Vyper =-Project Donor
Lol! It seems done now!

The World just got more Trumpified! :D
17) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1827674)
Posted 31 Oct 2016 by Profile -= Vyper =-Project Donor
So, why is the Intel build faster on the nVidia cards?

Nice find!
Maybe Intel Crippling is back again or something?! I don't know! I can only guess. They've done it in the past and may very well do so again :)
18) Message boards : Number crunching : Moore's Law illustrated (Message 1827036)
Posted 27 Oct 2016 by Profile -= Vyper =-Project Donor
Lol, I've programmed Assembly and i loved it back in my Amiga Days. Workbench friendly assembly with correct calls to libraries! :)
19) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1825193)
Posted 18 Oct 2016 by Profile -= Vyper =-Project Donor
And here is CPU code vs two SoGs that are inconclusive:
5221951752 8108844 16 Oct 2016, 10:45:17 UTC 17 Oct 2016, 11:24:57 UTC Completed, validation inconclusive 1,617.93 1,571.58 pending SETI@home v8
Anonymous platform (CPU)
5221951753 7804722 16 Oct 2016, 10:45:20 UTC 16 Oct 2016, 15:08:30 UTC Completed, validation inconclusive 625.20 603.70 pending SETI@home v8 v8.10 (opencl_nvidia_SoG)
5224825595 8060288 17 Oct 2016, 16:47:42 UTC 18 Oct 2016, 3:35:46 UTC Completed, validation inconclusive 826.10 815.27 pending SETI@home v8 v8.19 (opencl_nvidia_SoG)
20) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1823219)
Posted 10 Oct 2016 by Profile -= Vyper =-Project Donor
Sweet, sweet. Well we can now conclude that we can bin the precision questions that i've started lately.

A lot of more time spent doing nothing is not doable to justify! Thanks Richard and Raistmer for this.
If we now enters next phase then? I'm venting an idea now.

Should there exist a double precision variant of the cpu executable and a set of WUs that ofcourse would be slow as hell to calculate but we got so much precision of it that it would set the "gold standard" Q=100 on the .res files and make it the reference values that every other optimised and production executable would try to get as near Q100 as possible to?
That application is not ment to be for users i'm more talking like a this is the best results that can ever be calculated for every WU out there and is used by you optimisers and S@H crew themselves as an origin and outcome of what "perfect" would be!

Ok part two then:
Then we have the validator thing to address instead to invent a golden standard of sorting data returned in how and what manner!
As it is today it "seems like" that if we take a garbled WU and sends it to one cpu, and a older gpu code and a newer gpu code it gets different.
We all know that we got that limit of storage space allowed (30).
If we imagine that we remove that limit and just digs through the whole WU we could now imagine that we encountered: 78 Spikes, 112 Pulses, 7 Tripplets etc.
As it is today the calculation stops when reached 30 different detection Points and sends the result back.
The cpu would start processing from 0 and gets to 100 linear and amongst the way it founds 20 Pulses and 8 spikes and 2 tripplets example in this order:

PPSPSSPPPTPPPSSPPPSPPPPTPPSSPP.... boom 1870 seconds spent on the linear cpu.

Now we takes the old gpu code that is sped up significantly but is still "serial" even if it can calculate portions faster and it produces:

PPSPSSPPPTPPPSSPPPSPPPPTPPSSPP.... boom 165 seconds in it stops with the same as it is a straight cpu/gpu port and the code haven't evolved more than a regular port.

Ok then lets move on the other portion of new executable that speds up.

PPSPSTPPPSPSSSTPPPSSPPSPPSPSPP... boom 45 seconds in it stops and sends this back.

Now this seems wrong by the validator because it differs so much in numbers found etc and in order. But in the reality if we removed the 30 limit block and let the code in all variants crunch through in this whole WU it would get the same amount 78 Spikes, 112 Pulses, 7 Tripplets found but in different order on the last faster executable but the values in every measure Point is correct.
For what it seems today the last executable which return data got a "inconclusive mark" and the inconclusive rate is ofcourse higher.

Until someone starts to make a multicore version of s@h executable to cpu exactly the way Petri seems to have done to the Gpu version this "inconclusive result" numbers would be high.
If that was done and Boinc knows this then a 12core cpu would only start one task but it would process this much faster and with 100% utilisation on all cores and get low times on finishing time but the validator would match the latest cpu code to the latest gpu code because its the same process pattern and inconclusives would drop to perhaps 10/1000 instead of 150/1000 as it is today.

The more parallel executions the more diversity in inconclusives would occur.
Now can this disparity be fixed until the cpu code catches up and gets multicore?! I don't know. Only you optimizers do!

Next 20

©2017 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.