Posts by jason_gee


log in
1) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812340)
Posted 5 hours ago by Profile jason_gee
Yeah choices of which task and result files would be good to dig for and poke at, will probably come down to the individual problems, and how Richard+Raistmer plan to look at them.

For Cuda mysteries I prefer lab condition runs with actual files, while that's just my approach and some others may not need that, depending. Many can be ruled out as flaky hosts/GPUs pretty easily (with stock Cuda).

In the case of Petri Special, we already have a good headstart with the one Pulses example currently under the microscope, because the result files say more than the stderr prints. Probably we'll be looking for more after some things have been nailed down with that, but extensive collection for the Cuda special builds probably won't be necessary short term.
2) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812194)
Posted 15 hours ago by Profile jason_gee
Well Kiska's original inconclusive against Petri's older build (on WezH's Linux Machine) confirmed a large amount we knew, and possibly holds some hints for development to sift through, so will bundle up and send to Petri.

Note that my Windows zkpr3 test build, which I consider broken, and is already outdated, probably reflects similar characteristics to the Linux one WezH's running, and precedes the last changes in alpha. Probably will get to attempting a revised test build after having a Wisdom tooth pulled later today (we'll see)

C:\Users\Jason\Downloads\[alpha]\kiska\Comparison>rescmpv4.exe cuda42_result_fro m_kiska.res ref-setiathome_8.00_windows_intelx86.exe-kiska_guppi.vlar.wu.res Result : Strongly similar, Q= 99.22% C:\Users\Jason\Downloads\[alpha]\kiska\Comparison>rescmpv4.exe jason-result-Luna tics_x41zj_win32_cuda50.exe-kiska_guppi.vlar.wu.res ref-setiathome_8.00_windows_ intelx86.exe-kiska_guppi.vlar.wu.res Result : Strongly similar, Q= 99.22% C:\Users\Jason\Downloads\[alpha]\kiska\Comparison>rescmpv4.exe jason-result-Luna tics_x41zj_win32_cuda50.exe-kiska_guppi.vlar.wu.res cuda42_result_from_kiska.res Result : Strongly similar, Q= 100.0% C:\Users\Jason\Downloads\[alpha]\kiska\Comparison>rescmpv4.exe result-Lunatics_x 41zkpr3_winx64_cuda65.exe-kiska_guppi.vlar.wu.res ref-setiathome_8.00_windows_in telx86.exe-kiska_guppi.vlar.wu.res ----- R1:R2 ------ ----- R2:R1 ------ Tight Good Bad Tight Good Bad Spike 0 0 0 0 0 0 Autocorr 1 1 0 1 1 0 Gaussian 0 0 0 0 0 0 Pulse 3 3 0 3 3 2 Triplet 4 4 0 4 4 0 Best Spike 1 1 0 1 1 0 Best Autocorr 1 1 0 1 1 0 Best Gaussian 1 1 0 1 1 0 Best Pulse 0 0 1 0 0 1 Best Triplet 1 1 0 1 1 0 ---- ---- ---- ---- ---- ---- 12 12 1 12 12 3 Unmatched signal(s) in R1 at line(s) 554 Unmatched signal(s) in R2 at line(s) 383 471 606 Result : Weakly similar.
3) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812154)
Posted 18 hours ago by Profile jason_gee
got it. Will probably pass out while CPU is still running, but will have multiple ones to compare there. Will probably have enough on that one by the morning to work out if it might be useful to petri, then package everything up to email him.
4) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812135)
Posted 19 hours ago by Profile jason_gee
Ok. well anyway I expect my result file and your cpu + cuda to be ~99% similar or even matching. Its too bad I can't get the Windows build.....


I think that'll be the case. Sadly takes 2 hours on crappy CPU here :D (not so long for the Cuda result)

[Edit:] Just got Petri's sources updated in svn... one small step at a time
5) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812123)
Posted 19 hours ago by Profile jason_gee
I can run petri special app, manually I just need to know where to download it

My manually ran one, according to stderr has the same counts of pulses, etc


For Linux, You'd need to ask Petri (if he's still working on it toward these validation issues he may say no). For Windows, my build's somewhat broken and needs updating, so not an option to go handing out. TBar has a Mac version though.
6) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812120)
Posted 19 hours ago by Profile jason_gee
I am also running CUDA as well, I'll do 4.2 then 5.0

Though it shouldn't have issues, I ran it stock


Looking at the other hosts in that first one, more than likely it could be one Petri will want, so will confirm and then email it to him. The first host was a broken Cuda, second was your CPU, third was Petri special.

(If my CPU + Cuda match it under bench, + yours manually, he'll likely find some useful hints in it)
7) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812117)
Posted 19 hours ago by Profile jason_gee
More inconclusives whether they are GBT or Arecibo

blc5_2bit_guppi_57451_69044_HIP117559_OFF_0022.7520.416.18.27.221.vlar
Datafile

29au10ab.13767.17658.13.40.192
Datafile


running the first guppi [you supplied before] against reference Win32 stock CPU, and Cuda50, now; then will manually compare to your CPU result. Could turn out yours is ok or not, we'll see.
8) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812115)
Posted 20 hours ago by Profile jason_gee
I didn't check host validity, so some might be known bad hosts, that produce these invalids


That's fine. With so many new apps in circulation, a lot of the reasons aren't known.
9) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812113)
Posted 20 hours ago by Profile jason_gee
I have another inconclusive, stock CPU vs stock CPU.
Linux CPU vs Windows CPU.
Workunit

Datafile


That Linux host has a pretty high invalid count (for whatever reasons)

[Edit:] So does the Windows host :D (that's yours, not sure why)
10) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812097)
Posted 20 hours ago by Profile jason_gee
Jason I have results from CPU for that previous one.
Result CPU

cheers, off to grab some beverages, and will see if i spot anything odd.
11) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812089)
Posted 21 hours ago by Profile jason_gee
had a quick try, no success yet. Will poke at it a bit more ---> this is Richard's department ;)

Or alternatively the Beta site, where completed work is kept for longer.


Gotcha, yeah figured it was deleted already after noticing the valid Cuda and Intel GPU. Don't know enough about the other apps/systems to comment further on that one.
12) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812084)
Posted 21 hours ago by Profile jason_gee
had a quick try, no success yet. Will poke at it a bit more ---> this is Richard's department ;)
13) Message boards : Number crunching : Philosophy: To DeviceQueueOptimize or NOT (with a focus on: is it "micro managing"?) (Message 1812060)
Posted 23 hours ago by Profile jason_gee
Micro management doesn't mean lots of interventions. Even one is a case of micro management.


Well I see both sides saying the same thing: Direct 'User' micromanagement is awkward/inefficient/not-generally accepted, and external tools to facilitate such perhaps a little clumsy, maybe not.

'In application' is probably the best place for them, because the application has domain specific knowledge about the tasks, project, and user needs.

On the same token, I see no functional difference between [some tool] doing the job by switching out apps wired differently, the user using tools to shovel tasks one way or another, or manual editing of client state.

There are obvious advantages to the level of integration/automation we see in stock CPU, and the GPU third party apps do this to some extent (OpenCL compiles its own binaries at runtime, Cuda chooses from a selection of embedded ones or builds its own via the driver). The possible remaining exception then is AKv8 CPU, which doesn't use explicit inbuilt dispatch for the main codepath, but then delegates that choice to the user to select the SSE level of application (apart from the fftw library which has full internal runtime dispatch)

So all I'm getting at, is that it isn't anything special shovelling code to the right device/code. It's Boinc, yet again, making assumptions about things it has no control of.

Why Boinc should need to know anything about how I run, other than possibly providing valid results [within deadline], is completely at odds with its design principle that users' hosts are inherently untrustworthy. [ <--- That's micromanagement]

Lieutenant Tuvok: It is illogical to dwell on situations beyond your control. It will only serve to heighten your anxiety, which, if I may say so, is heightened enough.
Sklar: Oh. Well, thank you, for the reassurance.
14) Message boards : Number crunching : Philosophy: To DeviceQueueOptimize or NOT (with a focus on: is it "micro managing"?) (Message 1812049)
Posted 23 hours ago by Profile jason_gee
OK, so let's go with that. So Stock CPU app automatically changing from FPU code to SSE, SSE2, SSE3, AVX etc (which are different devices) is micromanagement->

Incorrect.
It is not micro management. How can it be? It is doing, what it has been programmed to, according to the settings it has been given through the Web/locally.
Someone changing those settings back and forth in order to change how BOINC does it's job, that would be micro management- whether they do it manually or automate it.
refer back to the definition-
micromanagement is a management style whereby a user closely observes or controls the BOINC manager.
The user changing the Manager's decision about which code to use would be micro management- which is what I did on Beta when the manager picked the slowest of 4 possible applications to use as a result of the work crunched when it was figuring out which was the best application to use. BOINC determining which code is best to use isn't micro management.



So we're saying micromanagement/rescheduling breaks boinc,

Who is, where? If so, is it true or not?
And that in itself is a completely different discussion.


Well let me try to understand your rationale here. You're saying if a user does it it's micromanagement, or a tool they employ does it, still micromanagement, but if they make an application that does it internally, then it's not ?
15) Message boards : Number crunching : Philosophy: To DeviceQueueOptimize or NOT (with a focus on: is it "micro managing"?) (Message 1812044)
Posted 1 day ago by Profile jason_gee
Nah Jim, that question was directed at Grant, trying to establish where the line between user micromanagement and automation lies.


BOINC is the worker- we are the Managers. We tell it what we want it to do & how to do it, then let it get on with the job.
Us changing things above & beyond that, are perfect examples of micro management, whether it's automated & runs once weekly, or it's someone endlessly changing the web and/or local host settings to get it to do things in a particular way (why FIFO bothers some people I have no idea).

BOINC having the ability to do rescheduling would just be BOINC doing it's job, the way we ask it to (through the web/local host settings). Of course when people then decide they don't like the way BOINC is handling that, and do their own {whatever}, then they are micro managing.


OK, so let's go with that. So Stock CPU app automatically changing from FPU code to SSE, SSE2, SSE3, AVX etc (which are different devices) is micromanagement->rescheduling, which you'd be in good company with Boinc developers in thinking, since Boinc assumes all x86 CPUs run at single threaded FPU Whetsone (which happens to be provably false, especially stock CPU AVX)

So we're saying micromanagement/rescheduling breaks boinc, since it doesn't support the needs of even stock CPU applications.
16) Message boards : Number crunching : Philosophy: To DeviceQueueOptimize or NOT (with a focus on: is it "micro managing"?) (Message 1812040)
Posted 1 day ago by Profile jason_gee
So does this example of automated hardware and efficiency based dispatch meet your definition of micromanagement, despite no user intervention ?

Nope.
And it's not my definition, it's the definition.

For those that missed it the first time around-

From the Wikipedia,
Micromanagement
In business management, micromanagement is a management style whereby a manager closely observes or controls the work of subordinates or employees.


Then I would argue the provided (stock CPU 'dispatcher') example precisely matches that definition.
17) Message boards : Number crunching : Philosophy: To DeviceQueueOptimize or NOT (with a focus on: is it "micro managing"?) (Message 1812037)
Posted 1 day ago by Profile jason_gee
Is it micromanaging? Not to my definition, because to me micromanaging implies an excessive amount of effort relative to the results obtained.

Which differs from the actual definition of micro management.


So does this example of automated hardware and efficiency based dispatch meet your definition of micromanagement, despite no user intervention ?

Optimal function choices:
--------------------------------------------------------
name timing error
--------------------------------------------------------
v_BaseLineSmooth (no other)
.
.
.

AK SSE folding 0.001666 0.00000 choice

Test duration 6.33 seconds


Either way, looks to me like heuristic strategies are encouraged by the stock CPU implementation.


Wouldn't have a clue, dude. I am not, and have not held myself out to be, a programmer on that level. Which I think you were well aware of. I have a tremendous respect for those like you and Raistmer who are. I think that probably doesn't get said enough.

Not sure what point you were trying to make, other than perhaps to say "go away and let the experts deal with it"? Don't think that's where you were coming from ... hopefully.
But folks wanting to do what we're doing shouldn't be misconstrued as an insult, I would hope ... Lotsa prickly folks here ...
Later ...


Nah Jim, that question was directed at Grant, trying to establish where the line between user micromanagement and automation lies.

We've got a situation where the technology and users 'want' more sophisticated capabilities to make life simpler and more efficient, while the status quo resists change.

It's not something I'd pretend to know the absolute best solutions for, though at the same time I can point to an example of how it happens already, and say 'What's wrong with that?'.

Nothing wrong with it then theres the answer --> Something wrong with it then Boinc and the project need to look at it too.

If it were all black and white, like 'rescheduling bad, accepting poor scheduling good', then I think the discussion would be moot. In this case I think the discussion leads to questions about suppressing of technology, and assumptions made under the current mechanisms, that probably should be discussed.
18) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812028)
Posted 1 day ago by Profile jason_gee
OK I have the file now. And the file name matches the workunit

Here is the google drive link to the workunit


Have a few copies myself, for interest sake will probably run CPU and baseline Cuda, then if nothing weird forward to Petri for analysis of that app.
19) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812026)
Posted 1 day ago by Profile jason_gee
Must have pasted wrong address:
http://boinc2.ssl.berkeley.edu/sah/download_fanout/374/blc5_2bit_guppi_57451_69044_HIP117559_OFF_0022.7362.831.18.27.38.vlar

What I get for having too many tabs open ;)

[Edit] subfolder being 6th, 7th, and 8th digits of the md5 checksum of the filename:
http://md5.gromweb.com/?string=blc5_2bit_guppi_57451_69044_HIP117559_OFF_0022.7362.831.18.27.38.vlar ... 374
20) Message boards : Number crunching : Monitoring inconclusive GBT validations and harvesting data for testing (Message 1812022)
Posted 1 day ago by Profile jason_gee
Yeah looks like the md5 hash is only basing the fanout folder on the text 'blc5' instead of the whole filename.
[Edit:] abort, checking what I got


Next 20

Copyright © 2016 University of California