Posts by Raistmer


log in
1) Message boards : Number crunching : AstroPulse Work Fetch Thread (Message 1594319)
Posted 2 hours ago by Profile Raistmer
I see no issue that I need or want to report to the project team (or to BOINC), and I'll bow out of the conversation here.


Could you please point place in provided logs that made you think so?
2) Message boards : Number crunching : AstroPulse Work Fetch Thread (Message 1594318)
Posted 2 hours ago by Profile Raistmer
Well I see some counter-productive behavior in provided log:

Wed Oct 29 21:56:33 2014 | SETI@home | [sched_op] estimated total ATI task duration: 0 seconds
Wed Oct 29 21:56:33 2014 | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task ap_02no11aa_B2_P1_00253_20141029_17532.wu_0
Wed Oct 29 21:56:33 2014 | SETI@home | [work_fetch] backing off ATI 783 sec


That is, client will not ask for GPU work LONGER than server-imposed delay.

It's not wise behavior when device sits idle. It can be allowed only if GPU has enough work in cache to sustain few more empty work fetches. Does it the case?
3) Message boards : Number crunching : AstroPulse Work Fetch Thread (Message 1594315)
Posted 2 hours ago by Profile Raistmer
I think we've checked that out thoroughly, and seen no evidence of a bug of that nature. Any imbalance in work allocation is due to problems of supply and demand - there isn't enough AP work in existence to fill every request on every occasion. On that basis, I see no issue that I need or want to report to the project team (or to BOINC)


Good if so. That means server is able to fulfill CPU and GPU work request in single communication transaction provided there is work to send available for both devices, right?

But also it means client should continually (each 5 mins currently) ask server if GPU work queue less than 100 tasks. Right? Or some additional backoffs included I didn't account for?
4) Message boards : Number crunching : AstroPulse Work Fetch Thread (Message 1594304)
Posted 3 hours ago by Profile Raistmer
1) I don't see how 10 days cache setting could prevent GPU from getting work if no bugs involved. If project adds additional limitations like current 100 per device cache should fill up to 100 tasks and being keeped near 100 tasks always, that's what 10/10 setting should imply. And definitely not any complications regarding CPU + GPU work receiving in single request.
Lets not put all in one heap.

2) There are different reasons for user to have such settings. Additionally to already mentioned: to simplify performance logging and analysis. For example my own hosts where I monitor performance always run in such mode to accumulate many results per client_state copy for subsequent processing.
In short, option is for user discretion. If it doesn't work - it doesn't work and worth to be fixed, not to be argued for no usage.

3) OP is well known as active volunteer tester that helped already in detection and eradication of many issues. And constant accusations in his address regarding credit-related motives I find quite irritating.
5) Message boards : Number crunching : Lunatics Windows Installer v0.43 Release Notes (Message 1593998)
Posted 17 hours ago by Profile Raistmer
I added config file with per-device configurations to override command-line options.
Now multi-GPU hosts owners can configure each GPU separately (3 most important options for now, more will follow).
6) Message boards : Politics : Goodbye to more freedoms, liberties and privacy (Message 1593809)
Posted 1 day ago by Profile Raistmer
but there comes a point when to enjoy our freedoms we must accept the state has to take certain, controlled, actions in order to protect the state and it's citizens.


Who will control and who will control controller?... Any additional right to make some unjustified actions given to some organization (government organization/department) just increase corruption inside corresponding organisation. Powerful organization starts to protect own power. And interests of usual citizens for which all that power was given never come into equation. And next we have falsified reports regarding nuclear or bio-weapons in some countries... or multi-millions submarine hunting quests just before new budget hearings.
7) Message boards : Number crunching : AP V7 (Message 1593805)
Posted 1 day ago by Profile Raistmer

Did anyone ever submit a reasonable reason for not just having Plain APs as you suggested a while back? I can't think of any reason to have device restrictions.


What do you mean??? There is no difference between CPU and GPU AP tasks at all. Not even some scheduler rules as for VLAR MB...
8) Message boards : Number crunching : Lunatics Windows Installer v0.43 Release Notes (Message 1593561)
Posted 1 day ago by Profile Raistmer
Regarding RTFM, experimenting with the period num iterations option had already been done on my APUs a long while ago, with no apparent change in the frequency of stuck MB tasks. (For the record, I think I got to the point of setting the parameter in the order of thousands.)

I'm not complaining, just clarifying the circumstances.

Order of thousands no need. 100-200 should be enough in case it works at all.

Try to apply -sbs 128 or -sbs 64 along with ~100 period num.
[and next suggestion will be driver change. I recall more than month debugging session with AMD support where they gave whole spectrum of suggestions regarding stability of my hardware, quality of my OS and so on and so forth (MB gave BSoD periodically) except to simple driver change. I did change the driver and BSoDs completely disappeared.]
9) Message boards : Number crunching : AP V7 (Message 1593465)
Posted 2 days ago by Profile Raistmer
Need to say last few dozens of posts made this thread absolutely unworthy for monitoring AP7 issues. So if someone has something to say (and wanna be heared, of course) please start new thread or post in APv7 issues & errors one.

Regarding new thread topic (BOINC fetch issues) well, I did noticed too that often BOINC client asking for both CPU and GPU work but almost (by my impression) never gets work for both types in single request (though asked for both).

But usually I trying to do separate CPU MB and GPU AP queues filling by micromanagement so did not have many or consistent enough observation on this point.
10) Message boards : Number crunching : Lunatics Windows Installer v0.43 Release Notes (Message 1593462)
Posted 2 days ago by Profile Raistmer
Before messing with TDRDelays I would recommend to try app's own tuning capabilities listed in ReadMe file specially for such cases [it seems OpenCL ATi MB caused restarts in that particular case]

RTFM ;) And only if that did not help rise the issue again.



-period_iterations_num N: Splits single PulseFind kernel call to N calls for longest PulseFind calls. Can be used to reduce GUI lags or
to prevent driver restarts. Can affect performance. Experimentation required. Default value for v6/v7 task is N=20. N should be positive integer.

Other listed options could be helpful too (like -sbs one)
11) Message boards : Number crunching : Some questions about BOINC for Android... (Message 1592880)
Posted 3 days ago by Profile Raistmer
Sooo... except for the small issue I wrote about before, it works OK and all results were valid so far.

What I don't understand... now that all app versions have compleated at least 10 WUs, so the server knows how fast each app is, it decided to send me only WUs to the SLOWEST app, i.e. the app with the lowest GFLOPS. My records of compleated WUs confirm, that this app is the slowest, in particluar on "normal" AR WUs, VLARs and VHARs are about the same for all apps as far as I can say with the WUs I got untill now.

Was the idea of sending all apps to a device not to find the fastest app for this device and than send WUs only (or at least mostly) to this app?

As you can see on the application details page the two apps with the highest GFLOPS just got the standard 10 WUs each, the two slower apps much more (and the slowest app got most of course just to confirm, that the server likes it most).


Worth to file complain on BOINC forum or alpha/dev lists cause this has nothing to do with project apps... and volunteers that will read your message here. It's question for BOINC devs and it will never be answered if posted in wrong forum.
12) Message boards : Number crunching : "Zombie" AP tasks - still alive in AP v7 (Message 1592655)
Posted 3 days ago by Profile Raistmer
Hm... your way seems to be more about BOINC development, not app.
But I agree app should exit if no BOINC client govern it long enough. Hence killing boinc.exe should stop all its tasks after some (not too big) amount of time.

I'll try to reproduce boinc.exe task kill and then rise issue on BOINC dev list if case will be reproducible.

I was just thinking "oh no!". If there is code telling the science app to exit if it can not find the BOINC client. Then would it also exit when running in offline test mode. Such as doing a benchmark/debugging test run, or does that not apply when there is no parent application?


Launched standalone is different from being orphaned/abandoned.
13) Message boards : Number crunching : iGPU tuning (Message 1592582)
Posted 3 days ago by Profile Raistmer
1) cache thrashing,
2) memory bus saturating,
3) automatic frequency adjustement to fit into power budget.

Last one can be checked thoroughly by CPU-Z + GPU-Z tools to exclude.

Are there tools to monitor cache thrashing or memory bus saturation?


Yes, but "heavy" ones like Intel's vTune.
If you ready to use them you definitely will learn something new regarding your system and other things. But usage learning will required too perhaps.

EDIT: maybe some more common tools like Windows embedded perfmon can provide some overview too, BTW. Check perfmon documentation.

Intel even offers a trial version so I don't have to put on an eye patch to download it. :) Hopefully the learning curve is not much higher than other system monitoring tools I have used previously at work when breaking software. Having my BayTrail system that doesn't exhibit this slowdown to compare may also prove useful.

A fresh OS install on a spare HDD & then making a ghost image of the system before installing tools is probably also in order.


Yes, you on the right way IMHO :)
14) Message boards : Number crunching : "Zombie" AP tasks - still alive in AP v7 (Message 1592565)
Posted 3 days ago by Profile Raistmer
Yes, it's reproducible.
15) Message boards : Number crunching : "Zombie" AP tasks - still alive in AP v7 (Message 1592558)
Posted 3 days ago by Profile Raistmer
Hm... your way seems to be more about BOINC development, not app.
But I agree app should exit if no BOINC client govern it long enough. Hence killing boinc.exe should stop all its tasks after some (not too big) amount of time.

I'll try to reproduce boinc.exe task kill and then rise issue on BOINC dev list if case will be reproducible.
16) Message boards : Number crunching : iGPU tuning (Message 1592552)
Posted 3 days ago by Profile Raistmer
1) cache thrashing,
2) memory bus saturating,
3) automatic frequency adjustement to fit into power budget.

Last one can be checked thoroughly by CPU-Z + GPU-Z tools to exclude.

Are there tools to monitor cache thrashing or memory bus saturation?


Yes, but "heavy" ones like Intel's vTune.
If you ready to use them you definitely will learn something new regarding your system and other things. But usage learning will required too perhaps.

EDIT: maybe some more common tools like Windows embedded perfmon can provide some overview too, BTW. Check perfmon documentation.
17) Message boards : Number crunching : Lunatics Windows Installer v0.43 Release Notes (Message 1592396)
Posted 4 days ago by Profile Raistmer
-oclFFT_plan is case sensitive.

Maybe is a good ideia change that, all the other switches uses lower case letters.


Like Raistmer said its for advanced users.
Everybody can snip it out of the read me.

Sorry for the typo.

My mistake.


Well, that option is from advanced area not cause it hard to type of course :) but cause not all combos will go and there is no fool-proof at its excersising.

But indeed, there is inconsistency in options naming. FFT is shortcut, but FFA is just similar shortcut.

Hence there should be -FFA_block and -FFA_block_fetch.
In next builds app will understand both "correct" option naming (case-sencitive where upper case can be) and low-register "unix-style" one.
(-oclfft_plan and -oclFFT_plan both will go along with -FFA_block ).
18) Message boards : Number crunching : "Zombie" AP tasks - still alive in AP v7 (Message 1592392)
Posted 4 days ago by Profile Raistmer
Current function that checks exit condition looks like:

inline void ExitCheck(){ check_repeat: if (boinc_status.quit_request || boinc_status.abort_request || !canRun) { /* fprintf(stderr,"DEBUG: polled for exit/suspend request: exit needed. Flags are: boinc_status.quit_request=%d, \ boinc_status.abort_request=%d, canRun=%d\n", boinc_status.quit_request,boinc_status.abort_request,canRun);*/ DoSyncExit(); }else if(boinc_status.suspended){ Sleep(100); //R:await in sleep 100ms /* fprintf(stderr,"DEBUG: polled for exit/suspend request: sleep needed. Flags are: boinc_status.quit_request=%d, \ boinc_status.abort_request=%d, canRun=%d\n", boinc_status.quit_request,boinc_status.abort_request,canRun);*/ goto check_repeat;//R: check again if exit required or sleep continues }else{ /* fprintf(stderr,"DEBUG: polled for exit/suspend request: exit NOT needed. Flags are: boinc_status.quit_request=%d, \ boinc_status.abort_request=%d, canRun=%d\n", boinc_status.quit_request,boinc_status.abort_request,canRun);*/ } }


DoSyncExit() calls DoSync() that in turn quite verbose about its actions and spams in stderr. If no corresponding lines in stderr and OpenCL AP/MB (they share exit check code) then DoSyncExit() missed.

If recent BOINC behavior changes rewuires some different flags to check I'm not aware of let me know.

Also please describe exact test case to reproduce to see this "zombie" issue.
19) Message boards : Number crunching : Lunatics Windows Installer v0.43 Release Notes (Message 1592380)
Posted 4 days ago by Profile Raistmer
Thanks for trust :D

Some improvement is already reached indeed: http://lunatics.kwsn.net/12-gpu-crunching/opencl-ap-v7-memory-consumption.msg57241.html#msg57241

Ideas for more radical solution require possible non-trivial code changes so perhaps till next weekend, will see...
20) Message boards : Number crunching : iGPU tuning (Message 1592363)
Posted 4 days ago by Profile Raistmer
1) cache thrashing,
2) memory bus saturating,
3) automatic frequency adjustement to fit into power budget.

Last one can be checked thoroughly by CPU-Z + GPU-Z tools to exclude.


Next 20

Copyright © 2014 University of California