Posts by Jeff Buck


log in
1) Message boards : Number crunching : How Does This Make Sense? (CUDA42 vs CUDA50 on Similar Machines) (Message 1763698)
Posted 2 days ago by Profile Jeff Buck
I have been running stock since v8 came out. When the GPU versions came, I expected both of my crunchers to settle on the CUDA50 versions, as both of my crunchers have dual GTX 980s (thanks, Craigslist!). (One is an i7-4820K, the other is an i7-4790K). NOTE: For about 10-12 days, both ran 2 WUs/GPU, then I bumped them both to 3/GPU, which they had been running before v8.

But Nooooo!

The i7-4790K fairly quickly settled as using CUDA50, but the i7-4820K, after almost 3 weeks of running both CUDA50 and -42, finally has settled as CUDA42. How is this possible? Same graphics cards, but different CUDA seems rather strange to me. Can anyone please explain to me how this could happen? I mean, Maxwell is Maxwell, and definitely not Kepler, right?

It may not be as "settled" as you think. You had Cuda50 running as recently as yesterday, it appears. One thing that I found can be a problem is a mixed workload of AP and MB tasks. When an AP and MB run together, the MB run times suffer, thus driving down the APR for whatever Cuda flavor is currently favored. That causes the scheduler to tilt toward the other flavor for a while. If the APs then stop for awhile, the scheduler sticks with the most recently favored cuda....until the next burst of APs comes along. Then the flip-flopping tends to resume. It's kind of like musical chairs. When the [AP] music stops, one Cuda gets a seat and the other is left out (temporarily).
2) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1763607)
Posted 2 days ago by Profile Jeff Buck
Google Chrome still sits here on version 7 :)

It was not that simple to stop its desire to update (at any time, not asking for permission, even if not started for months)

It have 3-4 ways to start the Updater:
From "Run" in Registry (when the user logs-in)
By "Scheduled Tasks" - 2 tasks with multiple triggers
By Services
(?) When you start Google Chrome (not sure about that, was long ago) or looking in "About Google Chrome" (was window, not tab, if you remember)

If you forget to Disable one of them - the Updater is started and it "fixes" all places to re-enable again

I used Autoruns to Disable all

You're right about the "Run" entry, but I found that I already had that disabled, too. I couldn't find a "Services" entry.

FWIW, I actually just updated Chrome about 10 days ago (for the first time since 2011) because when I tried to use it (due to some Firefox issue I was having on a particular web page) I discovered that the version I had was so old that it wouldn't work right, either.

I suppose I should update my Autoruns, too, since I'm still using v9.02, from 2008. On the other hand, if it ain't broke, I don't usually try to fix it! :^)
3) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1763490)
Posted 3 days ago by Profile Jeff Buck
They do Not update automatically as Chrome do (the auto-update was the main reason I stopped using Google Chrome years ago)

Although I don't often use Google Chrome, I do keep it on my system for those rare occasions when I run into a problem with Firefox on a particular web page. However, I have the Chrome update function disabled through the Task Scheduler. There seem to be two Goggle tasks in the Task Scheduler Library, "GoogleUpdateTaskMachineCore" and "GoogleUpdateTaskMachineUA". I've got both disabled.
4) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1763183)
Posted 4 days ago by Profile Jeff Buck
Um. How about this?

Changeset 3355:

145 145 // check for VLAR workunits
146 if (wugrp.data_desc.true_angle_range < 0.12) {
146 if (wugrp.data_desc.true_angle_range < 1.2*wugrp.receiver_cfg->beam_width) {
147 147 group_is_vlar=true;

If we plug in that old Arecibo value of 0.05 for the beam width, the cutoff point comes out as 0.06, rather than the previous 0.12

That would make sense. I'm guessing the change didn't get implemented until the outage of 26 Jan, since my archives hold a v8 VLAR from a WU created 26 Jan 2016, 4:11:04 UTC, which has an AR of 0.088912. Since then, the highest AR I can find for a VLAR is 0.036281, whereas I know I've gotten a non-VLAR with an AR of 0.063161, so 0.06 could very well be the cutoff.
5) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1763161)
Posted 4 days ago by Profile Jeff Buck
Whatever the cutoff was under v7, it definitely has changed for v8. I just checked my archives for December and found a VLAR (17ja11af.10230.44838.4.12.50.vlar_0) with an AR of 0.106412. That was the only one above 0.1, although I saw others in the 0.015-0.048 range.

EDIT: And in November, I found 06mr11af.21497.9235.10.12.23.vlar_0 with an AR of 0.115419, so the originally posted cutoff of 0.12 looks like it was probably correct for v7.
6) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1763146)
Posted 4 days ago by Profile Jeff Buck
VLAR`s starting AR 0.012 not 0.12.

That doesn't seem to be quite right, either. I just checked a recent VLAR on my daily driver, WU 2052285866, 10oc15ab.18223.809244.7.34.117.vlar, and see that it has an AR of 0.021689. So the cutoff must be above that AR, at least.
7) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1763072)
Posted 4 days ago by Profile Jeff Buck

"Work Units fall into 3 Angle Rate (AR) ranges - Very Low (VLAR, <0.12), Mid-Range (0.12 - 0.99) and Very High (VHAR, aka "Shorties", >1.0). Those numbers are approximate."

That's what I thought I had remembered. I had (3) of these .099 AR range tasks and they WERE NOT marked as .VLAR. By the definition of <.12 ... they should have been marked as .VLAR. I inspected them because I saw they had been awarded pretty high credit by CREDIT_New and that caught my attention. Long gone by now... just wondering.

Interesting. I hadn't noticed those, and there aren't any in my current validated tasks list, but I just looked in my archives and found one from a couple days ago with an AR of 0.078860 and credit of 157.53. It took nearly 40 minutes to run on a GTX660. And a day earlier, there was one with an AR of 0.096775 and credit of 151.43 that took an hour and 2 minutes to run on a GTX750Ti.

EDIT: Further archive diving turned up an AR of 0.063161, credit of 195.74, and an hour and 2 minutes on a GTX660. (BTW, all of these times are with 2 tasks per GPU. However, if one of those tasks happens to be an AP, the MB run times suffer no matter what the AR.)
8) Message boards : Number crunching : Cuda50 Task Invalid Against Two Other Cuda50s (Message 1763057)
Posted 4 days ago by Profile Jeff Buck
I've come to expect the occasional odd Invalid when tasks run on different platforms, but I don't think I've ever run across a case where the same, apparently identical app, returns an Invalid to one host but validates the other two.

In this case, WU 2051291982, all three hosts ran "setiathome enhanced x41zi (baseline v8), Cuda 5.00". For my host, the one that got the Invalid, it was run under Anonymous Platform, while the other two ran stock.

All three hosts returned a -9 Overflow with 30 spikes. It was not an "instant" Invalid, but first was marked Inconclusive until the third host reported. My task ran on a GTX 750Ti, while the initial Inconclusive was against a Quadro K620. The deciding vote was cast by a GTX 650Ti.

I don't see anything in the Stderr outputs that would indicate any differences in configuration parameters, or any other glaring differences, for that matter. So.......is this some odd manifestation of this new "increased precision", or are cosmic rays at work again?

Of course, as a -9 overflow that resulted in a grand total credit of 0.40 for the valid tasks, there's really not much at stake with this particular WU. :^) However, any circumstance where Cuda50 doesn't agree with Cuda50 seems like it might be a concern.
9) Message boards : Number crunching : Lunatics Windows Installer v0.44 - new release for v8 (required upgrade) (Message 1761080)
Posted 11 days ago by Profile Jeff Buck
I think to add a feature only used by a fraction of users is not worth implementing.
Also the appinfo already is very huge.

If a feature is an integral part of the app, it shouldn't matter what fraction of users take advantage of it. After all, there are multiple cmdline files and an mbcuda.cfg file that are already included in the Lunatics installer, even though very few people actually use them (or even know about them, for that matter). I would think that the device-specific config files should only be omitted if there's a technical roadblock preventing their inclusion.

As to the size of the app_info file, its very size and complexity are what already make it so difficult to manually edit, which means that anything the installer can do to facilitate the use of a feature would be most welcome. After all, more people might use a feature if they can do so without messing around with the guts of an app_info file.
10) Message boards : Number crunching : Lunatics Windows Installer v0.44 - new release for v8 (required upgrade) (Message 1760955)
Posted 11 days ago by Profile Jeff Buck
I just discovered that the Astropulse device-specific configuration files (for those with multiple, disparate GPUs) that Raistmer added back in the fall of 2014 apparently still haven't found their way into the app_info.xml file generated by the Lunatics installer. For Nvidia GPUs, that would be 'AstroPulse_NV_config.xml'. I didn't notice it until I started checking results for the first major batch of APs that my xw9400 just started returning following the v0.44 upgrade. Some of the timings were quite a bit out of line from my expectations.

Richard, is that a deliberate omission or just an oversight? I didn't see any sort of advisory in the release notes. I know there probably aren't many of us using that device-specific configuration, but a "heads-up" would be helpful if those files are going to be omitted from the installer.

For Astropulse, there are quite a few app_version sections (8, by my count) where I had to pull <file_ref> blocks from the "oldApp_backup" app_info file. Also, one <file_info> entry. Keeping my fingers crossed that I didn't mess up on my cut-n-pasting. ;^)
11) Message boards : Number crunching : APR Question... (Message 1760822)
Posted 12 days ago by Profile Jeff Buck
In trying to figure out the best number of wu's to run on any given machine I've always used the APR as a rough guideline for estimation of overall system throughput. If you run one wu, the number is higher, but I have always assumed that if you are running 2 or 3 work units at a time you at the reported APR and multiply it by the number of instances you are running on the GPU. I've never seen anyone say that explicitly but it seemed to make sense. Is my assumption correct?

Thanks,

Chris

While APR does, indeed, decline when you run multiple tasks per GPU, I don't know that you can apply a direct multiplier such as you suggest and come up with a meaningful result. As rob smith recommended, manually comparing run times (at different angle ranges), is likely to be much more accurate.

APR can actually be fairly volatile, as I found when I tried to use it to make some app selection decisions of my own a few years ago. Take a look at a thread that I started back then, titled What, exactly, does APR measure, and is it meaningful?. In particular, I think you might find the very knowledgeable replies of Richard, Claggy, and Joe Segur to be quite informative.
12) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1759345)
Posted 16 days ago by Profile Jeff Buck
Has the size of MB V8s been increased today by a factor of 4 or 5?

The file size, or the crunching time?
Crunching times (at least for GPU work) are the same, although having a look in my data directory there are some files that are way smaller than usual- 12-16kB instead of the usual 358kB.

If those are files that end in _0, they're actually nothing new, just the Result files for tasks that are currently running or waiting to upload.
13) Message boards : Number crunching : Lunatics Windows Installer v0.44 - new release for v8 (required upgrade) (Message 1759143)
Posted 17 days ago by Profile Jeff Buck
Actually.....we're all late to the party. Here's the same window from v0.43b.



I guess none of us really pay attention to details when we're in a hurry to install the latest and greatest! ;^)
14) Message boards : Number crunching : Seti@home v8 (Message 1758942)
Posted 18 days ago by Profile Jeff Buck
BTW, I see I have some ap's coming up and they are running on the 730. The config file has the gpu and the cpu usage both at .5, which is different than the non ap units at .5 on gpu and .04 on cpu. Will this be a problem when it goes to run the ap's? I noticed that they are opencl nvida 100 too.

Personally, I prefer not to run 2 APs at a time on a single GPU, although 1 AP plus 1 MB is usually fine. To allow for either 2 MBs at a time or 1 AP plus 1 MB, change your GPU usage to 0.51 for Astropulse tasks and 0.49 for the MultiBeam (setiathome_v8 and _v7).

As far as CPU usage goes, there are strong feelings around here that a full core should be reserved for each AP task. I tend not to subscribe fully to that philosophy, but on my multi-GPU systems, where an AP might be running on each of my GPUs at the same time, I use that 0.5 figure for AP CPU usage, effectively reserving a full core only when at least 2 APs are running. The CPU usage value only comes into play when the total value of all running tasks equals or exceeds an integer amount (1, 2, 3, etc.). In your current setup, you'd only reach that threshold if you did, in fact, let 2 APs run at the same time.

If you do choose to reserve a full core for AP, simply change your cpu_usage value to 1.0. What you'll notice, in that case, is that one of your running CPU tasks will drop into a "Waiting to run" state, and will remain that way either until the AP task or another CPU task finishes, at which point it will automatically resume.
15) Message boards : Number crunching : Seti@home v8 (Message 1758935)
Posted 18 days ago by Profile Jeff Buck
Allen, I see both your 730 and 650 have cuda 32/42/50 tasks, So just leave it alone for 24 hours and see what application BOINC settles in with and wants to run.

Will do. I noticed the same and was wondering why I had all of them since I set for the 50's in the install.

Thanks!

Your task list is just showing you what the tasks were originally assigned when they were downloaded as "stock", before you switched to Lunatics. If you look at the Stderr for any that have run since you switched, you'll find they were run under "setiathome enhanced x41zi (baseline v8), Cuda 5.00", in other words, Cuda50, regardless of the original assignment.
16) Message boards : Number crunching : Seti@home v8 (Message 1758905)
Posted 18 days ago by Profile Jeff Buck
Sorry for the poor information giving. It only on my two nVidia machines. As soon as I dumped the config files and restarted with only one iteration of gpu work, everything returned to normal. I wonder if there is something wrong with the config file. This is the one I used.....

<app_config>
<app>
<name>astropulse_v7</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v8</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.04</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v7</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.04</cpu_usage>
</gpu_versions>
</app>

I think your problem is the "<max_concurrent>2</max_concurrent>". I believe that tells BOINC that only 2 total instances of all setiathome_v8 tasks can run, regardless of whether they're GPU or CPU. I'd recommend removing that line altogether.


Then what indicates how many instances you wish to run? I thought that was it.

The <gpu_usage>0.50</gpu_usage> tells BOINC that each task should use half of a GPU, thereby allowing 2 tasks per GPU. The <max_concurrent> is pretty much useless except in very limited situations.
17) Message boards : Number crunching : Seti@home v8 (Message 1758895)
Posted 18 days ago by Profile Jeff Buck
Sorry for the poor information giving. It only on my two nVidia machines. As soon as I dumped the config files and restarted with only one iteration of gpu work, everything returned to normal. I wonder if there is something wrong with the config file. This is the one I used.....

<app_config>
<app>
<name>astropulse_v7</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>.5</gpu_usage>
<cpu_usage>.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v8</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.04</cpu_usage>
</gpu_versions>
</app>
<app>
<name>setiathome_v7</name>
<max_concurrent>2</max_concurrent>
<gpu_versions>
<gpu_usage>0.50</gpu_usage>
<cpu_usage>0.04</cpu_usage>
</gpu_versions>
</app>

I think your problem is the "<max_concurrent>2</max_concurrent>". I believe that tells BOINC that only 2 total instances of all setiathome_v8 tasks can run, regardless of whether they're GPU or CPU. I'd recommend removing that line altogether.
18) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1758875)
Posted 18 days ago by Profile Jeff Buck
One way to get a good look at your runtimes is to simply disable you network activity for awhile, then you can get a good look at what is in your upload queue.

But that won't tell you what the AR is for each task, which is the only way you can be sure you're comparing apples to apples. You still have to dig for that info.

EDIT: You also have to know whether a task was a -9 overflow or not.
19) Message boards : Number crunching : Panic Mode On (102) Server Problems? (Message 1758851)
Posted 18 days ago by Profile Jeff Buck
Dang, locked out the old thread while I was composing a reply! ;^)

Okay, reply to Message 1758837:

Both the GTX cards are running on Haswell rigs with the Intel GPU also running 3 jobs and the CPU running 7 jobs. The Xeon machine is running 3 CPU jobs. The Intel Haswell GPU rigs are averageing 2.5 to 2.79 turnaround days.

Turn around days are like Credits, they are a lagging indicator & it will take several days for the numbers to reflect reality.
That's why I prefer to use actual run times.

It also seems to me that "Average turnaround time" can be affected by numerous factors unrelated to what you're trying to observe. Such as the size of your work buffer (especially if you happen to change it), whether you run less than 24/7, other BOINC projects that may use GPU time, or the normally changing mix of angle ranges for the tasks you receive (with "normal" ARs taking longer to turn around than the "shortie" higher ARs that you often see mentioned here).

I agree with Grant that manually recording actual run times (spreadsheet, database, scratch pad, napkin, or whatever) gives you the best picture of performance. Just pick two or three common Angle ranges (I like tasks around 0.41nnnn for a normal AR and perhaps 2.72nnnn for higher ones), check your recent tasks for Stderrs with those ARs which don't show an overflow condition, and record them. You should find that the run times and CPU times for any given AR will stay pretty consistent, with only an occasional one that's abnormally high.
20) Message boards : Number crunching : Seti Disregarding Count Changes In ap_info After V0.44 (Message 1758775)
Posted 18 days ago by Profile Jeff Buck
The section about starting or re-starting SETI tasks?

Unless he's got [cpu_sched] set, I don't think BOINC 7.6.9 is going to show task restarts. (One of my gripes with recent BOINC versions. :^))


Next 20

Copyright © 2016 University of California