Posts by Jeff Buck


log in
1) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1706560)
Posted 1 day ago by Profile Jeff BuckProject donor
Well that was less than stella!!

Having already installed the 353.30 drivers from the nvidia web site, all was fine

Foolishly I then checked for updates and Windows decided to install the 353.62 drivers, not really sure what happened but I just got a black screen, luckily Team Viewer worked fine and I was able to reboot, remove all traces of all Nvidia drivers, then run the update blocker from MS and reinstall the Nvidia drivers.

So there is still a problem, however the update blocker seems to be a way round. I suspect because of this we will eventually see a way to block specific updates rather than an "add on".

So, did you try simply changing the Device Installation Settings that I referenced in my earlier post this morning?


Well it was a bit late by then as they are installation settings, as far as I can see they stop Windows trying to install a generic driver on a new equipment installation, like the first time you power up after installing something new, nothing to do with updates.

Could be wrong, but that is how I read it.

Interesting, I took it as just the opposite, both from the other web site comments that led me to those examples, and from the "Never install driver software from Windows Update" option. In any event, since I'm planning to use my xw9400 as a Win 10 guinea pig, perhaps I'll find out how it really works in a day or so.
2) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1706413)
Posted 1 day ago by Profile Jeff BuckProject donor
Well that was less than stella!!

Having already installed the 353.30 drivers from the nvidia web site, all was fine

Foolishly I then checked for updates and Windows decided to install the 353.62 drivers, not really sure what happened but I just got a black screen, luckily Team Viewer worked fine and I was able to reboot, remove all traces of all Nvidia drivers, then run the update blocker from MS and reinstall the Nvidia drivers.

So there is still a problem, however the update blocker seems to be a way round. I suspect because of this we will eventually see a way to block specific updates rather than an "add on".

So, did you try simply changing the Device Installation Settings that I referenced in my earlier post this morning?
3) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1706289)
Posted 1 day ago by Profile Jeff BuckProject donor
Strange, but I thought that hysteria over the device driver updates that popped up over the weekend was debunked fairly quickly. I found a couple references to already existing "Device Installation Settings" in Win 10, either from System Properties, Computer Properties, or Devices and Printers. I don't have Win 10 yet on any of my boxes, but here's a couple screen shots that I saw a few days ago.



4) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1706028)
Posted 2 days ago by Profile Jeff BuckProject donor
The outage started before I got a chance to edit my post and add that running "netplwiz" might also help you reset passwords.
5) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1706020)
Posted 2 days ago by Profile Jeff BuckProject donor
I only still have to figure out the Administrator password and reset the user's password. Because we don't know either, and the latter is auto-login

Try rummaging around in HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon and see if you can pick up anything in there.
6) Message boards : Number crunching : Syntax Error...For GPU to do 3 WU (Message 1705869)
Posted 2 days ago by Profile Jeff BuckProject donor
Crunching on 3 wu in the GPU.

Count me as one of those that tends to think that 3 per GPU is excessive for a 750Ti. I have a couple of EVGA 750Ti cards on my host 6980751 (along with a couple of GTX 660s). I run 2 per GPU and all 4 cards run pretty consistently at 98-99% utilization. Running 3 at a time wouldn't increase that, but would increase the overall contention on the system. However, everybody's setup is different, so the best thing for you to do would be to try it both ways and see which configuration gives you the best throughput.

Keep in mind that run times for MB tasks are highly dependent on the Angle Range of the task, so be sure when you look at run times you compare those with similar angle ranges. Apples to apples, and guavas to guavas.

For a comparison to what my 750Ti GPUs do, running 2 at a time, here are some samples from last night's tasks. Although both cards are EVGA, they're different models with slightly different clock rates, so there are samples for each. The Angle Ranges are more or less grouped as normal, high, and very high. (Somewhere there's an actual definition for AR groups, but I don't have it handy at the moment.) These times are for completed tasks that did not overflow (i.e., less than 30 signals reported).

Clock Rate / Angle Range / Run Time / CPU Time
1162 MHz / 0.426543 / 22 min 43 sec / 3 min 58 sec
1162 MHz / 0.437118 / 23 min 16 sec / 3 min 52 sec
1162 MHz / 0.438701 / 22 min 56 sec / 3 min 57 sec

1162 MHz / 0.846425 / 17 min 18 sec / 3 min 14 sec
1162 MHz / 0.912683 / 17 min 5 sec / 3 min 15 sec

1162 MHz / 1.130179 / 13 min 24 sec / 2 min 41 sec
1162 MHz / 4.566340 / 12 min 56 sec / 2 min 34 sec
1162 MHz / 11.820134 / 13 min 49 sec / 2 min 35 sec

----------

1320 MHz / 0.426208 / 22 min 26 sec / 4 min
1320 MHz / 0.426543 / 22 min 20 sec / 4 min 2 sec
1320 MHz / 0.437118 / 21 min 39 sec / 3 min 51 sec

1320 MHz / 0.846425 / 16 min 29 sec / 3 min 15 sec
1320 MHz / 0.912683 / 16 min 42 sec / 3 min 11 sec

1320 MHz / 1.130179 / 13 min 1 sec / 2 min 39 sec
1320 MHz / 4.270934 / 14 min 28 sec / 2 min 53 sec
1320 MHz / 10.063679 / 12 min 35 sec / 2 min 32 sec

Also, keep in mind that since you're running stock applications, some of yours may run as Cuda42 and some as Cuda50, while the scheduler tries to figure out which is best for you. Since I'm running Lunatics, all mine run as Cuda50, which I think is best for the 660 and 750Ti cards. (Locking in a specific app is, I think, the only actual benefit to running Lunatics at present, since the stock apps and Lunatics apps are currently the same.)

Good luck figuring it all out!
7) Message boards : Number crunching : Why this CPU task invalid so soon? (Message 1705368)
Posted 4 days ago by Profile Jeff BuckProject donor
You also may need to check with any very young ones that might be around, if they played 'setup' recently. That happens too :-O

No such critters around here!
8) Message boards : Number crunching : Why this CPU task invalid so soon? (Message 1705357)
Posted 4 days ago by Profile Jeff BuckProject donor
Bios settings can go askew sometimes for similar arcane reasons to memory.

Oh, good...so I'm not just losing my mind! I don't normally care about sound on my xw9400, since it's usually a crunch-only box. However, one time in about March or so, I tried to watch a video and found out I had no sound, not even through headphones. Neither reinstalling nor upgrading the driver had any effect, and I just let it go, since that momentary need for sound had long passed. Then on Friday evening, after upgrading that box to Win 7 and reinstalling the drivers (again) to no effect, I suddenly thought to interrupt one of the many restarts (necessary for installing the 200+ Win 7 updates) and check the BIOS setup. Well, whadda ya know...there was a "Use integrated audio device" setting that had mysteriously been turned off...and I sure as heck don't remember changing that, ever! Turned it back on and now the xw9400 has sound again. Perhaps someday I'll actually need it. ;^)
9) Message boards : Number crunching : Stderr Truncations (Message 1704782)
Posted 6 days ago by Profile Jeff BuckProject donor
Hey Jeff, do you need help testing on a AMD 4200, and 750Ti?

The only testing I did was specific to the actual Stderr truncations and the "fix" that was added to BOINC v7.6.6. I think my testing, and that of the others who were specifically focused on that aspect, went fine. The broader testing for v7.6.6 is what's going on over at BOINC Alpha, which I'm not involved in.
10) Message boards : Number crunching : Stderr Truncations (Message 1704777)
Posted 6 days ago by Profile Jeff BuckProject donor
And someone writing the documentary, manual and wiki. I haven't had much time yet the past weeks, hope to do better this weekend. :)

Ah...didn't see a line for that on the chart. I guess the target #results, then, would be 1. ;^)
11) Message boards : Number crunching : Stderr Truncations (Message 1704772)
Posted 6 days ago by Profile Jeff BuckProject donor
I'm not sure I understand your concerns. The 'fix' has been available for over a week.

BOINC 7.6.6 HERE

Development version
(MAY BE UNSTABLE - USE ONLY FOR TESTING)

What are the know issues?

I don't know about any known issues, but according to the Test results summaries chart it appears that they're 71% complete. The chart seems to show that a lack of XP and MAC testers is the primary holdup.
12) Message boards : Number crunching : Since when did BOINC stop... (Message 1704397)
Posted 7 days ago by Profile Jeff BuckProject donor
In the mean time, anyone with the stock applications running on a post-5.8.2 BOINC, let us hear from you. :)

My T7400 still runs stock, but the only time it's been operational lately was during the truncated Stderr testing. Here's some event log samples from my archives:

BOINC client version 7.2.42 for windows_x86_64
14-Oct-2014 08:06:50 [SETI@home] Starting task 19mr14aa.26912.5384.438086664200.12.19_1 14-Oct-2014 08:06:50 [SETI@home] [cpu_sched] Starting task 19mr14aa.26912.5384.438086664200.12.19_1 using setiathome_v7 version 700 (cuda42) in slot 8


BOINC client version 7.6.2 for windows_x86_64
11-Jul-2015 20:18:09 [SETI@home] Starting task 31ja15aa.1364.13319.438086664201.12.97_1 11-Jul-2015 20:18:09 [SETI@home] [cpu_sched] Starting task 31ja15aa.1364.13319.438086664201.12.97_1 using setiathome_v7 version 700 (cuda42) in slot 0


BOINC client version 7.6.6 for windows_x86_64
14-Jul-2015 20:06:05 [SETI@home] Starting task 02jn15ag.19205.7189.438086664199.12.19_0 14-Jul-2015 20:06:05 [SETI@home] [cpu_sched] Starting task 02jn15ag.19205.7189.438086664199.12.19_0 using setiathome_v7 version 700 (cuda42) in slot 16


No changes over that span that I can see.

EDIT: Found an earlier one, too.

BOINC client version 7.2.33 for windows_x86_64
24-Feb-2014 18:51:24 [SETI@home] Starting task 25no13aa.13535.171373.438086664199.12.8_1 using setiathome_v7 version 700 (cuda42) in slot 12

...which also demonstrates why I stopped updating my other hosts after 7.2.33. No need for <cpu_sched> to add another line to the log just to get the slot numbers.
13) Message boards : Number crunching : Windows 10 - Yea or Nay? (Message 1703857)
Posted 8 days ago by Profile Jeff BuckProject donor
Found a great copy of Windows NT 3.51 on there for cheap.


Need any NT 4.0?

If I still have it and can find it.

Already got it! Along with DOS 6.22 and Windows 3.11.



At least they're not on 5.25" disks. (I still have a bunch of that stuff, too!)
14) Message boards : Number crunching : Stderr Truncations (Message 1702891)
Posted 12 days ago by Profile Jeff BuckProject donor
99% of the time it'll probably the contents of stderr.txt on the first retry when the error condition is triggered. The extra four retries are to deal with the possibility we have to wait on various forms of anti-malware and content indexers to have had their way with the file as well.

Special purpose software such as virus scanners and anti-malware usually have kernel mode components that make sure they get first dibs on files after they have been released by the apps that created them.

Normal Windows software hooks into the file system notification API and lets Windows tell it when it is its turns to fiddle with the file. BOINC being a cross-platform async polling app instead relies of checking to see if it can successfully access the file.

For those couple days that I was running with the Process Monitor logging turned on, it never required more than one re-try to be successful. However, the AV on those boxes (Security Essentials on the xw9400 and Windows Defender on the T7400) didn't seem to be taking any interest in those files. It was certainly informative, however, to see that Windows Search indexing was accounting for a significant amount of activity on the T7400 (Message 1701586). Seeing no purpose in indexing such transient files, I was pretty quick to turn the indexing off for the whole slots tree. So, another efficiency improvement resulting from the testing. :^)


Security Essentials is MsMpEng.exe.

Ah, okay, thank you. I hadn't caught on to that. Well, at least it seemed to get in and out very quickly.
15) Message boards : Number crunching : Stderr Truncations (Message 1702887)
Posted 12 days ago by Profile Jeff BuckProject donor
99% of the time it'll probably the contents of stderr.txt on the first retry when the error condition is triggered. The extra four retries are to deal with the possibility we have to wait on various forms of anti-malware and content indexers to have had their way with the file as well.

Special purpose software such as virus scanners and anti-malware usually have kernel mode components that make sure they get first dibs on files after they have been released by the apps that created them.

Normal Windows software hooks into the file system notification API and lets Windows tell it when it is its turns to fiddle with the file. BOINC being a cross-platform async polling app instead relies of checking to see if it can successfully access the file.

For those couple days that I was running with the Process Monitor logging turned on, it never required more than one re-try to be successful. However, the AV on those boxes (Security Essentials on the xw9400 and Windows Defender on the T7400) didn't seem to be taking any interest in those files. It was certainly informative, however, to see that Windows Search indexing was accounting for a significant amount of activity on the T7400 (Message 1701586). Seeing no purpose in indexing such transient files, I was pretty quick to turn the indexing off for the whole slots tree. So, another efficiency improvement resulting from the testing. :^)
16) Message boards : Number crunching : Stderr Truncations (Message 1702874)
Posted 12 days ago by Profile Jeff BuckProject donor
Now, having just looked at that code block again with my non-C++ trained eyes, I'm wondering whether I've still misinterpreted what will happen if that 5-second grace period expires. Does the Sharing Violation actually end up causing that Error Code 32 to be reported, or does the normal code path simply resume, presumably still producing a truncated stderr.txt? I dunno, some expert is gonna hafta 'splain that to me! ;^)

To my (also untrained) eye, it looks like it simply falls out of the bottom, five seconds later. So it does what it was going to do anyway, but slower.

If that's the case, then it will definitely be a significant improvement for S@h, too, even if it is more of a patch than a true fix. And to use Jason's raft analogy ("Fingers crossed this raft never gets used on the open ocean."), you may not want to launch a raft with this sort of patch, but if you're already in the middle of the ocean, it sure will be helpful to patch the pinholes any way you can, until you can get the raft back to shore and build a whole new one!


Followed up on this with clear eyeballs as promised. I have no issues with the method or logic. It'll just keep trying to open the file for up to 5 seconds (rather than simply a fixed delay). The comment doesn't really say why a magic number of 5 seconds ( not 3, 9 or 42 ), and it only tries once per second for access, but should be way better as your results would suggest.

Thanks for the follow-up, Jason. So, if even the 5-second grace period isn't sufficient, BOINC might still produce the truncated stderr.txt but won't throw that new Sharing Violation back at the user as an "Error while computing"? Then that should make the truncations and by extension, the "instant" Invalids, much, much rarer, even if not entirely extinct. A major improvement!
17) Message boards : Number crunching : Anything relating to AstroPulse tasks (Message 1702640)
Posted 13 days ago by Profile Jeff BuckProject donor

Perhaps it would be helpful to add a note to the ReadMe for the device-specific configuration to suggest that users recheck their device assignments following a driver upgrade.

Yes, such warning is appropriate. Along with possible same result after physical slot change.

That would be good, thanks!
18) Message boards : Number crunching : Anything relating to AstroPulse tasks (Message 1702637)
Posted 13 days ago by Profile Jeff BuckProject donor

Is there any way to have a different method of identifying the individual GPUs in the device-specific configuration other than Device 0, Device 1, etc., to avoid this sort of shuffling when upgrading drivers?


Who could better know what card is what than their own device driver?
If device driver decided to change enumeration we must follow ;)
Another recognition metod would require AI to implement :D

Okay, just thought I'd ask. ;^) It caught me by surprise because I don't think I'd seen that happen in earlier driver updates.

Perhaps it would be helpful to add a note to the ReadMe for the device-specific configuration to suggest that users recheck their device assignments following a driver upgrade.
19) Message boards : Number crunching : Anything relating to AstroPulse tasks (Message 1702586)
Posted 13 days ago by Profile Jeff BuckProject donor
Per-device config supported.

Speaking of that device-specific configuration, I should mention that I got a bit of a surprise several weeks ago after I upgraded drivers on my xw9400. The device numbers got shuffled!

OLD ORDER
OpenCL: NVIDIA GPU 0: GeForce GTX 660 (driver version 337.88, device version OpenCL 1.1 CUDA, 1536MB, 1498MB available, 2047 GFLOPS peak)
OpenCL: NVIDIA GPU 1: GeForce GTX 750 Ti (driver version 337.88, device version OpenCL 1.1 CUDA, 2048MB, 2010MB available, 2082 GFLOPS peak)
OpenCL: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 337.88, device version OpenCL 1.1 CUDA, 2048MB, 2003MB available, 2409 GFLOPS peak)
OpenCL: NVIDIA GPU 3: GeForce GTX 660 (driver version 337.88, device version OpenCL 1.1 CUDA, 2048MB, 2013MB available, 2132 GFLOPS peak)

NEW ORDER
OpenCL: NVIDIA GPU 0: GeForce GTX 750 Ti (driver version 350.12, device version OpenCL 1.2 CUDA, 2048MB, 2009MB available, 1388 GFLOPS peak)
OpenCL: NVIDIA GPU 1: GeForce GTX 660 (driver version 350.12, device version OpenCL 1.2 CUDA, 2048MB, 2012MB available, 2132 GFLOPS peak)
OpenCL: NVIDIA GPU 2: GeForce GTX 750 Ti (driver version 350.12, device version OpenCL 1.2 CUDA, 2048MB, 2003MB available, 1606 GFLOPS peak)
OpenCL: NVIDIA GPU 3: GeForce GTX 660 (driver version 350.12, device version OpenCL 1.2 CUDA, 1536MB, 1497MB available, 2047 GFLOPS peak)

Only Device 2 stayed the same. I didn't notice the change until I went to check the results of the device-specific tuning and found that 3 of the 4 GPUs had been running with the wrong parameters. Nothing harmed, because I don't have that big a spread between the 4 devices, but the resulting run times were far enough off my expectations that it finally caught my eye.

Is there any way to have a different method of identifying the individual GPUs in the device-specific configuration other than Device 0, Device 1, etc., to avoid this sort of shuffling when upgrading drivers?
20) Message boards : Number crunching : Stderr Truncations (Message 1702348)
Posted 14 days ago by Profile Jeff BuckProject donor
If I could just interject one thing (without understanding much other than CUDA in that post), it would be that truncated Stderr is not unique to the NVIDIA GPUs. It happens on ATI cards and CPUs as well.

EDIT: See ancient Message 1469381 in "Strange Invalid MB Overflow tasks with truncated Stderr outputs..." for an example and discussion. Also in other messages in that thread.


Next 20

Copyright © 2015 University of California