Posts by Jeff Buck

1) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874379)
Posted 17 hours ago by Profile Jeff Buck
Post:
Yeah, certainly seems like they're fairly rare, and since they ultimate all seem to validate, probably not a show-stopper for the app.

Anyway, FWIW, I did go ahead and generate a new list from my Inconclusives and skimmed through it looking for the proverbial low-hanging fruit (tasks that were non-overflow, with matching counts for all signals, no iGPU involvement, etc.) and just came up with two new candidates. There might be a few more among the 63 total Inconclusives I currently have for the Cuda 8.0 special, but it didn't seem worth digging any further. You can add them to your testing stash or ignore them, as you wish.

Workunit 2581227344 (09no16aa.18442.2116.6.33.31)
Task 5821860064 (S=3, A=2, P=1, T=0, G=3) x41p_zi3t2b, Cuda 8.00 special
Task 5821860065 (S=3, A=2, P=1, T=0, G=3) v8.22 (opencl_nvidia_SoG) windows_intelx86

Cuda 8.00 special - Best gaussian: peak=6.385563, mean=0.5961245, ChiSq=1.203035, time=67.95, d_freq=1420305212.47,
score=2.122812, null_hyp=2.212901, chirp=-62.462, fft_len=16k
v8.22 SoG - Best gaussian: peak=5.745589, mean=0.583622, ChiSq=1.414207, time=59.56, d_freq=1420299457.13,
score=2.057846, null_hyp=2.326043, chirp=90.136, fft_len=16k

Workunit 2581784900 (11no16aa.15419.21379.7.34.219)
Task 5823048772 (S=7, A=0, P=0, T=0, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5823048773 (S=7, A=0, P=0, T=0, G=0) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=5.887481, mean=0.6405677, ChiSq=1.381727, time=29.36, d_freq=1419641581.37,
score=-0.6105146, null_hyp=2.163218, chirp=52.627, fft_len=16k
v8.22 SoG - Best gaussian: peak=6.274475, mean=0.6576355, ChiSq=1.411415, time=10.91, d_freq=1419640238.56,
score=-0.6067085, null_hyp=2.1823, chirp=-69.28, fft_len=16k
2) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874372)
Posted 18 hours ago by Profile Jeff Buck
Post:
I ran the six problem WUs in Linux and the results are a tie. I ran it again on the Mac and as before 23se08ac.6875.22968.6.33.135 was a success on the Mac making it 4 - 2.
So, does that provide any clues as to what might be going on with the Best gaussians, or is that going to require a larger sample size? Or are you indicating that with zi3v the issue might be solved? I can generate a new list of my Inconclusives each evening and/or switch my Linux boxes over to zi3v. (I also hope to have Linux running on my other xw9400 in a day or so.)
3) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874215)
Posted 1 day ago by Profile Jeff Buck
Post:
Latency: 37 ms

Pretty high level of latency there. Way better than satellite, but still rather high.
Is that a wired/fibre connection, or a wireless connection?
Wired. Just DSL through an old AT&T telephone line that was originally installed/buried in about 1979, in a semi-rural area. I'm lucky I don't have to use semaphore flags. ;^)
4) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874211)
Posted 1 day ago by Profile Jeff Buck
Post:
a couple of hours later I had 40/100Mbs NBN.
Sigh, and the best I can get is 620k/5Mb :(
Just ran a speed test on my line and got:
Download Speed: 3.77 Mbps (471.3 KB/sec transfer rate)
Upload Speed: 0.5 Mbps (62.5 KB/sec transfer rate)
Latency: 37 ms

Considering that my nominal D/L speed is only supposed to be 3.0 Mbps, anything over about 2.4 is turbocharged for me!
5) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874208)
Posted 1 day ago by Profile Jeff Buck
Post:
Here's how it plays out on my Mac, I'll run it in Linux shortly. Of course, it needs to be run a couple of times to determine if it's consistent.
...
...
CUDA Wins 4 to 2
So...that's twice as good.
LOL...or...you could look at it as "CUDA gets it right 2/3 of the time". ;^)

Seriously, though, do you see any sort of pattern in those WUs, as to what characteristics favor the CUDA8.0 result vs. the SoG result? I know it's a small sample, but they look kind of random to me.
6) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874196)
Posted 1 day ago by Profile Jeff Buck
Post:
Workunit 2576907391 (26mr17aa.25216.7429.14.41.247)
Task 5812811343 (S=0, A=1, P=2, T=0, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5812811344 (S=0, A=1, P=2, T=0, G=0) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=3.681468, mean=0.5749262, ChiSq=1.387973, time=34.39, d_freq=1419914966.94,
score=-1.986706, null_hyp=2.127978, chirp=-74.326, fft_len=16k
v8.22 SoG - Best gaussian: peak=3.074103, mean=0.5448458, ChiSq=1.296905, time=99.82, d_freq=1419915193.65,
score=-1.983227, null_hyp=2.072506, chirp=-74.55, fft_len=16k
Hmm....looks like the tiebreaker for this WU agreed with SoG. All got credit, though.

SETI@home v8 v8.05 windows_x86_64 - Best gaussian: peak=3.074107, mean=0.5448451, ChiSq=1.296909, time=99.82, d_freq=1419915193.65,
score=-1.983146, null_hyp=2.072513, chirp=-74.55, fft_len=16k
7) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874113)
Posted 2 days ago by Profile Jeff Buck
Post:
Did you run the tasks with your CPU to see which matched better? My CPUs would take about two hours a piece on them.
No, I've never taken the time to set up to run stand-alone tasks. I've tended to leave that to the developers, if any of the WUs I've identified look sufficiently interesting.
8) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1874108)
Posted 2 days ago by Profile Jeff Buck
Post:
Are these the sorts of "Best gaussian" mismatches that you're looking into?

Workunit 2573263722 (23se08ac.6875.22968.6.33.135)
Task 5805117074 (S=3, A=0, P=1, T=3, G=0) x41p_zi3t2b, Cuda 8.00 special
Task 5805117075 (S=3, A=0, P=1, T=3, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86

Cuda 8.00 special - Best gaussian: peak=3.252388, mean=0.5397108, ChiSq=1.344394, time=14.26, d_freq=1418816790.11,
score=-1.169299, null_hyp=2.144445, chirp=-39.071, fft_len=16k
v8.22 SoG - Best gaussian: peak=3.76217, mean=0.5480909, ChiSq=1.226871, time=39.43, d_freq=1418822660.68,
score=-1.169124, null_hyp=2.078196, chirp=43.425, fft_len=16k

Workunit 2573397376 (23se08ac.6117.29512.7.34.110)
Task 5805397040 (S=9, A=1, P=0, T=1, G=2) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5805397041 (S=9, A=1, P=0, T=1, G=2) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=3.803741, mean=0.5273897, ChiSq=1.256179, time=51.17, d_freq=1421074725.21,
score=1.100755, null_hyp=2.216236, chirp=46.042, fft_len=16k
v8.22 SoG - Best gaussian: peak=3.719964, mean=0.5246363, ChiSq=1.375913, time=52.85, d_freq=1421074802.65,
score=1.091865, null_hyp=2.283048, chirp=46.046, fft_len=16k

Workunit 2576907391 (26mr17aa.25216.7429.14.41.247)
Task 5812811343 (S=0, A=1, P=2, T=0, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5812811344 (S=0, A=1, P=2, T=0, G=0) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=3.681468, mean=0.5749262, ChiSq=1.387973, time=34.39, d_freq=1419914966.94,
score=-1.986706, null_hyp=2.127978, chirp=-74.326, fft_len=16k
v8.22 SoG - Best gaussian: peak=3.074103, mean=0.5448458, ChiSq=1.296905, time=99.82, d_freq=1419915193.65,
score=-1.983227, null_hyp=2.072506, chirp=-74.55, fft_len=16k

Workunit 2577622008 (03my17ab.4903.11519.16.43.91)
Task 5814309939 (S=0, A=2, P=0, T=7, G=0) v8.22 (opencl_nvidia_SoG) windows_intelx86
Task 5814309940 (S=0, A=2, P=0, T=7, G=0) x41p_zi3t2b, Cuda 8.00 special

Cuda 8.00 special - Best gaussian: peak=8.449782, mean=0.7001557, ChiSq=1.376348, time=9.227, d_freq=1420886348.79,
score=-1.853296, null_hyp=2.074755, chirp=-14.824, fft_len=16k
v8.22 SoG - Best gaussian: peak=7.661397, mean=0.6757016, ChiSq=1.342818, time=71.3, d_freq=1420892014.37,
score=-1.852513, null_hyp=2.053577, chirp=-93.509, fft_len=16k

Workunit 2580203063 (28mr17ac.1412.331287.5.32.153)
Task 5819707063 (S=0, A=0, P=2, T=1, G=1) x41p_zi3t2b, Cuda 8.00 special
Task 5819707064 (S=0, A=0, P=2, T=1, G=1) v8.20 (opencl_ati5_SoG_mac) x86_64-apple-darwin

Cuda 8.00 special - Best gaussian: peak=3.722415, mean=0.5437903, ChiSq=1.365831, time=37.75, d_freq=1418989730.74,
score=0.4701011, null_hyp=2.247322, chirp=-45.357, fft_len=16k
v8.20 ATI SoG Mac - Best gaussian: peak=3.699681, mean=0.5424874, ChiSq=1.395795, time=39.43, d_freq=1418989654.65,
score=0.3231449, null_hyp=2.257735, chirp=-45.357, fft_len=16k

I spotted a couple more in my Inconclusives list, both also against v8.22 SoG.
9) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1873255)
Posted 6 days ago by Profile Jeff Buck
Post:
Was thinking, grab an old molex connector with a female pin, pop it out of the case, and it can be used to insert/twist on the bad MB pin to clean it up.
Just tried a modified version of your idea and it appears to have worked. The Molex pin would have been too short for me to make effective use of, so I rooted around in a box of old ballpoint pens and found a dried-up metal refill. I just cut it in half and chucked it into my Dremel rotary. It drilled out most of the gunk in both sockets in just a few seconds. The pins in the middle appear to still be intact, though whether they're still functional or not will have to await the arrival of the new PSU, which isn't due until Friday. Keeping fingers crossed!
The PSU got here a day early. I got it installed and plugged into the existing mobo and ran SoG on Windows for about an hour. No problems, so now it's running Linux and the Cuda 8.0 Special App. Just on the two GTX 960s that are plugged directly into the board, though, since I don't yet have the powered riser cables for the other two. We'll see how it goes overnight. Still keeping fingers crossed. If things go well, the new mobo will go directly into the spare parts bin once it gets here tomorrow or Saturday.
10) Message boards : Number crunching : Panic Mode On (106) Server Problems? (Message 1872977)
Posted 7 days ago by Profile Jeff Buck
Post:
Heh. I have an old IBM Thinkpad which runs 24/7. In the nearly 3 years since AP v7 came along, it has been privileged to process just 24 AP tasks. However, in all that time, only 7 of those have actually qualified toward the 11 "tasks completed" figure that's used to compute estimated run times. So the AP that's now sitting in that macine's queue has an estimated run time of 17d 17:03:29. In reality, it should only take around 48 hours to run, but until the "remaining" time estimate drops below the 2.5 days that I have the buffer set for, the AP will be the only thing in the queue. At that point the AP will really only have about 6 hours left to run.

If I'm lucky, perhaps I'll reach the 11 task threshold sometime in 2019. :^(
11) Message boards : Number crunching : Panic Mode On (106) Server Problems? (Message 1872962)
Posted 7 days ago by Profile Jeff Buck
Post:
My guess, and that's all it is, is that the "other" option isn't intended to keep your queue filled, but rather is geared to preventing you from actually running out. So, the scheduler might not send "other" work unless your supply of primary work gets below some specified threshold (which only the scheduler knows), rather than every time it drops below the maximum requested.

At least, that would seem logical to me, and would be the way that I'd want it to work if I had my options set that way. Then, if a plethora of the work that I preferred suddenly became available, I'd have plenty of available space in my queue to accept it, rather than having the queue already stuffed with "other" work.

Somebody who actually understands the scheduler code would really have to take a look at it to see if that's really what's happening.
12) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1872286)
Posted 12 days ago by Profile Jeff Buck
Post:
Was thinking, grab an old molex connector with a female pin, pop it out of the case, and it can be used to insert/twist on the bad MB pin to clean it up.
Just tried a modified version of your idea and it appears to have worked. The Molex pin would have been too short for me to make effective use of, so I rooted around in a box of old ballpoint pens and found a dried-up metal refill. I just cut it in half and chucked it into my Dremel rotary. It drilled out most of the gunk in both sockets in just a few seconds. The pins in the middle appear to still be intact, though whether they're still functional or not will have to await the arrival of the new PSU, which isn't due until Friday. Keeping fingers crossed!
13) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1872132)
Posted 12 days ago by Profile Jeff Buck
Post:
15.7 Amps ain't much. Seems these boards just aren't wired for very much PCIe power..
Yeah, max is supposed to be 18.0 Amps, which should allow for some occasional higher peaks, but with S@h we're pretty much running flat out at whatever our cards need all the time.

The problem with using External power connectors is the Graphics cards are designed to pull the first 75 watts from the Slot.
So, the cards will still be trying to pull 300 watts from slots wired for 188 watts. I don't think that will work...for very long.
The benefit with the powered risers is that they feed the additional power through a Molex wired into the riser's PCI-e connector. Theoretically, that should mean that while the card may still be drawing 75W from the riser cable, not all the power is actually coming from the slot itself. Theoretically. :^)
14) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1872117)
Posted 12 days ago by Profile Jeff Buck
Post:
Here are the relevant sections from the Technical Reference. You can write to HP if you want to challenge them. :^)


As far as I know, no specific slot is the cause, or is itself fried.


The fried pins are numbers 10 and 11.


And that "12 V-B" circuit seems to handle all the "PCI, fans, onboard logic, and audio regulator". It's not fragmented by slot.
15) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1872091)
Posted 12 days ago by Profile Jeff Buck
Post:
...I've been there, I've seen that, I've experienced the same. My MoBo connectors were fried, the cables from my PSU to The MoBo were fried, The GPUs were fine.
A lesson learned: Get a GPU with a power connector or two! Or more of them GPUs but make sure they have their own power.
The MoBo cannot supply 4x75W to GPUs. I was lucky a few times and got a new MoBo and a couple of PSUs with 7 year warrany. Things may not end that lucky....
My experience has been just the opposite. The only time I've experienced burnt MB pins was when using a 250 watt 6970 which has 6 & 8 pin external connectors. I've never had that problem with GPUs that use 75 watts or less. It seems reasonable, if the card isn't capable of pulling more power than the pins can handle, you won't have the problem. I don't know about Jeff's 960, but mine has a single 8 pin connector....sometimes I worry about it since it's using a 6 pin adapter. If for whatever reason the card tries to pull much more than 75 watts thru the MB pin, it will probably burn.
Actually, Petri's pretty much nailed it for my xw9400s. The User Manual and User Guide never specified the max power draw for the PCIe slots and, since I've been running 4 GPUs on my first xw9400 since November, 2013, without problems, I assumed that all 4 slots were rated for 75W. They're not. I finally dug through an xw9400 Technical Manual today and found that the max slot power for the two x16(x8) is supposed to be 25W. Apparently that doesn't mean that they can't supply more, but they really shouldn't be expected to.

Both of my xw9400s are running cards on riser cables, but they're not powered risers. That means that the two 750Ti's on my original xw9400 are drawing their full power, which I believe is in the 65-70W range running Windows SoG, from slots that should only be good for 25W. I just tried getting a look at the 24-pin connector on that box and can see some light browning on the underside of pins 10 and 11, but can't easily separate the plug from the socket, so I'm not going to try to get a closer look. I suspect that they're probably fused....but still working okay after nearly 4 years of simmering.

On the rig that just got fried, the two 960s on the x16(x8) slots are also on unpowered risers, but they have supplemental 6-pin connectors which, theoretically, should be providing any power that the slots themselves cannot. Of the two 960s that are mounted directly in the x16(x16) slots, one has a 6-pin connector and the other has an 8-pin.

In any event, new MB and PSU are on the way, but I think I'm going to replace all the riser cables in both boxes with powered ones, so the cards can hopefully draw extra power from a Molex or SATA power connector, instead of overloading the slots. BTW, the pins that fried are indeed the two 12V ones for the circuit that's used for "PCI, fans, onboard logic, and audio regulator", according to the technical manual. Supposedly that circuit can handle a maximum of 18 amps, but I have no idea what it's actually drawing.
16) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1871986)
Posted 13 days ago by Profile Jeff Buck
Post:
Was thinking, grab an old molex connector with a female pin, pop it out of the case, and it can be used to insert/twist on the bad MB pin to clean it up.
Hmmm...might try that. I was just playing around with it a few minutes ago and felt like I needed a dental pick or similar tool. All that black in the worst-looking socket is melted gunk from the PSU connector, and no amount of cleaning fluid, or digging with a stiff wire, is going to get it out. The pin in the middle of the socket still looks to be intact, though.

I've ordered a new mobo, just in case. For 20 bucks, I can afford to have a spare handy.
17) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1871979)
Posted 13 days ago by Profile Jeff Buck
Post:
[quote][quote]. . Some Freon and a Qtip should show up how much damage has been done.
Have plenty of QTips, but no Freon. I'll see what I can do tomorrow.


. . Do you have any denatured alcohol (Isopropyl) such as maybe tape head cleaner? That would do the trick. Just to clean that mobo ATX socket enough to set the extent of the damage.

Stephen

..
Sure,...have plenty of that stuff around. An old dinosaur has to keep his 8-track players (yes, that's plural) in good working order, after all. ;^)
18) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1871878)
Posted 13 days ago by Profile Jeff Buck
Post:
All those little plugs for the fans, etc., and, oh yeah, a big honkin' water cooler ....
Those are both +12V pins, so if you can move some of the 12V fan draw to SATA/Molex connectors it would help lessen the MB loading. That is they are likely at full speed anyways, so don't really need to be on the MB other than for monitoring.
Yeah, the cooler itself has 3 little plugs running off in different directions. Both Molex connectors are currently being used to feed the supplemental connector on one of the GTX 960s, as are a couple of the SATA connectors for another 960. There should still be a couple more SATA power connectors free, but I really need to map out what rails are feeding what before I go trying to shift stuff around.

I'm sure your PSU is fine, just put a new plug on it from an old PSU and maybe patch in 2 replacement ends.
You may very well be right about that, although I assume the existing PSU is the original one and is getting a bit long in the tooth, anyway. And unfortunately, the only old PSU I have laying around is from an old AT rig and the plug just ain't quite what I need. :^)

Anyway, if replacing the PSU will get the box going again, I can track down a new plug somewhere later on and perhaps fix that up. That might give me a 1050W upgrade for the 800W PSU that's currently driving my other xw9400.

Oh well, too much to think about tonight. I'll face it all tomorrow.
19) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1871874)
Posted 13 days ago by Profile Jeff Buck
Post:
. . Some Freon and a Qtip should show up how much damage has been done. Did you say you were waiting on a new MoBo? Is there any sign of tracking on the underside of the mobo? (arcing between adjacent printed circuit rails)/ Anyone willing to bet against it being the 12V lines?

Stephen

??
Have plenty of QTips, but no Freon. I'll see what I can do tomorrow. I have a replacement PSU ordered (as of a couple hours ago), but will hold off on the MB until I can get a better look at the damage. Amazon has refurbished ones for $19.98, so they're cheap enough, but it's a real PITA to pull the old one out and put a new one in. All those little plugs for the fans, etc., and, oh yeah, a big honkin' water cooler and fresh thermal paste for the CPUs. I'd rather not tackle that project if I don't have to. ;^)

@TBar
Deja Vu!
Dude, that's exactly how my xw4600 looked after the bout with the ATI 6970. Exact same two pins. In my case, after I picked at it, it only worked as long as there wasn't a card in the #1 slot. The 'new' $22 xw4600 seems to work fine...with a different Power Supply. Seems the two melted pins on the old PS don't work correctly with an ATI 6870, but seems to work in another machine with just an ATI 7750 on an ECS board. The PS I'm using now on the new xw4600 is too small for both ATI cards, and the new PS won't be here 'till Saturday. I think I'll try swapping the cards again after then.
It'll probably be at least a week before a replacement PSU arrives here, so there's probably not much I can do in the meantime. It's a 1050W model, and the box never drew more than about 675W, so I don't think the overall power draw was the issue. However, it's a multi-rail PSU and I suspect I need to figure out just which was overloaded. On my T7400, I had to do a lot of juggling of supplemental connectors to get it to support 3 GPUs. Never before had that problem with either of my xw9400s, but it could be the extra 40-50 watts that the Special App was drawing was just enough to cross the line somewhere. I'll start diagnosing tomorrow.
20) Message boards : Number crunching : Linux CUDA 'Special' App finally available, featuring Low CPU use (Message 1871866)
Posted 13 days ago by Profile Jeff Buck
Post:
Just when my xw9400 w/ 4 GTX 960s was starting to level off at around 80K RAC running the Special App, it appears it will have to take some time off.


It first crashed early Wednesday morning, then has only run intermittently since. I figured the PSU was dying, so I first tried switching back to Windows SoG processing, to see if that lessened the load, then disconnected one GPU and then a second. No real help. I also reduced the CPU load, with no improvement. So, tonight I started pulling things apart. Ugghh!

Keeping my fingers crossed that the connector on the MB isn't too fried and that a PSU replacement will do the trick. But not until I figure out what those 2 pins were supplying. That's a job for tomorrow. :^)


Next 20


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.