Posts by HAL9000


log in
1) Message boards : Number crunching : Power bill challenge (Message 1576644)
Posted 7 hours ago by Profile HAL9000
Just for fun, does anyone want to take a stab at working out the monthly power bill for the total computing power contributed to SETI monthly?

You could take the computing power for the project and then figure some kind of average watt per flop.
Well... you could if the credit system that was meant to be a measure of computing power was actually that. Since it isn't that makes the task quite impossible.
Given the differences between MB & AP credit being drastically different and the power consumption for a computer being the same in either case would also make it difficult.
2) Message boards : Number crunching : Max temp for getting the normal life out of a video card? (Message 1576598)
Posted 8 hours ago by Profile HAL9000
Sorry Hal, I thought about that during the outage and remember that he has ATI...The person before comment about his NV and that got stuck in my head....'

I was hoping that it did work. As there does not seem to be as many tools for customizing non NV hardware. I am not sure if MSI Afterburner or Sapphire TriXX give fan remapping options.
3) Message boards : Number crunching : Max temp for getting the normal life out of a video card? (Message 1576591)
Posted 8 hours ago by Profile HAL9000
Merle,

You are using Precision X? If so, did you customize your fan curve? What is the max temp on the curve so you get 100% fan usage? You can also set your target temp for the GPU in Precision X and it will throttle back when it hits that as to maintain that max temp. I wouldn't go as high as those 80-90's C. I think the max listed is somewhere near 90 but you are asking for trouble up that high. I would try to stick around 68 max if you can. You can push it higher but then you have to look at the trade offs.

Zalster

Precision X works with ATI cards now?
4) Message boards : Number crunching : Max temp for getting the normal life out of a video card? (Message 1576536)
Posted 15 hours ago by Profile HAL9000
I will also add that all of my cards are from Sapphire. Who has had a reputation for making quality cards since the days of ATI building their own cards.
I think they would be the ATI equivalent of EVGA. If that makes sense to you.
5) Message boards : Number crunching : 269 pending - what will happen?? (Message 1576532)
Posted 15 hours ago by Profile HAL9000
I found a computer that has 156 tasks "in progress". I found it because i was trying to figure out why i had a task (370648968) still pending validation from a month ago. I traced it to a computer that has had 156 tasks "in progress" for almost a month now. Looks like i'll be waiting for that task to time out and get resent to an active account. :/

I doubt it will time out. The deadline for that task is 20 October. Plus your wingperson is running An Intel Xeon with 8 cores. His GPU is a GeForce 310.

My I7 3770,s with a 550TI gpu can burn through 200 AP tasks in about 4 or 5 days.

I think you are safe even with the 10 timeouts his GPU has had.

Given that host hasn't contacted the servers in about 3.5 weeks. I would expect their task to timeout & get sent to a 3rd host. Per normal operation.
6) Message boards : Number crunching : Max temp for getting the normal life out of a video card? (Message 1576531)
Posted 15 hours ago by Profile HAL9000
Hal,

I leave the fan on my HD6870 on auto. On a hot day with the windows open the fan spins up to about 55% from 40-45%. While keeping the GPU at the same temp as when the A/C going is on. I would have to check the temp logs but I think it stays well under 60ºC all of the time.
I leave the fan on auto to hopefully get as much life out of the fan as possible. Because a dead fan is a fairly likely way to have a dead GPU.


Do you always use auto? When I left my r7 265 on auto, I had a temp as high as 76C which bothers me a little.

Given the options for the fan on the GPU are auto or a single fixed speed I use auto. Then the fan runs at the percent specified by the card manufacture. There are probably apps to tune the fan speeds at various temps. Like I can do with my MB for the CPU and chassis fans, but I haven't bother to go that route.

I recently bought a 2nd 6870 off of ebay for $40 to put in my HTPC. That card runs a bit warmer than the one in my gaming machine. I am seeing up to 74ºC with the fan running around 60%. I tested various manual settings in the HTPC to see how much of an effect they would have. At 100% it would drop to about 62ºC, but at that point the fan noise is rather terrible. Anything over 75% I didn't care for in the way of noise. I'm OK with the temperature of the GPU. So long as it runs without generating errors.
I may try turning the fan speed down and let the temp go up to see at what point it will generate errors. Then I will have a good measure of "this is to hot".

Next summer I may swap the 5750 back in the system or get a newer more efficient GPU to keep the noise down for that system. Because I don't want to try and watch a movie over the sound of a jet trying to take off in the living room.
7) Message boards : Number crunching : Well it had to happen... (Message 1576225)
Posted 1 day ago by Profile HAL9000
I've been watching my inconclusives and made an observation. Since ramping up production as winter approaches my pending WUs have doubled whereas my inconclusives have gone up by a factor of 5.
Assuming that I'm not generating invalids (and when I click through on my inconclusive wingman I see lots of invalids so it's probably not me) I guess that means the invalids have a comparatively fast turnaround time. Does that mean they are more likely to validate against each other?

In a way the hosts trashing work do have a higher chance of finding another host trashing work to generate a false valid result. However with the large number of hosts out there it doesn't happen very often. The two machine listed in the OP are running a rate of about 3% valid. I didn't check to see if any of their valid tasks are truly valid, but given the low numbers, of 8 & 28, I expect not.
8) Message boards : Number crunching : Well it had to happen... (Message 1576168)
Posted 1 day ago by Profile HAL9000
About as bad as those NV card users that return nothing but -9 overflows.

Actually, this is much worse because they're validating bad results against each other, causing good results to be thrown out.

Exact same issue as before. When it was brought up a few years ago it was two NV GPUs that trashed their work validating against each other. With the good CPU result being tossed out. Which does still happen btw.
9) Message boards : Number crunching : headless computer? (Message 1576097)
Posted 1 day ago by Profile HAL9000
Thanks OzzFan
Yes, sorry 300Mbps

I would guess that is several times faster than your internet connection. Unless you have GB fiber. In which case I don't want to know!
>.<
la la la la la la

At work I have 25 machines sharing a 100Mb pipe to the internet & that is more than enough bandwidth. The only issue I could see from using a wireless adapter would be if it acts weird and you can't access the machine. Which is an issue I have with one of my notebooks. It continues to have internet access fine, but won't talk to any other device on the network. I've chalked it up to being an older atheros wireless adapter.
10) Message boards : Number crunching : Well it had to happen... (Message 1576093)
Posted 1 day ago by Profile HAL9000
About as bad as those NV card users that return nothing but -9 overflows.
11) Message boards : Number crunching : 750Ti Power Consumption (Message 1575895)
Posted 1 day ago by Profile HAL9000
Given that a PCIe x16 slot is rated for 75w the 60w cards should be fine. At least on paper. Whether or not the MB was designed to take a continuous load is a separate question.
If your P35 board is built anything like the P45 board I bought I expect it will be fine. You may even be pulling less from the PCIe slots than you were with the older 116w cards.
12) Message boards : Number crunching : Phantom Triplets (Message 1575798)
Posted 2 days ago by Profile HAL9000
Having something that consistently fails is indeed much nicer. Which is why Jason mentioned trying to force an error. With an error rate of 1 in 800 that would put it at 1 about every 16 days. You said most of the errors happen late at night, but have they happened on the same day(s)? It makes me think there could be some kind of large industrial plant powering up or down ever 2 weeks. Causing just enough of a fluctuation in the line voltage to make your GPU to go a little nuts.

Ah, if only it was that consistent! Although it may average out to once every 16 days, the intervals have actually ranged from 2 days up to 35 days, the most recent interval being just 4 days. Not happening the same day or time, either, I'm afraid. Let's see, I have (in order) a Monday at ~1:35 AM, a Wednesday at ~4:40 AM, a Tuesday at ~6:05 PM, a Tuesday at ~10:50 PM, and a Saturday at ~3:20 PM. No industrial plants nearby, either. I live in a fairly rural area. That's not to say PG&E's electric service is entirely prodicable, though. Sometimes they make California feel like a third world country, with random outages that seem to have no external cause (i.e., perfectly bright, sunny day or calm, clear night, no car meeting power pole, but suddenly, no juice). However, that's been true for as long as I've lived here (28+ years) and this thing with this one GPU is quite new. I'm keeping the suggestions on voltages, both yours and Jason's, in reserve for now, pending any worsening of the situation.

I'd still like to try running a GPU memory test first, if I can find one similar to memtest86, just to test the memory, not stress test the whole GPU.

This seems to be one of few tools out there to test video memory.
http://mikelab.kiev.ua/index_en.php?page=PROGRAMS/vmt_en
13) Message boards : Number crunching : Phantom Triplets (Message 1575765)
Posted 2 days ago by Profile HAL9000
About every 4-6 weeks the 8500 GT I'm using throws a fit and starts trashing work. Despite it being in a lab with temperature, humidity, & power regulation. I tried a few things like shutting down the system weekly to see if it would help, but nothing I tried has so far.
I decided it was rare enough I didn't want to spend any more time looking into it.

Actually, I think if my 550Ti actually went off the rails like that, continuing to throw Invalids for an extended period after it got the first one, I'd understand it better, or at least be more understanding of it, than the way it just pharts once and then resumes as if there's nothing wrong. :^) At least then, if cleaning it or reseating it or increasing the fan speed, etc., didn't make the problem go away, I wouldn't feel hesitant to replace it. Under the present circumstances, though, I certainly wouldn't want to do that.

Speaking of replacing a card, a couple months ago I replaced an 8600 GT in my old IBM Thinkcentre with one of the (relatively) new ASUS GT 630 1GB cards. It's passively cooled and only has a maximum draw of 25w versus the 47w max for the 8600 GT. The actual power draw seems to be running only about 13w. The card only cost me $33.00 USD delivered (new, on eBay) and I figure it saves me over $4/month in electricity, so should pay for itself in about 8 months. Oh, and it provides about an 18% boost in production (at least as measured by Credits). Might be worth considering for your cranky 8500 GT.

Having something that consistently fails is indeed much nicer. Which is why Jason mentioned trying to force an error. With an error rate of 1 in 800 that would put it at 1 about every 16 days. You said most of the errors happen late at night, but have they happened on the same day(s)? It makes me think there could be some kind of large industrial plant powering up or down ever 2 weeks. Causing just enough of a fluctuation in the line voltage to make your GPU to go a little nuts.

In my case the 8500 GT is a work machine I use for testing. Hardware is kept the same in order to have consistent test platforms or for instances when something needs to be regressed. I did try a few years ago to get some GT 430's for several of the systems. So the systems could correctly support Windows Aero, but that never happened. I had also made a proposal to replace all of the monitors in the lab that were using CRTs with LCDs. I even included the amount of time it would take to recoup the cost in electric to pay for them. Sometimes our bean counters are not the brightest.
14) Message boards : Number crunching : Phantom Triplets (Message 1575738)
Posted 2 days ago by Profile HAL9000
Thanks for the insights, Jason. Thus far, I haven't tinkered with the voltage or clocks on that card at all. It's just running at all the defaults, with no overclocking. Probably, unless the frequency of this little anomaly increases significantly, I'll try to avoid altering any of those settings. The idea of actually trying to increase the Invalid rate in order to possibly get a clue to diagnose the existing low failure rate doesn't really appeal to me at the moment! ;^)

- The inherent susceptibility of 'consumer grade' hardware to soft error
http://en.wikipedia.org/wiki/Soft_error#Causes_of_soft_errors, especially,

IBM estimated in 1996 that one error per month per 256 MiB of ram was expected for a desktop computer


That's some really interesting stuff...definitely a scary door! I had no idea that "soft" errors were so prevalent and could be caused so easily. Alpha particles and cosmic rays and thermal neutrons, oh my! I can't wait for my bank to blame the next data breach on a thermal neutron.

Seriously, though, could a soft error possibly be consistent enough to cause the sort of rare, yet consistent, hiccup that I'm seeing where only Triplets (and lots of them) are being incorrectly identified where none apparently exist?

About every 4-6 weeks the 8500 GT I'm using throws a fit and starts trashing work. Despite it being in a lab with temperature, humidity, & power regulation. I tried a few things like shutting down the system weekly to see if it would help, but nothing I tried has so far.
I decided it was rare enough I didn't want to spend any more time looking into it.
15) Message boards : Number crunching : Phantom Triplets (Message 1575686)
Posted 2 days ago by Profile HAL9000
Is the 550 Ti one of the Ti cards prone issues with power?
If it is you could log the voltages and see if there are any drops on the 12v supply that coincide with the running time window for any future invalid tasks.
16) Message boards : Number crunching : venting (Message 1575669)
Posted 2 days ago by Profile HAL9000
Prevent the aliens or government from controlling or reading your thoughts......LOL

In a funny twist it would actually act as an antenna. Making one more receptive to such mind controlling devices.
For a faraday cage to work correctly it must complete surround the object. So a tinfoil suit is a much better idea. :)
17) Message boards : Number crunching : headless computer? (Message 1575158)
Posted 3 days ago by Profile HAL9000
tbret or anybody

Team Viewer seems cheap and easy. What is this about push the power button less than 4 seconds? Wouldn't you just shutdown over Team Viewer?

The default action for Windows is to start shutting down when the power button is pressed. This gives you a clean system shutdown.
If you hold the power button more than 4 seconds it is ATX spec to shut off the PSU. Which Windows will treat as an unexpected shutdown.
18) Message boards : Number crunching : headless computer? (Message 1575130)
Posted 3 days ago by Profile HAL9000
Headless Computer:

Is it without a monitor, keyboard and mouse? How do you get it setup? How do you shut it down? I assume it will just be running seti/lunatics/win8.1.

There are many options. A few are.
1) You could get an inexpensive KVM switch to go between your machines.
2) Get a network KVM device so you remote into it and have full control over the whole machine just like a KVM, but across the network. However these are often expensive.
3) Connect your current monitor and input devices. Setup the system and install remote software such as VNC when done.
4) Connect your current monitor and input devices. Setup the system then control it remotely over the network with command line options.

I mostly do 3 or 4 myself.
Once you have the system setup & configure BOINC to launch on start up you can control BOINC through BOINC Manager from your main system.
Most of my systems I don't even have BOINC Manager running. Since I am accessing the systems remotely when I need to do anything via boinccmd.

As far as shutting the system down you can do that remotely via a windows command line option. Open a command line prompt and type shutdown /? to get the full help instructions.
19) Message boards : Number crunching : Different cache for AP/MB (Message 1575126)
Posted 3 days ago by Profile HAL9000
Is it possible to set different cache values for AP and MB? Like 10 days for AP and 1 day for MB?

Cache preferences are globally for BOINC instead of per project or application in a project.
20) Message boards : Number crunching : The ultimate build (Message 1575056)
Posted 3 days ago by Profile HAL9000
The GTX750Ti is a dual slot wide card, right?
Then the max is 4 cards/mobo.
Or low profile cards are available?

Instead of two GTX750Ti one e.g. GTX780 / maybe ~ same RAC output.

AFAIK, BOINC can manage up to 8 GPU chips (4x dual, or 8 separate cards).

You could use 14/20 slot chassis or use PCIe extensions to have the cards further apart. The main issue would be having enough PCIe lanes for all the cards. So a multi-socket MB would be a good start.
I think BOINC has always supported more than 8 GPUs. However scheduler wouldn't believe the client if it reported more 8. It would only give it # of tasks for 8 GPUs. I seem to recall that this limit was increased to a higher value. I want to say 40, but that may not be correct.


Next 20

Copyright © 2014 University of California