Posts by Andy Lee Robinson

41) Message boards : Number crunching : Scarecrow's Graphs no longer being updated? (Message 1182312)
Posted 2 Jan 2012 by Profile Andy Lee Robinson
Post:
That really isn't easy at the moment because of munin templates, too much hacking involved and hence making the system already more fragile.
Each graph has its own html page anyway, showing month and year.

I've separated out the purges, though they appear at the top because there's no way to override the alphabetical sorting without making fixed templates.

All the info is there anyway.. :-)
42) Message boards : Number crunching : Scarecrow's Graphs no longer being updated? (Message 1182292)
Posted 2 Jan 2012 by Profile Andy Lee Robinson
Post:
The value of the WU awaiting purging is so high, that all the others are just a line on the bottom axis.


I agonized over this too, in my desire to keep the number of graphs down and logically related, as they are generated on the fly on a busy web server, though also cached for 5 mins.

The values are also viewable as text - max, min, average... but I'll have a play and see what I can do.
43) Message boards : Number crunching : Scarecrow's Graphs no longer being updated? (Message 1182271)
Posted 1 Jan 2012 by Profile Andy Lee Robinson
Post:
I made a stats page too if you can't get your fixes!

http://setistats.haveland.com/

Cheers,
Andy.
44) Message boards : Number crunching : HE connection problems thread (Message 1163135)
Posted 17 Oct 2011 by Profile Andy Lee Robinson
Post:
NB: We should also correct the grammar of any question I pose before sending it, so the human race doesn't look stupid :)


The human race *is* stupid, but at least, most of us here are less stupid than average!
45) Message boards : Number crunching : ~ Temporary Fix for HE connection problems ~ (Message 1159293)
Posted 6 Oct 2011 by Profile Andy Lee Robinson
Post:
Yes, I asked HE a few weeks ago and posted the result here

http://setiathome.berkeley.edu/forum_thread.php?id=64652&nowrap=true#1145686


The problem appears to be within seti@home. The pao hop is the last hop
before we hand traffic off to them. After that is where things are dying.

You would need to contact seti@home about the issue.

Regards,


Is there a chance that anyone can fix this routing/firewall problem for good?
It is probably something just irritatingly simple.

I'm a linux system administrator, and can sympathise that sometimes things go wrong, and need fixing, but this continuous downtime does not look at all good from the outside.

I try to ensure that when things break, the cause is identified and remedied quickly, and if possible, write some code to attempt an automatic fix so the systems can better look after themselves.

After a while, there should be little for humans to do except feed hard disks and gaze at the processes while munching pizza. :-)
46) Message boards : Number crunching : Downloading WUs Requires Numerous Retrys (Message 1151502)
Posted 12 Sep 2011 by Profile Andy Lee Robinson
Post:
Problem with capping the bandwidth is that valuable connections are held open longer, and there may not be enough to go around.

A while ago I posted a solution using iptables with "negative subnets" as a choke, that could be incrementally opened to allow ip ranges to connect such as

 &00         = 1/4 address space
 &00|&01     = 1/2 address space
 &00|&01|&10 = 3/4 address space

A million clients downloading 95% of a WU, barfing only to try again from the beginning is a crying waste of time and bandwidth, so there must be a better way.

Physically ban clients from even reaching the server for a period until saturation is reduced so that everyone can connect when demand has eased.
47) Message boards : Number crunching : HE connection problems thread (Message 1145686)
Posted 27 Aug 2011 by Profile Andy Lee Robinson
Post:
I sent a report to he.net and got this reply:

The problem appears to be within seti@home. The pao hop is the last hop
before we hand traffic off to them. After that is where things are dying.

You would need to contact seti@home about the issue.

Regards,


So... back to seti...
48) Message boards : Number crunching : HE connection problems thread (Message 1145628)
Posted 27 Aug 2011 by Profile Andy Lee Robinson
Post:
Probs here too, one of my servers can reach:

 2  catv-89-133-254-254.catv.broadband.hu (89.133.254.254)  7.878 ms  8.218 ms  20.340 ms
 3  catv-89-135-220-46.catv.broadband.hu (89.135.220.46)  9.868 ms  9.876 ms  9.881 ms
 4  84.116.240.1 (84.116.240.1)  107.933 ms  109.890 ms  110.739 ms
 5  at-vie15a-rd1-xe-4-2-0.aorta.net (84.116.134.5)  111.757 ms at-vie15a-rd1-xe-1-0-0.aorta.net (84.116.134.1)  111.765 ms at-vie15a-rd1-xe-4-2-0.aorta.net (84.116.134.5)  111.754 ms
 6  uk-lon01b-rd1-xe-1-0-2.aorta.net (84.116.130.253)  110.341 ms uk-lon01b-rd1-xe-1-0-1.aorta.net (84.116.132.37)  107.655 ms uk-lon01b-rd1-xe-1-0-0.aorta.net (84.116.132.33)  109.041 ms
 7  84.116.132.226 (84.116.132.226)  108.997 ms 84.116.132.234 (84.116.132.234)  109.016 ms  109.004 ms
 8  us-nyc01b-rd1-pos-12-0.aorta.net (213.46.160.242)  108.993 ms  109.806 ms  108.993 ms
 9  us-nyc01b-ri1-xe-3-1-0.aorta.net (213.46.190.94)  110.992 ms  107.878 ms  107.879 ms
10  core1.nyc4.he.net (198.32.118.57)  108.115 ms  108.120 ms  108.087 ms
11  10gigabitethernet10-1.core1.sjc2.he.net (184.105.213.173)  177.703 ms  177.303 ms 10gigabitethernet10-2.core1.sjc2.he.net (184.105.213.197)  177.673 ms
12  10gigabitethernet3-2.core1.pao1.he.net (72.52.92.69)  183.094 ms  182.394 ms  182.405 ms
13  64.71.140.42 (64.71.140.42)  186.712 ms  186.721 ms  188.548 ms
14  208.68.243.254 (208.68.243.254)  232.318 ms *  229.344 ms
15  * * *

ping 208.68.240.13
PING 208.68.240.13 (208.68.240.13) 56(84) bytes of data.
64 bytes from 208.68.240.13: icmp_req=1 ttl=53 time=266 ms


and another one can't:

 1  GE-V25.core0.interware.hu (195.70.32.193)  1.105 ms  1.101 ms  1.094 ms
 2  GE-0-0-12.border0.interware.hu (195.70.32.4)  1.061 ms  1.059 ms  1.054 ms
 3  194.149.1.85 (194.149.1.85)  90.807 ms  90.811 ms  90.802 ms
 4  TenGE-7-2.huvie.datanet.hu (194.149.20.10)  107.804 ms  107.806 ms *
 5  194.149.19.14 (194.149.19.14)  7.627 ms  7.629 ms  7.621 ms
 6  10gigabitethernet1-1.core1.lon1.he.net (195.66.224.21)  35.718 ms  34.981 ms  43.207 ms
 7  10gigabitethernet6-3.core1.ash1.he.net (72.52.92.137)  120.001 ms  120.007 ms  119.987 ms
 8  10gigabitethernet7-4.core1.pao1.he.net (184.105.213.177)  193.815 ms  190.656 ms  190.647 ms
 9  * * *

ping 208.68.240.13
PING 208.68.240.13 (208.68.240.13) 56(84) bytes of data.
^C
--- 208.68.240.13 ping statistics ---
512 packets transmitted, 0 received, 100% packet loss, time 510998ms


SNAFU looks to be on 10gigabitethernet7-4.core1.pao1.he.net
49) Message boards : Number crunching : All hail GPU computing!! The CPU cruncher is dead! (Message 1081919)
Posted 27 Feb 2011 by Profile Andy Lee Robinson
Post:
I'm just experimenting with GPU crunching, and it looks like my main test/development system can leap from a RAC of about 1000 using the CPU to instead something like 200 000 using an nVidia GPU for about the same cost. At that rate, the machine should out compute in mere days everything its ever done over the previous years...

Kinda makes CPU crunching look futile when there is a GPU application available.


Martin, I'm surprised you've only just realised this!

I got a 5850 last year, and it was doing in 2 minutes the same WUs that my i7
did in 10 hours! 300x improvement! Totally insane. 1440 processors vs 4 for
similar energy consumption.

Yes, it is kind of disappointing when years work costing perhaps $1000 a year
in energy can now be done for less than a tenth of the cost, but that is progress.

As for economic sense: We could save a lot of energy, and still do much more
science in the longer term if we don't use our CPUs now, and instead put
aside what we save to buy GPUs for when our favourite projects do have proper
GPU support.

Unlike a distributed medical application that has some urgency, seti@home isn't
a case of find ET as soon as possible, We are very unlikely to have any
positive results in our lifetimes but we have at last started to put some
coefficients into the Drake Equation.
50) Message boards : Number crunching : OMG......I did it. (Message 1081916)
Posted 27 Feb 2011 by Profile Andy Lee Robinson
Post:
Congratulations Mark!
100,000,000 seti creds is a milestone certainly worth celebrating!
51) Message boards : Technical News : School (Feb 22 2011) (Message 1080516)
Posted 23 Feb 2011 by Profile Andy Lee Robinson
Post:
Try running rsync over ssh instead of nfs?
52) Message boards : Technical News : It's Raining Again (Feb 17 2011) (Message 1080208)
Posted 22 Feb 2011 by Profile Andy Lee Robinson
Post:
the air in the 'closet' should be dry with the AC constantly running, as
long as they keep the door shut. if you duct the hot air from behind the racks
to the AC return duct you will dramatically improve efficiency! it is easier to
remove heat from hot air.


If the room is sealed then humidity won't be a problem, but it would be far better
to unload and bypass the AC and duct the air straight outside, or to where it may
be useful elsewhere on campus, for example helping to heat a swimming pool or go
into a heat sink that can be drawn upon in winter.
Continuous 10kW+ plus AC costs is not a small amount of energy to just throw away.
53) Message boards : Technical News : It's Raining Again (Feb 17 2011) (Message 1078715)
Posted 18 Feb 2011 by Profile Andy Lee Robinson
Post:
Air Conditioners cool the air by removing the moisture from it.
Thus, on rainy days there is more in the air (hence A/C cannot remove
moisture fast enough to cool properly).


err.. air conditioners work by removing *heat* from the air. Moist air
has a higher specific heat capacity, thus requires more energy.
Dehumidification is a byproduct as the cooled air can't hold as much
water vapour and it condenses out. This releases latent heat which
also requires energy to remove.

Getting a dehumidifier is almost the same as adding more air conditioning,
unless you use a few buckets of silica gel.

If it's raining outside, the temperature outside is usually lower and
should therefore make the a/c more efficient as the refrigerent can
give up its heat more effectively as it condenses back to its liquid
phase.
I'm not sure that the difference in efficiency amounts to more than a
couple of percent.
54) Message boards : Technical News : Hills and Valleys (Feb 10 2011) (Message 1076497)
Posted 12 Feb 2011 by Profile Andy Lee Robinson
Post:
Sobering thought... the amount of work you crunched in 1999 can be done in
less than a day now. You spent 300x the amount of electricity than it costs now.

If we wait another ten years, we can do 300x more work for the same cost,
and not waste money on electricity now!

Just that the results (if any) would come in later rather than sooner...
55) Message boards : Technical News : Lost in Boston (Feb 08 2011) (Message 1075592)
Posted 9 Feb 2011 by Profile Andy Lee Robinson
Post:
I've been down this route a couple of times, though with not so many drives.
I copped out and used a pendrive. There was only so much pain I could
take before seeing the light!

/boot is only needed once in a blue moon, and any source pendrive, usb,
cdrom, hd, net can be used to get the system up.
Having a separate interchangable boot source removes dependency
on a raid drive failure.
Pendrives just don't break, at least for the purpose of booting, and they
can be mirrored with mdadm too for the really paranoid!
56) Message boards : Number crunching : 4 GTX295 Hydros on a EVGA X58 Classified 4-Way SLI (Message 1075337)
Posted 8 Feb 2011 by Profile Andy Lee Robinson
Post:
Joe, thanks for the clarification.
57) Message boards : Number crunching : 4 GTX295 Hydros on a EVGA X58 Classified 4-Way SLI (Message 1075134)
Posted 7 Feb 2011 by Profile Andy Lee Robinson
Post:
I don't deny that there must be corrugations and obstacles to
generate turbulence to maximise heat transfer, which creates
resistance and pressure differences but the inescapable fact
is that flow rate in == flow rate out!
58) Message boards : Number crunching : 4 GTX295 Hydros on a EVGA X58 Classified 4-Way SLI (Message 1075108)
Posted 7 Feb 2011 by Profile Andy Lee Robinson
Post:
As the water passes through each component, there will be a drop in the
flow rate, naturally, and most of the waterblocks have a recommended flow rate.


I think you meant the heat transfer rate will drop through each unit as the
water warms up on each stage, assuming a serial circuit. The water flow rate
will be the same everywhere, as it is not compressible.

If that is a problem, then a custom solution could use 4 circuits
running through one larger radiator, or even better, an insulated
underground water tank, as that heat could be recovered later.
25kW per day is not a small amount of energy just to throw away!
59) Message boards : Number crunching : AVX Extensions - Ongoing development? (Message 1074834)
Posted 6 Feb 2011 by Profile Andy Lee Robinson
Post:
Definitely! I'm getting a new SB soon to replace my aging colo q6600

If running more than 2 SATA devices, keep in mind the likely failure of the 3Gb/s SATA controller. The 6Gb/s controller isn't affected. Boards with the fixed chipset aren't expected to make it to retail probably until late March, early April.


Sure Grant, I will wait until the bug has been ironed out, but I will use 6Gb/s anyway, the server does over a million hits a day, so has to be as meaty as possible. As it is, the db and web server are separate machines.
60) Message boards : Number crunching : AVX Extensions - Ongoing development? (Message 1074431)
Posted 5 Feb 2011 by Profile Andy Lee Robinson
Post:
Is there any interest in a linux version?


Definitely! I'm getting a new SB soon to replace my aging colo q6600 webserver, which also uses 3 cores for crunching. As the elec is built into the rental, I would give it a couple of gpus too if there were any decent linux gpu apps.

I got a big i7 to replace it last year, but it became too useful at home!


Previous 20 · Next 20


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.