Panic Mode On (84) Server Problems?

Message boards : Number crunching : Panic Mode On (84) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next

AuthorMessage
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1391087 - Posted: 15 Jul 2013, 21:47:31 UTC - in response to Message 1391063.  

Mike,

Juan is a TOTAL PROPONENT OF FREEING CORES TO PROCESS gpu TASKS MORE EFFICIENTLY

Opps more efficiently. Thanks for the knowledge that angle rates above 1.0

do less science, suspected that but your Unknown ATI likely due to version 6.12 BOINC, seems to confound BOINC with an APR of

SETI@home v7 (anonymous platform, ATI GPU)

Average processing rate 2315.981175065

Just how do you get that fantastic APR?

Juan takes just as long to process a VHAR as a normal angle range. Is this an issue between CUDA and OpenCL? If so GREAT Jason has something to work on,
My GTX660 doen't take quite 800 secs to process a VHAR but it still takes 650 to 700 secs.

PS - Did your RAC drop more than 30% (new length of tasks) when moving to V7?

I don't remember V6 taking a credit dump on VHAR's but that might be due to
no autocorrelations?


ID: 1391087 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1391095 - Posted: 15 Jul 2013, 22:03:50 UTC

The APR is a server issue.

I didn`t loose anything of my RAC but thats simply because i upgraded from a HD 5850 to a HD 7970 just a few days before V7 release.

Juans times are not very good specially on VHARs.
This indicates not enough CPU time available.
At least for WU start sometimes.
Its definetly not the apps.
I check hundreds of hosts a week.
Sciman Steve`s host is how it should work with nvidia.




With each crime and every kindness we birth our future.
ID: 1391095 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1391108 - Posted: 15 Jul 2013, 23:16:16 UTC
Last modified: 15 Jul 2013, 23:53:41 UTC

Sorry i can´t realy understand what you say, you need to free a core to crunch a MB? i know that happens with AP but never see that on MB with Intel CPU´s. This particular hosts have 2x690 and runs 2 WU at a time on each GPU and allready reaches 98-99% of GPU utilization, free more cores makes no diference, i try to free 1, 2 or even 3 and the times are actualy allmost the same. I don't do any aditional heavy task on that host, without Boinc running the actual work load of the host never go above 5% at most. Both GPU´s are EVGA clasified and i use no OC. So please someone tell me what is wrong... since you say the times are not very good... This particular host was before V7 the #5~#7 at SETI top computers (running cuda50 x41zc from Jason´s) with about 100-120K of RAC! Crunching only MB, and i made no modifications on it´s configuration just take out one 670 to help with the hot air flow.

SETI@home v7 (anonymous platform, nvidia GPU)
Number of tasks completed 1757
Max tasks per day 3288
Number of tasks today 351
Consecutive valid tasks 879
Average processing rate 180.40354863605
Média do tempo de resposta 0.15 days

That is bad times?

Check this 2 WU: almost the same AR... 0.4....

WU 1 - http://setiathome.berkeley.edu/result.php?resultid=3076351507
WU 2 - http://setiathome.berkeley.edu/result.php?resultid=3076460895

Time GPU+CPU Credit

WU 1 - 1,278.78+179.52 97.61
WU 2 - 905.52+97.06 103.22

WU 1 - more time to process less credit than WU 2 with almost the same AR.

That´s all realy bug´s my mind... i need a beer...

BTW i will PM Steve´s for some help, i´m sure he will give me a hand...
ID: 1391108 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1391114 - Posted: 15 Jul 2013, 23:45:32 UTC - in response to Message 1390819.  

Well,look at it from their side, why should they do it?

It costs them nothing to do nothing.

Edit Cost is not only in dollars. There's also time and effort.


Bill, your comment indicates to me that you missed the last sentence: As to cost, I'm quite sure they could get a bunch of 2nd or 3rd year computer science students to do the analysis, write new code and test it for the appropriate credit. I just want to make sure that you do understand what I am saying through this...

ID: 1391114 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1391118 - Posted: 15 Jul 2013, 23:54:05 UTC - in response to Message 1391062.  

Well,look at it from their side, why should they do it?


Because some people (not me, not you) might leave SETI for other projects to crunch and get more credits.


Some people have left, not enough to make a difference
in the amount of science done. You would probably need 75%+
of the crunchers to leave to get something accomplished and that's not
going to happen.

I doubt that 1% even noticed the difference in reduced credits
per work unit crunched. Kind of like telling a politician you're
not going to vote for him and a lobby telling a politician they
have 500,000 votes that aren't going to vote for him. Who do you
think he's going to pay attention to.

<cynic> The guy who just donated $500,000 dollars to his campaign;
that's who.</cynic>


I know I know it is not about credits.
But for someone may be.
When I look at general (BOINC general) ranking, I see people with 3e6 RAC, who are not in SETI and who in SETI would get much less, but still would do a big fair amount of work. If THEY are after credit, they are lost for SETI.
Is this good?

We must not introduce inflation in the credit system, but here we are causing the same deflation effect that is affecting economy in the real world.
Which is not good all the same.

Happy crunching!

Sleepy
_____________________________________________________
100 g of ham is 100 g of ham and must be paid as such.


Actually I think the whole credit thing is a poor idea.

I think something more tangible, like for every xyz work units
you would get a password to a download of a astronomical picture
suitable for desktop wallpaper/screen savers that wasn't available
to the general Internet would be better, IMO.



Bill, in case you haven't seen this, I have copied it in from another thread. Look at the data. The picture is clear. There is an issue with the logic of determining v7 credits.

I’ve had a look at the data around v7 and v6 WUs as well as AP WUs and below is a quick observational analysis of my data and the impact of v7 in relation to granted credits.

Under v6, I was roughly averaging 100 credits per Work Unit (WU).
Under v7, it seems that the average is sitting around 75-80 credits per WU.
In looking at run-times and taking the outliers out, cpu run time was around 600-660 seconds (10-11 minutes) per WU for v6, and appears to be around 800-1100 seconds (13+ to 18+ minutes) for a v7 WU. CPU time seems to have gone up by a factor of 2-3 from 50-60 seconds for v6 to 90-180 seconds for v7.

So doing a quick Back of the Envelope (10.5/15.83=0.66) shows that from a WU processing/throughput capability, I can expect to do roughly 66% of the volume of WUs that I did before (for example, if I was doing 400 WUs per day under v6, I can now expect to do around 264 WUs per day under v7).

Looking at the impact on credit gives 0.66*0.775 = 0.514 or 51.4%. In essence I can expect that daily credit for v7 will drop to circa 51% of what I was getting under v6.

I am aware of the comments around “that the system needs time to settle down” and that “it thinks all the WUs coming back at the moment are easy, hence the low credit” however, if the system continues to perform as is, then I can expect to see no change from current trajectory.

To test the assumption, I have looked at credit per day pre v7 and post v7 implementation (the data is contained below).

The average daily credit prior to migration was 221,878. Following migration on 1st June, the average daily credit is showing as 102,376 which is circa 46.1% of the previous daily average under v6. Up to and including 22 June, I saw no real change in daily total even though was comment that the credit system had been tweaked.

As part of my thinking, I decided to “benchmark” v7 credits against Astropulse credits. I started this on 23 June with the migration of a single box. Over the next day or two, I noticed what appeared to be an increase in daily totals, so decided to migrate the other two boxes to AP only to see what the full impact would be.

The assumption that I was running with was that pre v7, v6 and AP should have been fairly well benchmarked against each other and that the granting of credits would be in approximate equilibrium based on a daily basis.

If this was the case, then I would expect to see a rise in daily totals from circa 102k to circa 220k and for the daily total to remain around 220k. From 23 June onwards, totals increased on a daily basis towards the peak of 222k on 29 June. Unfortunately there has been a lack of AP WUs starting around 30 June and so daily totals have declined over the last few days.

My observation in here (albeit based on a small amount of data) is that v6 and AP seemed to be fairly well benchmarked against each other. The issue is with v7. V7 is not well benchmarked against v6 WUs, nor is it well benchmarked against AP WUs. In fact, one can almost double their existing daily run rate through only processing AP WUs.


Daily run rates:

2013.05.16 – 244,130
2013.05.17 – 220,168
2013.05.18 – 231,098
2013.05.19 – 226,353
2013.05.20 – 224,723
2013.05.21 – 210,477
2013.05.22 - 0
2013.05.23 – 431,485
2013.05.24 – 229,312
2013.05.25 – 228,767
2013.05.26 – 239,021
2013.05.27 – 231,271
2013.05.28 – 231,050
2013.05.29 – 0
2013.05.30 – 392,635
2013.05.31 – 209,556

Migration of all boxes to v7 on “day 1”

2013.06.01 – 123,072
2013.06.02 – 94,061
2013.06.03 – 102,333
2013.06.04 – 99,896
2013.06.05 - 65,653
2013.06.06 - 112,209
2013.06.07 - 102,538
2013.06.08 - 110,760
2013.06.09 - 89,757
2013.06.10 - 96,018
2013.06.11 - 111,653
2013.06.12 - 90,091
2013.06.13 - 119,848
2013.06.14 - 99,884
2013.06.15 - 104,561
2013.06.16 - 110,566
2013.06.17 - 110,603
2013.06.18 - 102,856
2013.06.19 - 85,268
2013.06.20 - 140,694
2013.06.21 - 70,247
2013.06.22 - 109,698

Migration of all boxes away from v7 to AP only (over period of ~4 days)

2013.06.23 – 126,873
2013.06.24 – 141,847
2013.06.25 – 169,625
2013.06.26 – 169,517
2013.06.27 – 183,693
2013.06.28 – 206,019
2013.06.29 – 222,126
2013.06.30 – 211,468 (Lack of AP wus from here)
2013.07.01 – 182,394



ID: 1391118 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1391124 - Posted: 16 Jul 2013, 0:03:15 UTC - in response to Message 1391095.  

Mike,
Sciman Steve`s host is how it should work with nvidia.
is not fair, Steve is running a one off, heavily OC'd system vs AFAIK "stock" hardware in a production environment.
ID: 1391124 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1391130 - Posted: 16 Jul 2013, 0:25:31 UTC - in response to Message 1391124.  
Last modified: 16 Jul 2013, 0:49:55 UTC

Mike,
Sciman Steve`s host is how it should work with nvidia.
is not fair, Steve is running a one off, heavily OC'd system vs AFAIK "stock" hardware in a production environment.

LOL - That´s what i say, i´m a poor poor BFB cruncher who only could runs "stock" hardware... now just the #5 at SETI Top participants... with a great help of creditnew of course!
ID: 1391130 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1391147 - Posted: 16 Jul 2013, 2:18:10 UTC - in response to Message 1391114.  

Understood your point perfectly. What you seem to
be missing that anybody's work will be put to
use that the 'powers that be' consider more productive.

Increasing the RAC for those that are dissatisfied is
so far down on the list that it's beyond comprehension.

Sorry, but there it is. I wish you luck.
ID: 1391147 · Report as offensive
Lionel

Send message
Joined: 25 Mar 00
Posts: 680
Credit: 563,640,304
RAC: 597
Australia
Message 1391155 - Posted: 16 Jul 2013, 3:02:19 UTC - in response to Message 1391147.  

Understood your point perfectly. What you seem to
be missing that anybody's work will be put to
use that the 'powers that be' consider more productive.

Increasing the RAC for those that are dissatisfied is
so far down on the list that it's beyond comprehension.

Sorry, but there it is. I wish you luck.



Bill, I am not asking for development of a differential credit system where those that are dissatisfied get more credit than those that are not. This in itself would lead to further dissatisfaction.

I have provided data and commentary that supports the proposition that there is a problem with the credit system. The numbers above are indicative of what many others have also seen.

I am sure that if you looked at daily RAC, say for the top 1,000 users pre and post the introduction of v7 you will see similar impact. This is not the grumblings of a single user, but the angst created by an ill tested system from a reasonable proportion of users.

What I find hard to accept is the lack of acknowledgement, investigation and ownership that Korpela, Anderson and possible others are showing towards this issue. We are not solving for unknown particles here, all that needs to be done is "root cause analysis" and I am in a way staggered by what appears to be a fingers in the ear la la la approach hoping that the problem goes away.

If you talk to them as you indicate, then you are in a prime position to affect an investigation and get them to be proactive, albiet somewhat late.







ID: 1391155 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1391169 - Posted: 16 Jul 2013, 3:32:07 UTC - in response to Message 1391155.  

Understood your point perfectly. What you seem to
be missing that anybody's work will be put to
use that the 'powers that be' consider more productive.

Increasing the RAC for those that are dissatisfied is
so far down on the list that it's beyond comprehension.

Sorry, but there it is. I wish you luck.



Bill, I am not asking for development of a differential credit system where those that are dissatisfied get more credit than those that are not. This in itself would lead to further dissatisfaction.

I have provided data and commentary that supports the proposition that there is a problem with the credit system. The numbers above are indicative of what many others have also seen.

I am sure that if you looked at daily RAC, say for the top 1,000 users pre and post the introduction of v7 you will see similar impact. This is not the grumblings of a single user, but the angst created by an ill tested system from a reasonable proportion of users.

What I find hard to accept is the lack of acknowledgement, investigation and ownership that Korpela, Anderson and possible others are showing towards this issue. We are not solving for unknown particles here, all that needs to be done is "root cause analysis" and I am in a way staggered by what appears to be a fingers in the ear la la la approach hoping that the problem goes away.

If you talk to them as you indicate, then you are in a prime position to affect an investigation and get them to be proactive, albiet somewhat late.



I agree that your perception is definitely different from the folks
in the lab. I suggest that you try e-mailing Dr. Anderson for his take
on all this. His e-mail address is not that hard to find. I know I found
it with very little looking.

I'm sure a direct dialogue would be much better than relaying
through a third party. That way any misconceptions are minimized
and I have no interest in playing pivot man in any such conversation.

ID: 1391169 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1391200 - Posted: 16 Jul 2013, 5:32:38 UTC - in response to Message 1391118.  
Last modified: 16 Jul 2013, 5:37:29 UTC



<snip> below is a quick observational analysis of my data and the impact of v7 in relation to granted credits.



Ah, I see your problem, Lionel!

You think "science" has something to do with observations of outcomes predicted by a theory.

No. No. NO!

Bad, Lionel! Bad, bad, Lionel!

Science is about whatever our equations predict and whatever inadequacy we can find in our observations that dismisses the falsification of our theory.

Nobody is going to listen to you! You are surrounded by Marsupials and we all know that people who live around Marsupials are, uh, upside-down and possibly backward. Living in a less-evolved world as you do, and standing on your head (from a planetary perspective -- well, at least the way globes are usually labeled) has obviously made all of the blood stagnate in your brain.

I'm sorry to have to point-out these things, but you obviously have no point.

The credits that you are receiving are exactly the same as they were for v6 of the application. No, really. You think you see something different? That's because you're looking.

Remember that "credits" could be waves or particles. When you aren't looking they are like waves... the waves you get when someone is leaving and receding into the distance. They are just as large as they ever were, but only appear smaller from your perspective, and since there is no preferred perspective, you are wrong. So, if you don't look - RAC goes bye-bye and you don't notice.

But if you DO look, then you will see your RAC as particles, and with your frame of reference, it can appear that your credits are going back in time and choosing to present themselves as being small from the beginning, even though they would have been waving if you hadn't looked. Nobody has ever heard of a large particle. If they were large, they'd be balls, and nobody would call them particles. So, you have to realize that by looking your making your balls into particles and they appear small.

This is known as the duelality of credits.

Juan thinks the whole thing is like a random number generator, but he only thinks that because of the Uncertainty Principle. You see, Juan can either have a high degree of confidence in the speed with-which his RAC declines, OR he can have an idea of what his RAC is, but he cannot know both with any degree of certainty and then only from his perspective.

So, at the time he looks at his credit he "fixes" that credit in time. The other person who owns the rig against which his work unit validated may or may not have looked at THEIR work unit.

This is a double-blind, single-blind, dual double-slit experiment where the person who is blind might get the same credit for more or less work depending on whether Juan looks at his result while sitting on his up-quark. If Juan's is up, then we know the other validater's credit must be down. That's why when you are sitting in a blind you might hear something go quark-quark before you look.

I plan on taking two identical laptops and copying a work unit onto each. I will then climb a water tower and put one on top and let the other crunch at ground-level. The one near the top should crunch the work unit in less time compared to the one on the ground. If it doesn't, I will throw it off of the top of the water tower and count until it hits the ground, which will tell me how fast my RAC fell.

And no matter what I observe, someone will tell me that I didn't.

The real problem with the RAC we're playing with is that it is too small.

Why do I say that?

Because if I have a RAC of 125,000 and I add a GTX 670 to it, I *expect* to do more to my RAC than to go to 145,000. My RAC feels like it got bigger by too little. If I have four GPUs and my RAC is 60,000 and I go buy another $400 GPU and my RAC only increases to 75,000 I haven't had a "psychologically rewarding experience." If it went from 600,000 to 750,000, well, that's a whole 150,000!!! Look at my RAC climb!!

AND really low RAC makes it hard as hell to see a small improvement. You have a 10,000 RAC. You make a 3% improvement, you have a RAC of 10,300. BUT, if you have a RAC of 1,000,000 a 3% improvement is 30,000!!! Hey, HEY!! We all know it's easier to see 30,000 than 300 of something.

Nobody cares. It doesn't change the "SCIENCE!" of shoveling sand into a bucket and sifting it and taking the bucket of non-sand to the scientist so he can see if the stuff in the bottom that isn't sand might be something interesting. And he hasn't said a word about whether we've brought him anything interesting or not in a long time.

We're arguing about whether our bucket holds 5 or 40. 5 or 40 "what" doesn't matter.

BUT, if we are being *paid* by the "x", then whether our bucket holds 5 of them or 40 of them does matter.

All anyone around here wants is a consistent "x" and I'm tellin' you (preaching to the choir), as long as someone *believes* their formula is correct, all of the observation in the world isn't going to shake that belief.

I also notice that people who have a small RAC are the most likely to say size doesn't matter.

{yes, I am very tired and a little loopy}
ID: 1391200 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1391216 - Posted: 16 Jul 2013, 6:51:13 UTC

I see that its the ones with the bigger RAC that seem to cry the most. O woe is me I went from #23 to #24.

Give it a rest, Dr.A could care less what anyones RAC is. What with all the tear jerking calls for a better way to calulate RAC, I wish hed say screw it and go back to the way it was. Crunch one you get a 1. That would be fair right?

Will we lose folks who crunch? Maybe, But we get them all the time for various reasons. I see you are all still here.
[/quote]

Old James
ID: 1391216 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1391217 - Posted: 16 Jul 2013, 6:53:10 UTC - in response to Message 1391216.  

I wish hed say screw it and go back to the way it was. Crunch one you get a 1. That would be fair right?

Nope.
Hence the Cobblestone.
Just a shame credit new borked it.

Grant
Darwin NT
ID: 1391217 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1391223 - Posted: 16 Jul 2013, 7:36:51 UTC

Juan.

First of all no offense meant.

Looking at this unit for example http://setiathome.berkeley.edu/result.php?resultid=3076512395

This is a VHAR unit and took 200 seconds longer than other VHARs on the same host.
It indicates the CPU had not enough free cycles at startup.
4 690`s are physically 8 680`s so in theory you`d need 8 free cores to feed the GPU`s properly.
At least 4 cores.

As i can see your host with 3 670`s has only 5K RAC less which shows this one is perfect.
One 690 should produce more than 5K.

All i want to say is that this has nothing to do with credit new.
Each host with 4+ cards has the same issue.
And i dont talk about overclocked hardware.




With each crime and every kindness we birth our future.
ID: 1391223 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1391227 - Posted: 16 Jul 2013, 7:51:23 UTC - in response to Message 1391223.  
Last modified: 16 Jul 2013, 8:03:38 UTC

First of all no offense meant.

That i clearely undestand, and is the same from my side.

My question was only in the direction to learn how to optimize the host, realy the numbers bug´s my mind, even with the slow problem (i will make some test during this week to see what i could do) the diference of credit on the allready processed WU (on the same host) seems totaly wierd, more processig time get less credit is what i can't understand.

The 3x670 host actualy is a 690+670 host. A 670 is faster than 1/2 690 so the RAC diferences are not so big. The APR on this host is 193.14636426995 about 10% faster than the 2x690 host.

I will change later (the host is on a remote location) the configuration on the 2x690 host and free 4 cores to feeds the GPU´s let´s see what i get. But that rises the question, why works fine before with V6 and don´t work with V7?
ID: 1391227 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1391232 - Posted: 16 Jul 2013, 8:28:35 UTC - in response to Message 1391227.  

First of all no offense meant.

That i clearely undestand, and is the same from my side.

My question was only in the direction to learn how to optimize the host, realy the numbers bug´s my mind, even with the slow problem (i will make some test during this week to see what i could do) the diference of credit on the allready processed WU (on the same host) seems totaly wierd, more processig time get less credit is what i can't understand.

The 3x670 host actualy is a 690+670 host. A 670 is faster than 1/2 690 so the RAC diferences are not so big. The APR on this host is 193.14636426995 about 10% faster than the 2x690 host.

I will change later (the host is on a remote location) the configuration on the 2x690 host and free 4 cores to feeds the GPU´s let´s see what i get. But that rises the question, why works fine before with V6 and don´t work with V7?


Its not a does work or does not work situation.
Its a optimizing thing.
It works but not perfect.
V7 uses more CPU cycles than V6 does.
So if 4 GPU instances start at the same time there is simply a lack in feeding the GPU`s.
But this is a long term thing.
I also freed one CPU core more since V7 release and as you can see i dont suffer anything.
You need patience and try for weeks not days.



With each crime and every kindness we birth our future.
ID: 1391232 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13731
Credit: 208,696,464
RAC: 304
Australia
Message 1391242 - Posted: 16 Jul 2013, 9:43:37 UTC - in response to Message 1391232.  

You need patience and try for weeks not days.

Almost (but not quite) 2 months now.
RAC is still falling.
Grant
Darwin NT
ID: 1391242 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1391280 - Posted: 16 Jul 2013, 13:10:48 UTC - in response to Message 1391232.  
Last modified: 16 Jul 2013, 13:11:26 UTC

So if 4 GPU instances start at the same time there is simply a lack in feeding the GPU`s.

That´s is very unlikely, this hosts runs 24/7 so if that happening sure is very rare and must be a hell of coincidence we look exactly that WU.

I also freed one CPU core more since V7 release and as you can see i dont suffer anything.

I could understand that but you don't have a 2 x very fast GPU on the same host, so is easy not to suffer too much, faster is your GPU´s bigger is the fall, that what i see on the field, just look the posts. Who runs most CPU work suffer a lot less than us who depends on GPU´s work.

You need patience and try for weeks not days.

I know that, an totaly agree, to run SETI patience is always needed, to many problems and to few suport/feedback from the lab, the only things that allready helps is the volunteers like you who try to help the community.

Thanks for your help, let´s see what i get.
ID: 1391280 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34257
Credit: 79,922,639
RAC: 80
Germany
Message 1391288 - Posted: 16 Jul 2013, 13:24:24 UTC


Like you said i only have 1 GPU and it happens at least 4 times a day that 3 instances starting at the same time.
I only noticed that because of those happenings.





With each crime and every kindness we birth our future.
ID: 1391288 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1391300 - Posted: 16 Jul 2013, 13:57:40 UTC - in response to Message 1391288.  


Like you said i only have 1 GPU and it happens at least 4 times a day that 3 instances starting at the same time.
I only noticed that because of those happenings.

Besides when i do a reboot of the host (something very rare normaly just when windows or nvidia made some update) i not realy see more than 2 WU starting at some determinate time and normaly takes few seconds to DL the data from the HD to the GPU to crunch could be because i use a Intel CPU diferent from the AMD you use and my CPU load for others task (beside boinc task itself) is very very low (normaly less than 2% - 5% top), i could said this is almost a crunching only host.

But i will try to free 2 more cores (actualy 2 cores are allready free) latter ASAP when i arrive to the remote location, i don´t like to make any changes in the configuration remotely, you know sh*** happenigs... and our friend Murphy is allways around.



ID: 1391300 · Report as offensive
Previous · 1 . . . 17 · 18 · 19 · 20 · 21 · Next

Message boards : Number crunching : Panic Mode On (84) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.