Panic Mode On (78) Server Problems?

Message boards : Number crunching : Panic Mode On (78) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 22 · Next

AuthorMessage
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1304039 - Posted: 9 Nov 2012, 16:24:31 UTC

Missed the edit window, but the more I think about it, the gpu limit may be total gpu rather than per gpu. Was a quick and dirty implementation. Can anyone confirm?
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1304039 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1304044 - Posted: 9 Nov 2012, 16:34:36 UTC - in response to Message 1304039.  

While I cannot confirm it, I will say that, based on the way most SETI logic has been written, the limits are most likely 100 CPU units per host and 100 GPU units per host. I know this is not what any of us want to hear, but I suspect it is the case. :(
ID: 1304044 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1304045 - Posted: 9 Nov 2012, 16:35:12 UTC - in response to Message 1304039.  

Missed the edit window, but the more I think about it, the gpu limit may be total gpu rather than per gpu. Was a quick and dirty implementation. Can anyone confirm?

I can't...
The kitties are still burning off cache. And I don't use any 3rd party software to monitor the 9 rigs, so it's hard for me to know exactly how what I actually have on hand compares to what the servers think I have on hand.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1304045 · Report as offensive
Profile zoom3+1=4
Volunteer tester
Avatar

Send message
Joined: 30 Nov 03
Posts: 65709
Credit: 55,293,173
RAC: 49
United States
Message 1304048 - Posted: 9 Nov 2012, 16:40:33 UTC - in response to Message 1303752.  

Tried that route as well. It works ( SOMETIMES)

I surely hope this problem gets ironed out.

Me too, maybe an Acme anvil dropped on it would flatten the problem. ;)

Just kidding of course.
The T1 Trust, PRR T1 Class 4-4-4-4 #5550, 1 of America's First HST's
ID: 1304048 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1304113 - Posted: 9 Nov 2012, 18:30:43 UTC

Me thinks an anvil might not be adequate :
ID: 1304113 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1304145 - Posted: 9 Nov 2012, 19:15:44 UTC

APs are NOT the whole problem. One problem is the crazy long back-off that BOINC throws up. Any reasonable model of a try/re-try/back-off delivery system shows that the very worst thing you can do is back-off for a long time - all it does it makes things worse, far worse. The best solution is actually to set the re-try/back-off time to around 50-75% of the server swap time (remember the download servers swap over every five minutes or so). This time must be properly random, and not the "high end biased" random that is currently in use.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1304145 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1304161 - Posted: 9 Nov 2012, 19:49:41 UTC - in response to Message 1304039.  

Missed the edit window, but the more I think about it, the gpu limit may be total gpu rather than per gpu. Was a quick and dirty implementation. Can anyone confirm?

Looks like that is what it is.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1304161 · Report as offensive
MikeN

Send message
Joined: 24 Jan 11
Posts: 319
Credit: 64,719,409
RAC: 85
United Kingdom
Message 1304172 - Posted: 9 Nov 2012, 20:30:36 UTC

Appologies if I have missed something, been busy this week, but is there not a major logic problem here? I have just run Boinc Rescheduler on my No2 cruncher and it tells me I have 324 tasks in progress whilst my SETI account page for this PC shows 426 in progress, so presumably 102 ghosts. Cruncher No2 cannot currently get any more tasks off SETI as it is over the newly imposed limit, so it cannot download these ghosts and get them off the system. For this cruncher the situation will presumably eventually resolve itself as it has a GPU and so should be allowed a total of 200 WUs. So when the SETI total in progress drops below 200 I assume the ghosts will be resent.

However, by analogy, my No1 cruncher with a GTX460 graphics card probably has arround 1000 ghosts out of 4000 WU in total at present. It will never be able to get down to 200 WU as it has more than that many ghosts and SETI will not give it the ghosts until it has less than 200 in total! Eventually therefore it will end up doing nothing (or more likely crunching Einstein) and the 1000 ghosts will stay in the SETI system for ever.
ID: 1304172 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1304176 - Posted: 9 Nov 2012, 20:36:09 UTC - in response to Message 1304039.  

Missed the edit window, but the more I think about it, the gpu limit may be total gpu rather than per gpu. Was a quick and dirty implementation. Can anyone confirm?

Can't talk about GPU, but CPU limit is 100. Running new* installation of BOINC in AMD 6-core processor and 100 tasks.

*As new install of BOINC 7.0.31 on which used 7.0.28 client.

And now... out of 100, 99 tasks are ghosts.
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1304176 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1304192 - Posted: 9 Nov 2012, 21:13:03 UTC

With so many of our loyal SETI community upset and frustrated over the current situation, perhaps it would be a good time for one of the SETI staff to take a few minutes to let us know what their ideas are about the problem, why they have limited us so severely, and if they have been able to determine a way to fix things and release the restrictions.

+1. I'm sure they are working on it, but...


Too late for me I have given up waiting for any meaningful communication from the project. I can no longer justify running 4 crunchers without really having any idea what is going on.

Whatever the reasons it is not good enough.
ID: 1304192 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1304195 - Posted: 9 Nov 2012, 21:23:48 UTC
Last modified: 9 Nov 2012, 21:31:51 UTC

Because the inability to cure the disease, they are choosing to kill the patient.

This is very, very sad...
ID: 1304195 · Report as offensive
bill

Send message
Joined: 16 Jun 99
Posts: 861
Credit: 29,352,955
RAC: 0
United States
Message 1304199 - Posted: 9 Nov 2012, 21:37:56 UTC

I figure, in about 4 hours, our little yellow
feathered friend will show up to join the party.
ID: 1304199 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1304201 - Posted: 9 Nov 2012, 21:42:29 UTC - in response to Message 1303999.  

Questions:

1. Would it be possible for Eric (or someone) to flush all the ghosts out of the scheduler? (I.e., somehow identify them as ghosts and deassign them so the scheduler isn't trying to keep track of so many.)

2. If it's possible, would it help the situation any to do it?

In theory that would be possible if the download server logs are detailed enough. The general idea would be that for a task marked sent at a particular time, there ought to be a corresponding download of the WU. The download server would only know the IP address of the system which asked for the file, hopefully that would most times match the address in the host record. If not, perhaps simply checking that there were as many completed downloads as tasks assigned for a WU would be an adequate fallback.

Whether it's practical is another matter. I guess it would take at least a day of programming effort trying to foresee all the possible problems, and another day testing with copies of the download server logs and data extracted from the BOINC database to see if the list of probable ghosts produced makes sense.

I have too foggy a view of what the problem really is to try to guess whether the effort would help. Eric and Jeff of course have much better knowledge of what's going on with the servers.

------------------------

The most puzzling aspect I see is that if the lost work checking is performed for every work request, it ought to be impossible for one host to have more than ~200 ghosts. That is, as soon as there are any ghosts no new tasks should be assigned until those ghosts are turned into real live tasks the host can report as other_results in its scheduler requests. Richard Haselgrove's post to boinc_dev last Sunday made it clear enough, but perhaps Dr. Anderson failed to look into it once Eric had implemented damage containment changes.
                                                                  Joe
ID: 1304201 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1304203 - Posted: 9 Nov 2012, 21:46:31 UTC - in response to Message 1304161.  

Missed the edit window, but the more I think about it, the gpu limit may be total gpu rather than per gpu. Was a quick and dirty implementation. Can anyone confirm?

Looks like that is what it is.

A shame if it is the case.

Would be better if they could have just overridden people's cache settings. Set it to 2 days for now, that will get us through most outages.
Although given the difference in work fetch between v6.x & v7.x clients that probably wouldn't work to well either.

The present settings will mean i'll run out of GPU work during the weekly outages. Probably CPU work as well on my i7 if they're mostly shorties.
Grant
Darwin NT
ID: 1304203 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1304236 - Posted: 9 Nov 2012, 23:01:07 UTC - in response to Message 1304201.  

The most puzzling aspect I see is that if the lost work checking is performed for every work request, it ought to be impossible for one host to have more than ~200 ghosts. That is, as soon as there are any ghosts no new tasks should be assigned until those ghosts are turned into real live tasks the host can report as other_results in its scheduler requests. Richard Haselgrove's post to boinc_dev last Sunday made it clear enough, but perhaps Dr. Anderson failed to look into it once Eric had implemented damage containment changes.
                                                                  Joe

Joe, you have some extra background by email.
ID: 1304236 · Report as offensive
Profile Brother Frank

Send message
Joined: 10 Dec 11
Posts: 26
Credit: 15,142,410
RAC: 0
United States
Message 1304254 - Posted: 10 Nov 2012, 1:30:30 UTC

What SETI Really Needs And Our Current Data Problems $350 Gift On The Way

By Frank Elliott, Brother Frank on Seti@Home

It's very discouraging to be purchasing computers especially designed or manufactured to give my wife and I a high level of Seti @ Home signal crunching and read very little about any efforts being made to solve problems undercutting our ability to do work for the project. However, I think the problem is not likely that of the dedicated staff and scientists at Seti not wanting to communicate with us. I think instead it is a funding problem at the personnel level and at the equipment and communication level too. I think too that many of us may want to and be able to step in for funding help. The difficulty I see is that I really don't know what would be needed to get the project running well in terms of handling the huge amount of data that needs to be analyzed. We probably need an interim funding goal and a continued yearly funding goal. I don't know either if there are gifted people willing to step in, but who absolutely must have it be a paid job so that they can support themselves and their families. There is a severe loss in productivity and turnover if the project relies on too many part-time and volunteer positions at headquarters. Continuity and staff knowledge and skills are so important in the medium and long term.

I am able to do the following. I have about 200 dollars in one of my bonus accounts for a credit card and can use it to pay the bill for that account. I'll use it this November and send the roughly 200 I didn't have to pay to Seti. I also have another 150 bucks from a bank bonus account for reaching some goals. I will send that to Seti too. I hope some of you or maybe many of you who can will send money you have in bonus or credit giving card accounts to Seti. So, the next business day, On next Monday, I will send $350 to Seti. I am wanting Seti to dedicate $175 of that money toward staff and $175 toward equipment if they can. However, this first year, you may use this money in a general account toward any of your needs as you see fit.

When my life and I lived in Toledo almost thirty years ago, we belonged to an Astronomy Society. I believe there were about 100 of us. We had a fund drive to build a rather large 25 inch reflecting telescope, the mounting, and an observatory. You wouldn't believe how much money we raised in that small group. Owens Corning donated the 25 inch mirror blank to us and we ground it ourselves. Other people stepped in with a land donation. We all worked helping to build the telescope and the observatory and then mount the big rotating dome on top of the building.

I hope that Seti can raise several million dollars a year. We should be able to with the dedicated volunteers we have. We will need a much larger fund raising group and many contacts with local and national media to spread the word that we are serious about finding a way to continue funding the project. Then, other large donors like Paul Allen and many medium and small ones will step in too. We need the help of a dedicated group of small donors year after year with sustaining pledges like Public Television does, with one at a time pledges, and with large donation that do special things. I have been taught in fundraising that most large donors don't want to supply ongoing funding to a project and that it is very important for a project to have a self-sustaining base. Then that attracts donors who want to help with the very important extra's that are very important but not quite essential and start up funds like Paul Allen provided for the Allen Telescope Array. So $350 is my immediate pledge.

I hope many of you out there will donate as much as you comfortably can to Seti's fundraising for this year and on a regular basis if you can. I will make further donations as I am able, but the majority of that is contingent on Seti really deciding what it wants to do on a yearly basis and what that would cost.

I want that to be realistic. Not a low ball figure that counts on all kinds of good possibilities happening. Not a high ball figure that goes far beyond making the staff adequate and solving some very severe problems we are having now like getting data into and out of the volunteers. I have watched as things have gone down hill with data delivery and data input problem since about last February. It is now about as bad as it can get for very high level crunchers, perhaps in the top 200. For me at 313 now in the top 1,000 participants (about 34,300 RAC) it is a big issue running around to my computers and turning no new tasks on and off and then remembering to turn new tasks on so I can get some data to crunch.

Please, please SETI. Let us know what you need and make a reasonable budget. I know many, many of us are willing and able to help. One time small donations of $l0 or more are also going to help, but just think what having 10,000 donors who are willing to send $200 to $500 per year and above to the project would do. The project would be on firm footing. What is $350 average times 10,000. It is 3.5 million dollars. My wife and I have donated a few hundred to $1,000 per year to many projects over the years. I don't understand why Seti could not be one of those for us and perhaps many of us who are committed to astronomy and space exploration, and the search for intelligent civilizations in our galaxy and it star systems that we can target as likely locations for life. That would work for the project wouldn't it. I am thinking it would make people like Paul Allen very happy too. My first donation of $350 will be sent to you online or via the web on Monday.
Frank Elliott,Member of Carepages.com,a chronic illness support site. Was FrankLivingFully there.Free user name & pw needed. My Google+ Profile is:
https://profiles.google.com/u/0/10871372137584 Science,SF,Space,Astronomy,Medicine,Psyc Topics.
ID: 1304254 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1304261 - Posted: 10 Nov 2012, 2:10:43 UTC - in response to Message 1304203.  

Missed the edit window, but the more I think about it, the gpu limit may be total gpu rather than per gpu. Was a quick and dirty implementation. Can anyone confirm?

Looks like that is what it is.

A shame if it is the case.

Would be better if they could have just overridden people's cache settings. Set it to 2 days for now, that will get us through most outages.
Although given the difference in work fetch between v6.x & v7.x clients that probably wouldn't work to well either.

The present settings will mean i'll run out of GPU work during the weekly outages. Probably CPU work as well on my i7 if they're mostly shorties.

I am certain that is how it is, in fact it counts WUs in progress as well as WUs wating to run. Between my cache and running jobs, the downloads bring the total up to 200. Down to 3100 ghosts which are slowly downloading.

I would normally never have that many WUs on my computer....it is a shame that the programing kept trying to send me WUs even when there were ghosts waiting to download.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1304261 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11359
Credit: 29,581,041
RAC: 66
United States
Message 1304264 - Posted: 10 Nov 2012, 2:31:40 UTC - in response to Message 1304254.  

Your Karma is good.
ID: 1304264 · Report as offensive
Profile Team BDM

Send message
Joined: 11 Jul 99
Posts: 9
Credit: 11,331,106
RAC: 8
Bahrain
Message 1304265 - Posted: 10 Nov 2012, 2:46:21 UTC - in response to Message 1303875.  

Mine are being delivered, but never show up in the listing once uploaded. This has been happening for some time now. For me its not about the numbers or credit, it is about if the CPU time is serving a bigger picture. Not sure if there is an issue that needs to be looked at if results are being dropped and not entered into the database.
ID: 1304265 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1304335 - Posted: 10 Nov 2012, 8:04:10 UTC

Mine are being delivered, but never show up in the listing once uploaded. This has been happening for some time now. For me its not about the numbers or credit, it is about if the CPU time is serving a bigger picture. Not sure if there is an issue that needs to be looked at if results are being dropped and not entered into the database.


Interesting - the validated tasks purge in about 24 hours now, so it may be the timing of when you look after reporting. Check your account at BOINCStats (use link on your account page). Your account there has a tab to show credit granted in the last 40 days. If you're getting no credit, open a thread for troubleshooting help. Haven't seen any other reports of a problem with reported tasks not showing up on the task page.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1304335 · Report as offensive
Previous · 1 . . . 11 · 12 · 13 · 14 · 15 · 16 · 17 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.