Continued server problems.


log in

Advanced search

Message boards : News : Continued server problems.

1 · 2 · 3 · Next
Author Message
Eric KorpelaProject donor
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 3 Apr 99
Posts: 1085
Credit: 8,597,759
RAC: 8,880
United States
Message 1307567 - Posted: 19 Nov 2012, 1:02:54 UTC

We're continuing to have issues due to a database problem early last week and a botched attempt to fix it.

The problem is that the result and host tables in the database have grown large enough, and hosts have gotten fast enough that the lookup of result in process for a host and the enumeration of new results to send don't finish before the web connection times out either on the server or the client side. This resulted in hosts being assigned large number or results to compute without the transaction that tells them about these results being completed. The host. think it received no results would then contact the server for more results, which it would again not receive.

This isn't a hardware problem. The database currently fits in memory and the processors are fast. We've just crossed a threshold where each host computes fast enough that host queues and the result table have become large enough to cause this problem. To solve it, we've put per host limits on results in process back in place. But hosts that are having this problem will probably continue to have it until the average number of results per host has fallen to a workable level. That could take weeks.

For a more permanent fix, we plan do more work in each result by quadrupling the size of the workunits. But that fix will probably take months to implement and test.
____________

WinterKnight
Volunteer tester
Send message
Joined: 18 May 99
Posts: 8630
Credit: 23,707,991
RAC: 18,932
United Kingdom
Message 1307592 - Posted: 19 Nov 2012, 2:07:18 UTC - in response to Message 1307567.

Thanks for the news, we will struggle through.

AS this problem only seems to occur when AP tasks are available, might it be possible to balance the rates of splitting. At the moment the AP tasks are being split much faster than the normal MB tasks.

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7061
Credit: 59,984,231
RAC: 21,344
Germany
Message 1307609 - Posted: 19 Nov 2012, 3:52:15 UTC - in response to Message 1307567.

Eric, thanks for the news!


By the way ..
In the *panic thread* in the NC subforum, some members report that they have *better* server contact via PROXY usage.
Maybe a S@h router need a reboot?


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR



>Das Deutsche Cafe. The German Cafe.<

tbretProject donor
Volunteer tester
Avatar
Send message
Joined: 28 May 99
Posts: 2720
Credit: 208,378,559
RAC: 519,999
United States
Message 1307610 - Posted: 19 Nov 2012, 3:57:57 UTC

Thanks Eric.

Knowing that you are looking into it, on a Sunday night no less, is gratifying.

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1922
Credit: 9,734,168
RAC: 16,037
United States
Message 1307619 - Posted: 19 Nov 2012, 5:16:11 UTC

Does that explain why the Schedulers are currently (2115, 9/18/12 Berkeley time) down?
____________
.

Profile Astro-AL
Send message
Joined: 31 Mar 00
Posts: 17
Credit: 50,079,244
RAC: 65,455
United States
Message 1307622 - Posted: 19 Nov 2012, 5:51:24 UTC

I have 3 machines with more than 600 completed WU's that I can't report. Does this latest problem mean that these finished WU's will be considered as expired since they will not accept them at the server by expected due date? Should I just turn off my machines that have completed WU and wait for news as to when they will be accepted? I am wasting a lot of CPU power and electric trying to get more work and report finished WU's. I have been solely interested in ET research since 1999 and do not wish to do other Boinc projects.

I wish there was more explanations, like this latest post. If there were, it would take a lot of the frustrations out, and crunchers would stop B*tch**g so much.

I rarely have anything to say but I read the blogs. I would like to have a more firm date as to when the project is expected to run as efficiently as it did before Boinc software.
____________

Profile Rae of Quakesville,CHCH NZ
Avatar
Send message
Joined: 30 Nov 01
Posts: 56
Credit: 195,657
RAC: 0
New Zealand
Message 1307629 - Posted: 19 Nov 2012, 6:16:36 UTC

Ah! so that's whats going on. Yesterday after reinstalling bonic, it took me 3 hrs to get any work units to download, the last of which completed downloading almost 24 hrs later... At the moment my laptop has 5 results ready to report & my desktop has a few waiting as well. Looking forward to testing chromes remote desktop while on holiday next week to check on my desktop's Bonic progress.
____________
A city destroyed by an earthquake is an opportunity to Rebuild, redeign & make it a better place to be. Better, stronger, faster like the 6 Million Dollar Man

Sp@ceNv@derProject donor
Avatar
Send message
Joined: 10 Jul 05
Posts: 41
Credit: 81,560,118
RAC: 113,882
Belgium
Message 1307642 - Posted: 19 Nov 2012, 7:34:16 UTC - in response to Message 1307567.

THX Eric, for your time to post & your time that will bring a fix eventually. Using proxies doesn't change a lot over here either, the two main crunchers have run dry, one will save on electricity, the best one has been unleashed upon WCG, a very fine project also. I'll leave S@H on autopilot for now, checking the homepage of the project and the messageboards once a day will do for now. Crunching is a passtime, not a basic necessity.

Kind Belgian Regards ;)
____________
To boldly crunch ...

Profile {BDC} Thomas DupontProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Dec 11
Posts: 3726
Credit: 1,310,582
RAC: 780
France
Message 1307654 - Posted: 19 Nov 2012, 7:54:21 UTC

Thanks Eric for this news
Good luck to all the technical staff of Berkeley from the french team "BRIGADE DU COSMOS"
Hopefully there will not be too disgruntled
We must never forget that the SETI@home is a scientific non-profit project
So be indulgent with this kind of technical risks
Hope everything returns to normal fairly quickly
____________
Team Founder BRIGADE DU COSMOS




BRIGADE DU COSMOS is proudly sponsored by Zenovia Digital Exchange

Draconian
Volunteer tester
Send message
Joined: 16 Mar 03
Posts: 21
Credit: 1,809,058
RAC: 0
United States
Message 1307657 - Posted: 19 Nov 2012, 7:57:40 UTC

Confirms a theory I had as to why proxy servers work - they send the data stream slower as they are sending to multiple other systems as well.
Should be able to configure my router QOS to send, say, 30K / sec max for seti program. For now at least though, finally found a good proxy.
Good luck with the fix - you are burdened by your success!
____________

Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 26 May 99
Posts: 6904
Credit: 25,732,033
RAC: 39,199
United Kingdom
Message 1307670 - Posted: 19 Nov 2012, 8:29:55 UTC
Last modified: 19 Nov 2012, 8:32:21 UTC

Thank you Eric.

Can we all now just wait and not start filling these boards with "When is this going to be fixed threads"

If it takes months it takes months I for one will be happy to wait.
____________


Today is life, the only life we're sure of. Make the most of today.

Profile Chris SProject donor
Volunteer tester
Avatar
Send message
Joined: 19 Nov 00
Posts: 31449
Credit: 12,157,201
RAC: 28,445
United Kingdom
Message 1307677 - Posted: 19 Nov 2012, 9:13:18 UTC
Last modified: 19 Nov 2012, 9:17:29 UTC

Eric, thank you very much for that comprehensive report explaining what is going on. That is exactly the sort of information that everybody needed to be told, the scope of the problem, and what is being done about it.

I agree with Bernie, let's not fill up the boards with "when" posts, but also I would suggest that we don't all add "suggestions" on how to fix it. Eric and the team at the Lab have years of experience with this project, and will know what has to be done and how to do it. They already know it isn't a hardware problem, and the fix will be down to manhours, which with Matt's current absence is going to take time.

Most Boinc projects were scoped out in the days of CPU computing, and the advent of GPU power has brought about this crossing of the threshold that Eric mentions. In past times the man with the quad was seen as god, these days crunchers with 4 or 6 GPU cards are not that unusual.

Again, many thanks Eric, and I am happy to wait a couple of months for it all to settle down, as I know it will given time. There is always Seti Beta of course.

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5791
Credit: 57,989,957
RAC: 47,934
Australia
Message 1307682 - Posted: 19 Nov 2012, 9:27:35 UTC - in response to Message 1307567.

We're continuing to have issues due to a database problem early last week and a botched attempt to fix it.

My concern is that the Scheduler timeouts started 4 weeks ago, and became a major issue about 3 weeks ago.
And that if you can find a good proxy, you don't get Scheduler timeouts.

____________
Grant
Darwin NT.

Profile S@NL Blue Angel
Avatar
Send message
Joined: 11 May 03
Posts: 224
Credit: 4,544,373
RAC: 0
Netherlands
Message 1307686 - Posted: 19 Nov 2012, 10:13:13 UTC

Thank you Eric for the message, although I'll get 19 task to do last
night. I'll thought haha the problem is fixet, but then again I read this message. I'll hope some wonder will take place to run everything normally.
I will tell my hubby(XP_Freak) when he come's home from work cause he is
moderator for Seti@Home Netherlands.

Greetings Angel Wynton
____________

clive G1FYE
Volunteer moderator
Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 23,054,144
RAC: 0
United Kingdom
Message 1307706 - Posted: 19 Nov 2012, 12:58:51 UTC
Last modified: 19 Nov 2012, 13:02:17 UTC

Can we have a new switch in `SETI@home preferences` for - VLAR on GPU.
In the `Run only the selected applications` section,
that would overide the default seting and so reduce the calls to the servers,
for those like me that can crunch them on ATI (or any other GPU that can handle them)

Profile Michael W.F. Miles
Avatar
Send message
Joined: 24 Mar 07
Posts: 237
Credit: 27,999,099
RAC: 18,687
Canada
Message 1307805 - Posted: 19 Nov 2012, 18:35:11 UTC

So we are going to be having this same problem for months eh!!!.

Wow, at least we have an explanation but I am not sure I like the answers.
Couple months of this an I will be ready for the hospital... LOONEY BIN

Michael Miles

Tcarey
Send message
Joined: 20 Aug 99
Posts: 26
Credit: 31,720,491
RAC: 18,215
United States
Message 1307825 - Posted: 19 Nov 2012, 18:56:52 UTC

Eric, Thanks for the update and explanation of the problem. Understanding the issues reduces my frustration considerably.

I do hope that it won't be months before my fast machine gets any more work units.

Profile dancer42
Volunteer tester
Send message
Joined: 2 Jun 02
Posts: 436
Credit: 1,093,724
RAC: 852
United States
Message 1307851 - Posted: 19 Nov 2012, 19:53:51 UTC - in response to Message 1307805.

So we are going to be having this same problem for months eh!!!.

Wow, at least we have an explanation but I am not sure I like the answers.
Couple months of this an I will be ready for the hospital... LOONEY BIN

Michael Miles


The bottom line is while we in our tens of thousands have built a new system gotten a new video card or just found time to tweak our systems,seti can not afford to hire a full time programmer to fix things and for the most part are running the same equipment they had last year. The new equipment is not nearly enough and no amount of tweaking is going to fix it for long.
With green bank coming on line soon it will become thousands of times more likely for seti to find a signal. Yet the funding through donations and endowments wouldn't pay the salary's for a good sized McDonald's.
For toughs complaining donate a dollar a month cheap to support any hobby,the minimum donation at the seti sight is $10 save up if you have to, it is going to stay broke until seti can catch up with new equipment.

____________

Profile Ronald R CODNEY
Avatar
Send message
Joined: 19 Nov 11
Posts: 87
Credit: 420,497
RAC: 0
United States
Message 1307853 - Posted: 19 Nov 2012, 19:59:42 UTC - in response to Message 1307851.

I agree with Dancer. If you havent put up, then hush up and wait.

Eric and the rest of the frustrated Berkeley Vols: Thanks for the info and praying your expertise wins out.

Profile S@NL Etienne Dokkum
Volunteer tester
Avatar
Send message
Joined: 11 Jun 99
Posts: 159
Credit: 15,821,156
RAC: 17,498
Netherlands
Message 1307864 - Posted: 19 Nov 2012, 20:36:51 UTC - in response to Message 1307805.

So we are going to be having this same problem for months eh!!!.

Wow, at least we have an explanation but I am not sure I like the answers.
Couple months of this an I will be ready for the hospital... LOONEY BIN

Michael Miles


If you want to help Seti along for the time being why don't you run Beta ? You'll still be crunching signals and it's for the good of the future of this system...

I do the same, it's no use complaining about things out of our (and the boys at the lab) control.

On topic : thanks Eric for the info, we'll wait it out and see what happens. Set my rigs to NNT and if it reports and I see the servers up I will surely be back !
____________

1 · 2 · 3 · Next

Message boards : News : Continued server problems.

Copyright © 2014 University of California