Panic Mode On (81) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (81) Server Problems?

1 · 2 · 3 · 4 . . . 21 · Next
Author Message
Profile arkaynProject donor
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3689
Credit: 48,727,742
RAC: 6,443
United States
Message 1333541 - Posted: 1 Feb 2013, 15:19:55 UTC

Scheduler MIA once again!!
____________

Profile Tim
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 205
Credit: 247,927,372
RAC: 163,532
Greece
Message 1333550 - Posted: 1 Feb 2013, 15:37:39 UTC

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

____________

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 922
Credit: 12,047,096
RAC: 13,812
United Kingdom
Message 1333556 - Posted: 1 Feb 2013, 15:47:01 UTC - in response to Message 1333550.

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

I second those two motions.
All in favour say "Aye"?
Passed nem con.
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4435
Credit: 118,971,504
RAC: 138,778
United States
Message 1333560 - Posted: 1 Feb 2013, 15:52:05 UTC - in response to Message 1333550.

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

The limits were put in place to deal with a database issue not bandwidth.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7101
Credit: 60,913,779
RAC: 17,172
Germany
Message 1333561 - Posted: 1 Feb 2013, 15:52:14 UTC

Again problems (scheduler contact not possible) .. :-(


IIRC, in past SAH had 50 WUs/CPU-thread and 400 WUs/GPU (limit).
.. and it worked much better than currently.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

Profile Tim
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 205
Credit: 247,927,372
RAC: 163,532
Greece
Message 1333566 - Posted: 1 Feb 2013, 16:08:32 UTC - in response to Message 1333560.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8633
Credit: 51,543,133
RAC: 48,318
United Kingdom
Message 1333570 - Posted: 1 Feb 2013, 16:18:57 UTC - in response to Message 1333566.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

Yes, Einstein does:

It seems that the large number of tasks (3.7M) is seriously stressing our databases (not only the replica).

For the time being I changed the configuration of the project such that the time "finished" workunits and tasks are kept in the DB is reduced from 7d to 5d.

This means you have less time to inspect your tasks e.g. after validation, but it should reduce the number of tasks we need to keep in our DB in total and make the DBs more responsive.

BM
Technical News, 23 January 2013

(Einstein has 38,869 active volunteers, only a quarter of the 147,408 active volunteers here - figures from BOINCstats)

juan BFBProject donor
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 5399
Credit: 306,074,622
RAC: 323,497
Brazil
Message 1333576 - Posted: 1 Feb 2013, 16:57:27 UTC

Maybe an intermediary solution, 100 for the CPUs and 300 for the GPUs or at least 100 per GPU on the hosts, example 2 GPU´s = 200WU, 4 GPUs = 400WU?

Not too much, but at least give us 4 to 6 hrs before running in dryland when any problem happens.
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1333590 - Posted: 1 Feb 2013, 17:17:37 UTC - in response to Message 1333576.
Last modified: 1 Feb 2013, 17:18:06 UTC

had about 50 AP WU's that finally D/led this morning at 50kb/s each. Which is an improvement over the 0.5 I was getting over the last few days

Sadly, I can't get the results to upload now
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile {BDC} Thomas DupontProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Dec 11
Posts: 3895
Credit: 1,325,539
RAC: 202
France
Message 1333592 - Posted: 1 Feb 2013, 17:19:32 UTC

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX
____________
Founder of team BRIGADE DU COSMOS
Ranked 55th !

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1961
Credit: 10,478,524
RAC: 10,626
United States
Message 1333593 - Posted: 1 Feb 2013, 17:20:07 UTC

Scheduler needs to be kicked again - 7 contacts (from 3 different computers) with no luck on any of them. 64 WU's waiting to report!
____________
.

TPCBF
Send message
Joined: 18 May 99
Posts: 50
Credit: 1,187,896
RAC: 3,902
United States
Message 1333601 - Posted: 1 Feb 2013, 17:29:28 UTC - in response to Message 1333566.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?
Yes, WCG had a two day scheduled outage last month to fix a specific performance effecting issue with the main MySQL database.
And I personally think the overall issues are rather database than bandwidth related. The bandwidth issues are IMHO just a symptom...

Profile petri33
Volunteer tester
Send message
Joined: 6 Jun 02
Posts: 382
Credit: 69,867,719
RAC: 127,816
Finland
Message 1333611 - Posted: 1 Feb 2013, 17:58:20 UTC - in response to Message 1333592.

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX



To me, the graph looks actually quite alright (NOW) :D

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
____________

Profile {BDC} Thomas DupontProject donor
Volunteer tester
Avatar
Send message
Joined: 9 Dec 11
Posts: 3895
Credit: 1,325,539
RAC: 202
France
Message 1333621 - Posted: 1 Feb 2013, 19:04:42 UTC - in response to Message 1333611.

Yes Petri, NOW ! :p
____________
Founder of team BRIGADE DU COSMOS
Ranked 55th !

Profile [seti.international] Dirk SadowskiProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7101
Credit: 60,913,779
RAC: 17,172
Germany
Message 1333647 - Posted: 1 Feb 2013, 20:28:59 UTC

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR

SETI@home Needs your Help ... $10 & U get a Star!

Team seti.international

Das Deutsche Cafe. The German Cafe.

Rolf
Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 1
Switzerland
Message 1333654 - Posted: 1 Feb 2013, 20:48:42 UTC - in response to Message 1333647.

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


What else?

Iona
Avatar
Send message
Joined: 12 Jul 07
Posts: 567
Credit: 2,914,113
RAC: 2,270
United Kingdom
Message 1333658 - Posted: 1 Feb 2013, 21:11:57 UTC

Hmmm. Validation seems to have taken a break. Now that I've said it, the ones that are currently awaiting validation (several hours in one case) will now promptly validate!



____________
Don't take life too seriously, as you'll never come out of it alive!

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4140
Credit: 33,573,172
RAC: 26,136
United Kingdom
Message 1333667 - Posted: 1 Feb 2013, 21:50:43 UTC
Last modified: 1 Feb 2013, 21:51:00 UTC

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1333669 - Posted: 1 Feb 2013, 21:54:08 UTC - in response to Message 1333667.

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy


And cricket's blue line is headed south
FRANK

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5864
Credit: 60,549,147
RAC: 47,675
Australia
Message 1333672 - Posted: 1 Feb 2013, 21:58:03 UTC - in response to Message 1333669.
Last modified: 1 Feb 2013, 22:00:33 UTC

EDIT- i spoke too soon.
Uploads have started piling up here.
____________
Grant
Darwin NT.

1 · 2 · 3 · 4 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (81) Server Problems?

Copyright © 2014 University of California