Panic Mode On (81) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (81) Server Problems?

1 · 2 · 3 · 4 . . . 21 · Next
Author Message
Profile arkayn
Volunteer tester
Avatar
Send message
Joined: 14 May 99
Posts: 3588
Credit: 47,338,019
RAC: 427
United States
Message 1333541 - Posted: 1 Feb 2013, 15:19:55 UTC

Scheduler MIA once again!!
____________

Profile Tim
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 196
Credit: 231,964,552
RAC: 112,738
Greece
Message 1333550 - Posted: 1 Feb 2013, 15:37:39 UTC

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

____________

Profile KWSN Ekky Ekky Ekky
Avatar
Send message
Joined: 25 May 99
Posts: 922
Credit: 10,835,905
RAC: 12,867
United Kingdom
Message 1333556 - Posted: 1 Feb 2013, 15:47:01 UTC - in response to Message 1333550.

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

I second those two motions.
All in favour say "Aye"?
Passed nem con.
____________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 3841
Credit: 106,346,866
RAC: 90,496
United States
Message 1333560 - Posted: 1 Feb 2013, 15:52:05 UTC - in response to Message 1333550.

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

The limits were put in place to deal with a database issue not bandwidth.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,130,236
RAC: 20,638
Germany
Message 1333561 - Posted: 1 Feb 2013, 15:52:14 UTC

Again problems (scheduler contact not possible) .. :-(


IIRC, in past SAH had 50 WUs/CPU-thread and 400 WUs/GPU (limit).
.. and it worked much better than currently.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Profile Tim
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 196
Credit: 231,964,552
RAC: 112,738
Greece
Message 1333566 - Posted: 1 Feb 2013, 16:08:32 UTC - in response to Message 1333560.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

____________

Richard Haselgrove
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8373
Credit: 46,544,441
RAC: 13,827
United Kingdom
Message 1333570 - Posted: 1 Feb 2013, 16:18:57 UTC - in response to Message 1333566.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

Yes, Einstein does:

It seems that the large number of tasks (3.7M) is seriously stressing our databases (not only the replica).

For the time being I changed the configuration of the project such that the time "finished" workunits and tasks are kept in the DB is reduced from 7d to 5d.

This means you have less time to inspect your tasks e.g. after validation, but it should reduce the number of tasks we need to keep in our DB in total and make the DBs more responsive.

BM
Technical News, 23 January 2013

(Einstein has 38,869 active volunteers, only a quarter of the 147,408 active volunteers here - figures from BOINCstats)

juan BFB
Volunteer tester
Avatar
Send message
Joined: 16 Mar 07
Posts: 4923
Credit: 267,156,257
RAC: 337,439
Brazil
Message 1333576 - Posted: 1 Feb 2013, 16:57:27 UTC

Maybe an intermediary solution, 100 for the CPUs and 300 for the GPUs or at least 100 per GPU on the hosts, example 2 GPU´s = 200WU, 4 GPUs = 400WU?

Not too much, but at least give us 4 to 6 hrs before running in dryland when any problem happens.
____________

Profile ignorance is no excuse
Avatar
Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,274
RAC: 0
Korea, North
Message 1333590 - Posted: 1 Feb 2013, 17:17:37 UTC - in response to Message 1333576.
Last modified: 1 Feb 2013, 17:18:06 UTC

had about 50 AP WU's that finally D/led this morning at 50kb/s each. Which is an improvement over the 0.5 I was getting over the last few days

Sadly, I can't get the results to upload now
____________
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

Profile {BDC} Thomas Dupont
Volunteer tester
Avatar
Send message
Joined: 9 Dec 11
Posts: 3601
Credit: 1,287,942
RAC: 693
France
Message 1333592 - Posted: 1 Feb 2013, 17:19:32 UTC

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX
____________
Team Founder BRIGADE DU COSMOS




BRIGADE DU COSMOS is proudly sponsored by Zenovia Digital Exchange

Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar
Send message
Joined: 20 Dec 05
Posts: 1893
Credit: 9,081,266
RAC: 9,279
United States
Message 1333593 - Posted: 1 Feb 2013, 17:20:07 UTC

Scheduler needs to be kicked again - 7 contacts (from 3 different computers) with no luck on any of them. 64 WU's waiting to report!
____________
.

TPCBF
Send message
Joined: 18 May 99
Posts: 50
Credit: 918,306
RAC: 1,931
United States
Message 1333601 - Posted: 1 Feb 2013, 17:29:28 UTC - in response to Message 1333566.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?
Yes, WCG had a two day scheduled outage last month to fix a specific performance effecting issue with the main MySQL database.
And I personally think the overall issues are rather database than bandwidth related. The bandwidth issues are IMHO just a symptom...

Profile petri33
Volunteer tester
Send message
Joined: 6 Jun 02
Posts: 372
Credit: 66,600,345
RAC: 86,169
Finland
Message 1333611 - Posted: 1 Feb 2013, 17:58:20 UTC - in response to Message 1333592.

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX



To me, the graph looks actually quite alright (NOW) :D

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
____________

Profile {BDC} Thomas Dupont
Volunteer tester
Avatar
Send message
Joined: 9 Dec 11
Posts: 3601
Credit: 1,287,942
RAC: 693
France
Message 1333621 - Posted: 1 Feb 2013, 19:04:42 UTC - in response to Message 1333611.

Yes Petri, NOW ! :p
____________
Team Founder BRIGADE DU COSMOS




BRIGADE DU COSMOS is proudly sponsored by Zenovia Digital Exchange

Profile [seti.international] Dirk Sadowski
Volunteer tester
Avatar
Send message
Joined: 6 Apr 07
Posts: 7018
Credit: 59,130,236
RAC: 20,638
Germany
Message 1333647 - Posted: 1 Feb 2013, 20:28:59 UTC

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
____________
BR



>Das Deutsche Cafe. The German Cafe.<

Rolf
Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,807,521
RAC: 297
Switzerland
Message 1333654 - Posted: 1 Feb 2013, 20:48:42 UTC - in response to Message 1333647.

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


What else?

Iona
Avatar
Send message
Joined: 12 Jul 07
Posts: 549
Credit: 2,690,420
RAC: 304
United Kingdom
Message 1333658 - Posted: 1 Feb 2013, 21:11:57 UTC

Hmmm. Validation seems to have taken a break. Now that I've said it, the ones that are currently awaiting validation (several hours in one case) will now promptly validate!



____________
Don't take life too seriously, as you'll never come out of it alive!

Claggy
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4039
Credit: 32,691,192
RAC: 761
United Kingdom
Message 1333667 - Posted: 1 Feb 2013, 21:50:43 UTC
Last modified: 1 Feb 2013, 21:51:00 UTC

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy

fscheel
Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1333669 - Posted: 1 Feb 2013, 21:54:08 UTC - in response to Message 1333667.

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy


And cricket's blue line is headed south
FRANK

Grant (SSSF)
Send message
Joined: 19 Aug 99
Posts: 5683
Credit: 56,077,656
RAC: 49,959
Australia
Message 1333672 - Posted: 1 Feb 2013, 21:58:03 UTC - in response to Message 1333669.
Last modified: 1 Feb 2013, 22:00:33 UTC

EDIT- i spoke too soon.
Uploads have started piling up here.
____________
Grant
Darwin NT.

1 · 2 · 3 · 4 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (81) Server Problems?

Copyright © 2014 University of California