Panic Mode On (81) Server Problems?

Message boards : Number crunching : Panic Mode On (81) Server Problems?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 21 · Next

AuthorMessage
Profile arkaynProject Donor
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4184
Credit: 53,049,088
RAC: 56
United States
Message 1333541 - Posted: 1 Feb 2013, 15:19:55 UTC

Scheduler MIA once again!!

ID: 1333541 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,575,000
RAC: 0
Greece
Message 1333550 - Posted: 1 Feb 2013, 15:37:39 UTC

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

ID: 1333550 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 937
Credit: 24,683,166
RAC: 8,412
United Kingdom
Message 1333556 - Posted: 1 Feb 2013, 15:47:01 UTC - in response to Message 1333550.  

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

I second those two motions.
All in favour say "Aye"?
Passed nem con.

ID: 1333556 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6466
Credit: 175,901,183
RAC: 56,290
United States
Message 1333560 - Posted: 1 Feb 2013, 15:52:05 UTC - in response to Message 1333550.  

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

The limits were put in place to deal with a database issue not bandwidth.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the BP6/VP6 User Group today!
ID: 1333560 · Report as offensive
Profile [SETI.Germany] Sutaru Tsureku (aka Dirk :-)
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7085
Credit: 135,065,623
RAC: 125,888
Germany
Message 1333561 - Posted: 1 Feb 2013, 15:52:14 UTC

Again problems (scheduler contact not possible) .. :-(


IIRC, in past SAH had 50 WUs/CPU-thread and 400 WUs/GPU (limit).
.. and it worked much better than currently.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1333561 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,575,000
RAC: 0
Greece
Message 1333566 - Posted: 1 Feb 2013, 16:08:32 UTC - in response to Message 1333560.  

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

ID: 1333566 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11516
Credit: 106,277,485
RAC: 70,947
United Kingdom
Message 1333570 - Posted: 1 Feb 2013, 16:18:57 UTC - in response to Message 1333566.  

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

Yes, Einstein does:

It seems that the large number of tasks (3.7M) is seriously stressing our databases (not only the replica).

For the time being I changed the configuration of the project such that the time "finished" workunits and tasks are kept in the DB is reduced from 7d to 5d.

This means you have less time to inspect your tasks e.g. after validation, but it should reduce the number of tasks we need to keep in our DB in total and make the DBs more responsive.

BM
Technical News, 23 January 2013

(Einstein has 38,869 active volunteers, only a quarter of the 147,408 active volunteers here - figures from BOINCstats)
ID: 1333570 · Report as offensive
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6059
Credit: 350,069,032
RAC: 119,622
Panama
Message 1333576 - Posted: 1 Feb 2013, 16:57:27 UTC

Maybe an intermediary solution, 100 for the CPUs and 300 for the GPUs or at least 100 per GPU on the hosts, example 2 GPU´s = 200WU, 4 GPUs = 400WU?

Not too much, but at least give us 4 to 6 hrs before running in dryland when any problem happens.
ID: 1333576 · Report as offensive
Profile skildude
Avatar

Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,436,947
RAC: 0
Burma
Message 1333590 - Posted: 1 Feb 2013, 17:17:37 UTC - in response to Message 1333576.  
Last modified: 1 Feb 2013, 17:18:06 UTC

had about 50 AP WU's that finally D/led this morning at 50kb/s each. Which is an improvement over the 0.5 I was getting over the last few days

Sadly, I can't get the results to upload now
In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope
ID: 1333590 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1333592 - Posted: 1 Feb 2013, 17:19:32 UTC

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX
ID: 1333592 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 2816
Credit: 43,289,796
RAC: 27,696
United States
Message 1333593 - Posted: 1 Feb 2013, 17:20:07 UTC

Scheduler needs to be kicked again - 7 contacts (from 3 different computers) with no luck on any of them. 64 WU's waiting to report!
.

Hello, Bangkok!
ID: 1333593 · Report as offensive
TPCBF

Send message
Joined: 18 May 99
Posts: 54
Credit: 3,557,096
RAC: 2,289
United States
Message 1333601 - Posted: 1 Feb 2013, 17:29:28 UTC - in response to Message 1333566.  

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?
Yes, WCG had a two day scheduled outage last month to fix a specific performance effecting issue with the main MySQL database.
And I personally think the overall issues are rather database than bandwidth related. The bandwidth issues are IMHO just a symptom...
ID: 1333601 · Report as offensive
Profile petri33Project Donor
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1465
Credit: 269,689,699
RAC: 302,237
Finland
Message 1333611 - Posted: 1 Feb 2013, 17:58:20 UTC - in response to Message 1333592.  

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX



To me, the graph looks actually quite alright (NOW) :D

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1333611 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1333621 - Posted: 1 Feb 2013, 19:04:42 UTC - in response to Message 1333611.  

Yes Petri, NOW ! :p
ID: 1333621 · Report as offensive
Profile [SETI.Germany] Sutaru Tsureku (aka Dirk :-)
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7085
Credit: 135,065,623
RAC: 125,888
Germany
Message 1333647 - Posted: 1 Feb 2013, 20:28:59 UTC

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1333647 · Report as offensive
Rolf

Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 0
Switzerland
Message 1333654 - Posted: 1 Feb 2013, 20:48:42 UTC - in response to Message 1333647.  

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


What else?
ID: 1333654 · Report as offensive
Iona
Avatar

Send message
Joined: 12 Jul 07
Posts: 739
Credit: 8,250,663
RAC: 22,641
United Kingdom
Message 1333658 - Posted: 1 Feb 2013, 21:11:57 UTC

Hmmm. Validation seems to have taken a break. Now that I've said it, the ones that are currently awaiting validation (several hours in one case) will now promptly validate!



Don't take life too seriously, as you'll never come out of it alive!
ID: 1333658 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,181,650
RAC: 386
United Kingdom
Message 1333667 - Posted: 1 Feb 2013, 21:50:43 UTC
Last modified: 1 Feb 2013, 21:51:00 UTC

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy
ID: 1333667 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1333669 - Posted: 1 Feb 2013, 21:54:08 UTC - in response to Message 1333667.  

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy


And cricket's blue line is headed south
FRANK
ID: 1333669 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 8887
Credit: 115,151,630
RAC: 70,428
Australia
Message 1333672 - Posted: 1 Feb 2013, 21:58:03 UTC - in response to Message 1333669.  
Last modified: 1 Feb 2013, 22:00:33 UTC

EDIT- i spoke too soon.
Uploads have started piling up here.
Grant
Darwin NT
ID: 1333672 · Report as offensive
1 · 2 · 3 · 4 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (81) Server Problems?


 
©2017 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.