Panic Mode On (81) Server Problems?

Message boards : Number crunching : Panic Mode On (81) Server Problems?

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 21 · Next

AuthorMessage
Profile arkaynProject Donor
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4097
Credit: 51,576,090
RAC: 1,593
United States
Message 1333541 - Posted: 1 Feb 2013, 15:19:55 UTC

Scheduler MIA once again!!



ID: 1333541 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,573,354
RAC: 256
Greece
Message 1333550 - Posted: 1 Feb 2013, 15:37:39 UTC

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim


ID: 1333550 · Report as offensive
Profile KWSN Ekky Ekky Ekky
Avatar

Send message
Joined: 25 May 99
Posts: 937
Credit: 20,516,552
RAC: 8,164
United Kingdom
Message 1333556 - Posted: 1 Feb 2013, 15:47:01 UTC - in response to Message 1333550.

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

I second those two motions.
All in favour say "Aye"?
Passed nem con.

ID: 1333556 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6085
Credit: 154,977,524
RAC: 47,104
United States
Message 1333560 - Posted: 1 Feb 2013, 15:52:05 UTC - in response to Message 1333550.

Someone must kick a machine maybe at the lab, and PLEASE RAISE THE LIMITS.

100+100 is ridiculous. At times like that, all the machines are going to sleep. Give us 1000 for Gpu and 100 for Cpu.

Just an idea… Maybe they feed us with many wu’s , but BOINC , will not connect every 5 minutes , but every 10, or 15.
By that way we will not use the bandwidth so often, so many computers, because uploads are much smaller than the downloads, even if we must have to upload 50, 100 at a time.

Tim

The limits were put in place to deal with a database issue not bandwidth.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

ID: 1333560 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7066
Credit: 100,897,470
RAC: 60,150
Germany
Message 1333561 - Posted: 1 Feb 2013, 15:52:14 UTC

Again problems (scheduler contact not possible) .. :-(


IIRC, in past SAH had 50 WUs/CPU-thread and 400 WUs/GPU (limit).
.. and it worked much better than currently.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *


ID: 1333561 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,573,354
RAC: 256
Greece
Message 1333566 - Posted: 1 Feb 2013, 16:08:32 UTC - in response to Message 1333560.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim


ID: 1333566 · Report as offensive
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11136
Credit: 83,501,681
RAC: 41,126
United Kingdom
Message 1333570 - Posted: 1 Feb 2013, 16:18:57 UTC - in response to Message 1333566.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?

Tim

Yes, Einstein does:

It seems that the large number of tasks (3.7M) is seriously stressing our databases (not only the replica).

For the time being I changed the configuration of the project such that the time "finished" workunits and tasks are kept in the DB is reduced from 7d to 5d.

This means you have less time to inspect your tasks e.g. after validation, but it should reduce the number of tasks we need to keep in our DB in total and make the DBs more responsive.

BM
Technical News, 23 January 2013

(Einstein has 38,869 active volunteers, only a quarter of the 147,408 active volunteers here - figures from BOINCstats)

ID: 1333570 · Report as offensive
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 5847
Credit: 330,512,932
RAC: 7,686
Panama
Message 1333576 - Posted: 1 Feb 2013, 16:57:27 UTC

Maybe an intermediary solution, 100 for the CPUs and 300 for the GPUs or at least 100 per GPU on the hosts, example 2 GPU´s = 200WU, 4 GPUs = 400WU?

Not too much, but at least give us 4 to 6 hrs before running in dryland when any problem happens.


ID: 1333576 · Report as offensive
Profile ignorance is no excuse
Avatar

Send message
Joined: 4 Oct 00
Posts: 9529
Credit: 44,433,321
RAC: 0
Korea, North
Message 1333590 - Posted: 1 Feb 2013, 17:17:37 UTC - in response to Message 1333576.
Last modified: 1 Feb 2013, 17:18:06 UTC

had about 50 AP WU's that finally D/led this morning at 50kb/s each. Which is an improvement over the 0.5 I was getting over the last few days

Sadly, I can't get the results to upload now


In a rich man's house there is no place to spit but his face.
Diogenes Of Sinope

End terrorism by building a school

ID: 1333590 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1333592 - Posted: 1 Feb 2013, 17:19:32 UTC

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX


ID: 1333592 · Report as offensive
Profile KWSN THE Holy Hand Grenade!
Volunteer tester
Avatar

Send message
Joined: 20 Dec 05
Posts: 2577
Credit: 34,649,310
RAC: 20,075
United States
Message 1333593 - Posted: 1 Feb 2013, 17:20:07 UTC

Scheduler needs to be kicked again - 7 contacts (from 3 different computers) with no luck on any of them. 64 WU's waiting to report!


.

ID: 1333593 · Report as offensive
TPCBF

Send message
Joined: 18 May 99
Posts: 54
Credit: 2,726,393
RAC: 920
United States
Message 1333601 - Posted: 1 Feb 2013, 17:29:28 UTC - in response to Message 1333566.

I am not an expert but I cannot deal with the fact that we have a database issue for so long and it isn’t finish yet.
Other projects have the same problems with database?
Yes, WCG had a two day scheduled outage last month to fix a specific performance effecting issue with the main MySQL database.
And I personally think the overall issues are rather database than bandwidth related. The bandwidth issues are IMHO just a symptom...

ID: 1333601 · Report as offensive
Profile petri33Project Donor
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1184
Credit: 165,785,767
RAC: 157,370
Finland
Message 1333611 - Posted: 1 Feb 2013, 17:58:20 UTC - in response to Message 1333592.

Hi all,
Problem of transfer of WU since a few hours...
Anybody has information on this subject ?
http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets
THX



To me, the graph looks actually quite alright (NOW) :D

http://fragment1.berkeley.edu/newcricket/grapher.cgi?target=/router-interfaces/inr-250/gigabitethernet2_3&ranges=d%3Aw&view=Octets

ID: 1333611 · Report as offensive
Thomas
Volunteer tester

Send message
Joined: 9 Dec 11
Posts: 1499
Credit: 1,345,576
RAC: 0
France
Message 1333621 - Posted: 1 Feb 2013, 19:04:42 UTC - in response to Message 1333611.

Yes Petri, NOW ! :p


ID: 1333621 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7066
Credit: 100,897,470
RAC: 60,150
Germany
Message 1333647 - Posted: 1 Feb 2013, 20:28:59 UTC

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *


ID: 1333647 · Report as offensive
Rolf

Send message
Joined: 16 Jun 09
Posts: 114
Credit: 7,817,146
RAC: 0
Switzerland
Message 1333654 - Posted: 1 Feb 2013, 20:48:42 UTC - in response to Message 1333647.

My BOINC can reach again the SAH scheduler.

.. the DL of the WUs is very slow - but mostly in waiting loop.


What else?

ID: 1333654 · Report as offensive
Iona
Avatar

Send message
Joined: 12 Jul 07
Posts: 625
Credit: 5,015,755
RAC: 0
United Kingdom
Message 1333658 - Posted: 1 Feb 2013, 21:11:57 UTC

Hmmm. Validation seems to have taken a break. Now that I've said it, the ones that are currently awaiting validation (several hours in one case) will now promptly validate!



Don't take life too seriously, as you'll never come out of it alive!

ID: 1333658 · Report as offensive
ClaggyProject Donor
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4622
Credit: 46,334,291
RAC: 2,939
United Kingdom
Message 1333667 - Posted: 1 Feb 2013, 21:50:43 UTC
Last modified: 1 Feb 2013, 21:51:00 UTC

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy

ID: 1333667 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1333669 - Posted: 1 Feb 2013, 21:54:08 UTC - in response to Message 1333667.

Uploads aren't going through now, at Seti Main, nor Seti Beta,

Claggy


And cricket's blue line is headed south
FRANK

ID: 1333669 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 7474
Credit: 90,865,618
RAC: 45,141
Australia
Message 1333672 - Posted: 1 Feb 2013, 21:58:03 UTC - in response to Message 1333669.
Last modified: 1 Feb 2013, 22:00:33 UTC

EDIT- i spoke too soon.
Uploads have started piling up here.


Grant
Darwin NT

ID: 1333672 · Report as offensive
1 · 2 · 3 · 4 . . . 21 · Next

Message boards : Number crunching : Panic Mode On (81) Server Problems?


 
©2016 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.