Panic Mode On (109) Server Problems?

Message boards : Number crunching : Panic Mode On (109) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 37 · Next

AuthorMessage
Tutankhamon
Volunteer tester
Avatar

Send message
Joined: 1 Nov 08
Posts: 6871
Credit: 43,231,377
RAC: 14,560
Sweden
Message 1906273 - Posted: 10 Dec 2017, 22:13:42 UTC
Last modified: 10 Dec 2017, 22:17:16 UTC

And again, download is terrible.
But it recovered, after 5 minutes of snail pace.
ID: 1906273 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9171
Credit: 118,609,630
RAC: 51,676
Australia
Message 1906374 - Posted: 11 Dec 2017, 7:17:06 UTC
Last modified: 11 Dec 2017, 7:19:39 UTC

Add to the download issues; the Validation/assimilation backlog has been clearing, slowly. Very slowly. However now there is a backlog of WU (AP and MB) awaiting deletion building up. And the replica, after falling behind and catching up again is now back to falling behind, even further behind than previously. Oh, and the Ready-to-send buffer has been rather up and down as the splitters have been unable to crank out the work again.
The system isn't looking too healthy.
And if it's all to do with the current noise bombs resulting in the record received-last-hour numbers, it doesn't bode well for the future if we do get a significant boost in active cruncher numbers to process the new work.

EDIT- and I forgot to mention the application settings related issue that began late last Dec that affects some but not others is still rearing it's unwelcome head. Had to triple update (again) as my cache was running down (again).
Grant
Darwin NT
ID: 1906374 · Report as offensive     Reply Quote
Profile Keith Myers
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 3051
Credit: 203,018,937
RAC: 288,288
United States
Message 1906388 - Posted: 11 Dec 2017, 8:18:21 UTC

I noticed the numbers on the SSP too not being in very healthy shape for some parameters and getting better for others. The site is slow as molasses and you can't look at any hosts' tasks now without the connection timing out.

Hope they can clear the cobwebs out of the system on Tuesday and that the project stays up for Monday without further damage.
Seti@Home classic workunits:20,676 CPU time:74,226 hours
ID: 1906388 · Report as offensive     Reply Quote
Profile Advent42
Avatar

Send message
Joined: 23 Mar 17
Posts: 160
Credit: 2,290,519
RAC: 9,573
Ireland
Message 1906444 - Posted: 11 Dec 2017, 17:42:28 UTC - in response to Message 1906388.  

So far so good...:-)
ID: 1906444 · Report as offensive     Reply Quote
Profile Gary Charpentier
Volunteer tester
Avatar

Send message
Joined: 25 Dec 00
Posts: 21435
Credit: 32,446,494
RAC: 30,620
United States
Message 1906574 - Posted: 12 Dec 2017, 1:50:13 UTC

Just a reminder, when you panic while the outage is on tomorrow you are welcome at https://boinc.berkeley.edu/dev/forum_thread.php?id=8105.
ID: 1906574 · Report as offensive     Reply Quote
Profile Wiggo "Socialist"
Avatar

Send message
Joined: 24 Jan 00
Posts: 12936
Credit: 173,644,846
RAC: 63,020
Australia
Message 1906597 - Posted: 12 Dec 2017, 2:53:11 UTC - in response to Message 1906574.  

Just a reminder, when you panic while the outage is on tomorrow you are welcome at https://boinc.berkeley.edu/dev/forum_thread.php?id=8105.

I've got washing down 2 more rooms booked for tomorrow, but you never know.

Cheers.
ID: 1906597 · Report as offensive     Reply Quote
Profile betregerProject Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 6748
Credit: 16,366,448
RAC: 18,311
United States
Message 1906636 - Posted: 12 Dec 2017, 5:38:40 UTC - in response to Message 1906634.  

Yep
ID: 1906636 · Report as offensive     Reply Quote
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 9171
Credit: 118,609,630
RAC: 51,676
Australia
Message 1906639 - Posted: 12 Dec 2017, 7:09:22 UTC

Splitter output varies from, good, to great, to mostly insufficient. Luckily the brief periods of health are long enough to offset the longer periods where they're barley putting out work.
The MB Validation/Assimilation backlog has almost cleared, unfortunately the MB & AP deletion backlogs are reaching record levels.
And the Replica is at it's 3rd biggest lag for the year, and heading for the second biggest.

And I got home from work to find my main system almost out of work. The Scheduler really is having greater than usual issues. The triple update got me some work, but no where near a full cache, with the weekly outage only hours away.
Grant
Darwin NT
ID: 1906639 · Report as offensive     Reply Quote
Stephen "Heretic"
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 2849
Credit: 54,859,611
RAC: 90,688
Australia
Message 1906675 - Posted: 12 Dec 2017, 12:31:03 UTC - in response to Message 1906639.  

Splitter output varies from, good, to great, to mostly insufficient. Luckily the brief periods of health are long enough to offset the longer periods where they're barley putting out work.
The MB Validation/Assimilation backlog has almost cleared, unfortunately the MB & AP deletion backlogs are reaching record levels.
And the Replica is at it's 3rd biggest lag for the year, and heading for the second biggest.

And I got home from work to find my main system almost out of work. The Scheduler really is having greater than usual issues. The triple update got me some work, but no where near a full cache, with the weekly outage only hours away.


. . Yep, of all times to be getting "no tasks available" messages :(

Stephen

:(
ID: 1906675 · Report as offensive     Reply Quote
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11694
Credit: 110,041,336
RAC: 57,775
United Kingdom
Message 1906688 - Posted: 12 Dec 2017, 12:55:37 UTC

Feels a bit like message 1900400:

For some reason the query that fills the "ready to send" queue is running about 10 times slower than it normally does.
Maybe we'll find out after the outrage. Last time, it was too little memory allocated. We might be able to afford some extra memory now :-)
ID: 1906688 · Report as offensive     Reply Quote
Profile betregerProject Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 6748
Credit: 16,366,448
RAC: 18,311
United States
Message 1906703 - Posted: 13 Dec 2017, 14:17:43 UTC

splitter output is not working
Current result creation rate ** 0/sec 0.0236/sec 0.0563/sec
ID: 1906703 · Report as offensive     Reply Quote
Kissagogo27

Send message
Joined: 6 Nov 99
Posts: 111
Credit: 4,242,411
RAC: 4,312
France
Message 1906708 - Posted: 13 Dec 2017, 14:33:45 UTC
Last modified: 13 Dec 2017, 14:34:24 UTC


13-Dec-2017 15:32:23 [SETI@home] Sending scheduler request: To fetch work.
13-Dec-2017 15:32:23 [SETI@home] Requesting new tasks for CPU and AMD/ATI GPU
13-Dec-2017 15:32:45 [SETI@home] Scheduler request failed: Couldn't connect to server
13-Dec-2017 15:32:48 [---] Project communication failed: attempting access to reference site
13-Dec-2017 15:32:49 [---] Internet access OK - project servers may be temporarily down.


some lags to acces forum too ;)
ID: 1906708 · Report as offensive     Reply Quote
Profile Mike
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 30787
Credit: 59,194,228
RAC: 25,373
Germany
Message 1906716 - Posted: 13 Dec 2017, 15:01:22 UTC - in response to Message 1906708.  


13-Dec-2017 15:32:23 [SETI@home] Sending scheduler request: To fetch work.
13-Dec-2017 15:32:23 [SETI@home] Requesting new tasks for CPU and AMD/ATI GPU
13-Dec-2017 15:32:45 [SETI@home] Scheduler request failed: Couldn't connect to server
13-Dec-2017 15:32:48 [---] Project communication failed: attempting access to reference site
13-Dec-2017 15:32:49 [---] Internet access OK - project servers may be temporarily down.


some lags to acces forum too ;)


It always takes a few hours until the servers catch up with demand.
With each crime and every kindness we birth our future.
ID: 1906716 · Report as offensive     Reply Quote
Profile Dr.Diesel

Send message
Joined: 14 May 99
Posts: 27
Credit: 14,393,872
RAC: 162,271
United States
Message 1906722 - Posted: 13 Dec 2017, 15:12:39 UTC - in response to Message 1906716.  


It always takes a few hours until the servers catch up with demand.


Check out the Database/file status, As of. Servers might be up, or at least showing up, but the project is still down/stuck/messed up.
ID: 1906722 · Report as offensive     Reply Quote
Profile Chris S
Volunteer tester
Avatar

Send message
Joined: 19 Nov 00
Posts: 40355
Credit: 38,155,441
RAC: 54,251
United Kingdom
Message 1906723 - Posted: 13 Dec 2017, 15:12:54 UTC

Quite happy to stay with Milkyway until they sort it out :-))
ID: 1906723 · Report as offensive     Reply Quote
Profile betregerProject Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 6748
Credit: 16,366,448
RAC: 18,311
United States
Message 1906724 - Posted: 13 Dec 2017, 15:13:46 UTC - in response to Message 1906716.  
Last modified: 13 Dec 2017, 15:14:27 UTC

It always takes a few hours until the servers catch up with demand.

For that to happen they will have to start splitting new data.
ID: 1906724 · Report as offensive     Reply Quote
Richard HaselgroveProject Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 11694
Credit: 110,041,336
RAC: 57,775
United Kingdom
Message 1906725 - Posted: 13 Dec 2017, 15:14:05 UTC
Last modified: 13 Dec 2017, 15:33:58 UTC

If you have big backlogs, try reporting fewer tasks at a time.

13/12/2017 15:05:07 | SETI@home | Reporting 150 completed tasks
13/12/2017 15:06:24 | SETI@home | Scheduler request failed: HTTP internal server error

13/12/2017 15:11:38 | SETI@home | Reporting 64 completed tasks
13/12/2017 15:12:17 | SETI@home | Scheduler request completed
and on another machine,

13/12/2017 15:27:28 | SETI@home | Reporting 191 completed tasks
13/12/2017 15:28:24 | SETI@home | Scheduler request failed: HTTP internal server error
May be a particular problem if you run Raistmer's apps with the huge stderr_txt entries. That file was 2.4 MB.

13/12/2017 15:32:01 | SETI@home | Reporting 64 completed tasks
was only 845 KB, and got through.
ID: 1906725 · Report as offensive     Reply Quote
Profile Brent Norman
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 1962
Credit: 130,321,624
RAC: 355,006
Canada
Message 1906729 - Posted: 13 Dec 2017, 15:30:19 UTC

JACKPOT!
ASUS i7 960 SETI@home 13/12/2017 9:26:43 AM Scheduler request completed: got 17 new tasks

Of course my slow computer first, LOL
ID: 1906729 · Report as offensive     Reply Quote
Kissagogo27

Send message
Joined: 6 Nov 99
Posts: 111
Credit: 4,242,411
RAC: 4,312
France
Message 1906730 - Posted: 13 Dec 2017, 15:31:13 UTC

server is back few minutes after ;)


13-Dec-2017 15:35:26 [SETI@home] Sending scheduler request: To fetch work.
13-Dec-2017 15:35:26 [SETI@home] Requesting new tasks for CPU and AMD/ATI GPU
13-Dec-2017 15:36:06 [SETI@home] Scheduler request completed: got 0 new tasks
13-Dec-2017 15:36:06 [SETI@home] Project has no tasks available


0 AP to grab .. normal ;)
ID: 1906730 · Report as offensive     Reply Quote
juan BFP
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 6426
Credit: 359,190,498
RAC: 153,102
Panama
Message 1906739 - Posted: 13 Dec 2017, 15:48:55 UTC - in response to Message 1906729.  

JACKPOT!
ASUS i7 960 SETI@home 13/12/2017 9:26:43 AM Scheduler request completed: got 17 new tasks

Of course my slow computer first, LOL

NO, NADA, nothing down here. LOL
ID: 1906739 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 37 · Next

Message boards : Number crunching : Panic Mode On (109) Server Problems?


 
©2018 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.