Panic Mode On (111) Server Problems?

Message boards : Number crunching : Panic Mode On (111) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 31 · Next

AuthorMessage
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1927420 - Posted: 31 Mar 2018, 1:55:09 UTC

Something's afoot right now.
Got about 10 consecutive 'Project has no tasks available' work request responses.
Loading my 'all tasks' page taking forever.
Hope it's a transient thingy and not a weekend problem.

Meowsigh.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1927420 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1927421 - Posted: 31 Mar 2018, 1:55:57 UTC - in response to Message 1927391.  

I've posted my work_fetch_debug outputs for you in the past.
Well, I popped your ID and 'work_fetch_debug' into an advanced search, and checked the last 12 months: 11 mentions (including tonight), but no actual logs. It's too late on this side of the pond to start work on it now, but it's still a useful tool, worth trying sometime.

I know I have in the past. It may have been past 12 months. I am not going to bother searching through my posts. Easier to just wait for the usual work fetch problem and then invoke the debug so I can get new output.

I use work_fetch_debug all the time as part of my 'kick the servers' tricks. Sometimes it works, sometimes it has no effect. As Stephen mentioned, what works for one person also doesn't for another.

I also know that if I really wanted to avoid all these issues I could just revert to 6.10.58 or 6.10.60 as those revisions don't EVER have issues as Wiggo will contest vehemently. I've been at that point multiple times and have backed away at the last minute. I know that I would have to revert to controlling with app_info instead of app_config but that is easily done. Just have to be careful in editing.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1927421 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1927430 - Posted: 31 Mar 2018, 2:38:51 UTC

On the upside, with little work going out the RTS is building up nicely :)
Hopefully when the replica lag gets to 0 the scheduler will kick back in ... hopefully.
ID: 1927430 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1927433 - Posted: 31 Mar 2018, 2:47:29 UTC

It's going to get cold unless I turn off the A/C if the caches go dry. All machines down at least 50% now. The last time I got any work on any machine was just shy of 2 hours ago.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1927433 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1927440 - Posted: 31 Mar 2018, 3:23:20 UTC

Fri 30 Mar 2018 10:21:00 PM EST | SETI@home | Project has no tasks available


Again... Every weekend is the same. :(

Caches down to less than 300WU,
ID: 1927440 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1927442 - Posted: 31 Mar 2018, 3:40:23 UTC

I was able to contact Eric, and made him aware of the servers not sending out work.
Meow.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1927442 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1927444 - Posted: 31 Mar 2018, 4:23:01 UTC

Is it just me or is there a lag at moment?
ID: 1927444 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1927445 - Posted: 31 Mar 2018, 4:33:27 UTC - in response to Message 1927444.  

Eric just rebooted the servers.
It seems to be settling down now.
Still not getting any work though.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1927445 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1927446 - Posted: 31 Mar 2018, 4:39:17 UTC

Thanks for the heads up :)
ID: 1927446 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1927449 - Posted: 31 Mar 2018, 5:06:33 UTC

Not getting any work for a while, then glitches in all the server graphs, then no response from the Scheduler for a couple of requests, now the Scheduler is responding (after a minute or more), and the response is- "Project has no tasks available".
Grant
Darwin NT
ID: 1927449 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1927450 - Posted: 31 Mar 2018, 5:14:57 UTC - in response to Message 1927420.  

Something's afoot right now.
Got about 10 consecutive 'Project has no tasks available' work request responses.
Loading my 'all tasks' page taking forever.
Hope it's a transient thingy and not a weekend problem.

Meowsigh.


. . I am getting a mixture of "cannot contact server, project may be down" and "project has no work available". Either way my fastest machine is now devoid of work and the slowest machine is devouring its way through what remains of its allotment. The middle machine hit the jackpot and actually refilled its cache on one particuarly well favoured request, but sadly it has been a one of a kind event. (for the last few hours)

Stephen

:(
ID: 1927450 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1927452 - Posted: 31 Mar 2018, 5:23:37 UTC - in response to Message 1927442.  

I was able to contact Eric, and made him aware of the servers not sending out work.
Meow.


. . Poor Eric, his dedication is impressive but a big imposition for him. It always irks me when some user questions the commitment of Eric and his co-workers. Their main priority is to keep the system running (and deal with any upgrades - like Parkes :) - not to play Dear Dorothy. It would always be nice to know what is happening, but nicer to have issues resolved.

Stephen

:)
ID: 1927452 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1856
Credit: 268,616,081
RAC: 1,349
United States
Message 1927455 - Posted: 31 Mar 2018, 5:55:14 UTC

Definitely got some trouble going on ...
ID: 1927455 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9956
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1927459 - Posted: 31 Mar 2018, 7:29:50 UTC - in response to Message 1927455.  

Definitely got some trouble going on ...


Eric has posted:

https://setiathome.berkeley.edu/forum_thread.php?id=82775#1927456
ID: 1927459 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1927461 - Posted: 31 Mar 2018, 7:38:07 UTC

My Android phone now gets scheduler request failed: Failure when receiving data from the peer.

So probably not going to be better after the DB restart.
ID: 1927461 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1927471 - Posted: 31 Mar 2018, 9:24:48 UTC
Last modified: 31 Mar 2018, 9:25:33 UTC

Sat 31 Mar 2018 12:23:57 PM EEST | SETI@home | Project has no tasks available
Results ready to send 0 3,301 1,203,726 23m
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1927471 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1927472 - Posted: 31 Mar 2018, 9:35:00 UTC - in response to Message 1927452.  
Last modified: 31 Mar 2018, 9:47:08 UTC

I was able to contact Eric, and made him aware of the servers not sending out work.
Meow.


. . Poor Eric, his dedication is impressive but a big imposition for him. It always irks me when some user questions the commitment of Eric and his co-workers. Their main priority is to keep the system running (and deal with any upgrades - like Parkes :) - not to play Dear Dorothy. It would always be nice to know what is happening, but nicer to have issues resolved.

Stephen

:)

Yes, Eric actually thanked me last night for letting him know that things were amiss.
He IS a dedicated man.
And it also irks me when some question how much 'the staff' cares about the project.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1927472 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1927473 - Posted: 31 Mar 2018, 9:36:51 UTC - in response to Message 1927459.  

Definitely got some trouble going on ...


Eric has posted:

https://setiathome.berkeley.edu/forum_thread.php?id=82775#1927456

Reading that post, and given the way things are working poorly after Eric's efforts last night, looks like we could be facing being down for most of the day today.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1927473 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13835
Credit: 208,696,464
RAC: 304
Australia
Message 1927474 - Posted: 31 Mar 2018, 9:43:56 UTC - in response to Message 1927473.  

Eric has posted:

https://setiathome.berkeley.edu/forum_thread.php?id=82775#1927456

Reading that post, and given the way things are working poorly after Eric's efforts last night, looks like we could be facing being down for most of the day today.

Yep.
A reboot didn't help, so looks like it's going to require some significant housekeeping to clean up the mess and get things working again.
Grant
Darwin NT
ID: 1927474 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1927475 - Posted: 31 Mar 2018, 9:44:06 UTC

Thankfully with my slow ass laptop I could go next month with no new work and still have work..Hope you big cruncher's get what you need!!
ID: 1927475 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 31 · Next

Message boards : Number crunching : Panic Mode On (111) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.