The Server Issues / Outages Thread - Panic Mode On! (117)

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (117)
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 52 · Next

AuthorMessage
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4261
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2024120 - Posted: 21 Dec 2019, 16:23:07 UTC

I have to assume that porting the beta code over to main has to be a mistake on someone's part and that they didn't really intend to do that.

just seems crazy to make that kind of change when the issues at Beta weren't even resolved.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2024120 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14504
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2024122 - Posted: 21 Dec 2019, 16:24:55 UTC

Sigh. What we really need is a project that allows access to the server logs (like Einstein), but uses standard BOINC code (not like Einstein).
ID: 2024122 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7373
Credit: 44,181,323
RAC: 238
United States
Message 2024126 - Posted: 21 Dec 2019, 16:31:06 UTC

Greetings,

I have a total of 28 CPU WUs left to work on (I counted them and verified with BT) and 0 GPU WUs. My task list says I have 533 WUs in progress. This, on this host. What gives?

I have set my main to NNT per Richards "recipe". I suppose I will do this with my other Linux PC as well. I haven't a clue about archiving the project directory. My app_info.xml file is in the project directory. Makes no sense to "delete" after archiving. :\

Have a great day and Merry Christmas to all! :)

Siran
CAPT Siran d'Vel'nahr XO - L L & P _\\//
USS Vre'kasht NCC-33187
Winders 10 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2024126 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13154
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2024127 - Posted: 21 Dec 2019, 16:32:19 UTC - in response to Message 2024078.  

I had 55 tasks to report on anonymous platform and setting NNT did the trick. Whew.

I set to 16 to report and set NNT and only got:

Serenity

16407 SETI@home 12/21/2019 8:24:10 AM Scheduler request failed: HTTP service unavailable
16408 SETI@home 12/21/2019 8:24:10 AM [sched_op] Deferring communication for 02:10:33
16409 SETI@home 12/21/2019 8:24:10 AM [sched_op] Reason: Scheduler request failed

I still have over 2000 to report.

Extremely bad decision to change to server software that doesn't support anonymous platform. That means hundreds of hosts will not be able to run Seti anymore.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2024127 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2024130 - Posted: 21 Dec 2019, 16:38:11 UTC - in response to Message 2024119.  
Last modified: 21 Dec 2019, 16:43:58 UTC

From looking at the SSP it's obvious most people are receiving and returning work. I'm also receiving and returning work after renaming the app_info.xml so the Host runs as Stock.
Everything on Main is now just the way BETA was working when I couldn't get the BETA Server to work under Anonymous platform on numerous machines. Eric may wish to review all those PMs I sent him about Anonymous platform not working at BETA...

To be honest you're not meant to run anonymous at Beta. Unless you are testing new apps.
Most of the Mac Apps at BETA are from Me, a number of them are on Main. The latest ones are from here;
https://setiathome.berkeley.edu/forum_thread.php?id=84805
https://setiathome.berkeley.edu/forum_thread.php?id=84927
Unfortunately the CUDA Special App 0.98b1 CUDA90 wasn't accepted, it would be nice to have the CUDA Special App on Main about now.

BTW, Eric said he looked the Server code over a couple of times and said he couldn't find any reason why Anonymous platform wouldn't work. I wouldn't hold my breath if I were waiting for it to be fixed.
ID: 2024130 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14504
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2024133 - Posted: 21 Dec 2019, 16:41:40 UTC - in response to Message 2024127.  

Extremely bad decision to change to server software that doesn't support anonymous platform. That means hundreds of hosts will not be able to run Seti anymore.
Agreed, but we haven't got to the bottom of how/why it was changed yet. I'd be 99.9% certain it wasn't deliberate: I keep a close eye on code changes, and if I'd seen a sniff of a suggestion like that, I'd have pounced on it. Nada.

What is certainly true is that when David is fiddling with the server code, he sometimes forgets to update the anonymous platform path as well. I'm looking for code changes that might suffer that defect, and testing one as we speak. If that fails, I'm going to build an app_info for one of my other projects which is also reporting server version 715, to see if that fails too (we still don't know if this is a general BOINC coding problem, or something specific to the Berkeley-BOINC-SETI experimental environment).
ID: 2024133 · Report as offensive
Sleepy
Volunteer tester
Avatar

Send message
Joined: 21 May 99
Posts: 219
Credit: 98,947,784
RAC: 28,360
Italy
Message 2024137 - Posted: 21 Dec 2019, 16:52:05 UTC
Last modified: 21 Dec 2019, 16:52:31 UTC

I made some tests on a spare host.
It was on anonymous as well, therefore I renamed all the anonymous related xml files, then started it.

Nothing particular happened: again "Project has no tasks".

Then I said: let's reset it. Still no joy (maybe US hosts are asking for tasks).
In the process, all anonymous configuration files, executable, tasks, everything went into the dustbin.
No problem for me, I can reconfigure that host in minutes and tasks were few. No much harm inflicted to the database.

But I stress the warning already given by Richard: save in some safe place your current Seti directory before resetting!!
In case the habit has made you forget this little detail.

Do not reset from stock configuration while you have your anonymous configuration files only in your active Seti directory. It will vanish, together with the effort to optimise everything (probably there is no hope for the tasks, though. I do not remember and I am not going to try again).
ID: 2024137 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 813
Credit: 2,361,516
RAC: 22
United States
Message 2024138 - Posted: 21 Dec 2019, 16:52:10 UTC

I totally agree about WHEN you introduce new code to the main server... not on a Friday, and not during a big holiday season. This is madness that could have been prevented.

Here are my silly questions:
- Could the not allowing anonymous servers help with the bad machines validating each other and polluting the db issue??
-I would think without beta running that main should run better, not worse. This makes no sense to me. Without beta's load, shouldn't main run better? Why is main dependent on beta??
ID: 2024138 · Report as offensive
Profile Mr. Kevvy Crowdfunding Project Donor*Special Project $250 donor
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 15 May 99
Posts: 3562
Credit: 1,114,826,392
RAC: 3,319
Canada
Message 2024139 - Posted: 21 Dec 2019, 16:53:06 UTC - in response to Message 2024138.  
Last modified: 21 Dec 2019, 17:10:06 UTC

- Could the not allowing anonymous servers help with the bad machines validating each other and polluting the db issue??


Nope, they are all running the "stock" Windows SETI@Home client.

This is actually making it worse as they will be able to get even more work and trash it by cross-validating now that would otherwise have been completed accurately by a locked-out host!

Ah well, I could always find something else to do, perhaps start early with my Christmas visits to people less fortunate than I am.


I think some XMas shopping and a good lunch with the missus will be a good day and then I can return and find this is resolved. Enjoy. :^)
ID: 2024139 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14504
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2024146 - Posted: 21 Dec 2019, 17:04:47 UTC

Not making much progress here. My test to see if one of the server code changes had been responsible didn't reveal anything new.

Setting up anonymous platform on another project running server 715 gave me a stern red message:

This project doesn't support computers of type anonymous

Which in a way is reassuring: if it was deliberate, we'd know about it by now.

Just gone 9am in Berkeley: I think I'll start drafting an email to Eric.
ID: 2024146 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024152 - Posted: 21 Dec 2019, 17:36:36 UTC - in response to Message 2024146.  
Last modified: 21 Dec 2019, 18:00:08 UTC

Setting up anonymous platform on another project running server 715 gave me a stern red message:

This project doesn't support computers of type anonymous

Which in a way is reassuring: if it was deliberate, we'd know about it by now.

Sad very sad, i think it's time to power down our babies, at least we save some electric power.

And the time was perfect: On a Christmas Long Holiday and in the middle of a fundraiser.

<edit> Did anyone noticed our bandages disappears too.
ID: 2024152 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14504
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2024154 - Posted: 21 Dec 2019, 17:38:01 UTC

I see Eric has just posted in the Flakey AMD/ATI GPU thread, so he's around. I've sent him an email about Anonymous Platform, with details of the host which got stock work after I removed app_info, and timings for a server log search, hopefully.
ID: 2024154 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2024155 - Posted: 21 Dec 2019, 18:09:17 UTC

I'm able to report 256 tasks at a time if I set NNT. If I'm requesting new work at the same time, I get "Scheduler request failed." this is for all 3 of my PCs. The NNT is working every time now; it was not a few hours ago.

Of course, I am just about out of tasks to report at this point. The surest way to get this fixed is go download a bunch of tasks for Einstein...
ID: 2024155 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024157 - Posted: 21 Dec 2019, 18:17:06 UTC - in response to Message 2024155.  
Last modified: 21 Dec 2019, 18:22:39 UTC

I'm able to report 256 tasks at a time if I set NNT. If I'm requesting new work at the same time, I get "Scheduler request failed." this is for all 3 of my PCs. The NNT is working every time now; it was not a few hours ago.

I confirm this. Nice finding.

If you not set NNT you receive: Sat 21 Dec 2019 01:16:04 PM EST | SETI@home | Scheduler request failed: HTTP internal server error
ID: 2024157 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14504
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2024158 - Posted: 21 Dec 2019, 18:26:23 UTC - in response to Message 2024105.  

I removed app_info from an empty machine, reset the project, and allowed new work. Got new tasks at the first attempt - some cuda50 for an ageing GTX 670, requesting NV tasks only.
But since then, the downgraded host hasn't got any more work. It's finished the last task from that batch, and is just waiting to report it.

I see from the SSP that RTS has reached high water mark and inhibited the splitters. So there's not much work going out to the mass stock crew, either.
ID: 2024158 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 2024164 - Posted: 21 Dec 2019, 18:51:15 UTC - in response to Message 2024161.  

Ah well, I went down into my wine cellar , and now I feel much better.
Cheers, and bottom up.
:-)

Wine does solve many problems. One stops caring, no problem. :)
ID: 2024164 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2024165 - Posted: 21 Dec 2019, 19:03:07 UTC

I finally threw up my cpus and activated some long dominant backup projects (Rosetta@Home and Mind Modeling).

The resource allocation is still 1,000 for Seta@Home and 1 for Rosetta but at least something is processing.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2024165 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2024166 - Posted: 21 Dec 2019, 19:03:12 UTC - in response to Message 2024164.  

Wine does solve many problems.

Totally agree!
ID: 2024166 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 17724
Credit: 40,757,560
RAC: 67
United Kingdom
Message 2024168 - Posted: 21 Dec 2019, 19:06:42 UTC - in response to Message 2024110.  

... out of desperation "reset" the project, now the msg is "No tasks available" but it downloaded all the *.png files successfully.
If that's an anonymous platform host, my recipe was

  • report all completed work
  • set NNT
  • archive (zip/7z) the entire remaining contents of the SETI project folder, so you can put it back when this is over
  • delete app_info.xml
  • restart the BOINC client
  • reset the project
  • allow new work


Well that worked.

For about 10 mins.
Got a few tasks, can't see them on a/c task list, crunched, returned ok. But all I've had since is "No tasks available"
ID: 2024168 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2024172 - Posted: 21 Dec 2019, 19:12:28 UTC - in response to Message 2024164.  

Ah well, I went down into my wine cellar , and now I feel much better.
Cheers, and bottom up.
:-)

Wine does solve many problems. One stops caring, no problem. :)


+1
A proud member of the OFA (Old Farts Association).
ID: 2024172 · Report as offensive
Previous · 1 . . . 45 · 46 · 47 · 48 · 49 · 50 · 51 . . . 52 · Next

Message boards : Number crunching : The Server Issues / Outages Thread - Panic Mode On! (117)


 
©2022 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.