Panic Mode On (113) Server Problems?

Author	Message
Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1959759 - Posted: 11 Oct 2018, 16:37:10 UTC - in response to Message 1959756. Falling . . . falling . . . . . falling The website never fixed the problems with the servers before the outage. And the servers still are not right. Too many tasks not validated. Impossible to look at any of your hosts because the task list never refreshes, you only see the spinner. You mean the human being - probably singular - driving maintenance didn't do it. Not enough donations means not enough staff. ID: 1959759 ·

Brent Norman Volunteer tester Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835	Message 1959763 - Posted: 11 Oct 2018, 16:54:58 UTC Yesterday about this time of day there was an outage as well. Possibly something to do with the heavy assimilation load plus an additional daily script???? ID: 1959763 ·

rob smith Volunteer moderator Volunteer tester Send message Joined: 7 Mar 03 Posts: 22202 Credit: 416,307,556 RAC: 380	Message 1959780 - Posted: 11 Oct 2018, 18:37:14 UTC It's not dead, it's just slumbering in a dark corner.... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? ID: 1959780 ·

KWSN THE Holy Hand Grenade! Volunteer tester Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0	Message 1959786 - Posted: 11 Oct 2018, 18:59:08 UTC Last modified: 11 Oct 2018, 19:04:00 UTC For everyone bellyacheing about the "No tasks available" tag when there are tasks shown as available: That full number (500k+) of WU's are not in a table in memory - the table in memory that the scheduler uses to fill requests for work is much smaller, (like 400 tasks, last time I checked...) and refills once a second. (at least, the last time someone explained this to me it was "refills once a second"â€¦) When you get the "no tasks available" message, it's this 400 WU buffer that is out... and with the long (5 minute plus a few seconds) wait between honored requests for work, you may run into it repeatedly if demand for WU's is high... In short, SETI is showing the NORMAL behavior for itself when there is high demand for WU's... . Hello, from Albany, CA!... ID: 1959786 ·

JohnDK Volunteer tester Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127	Message 1959800 - Posted: 11 Oct 2018, 21:05:45 UTC No new work for 30+ mins, panic? ID: 1959800 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1959801 - Posted: 11 Oct 2018, 21:07:06 UTC The last work requests from any of my hosts that netting anything other than 0 was an hour ago. Caches falling fast. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1959801 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1959805 - Posted: 11 Oct 2018, 21:40:39 UTC - in response to Message 1959801. The last work requests from any of my hosts that netting anything other than 0 was an hour ago. Caches falling fast. Alll is working fine from here. Receiving new work as normal. ID: 1959805 ·

lunkerlander Send message Joined: 23 Jul 18 Posts: 82 Credit: 1,353,232 RAC: 4	Message 1959810 - Posted: 11 Oct 2018, 22:36:43 UTC - in response to Message 1959782. Last modified: 11 Oct 2018, 22:41:17 UTC Ah well, it's not as if they're analyzing the tasks we crunch, so there's millions and millions of finished tasks just sitting there for years and years..... ET may have been found years ago, but I seriously doubt we'll ever know. Nebula too, like NTPCkr will likely just disappear from the discussions. So, it's really only the credit hounds that's in dire need of an endless flow of new tasks. Not for scientific reasons, only for credit reasons :-) I've thought about this too! With the amount of time, effort, and money that has been put into this project for so long, the data that have been created haven't been put to the best use. I think that it would benefit the project scientifically more to invest resources into Nebula than to buying more GPUs to crunch more work units. How many people are working on Nebula? Is it just one person in his spare time? Perhaps a team of people could collaborate and troubleshoot problems with Nebula. You guys have been a great resource for everyone here in the forums when it comes to helping getting computers and BOUND issues solved. If even 1% of us users had some programming knowledge, they could offer support. If others would donate half of what a new gpu would cost towards finishing the nebula backend, it would do much more for the project than just produce new data from workunits that aren't being analysed. I'm not trying to sound negative. I'm very interested in this project and would like to see it succeed. I just think a lot more emphasis needs to be given to nebula, perhaps even via discussions here on this forum. ID: 1959810 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1959813 - Posted: 11 Oct 2018, 22:57:19 UTC - in response to Message 1959810. Unfortunately, we as contributors don't have a explicit say as to where our donations are used. We don't even know whether specific fundraising programs like last years hardware purchase donation event actually gets used to buy hardware or not. The project is not very transparent in that regard. Occasionally we get little tidbits of information about past fundraising. I just hope that my donation last year actually bought hardware for Parkes as it was proposed. Now I suppose we should start a hardware purchase fundraiser for the servers since they are the ones constantly having issues this year. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1959813 ·

Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22	Message 1959835 - Posted: 12 Oct 2018, 4:38:20 UTC I don't think we have enough files loaded to get us through the weekend. While the seti team is great about giving us more on the weekends, I thought maybe they should load the data on Friday, and hopefully the system will behave and then they can have a weekend without worrying about it. ID: 1959835 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1959840 - Posted: 12 Oct 2018, 6:20:03 UTC - in response to Message 1959786. In short, SETI is showing the NORMAL behavior for itself when there is high demand for WU's... No, it's not. Generally after an outage, if there is work in the Ready-to-send buffer a request will result in some work, even after extended outages. It's actually very unusual to not get work when there are WUs ready to do. The days of getting multiple "Project has no tasks available messages" were in the distant past. Given the present low demand for work due to the long GBT WU runtimes & lack of Arecibo work, there is some other issue causing the Scheduler to not issue work, or to not have work to issue. It is not normal behaviour. Grant Darwin NT ID: 1959840 ·

Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62	Message 1959870 - Posted: 12 Oct 2018, 11:41:01 UTC Hardware changes made more demands of tasks ... Sever still use old behavior of considered " normal user with normal crunching machine " new hardware with old rules won't work ! It's time for a Major Upgrade ! nah ! that's all but does the minds have changed too ? we have the money ... some or more , but , who makes the decisions ? the same old minded ones ? it's time to do a kick in the anthill .. Sorry for any convenience this may cause, but this must be done ! not to evolve is to disappear indeed ID: 1959870 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1959873 - Posted: 12 Oct 2018, 12:15:23 UTC The Server had been working fine since the last change allowing Arecibo VLARs on GPUs. Recently some problem has developed that causes the Server to lose contact with the RTS and a few other items, obvious when the As Of* times start rising. When that happens tasks are not sent to Hosts and the As Of* times start increasing, this is NOT Normal. If you look right now the As Of* times are a little off, that is not normal either, but, isn't causing problems right now. It will probably cause problems later though, something just isn't right. The other day the Splitters were running when contact was lost, they kept running until contact was restored, and had split enough that the total was around 1.2 million tasks when contact was restored. When you see those impossible Creation Rate numbers it's because when contact is restored and there is a large difference, the numbers have to be large enough to make up for the difference. That's how you can get a 156/sec number. It wasn't really 156/sec, but it had to make up for the difference between 600K and 1200k. This is NOT Normal. Hopefully the problem will be identified and solved soon. Looking at the SSP right now, I see As Of numbers off. I'm expecting the thing to croak at any time. ID: 1959873 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1959875 - Posted: 12 Oct 2018, 12:29:36 UTC - in response to Message 1959873. The SSP isn't updated continuously - there's normally a 10 minute or 20 minute interval between updates. Also, not every value is updated simultaneously. It's normal to see one or more As of* values between the main page updates. I wouldn't take any notice of an As of* less than 20 minutes - and at the moment, the largest is 6 minutes. ID: 1959875 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1959876 - Posted: 12 Oct 2018, 12:33:17 UTC - in response to Message 1959875. The thing will croak later today, probably just after everyone leaves for the weekend...watch. ID: 1959876 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1959885 - Posted: 12 Oct 2018, 12:49:12 UTC - in response to Message 1959876. The thing will croak later today, probably just after everyone leaves for the weekend...watch. Murphys law could explain that! ID: 1959885 ·

Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22	Message 1959904 - Posted: 12 Oct 2018, 13:41:17 UTC something is definitely wrong with the system still. I'd like to hopefully think that it is related to handling the last mess and thus won't happen once things are back to normal, but it is probably something else. It really bothers me that the throttle on splitting isn't working. If the machine is going to take a break from handing out WUs then the splitters should stop. It really shouldn't be taking a 30 minute break from handing out WUs anyway. In good news Results returned and awaiting validation is now a healthier 4 million instead of 10 million. It could probably still go lower. Results for db purge is at 5 million and hopefully will lower once other numbers are lower and the system doesn't have to spend time on the backlog cleaning from the last mess. ID: 1959904 ·

Stephen "Heretic" Volunteer tester Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628	Message 1959906 - Posted: 12 Oct 2018, 13:59:15 UTC - in response to Message 1959782. And back again at "Project has no tasks available " Ah well, it's not as if they're analyzing the tasks we crunch, so there's millions and millions of finished tasks just sitting there for years and years..... ET may have been found years ago, but I seriously doubt we'll ever know. Nebula too, like NTPCkr will likely just disappear from the discussions. So, it's really only the credit hounds that's in dire need of an endless flow of new tasks. Not for scientific reasons, only for credit reasons :-) . . You are so very cynical ... Stephen :( ID: 1959906 ·

RickToTheMax Send message Joined: 22 May 99 Posts: 105 Credit: 7,958,297 RAC: 0	Message 1959967 - Posted: 12 Oct 2018, 22:36:13 UTC It is holding up so far.. wonder if we will survive Saturday and Sunday tho..! And we've got enough tapes for the weekend now. ID: 1959967 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1960238 - Posted: 14 Oct 2018, 6:24:22 UTC Is anybody else noticing a higher than usual number of pending validation results? Nothing looks out of kilter on the SS page. It could just be down to I have a new GPU ID: 1960238 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.