Message boards :
Number crunching :
Panic Mode On (113) Server Problems?
Message board moderation
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 37 · Next
Author | Message |
---|---|
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14667 Credit: 200,643,578 RAC: 874 |
Falling . . . falling . . . . . fallingYou mean the human being - probably singular - driving maintenance didn't do it. Not enough donations means not enough staff. |
Brent Norman Send message Joined: 1 Dec 99 Posts: 2786 Credit: 685,657,289 RAC: 835 |
Yesterday about this time of day there was an outage as well. Possibly something to do with the heavy assimilation load plus an additional daily script???? |
rob smith Send message Joined: 7 Mar 03 Posts: 22401 Credit: 416,307,556 RAC: 380 |
It's not dead, it's just slumbering in a dark corner.... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
For everyone bellyacheing about the "No tasks available" tag when there are tasks shown as available: That full number (500k+) of WU's are not in a table in memory - the table in memory that the scheduler uses to fill requests for work is much smaller, (like 400 tasks, last time I checked...) and refills once a second. (at least, the last time someone explained this to me it was "refills once a second"…) When you get the "no tasks available" message, it's this 400 WU buffer that is out... and with the long (5 minute plus a few seconds) wait between honored requests for work, you may run into it repeatedly if demand for WU's is high... In short, SETI is showing the NORMAL behavior for itself when there is high demand for WU's... . Hello, from Albany, CA!... |
JohnDK Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 |
No new work for 30+ mins, panic? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
The last work requests from any of my hosts that netting anything other than 0 was an hour ago. Caches falling fast. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
The last work requests from any of my hosts that netting anything other than 0 was an hour ago. Caches falling fast. Alll is working fine from here. Receiving new work as normal. |
lunkerlander Send message Joined: 23 Jul 18 Posts: 82 Credit: 1,353,232 RAC: 4 |
Ah well, it's not as if they're analyzing the tasks we crunch, so there's millions and millions of finished tasks just sitting there for years and years..... I've thought about this too! With the amount of time, effort, and money that has been put into this project for so long, the data that have been created haven't been put to the best use. I think that it would benefit the project scientifically more to invest resources into Nebula than to buying more GPUs to crunch more work units. How many people are working on Nebula? Is it just one person in his spare time? Perhaps a team of people could collaborate and troubleshoot problems with Nebula. You guys have been a great resource for everyone here in the forums when it comes to helping getting computers and BOUND issues solved. If even 1% of us users had some programming knowledge, they could offer support. If others would donate half of what a new gpu would cost towards finishing the nebula backend, it would do much more for the project than just produce new data from workunits that aren't being analysed. I'm not trying to sound negative. I'm very interested in this project and would like to see it succeed. I just think a lot more emphasis needs to be given to nebula, perhaps even via discussions here on this forum. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Unfortunately, we as contributors don't have a explicit say as to where our donations are used. We don't even know whether specific fundraising programs like last years hardware purchase donation event actually gets used to buy hardware or not. The project is not very transparent in that regard. Occasionally we get little tidbits of information about past fundraising. I just hope that my donation last year actually bought hardware for Parkes as it was proposed. Now I suppose we should start a hardware purchase fundraiser for the servers since they are the ones constantly having issues this year. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
I don't think we have enough files loaded to get us through the weekend. While the seti team is great about giving us more on the weekends, I thought maybe they should load the data on Friday, and hopefully the system will behave and then they can have a weekend without worrying about it. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13824 Credit: 208,696,464 RAC: 304 |
In short, SETI is showing the NORMAL behavior for itself when there is high demand for WU's... No, it's not. Generally after an outage, if there is work in the Ready-to-send buffer a request will result in some work, even after extended outages. It's actually very unusual to not get work when there are WUs ready to do. The days of getting multiple "Project has no tasks available messages" were in the distant past. Given the present low demand for work due to the long GBT WU runtimes & lack of Arecibo work, there is some other issue causing the Scheduler to not issue work, or to not have work to issue. It is not normal behaviour. Grant Darwin NT |
Kissagogo27 Send message Joined: 6 Nov 99 Posts: 716 Credit: 8,032,827 RAC: 62 |
Hardware changes made more demands of tasks ... Sever still use old behavior of considered " normal user with normal crunching machine " new hardware with old rules won't work ! It's time for a Major Upgrade ! nah ! that's all but does the minds have changed too ? we have the money ... some or more , but , who makes the decisions ? the same old minded ones ? it's time to do a kick in the anthill .. Sorry for any convenience this may cause, but this must be done ! not to evolve is to disappear indeed |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
The Server had been working fine since the last change allowing Arecibo VLARs on GPUs. Recently some problem has developed that causes the Server to lose contact with the RTS and a few other items, obvious when the As Of* times start rising. When that happens tasks are not sent to Hosts and the As Of* times start increasing, this is NOT Normal. If you look right now the As Of* times are a little off, that is not normal either, but, isn't causing problems right now. It will probably cause problems later though, something just isn't right. The other day the Splitters were running when contact was lost, they kept running until contact was restored, and had split enough that the total was around 1.2 million tasks when contact was restored. When you see those impossible Creation Rate numbers it's because when contact is restored and there is a large difference, the numbers have to be large enough to make up for the difference. That's how you can get a 156/sec number. It wasn't really 156/sec, but it had to make up for the difference between 600K and 1200k. This is NOT Normal. Hopefully the problem will be identified and solved soon. Looking at the SSP right now, I see As Of numbers off. I'm expecting the thing to croak at any time. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14667 Credit: 200,643,578 RAC: 874 |
The SSP isn't updated continuously - there's normally a 10 minute or 20 minute interval between updates. Also, not every value is updated simultaneously. It's normal to see one or more As of* values between the main page updates. I wouldn't take any notice of an As of* less than 20 minutes - and at the moment, the largest is 6 minutes. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
The thing will croak later today, probably just after everyone leaves for the weekend...watch. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
The thing will croak later today, probably just after everyone leaves for the weekend...watch. Murphys law could explain that! |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
something is definitely wrong with the system still. I'd like to hopefully think that it is related to handling the last mess and thus won't happen once things are back to normal, but it is probably something else. It really bothers me that the throttle on splitting isn't working. If the machine is going to take a break from handing out WUs then the splitters should stop. It really shouldn't be taking a 30 minute break from handing out WUs anyway. In good news Results returned and awaiting validation is now a healthier 4 million instead of 10 million. It could probably still go lower. Results for db purge is at 5 million and hopefully will lower once other numbers are lower and the system doesn't have to spend time on the backlog cleaning from the last mess. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
And back again at "Project has no tasks available " . . You are so very cynical ... Stephen :( |
RickToTheMax Send message Joined: 22 May 99 Posts: 105 Credit: 7,958,297 RAC: 0 |
It is holding up so far.. wonder if we will survive Saturday and Sunday tho..! And we've got enough tapes for the weekend now. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Is anybody else noticing a higher than usual number of pending validation results? Nothing looks out of kilter on the SS page. It could just be down to I have a new GPU |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.