Message boards :
Number crunching :
Panic Mode On (105) Server Problems?
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 34 · Next
Author | Message |
---|---|
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Patience rewarded. Caches are filling back up again after the toggle. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
Patience rewarded. Caches are filling back up again after the toggle. Same here. Grant Darwin NT |
Wiggo Send message Joined: 24 Jan 00 Posts: 36387 Credit: 261,360,520 RAC: 489 |
Patience rewarded. Caches are filling back up again after the toggle. My pendings are shrinking while my current valids have gone through the roof. Cheers. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
My pendings are shrinking while my current valids have gone through the roof. Likewise. I'm expecting another boost in about 10 days as another batch of outstanding work gets re-issued. Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
What can I say? Once again, the kitties have had no trouble getting their caches full, with no meowing around with settings. I still have no clue why some of you are having such difficulties. "Time is simply the mechanism that keeps everything from happening all at once." |
UniMatrixZ Send message Joined: 2 Feb 01 Posts: 102 Credit: 30,826,065 RAC: 3 |
My linux machine was having trouble this morning getting the Project has no tasks available. It was down to 11 GPU WU then all the sudden boom cache full. Strange problem! "SETI is probably the most important quest of our time, and it amazes me that governments and corporations are not supporting it sufficiently."- Arthur C. Clarke 2006 |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
News about the full-Amazon outage of last Tuesday: it was a big one. At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process. Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended. Removing a significant portion of the capacity caused each of these systems to require a full restart. While these subsystems were being restarted, S3 was unable to service requests. Other AWS services in the US-EAST-1 Region that rely on S3 for storage, including the S3 console, Amazon Elastic Compute Cloud (EC2) new instance launches, Amazon Elastic Block Store (EBS) volumes (when data was needed from a S3 snapshot), and AWS Lambda were also impacted while the S3 APIs were unavailable. I wonder what the team member will have to talk about at his/her next review, if he/she's still working there. :) |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
Hoping some new tapes will get tossed towards the splitters before today is done, else it will be a long cold weekend... Last BLC is about done. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Heads up messages sent to Eric and Jeff.................. All I can do except crunch. Meow. "Time is simply the mechanism that keeps everything from happening all at once." |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Jeff says that new data should be appearing over the next few hours. So, there should be no panic. Meow. "Time is simply the mechanism that keeps everything from happening all at once." |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Well, there's a problem with the tasks named 14dc10ab.26018.24607.5.32.*. I have had several that run to 0.003% and then seemingly get stuck on my new RX470. They crash after 4 minutes, or immediately when I exit & restart BOINC. |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Well, there's a problem with the tasks named 14dc10ab.26018.24607.5.32.*. I have had several that run to 0.003% and then seemingly get stuck on my new RX470. They crash after 4 minutes, or immediately when I exit & restart BOINC. Maybe run nVidia? I am sorry if I am not really able to diagnose with all NV cards online. Maybe somebody else can step in. "Time is simply the mechanism that keeps everything from happening all at once." |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
Well, there's a problem with the tasks named 14dc10ab.26018.24607.5.32.*. I have had several that run to 0.003% and then seemingly get stuck on my new RX470. They crash after 4 minutes, or immediately when I exit & restart BOINC. Just forced my Manager to run them, SoG and GTX 1070s. 5min 5-10sec run time. Edit- 5min 2-15sec outliers. How long are they running to get to the 0.0003% point? Generally the first 15-20 seconds on my system are the CPU setting up the WU and the percentage done stays at 0%, then it starts counting up as the GPU starts processing. It looks like yours are failing at the point the GPU starts to crunch. What application are you running? Have you tried a re-boot? Any recent driver changes? Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
I run anything the servers send. And do not cry about if some things do not get me creds at the same rate as others. There is a scientific point for every WU the project sends out, and that is what I signed up for. The rest of you should get on board and stop whining. I am currently running 8 computers, each with at least 2 GPUs on board. And I spend a lot of time on the boards. So IF the servers are coughing up furballs, I am very commonly the first one to notice it, and advise the authorities. If the project is not sending out work to everbody that requests it, I step in quickly. And usually within a short time of my notifying them, the admins step in and fix it. This is a volunteer project. Nobody ever promised all work, all creds, all the time. And as long as the servers are up, I am rather tired of the vomiting here at times. "Time is simply the mechanism that keeps everything from happening all at once." |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Thanks Mark for being the liaison between the NC forums members and the scientists. Looks like Jeff is on top of the issue. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
I am rather tired of the vomiting here at times. And some of us are rather tired about you abusing people who are just pointing out an issue, no whining or bleating involved. Since late December, there have been issues with getting work, even when there is plenty available. Sometimes it's not as bad as others, and other times it's a major hassle, so we point it out. There are times the servers are up, but the web site, forums & Scheduler have been either missing in action, or extremely slow to respond, so we point it out. We were running low on work to split, so someone pointed that out. Someone is having uses with some work units, and mentioned it. It's not whinging, whining, bleating or vomiting. It's just pointing out an issue. But if you really want someone to start carrying on, then keep abusing those that are just pointing out issues and i'm sure they'll give you something to really complain about. Or get the issues fixed, then then there will be no need to point them out in the first place. I'm in favour of the second option. Grant Darwin NT |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Thanks Mark for being the liaison between the NC forums members and the scientists. Looks like Jeff is on top of the issue. The software gets tangled up at times. Did any one of you ever have to reboot a computer to solve a problem? I shall bet you have. The Seti servers are not Google level servers. The project does not have that kind of money, and as much as I wish, I do not have that much to donate to the project. The Seti project does not have Google level backups, though they deserve it. The project is run on some simple rack level servers in a remote location. They are prone to errors like my own are. EXCEPT............. When they have a minute of down time, half the world notices it. Nobody notices if the kitties are down. BTW, I have two rigs down at the moment, and an not in a big hurry to pick them back up. So, the moral of the story is this............. We need a new Panic Mode thread, please. And everybody should not use it unless the project is in danger mode. There are other threads, and you can even create your own if you think your situation requires individual attention. Could you PLEASE stop using the 'Panic Mode' thread for any other purposes? "Time is simply the mechanism that keeps everything from happening all at once." |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13835 Credit: 208,696,464 RAC: 304 |
We need a new Panic Mode thread, please. No, we don't need a new Panic Mode thread, and it is not for when the project is in danger mode. It's for when there are issues with the project, for reporting them and discussing them. About the only off topic posts here are about the Amazon outage, you complaining about people mentioning project issues (in the very thread that was started for them), Keith thanking you, and me responding to you. 4 out of 59 posts is bugger all in my book. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
What Grant, I, and others have pointed out ..... is that the software is tangled up ALL the time. Ever since December when the project put in the fix for the 8.22 ATI app users. That screwed up things for the Nvidia users. What I find infuriating is the project not acknowledging that there is an issue. As Grant eloquently wrote, that is not whining or whinging. It is simply pointing out there is an problem. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
I shall never report a problem to Eric again. You can get your own avenues. As I am one of the top users on Seti, I have a path through his many filters. Good luck. "Time is simply the mechanism that keeps everything from happening all at once." |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.