Panic Mode On (115) Server Problems?

Message boards : Number crunching : Panic Mode On (115) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 30 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1986953 - Posted: 24 Mar 2019, 21:50:46 UTC - in response to Message 1986952.  
Last modified: 24 Mar 2019, 22:27:38 UTC

Try looking at who #28 is listed as belonging to, https://setiathome.berkeley.edu/top_hosts.php?sort_by=expavg_credit&offset=20
ID: 8682108
#28 Magne 171,989.52 8,889,326 7.8.3
GenuineIntel Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz [Family 6 Model 60 Stepping 3] (8 processors)
NVIDIA GeForce GTX 1060 6GB (4095MB) driver: 418.43 OpenCL: 1.2
Linux Ubuntu Ubuntu 18.04.2 LTS [4.15.0-46-generic]

Now look at #8, https://setiathome.berkeley.edu/top_hosts.php
ID: 8690734
#8 Magne 363,269.95 18,080,802 7.4.44
GenuineIntel Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz [Family 6 Model 60 Stepping 3] (8 processors)
[5] NVIDIA GeForce GTX 1060 6GB (4095MB) driver: 418.43
Linux 4.15.0-46-generic

Now note Magne only has One Linux Machine, https://setiathome.berkeley.edu/hosts_user.php?userid=118177
You should also note the RAC on his User page, Recent average credit 199,322.93
Most people would conclude it is the SAME machine.

You can also look at Juan's page and note the Credit; https://setiathome.berkeley.edu/show_user.php?userid=8606388
Total credit: 450,616,983
But when you look at his host, https://setiathome.berkeley.edu/hosts_user.php?userid=8606388
ID: 8662921: Total credit: 1,375,790,492
So, how does a Total of 450,616k translate to a Host of 1,375,790k?
Strange things around here...
ID: 1986953 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1986954 - Posted: 24 Mar 2019, 21:55:17 UTC - in response to Message 1986919.  

Results received in last hour ** 0 1,478 146,664

The results received in the last hour is higher than our usual 120k. It is surprising to me that the number is higher than usual even with the high amount of AP results returned as well. Are we getting garbage WUs right now? or have more fast machines joined the cause?? I'm really happy that the seti system seems to be handling things well and that the replica is also keeping up.


. . I have had some large numbers of noise bombs so I am guessing it is the former ...

Stephen

<shrug>
ID: 1986954 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1986956 - Posted: 24 Mar 2019, 22:04:46 UTC - in response to Message 1986953.  
Last modified: 24 Mar 2019, 22:32:14 UTC

Ah I see now. I had only clicked the links you provided and they showed different Owners.

But since stats on the second page haven’t updated yet that’s why it’s still there. (And shows anonymous when you actually click on it from the second page)

The anonymous setting must be what happens when a host gets deleted. I believe when this happened to Juan also, it deleted his old host too. I’m aware of the contradiction between Juan’s host and user stats. I posted about that some time ago in another thread.

Edit: found the old post where it was discussed previously.

https://setiathome.berkeley.edu/forum_thread.php?id=83833&postid=1978139#1978139
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1986956 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1987021 - Posted: 25 Mar 2019, 13:16:52 UTC

. . Aaahh! Here we go again, 24 hours (or so) to maintenance outage and the uploads are playing up .... :(

Stephen

:(
ID: 1987021 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 1987024 - Posted: 25 Mar 2019, 13:24:46 UTC - in response to Message 1987021.  

. . Aaahh! Here we go again, 24 hours (or so) to maintenance outage and the uploads are playing up .... :(

Stephen

:(

Just like a switch was flipped ...
ID: 1987024 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5126
Credit: 276,046,078
RAC: 462
Message 1987025 - Posted: 25 Mar 2019, 13:34:49 UTC - in response to Message 1987024.  

. . Aaahh! Here we go again, 24 hours (or so) to maintenance outage and the uploads are playing up .... :(

Stephen

:(

Just like a switch was flipped ...


Mon 25 Mar 2019 08:32:57 AM CDT | SETI@home | Temporarily failed upload of blc23_2bit_guppi_58405_84300_HIP86137_0023.13785.818.21.44.234.vlar_0_r1167924088_0: transient HTTP error


Or maybe it is a "Friday issue". The servers get "congested" right before maintenance?

Tom
A proud member of the OFA (Old Farts Association).
ID: 1987025 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1987030 - Posted: 25 Mar 2019, 14:07:43 UTC

Hopeful that the elves in the lab notice the upload server is acting up and fix it today when they get in.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1987030 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1987031 - Posted: 25 Mar 2019, 14:20:51 UTC

For those of you who never bother to ask yourselves which transient error:

25/03/2019 14:17:58 | SETI@home | [http] [ID#3033] Info:  Failed to connect to setiboincdata.ssl.berkeley.edu port 80: Timed out
That's one we can't do anything about - don't mess with any files.
ID: 1987031 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11451
Credit: 29,581,041
RAC: 66
United States
Message 1987048 - Posted: 25 Mar 2019, 15:35:33 UTC

Up loads have started working again but it is a painful process.
ID: 1987048 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1987051 - Posted: 25 Mar 2019, 15:54:30 UTC
Last modified: 25 Mar 2019, 15:54:52 UTC

i can mostly stay on top of my uploads on the 2 slower machines, but the beast is just too fast. uploads go slower than the machine can process the WUs. the the pending uploads is just growing and growing. this is with max file transfers set to it's max of 8.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1987051 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1987085 - Posted: 25 Mar 2019, 18:01:29 UTC

I'm also getting a lot of my uploads stuck at 100% while others timeout and just go into retry loops.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1987085 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1987087 - Posted: 25 Mar 2019, 18:20:53 UTC

PANIC! The ready to send queue is low, and people are having time out issues for a while, something is wrong!!

Can we tell if it is a Berkeley connection issue or a seti one?? is it a problem with the upload machine??

Time for someone to check the cables or kick the machine.
ID: 1987087 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1987089 - Posted: 25 Mar 2019, 18:24:58 UTC - in response to Message 1987087.  
Last modified: 25 Mar 2019, 18:25:19 UTC

PANIC! The ready to send queue is low, and people are having time out issues for a while, something is wrong!!

Can we tell if it is a Berkeley connection issue or a seti one?? is it a problem with the upload machine??

Time for someone to check the cables or kick the machine.

Richard, already posted the issue is with the upload server.

25/03/2019 14:17:58 | SETI@home | [http] [ID#3033] Info: Failed to connect to setiboincdata.ssl.berkeley.edu port 80: Timed out
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1987089 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 1987092 - Posted: 25 Mar 2019, 18:35:22 UTC

Greetings,

I don't get it! WUs are at 100% progress and then go into resend mode. How the heck does THAT work? The only thing I can think of, which doesn't sound logical, is that the data was sent 100% from the host yet the recipient has not gotten it so the WU goes back into resend mode.

Have a nice day, if you can. ;)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1987092 · Report as offensive
Cherokee150

Send message
Joined: 11 Nov 99
Posts: 192
Credit: 58,513,758
RAC: 74
United States
Message 1987098 - Posted: 25 Mar 2019, 19:11:02 UTC - in response to Message 1987092.  
Last modified: 25 Mar 2019, 19:19:09 UTC

Hi Siran.
From the technical side of the Internet, this most likely means either the receiver never sent the acknowledgement that the final packet in the file was received properly, or that the last packet was not received before the time limit was reached. Of course, there could be a number of reasons why the file was not received correctly or within the required time fame. That is most likely what someone is, or soon will be, researching at Berkeley.

I hope that they will find the problem quickly, and that the fix is easy to do.

I have found today that (if you have the time), going into the Transfer tab, selecting all the units trying to upload, then clicking the "Retry Now" button each time all have backed off, will get them completed within a few minutes of retrying. This only works when the cause of the failed updates is related to certain types of technical problems. Unfortunately, many times the issue at hand during one of these events will preclude the success of this retry procedure. Today, though, we are in luck, i.e., the retry6 does work.

I hope this helps.
ID: 1987098 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7381
Credit: 44,181,323
RAC: 238
United States
Message 1987102 - Posted: 25 Mar 2019, 19:20:09 UTC - in response to Message 1987098.  

Hi Siran.
From the technical side of the Internet, this most likely means either the receiver never sent the acknowledgement that the final packet in the file was received properly, or that the last packet was not received before the time limit was reached. Of course, there could be a number of reasons why the file was not received correctly or within the required time fame. That is most likely what someone is, or soon will be, researching at Berkeley.

I hope that they will find the problem quickly, and that the fix is easy to do.

I have found today that (if you have the time), going into the Transfer tab, selecting all the units trying to upload, then clicking the "Retry Now" button each time all have backed off, will get them completed within a few minutes of retrying. This only works when the cause of the failed updates is related to certain types of technical problems. Unfortunately, many times the issue at hand during one of these events will preclude the success of this retry procedure. Today, though, we are in luck, i.e., the retry6 does work.

I hope this help.

Hi Cherokee,

Yeah, that's about what I was getting at. I guess it was more logical than I thought. ;)

I do try the "Retry Now" and some do get through. I don't like to do it too often though since there are thousands of us trying to upload our finished work. :)

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 1987102 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1987104 - Posted: 25 Mar 2019, 19:27:46 UTC

splitting isn't happening properly either from the looks of it. It is around lunchtime for the Seti crew. Does anyone know if they are aware there is an issue??
ID: 1987104 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1859
Credit: 268,616,081
RAC: 1,349
United States
Message 1987105 - Posted: 25 Mar 2019, 19:36:12 UTC - in response to Message 1987092.  

I don't get it! WUs are at 100% progress and then go into resend mode. How the heck does THAT work?


That last Ack/Nak to finish the transfer either didn't get sent or got eaten by a router. Quite a few also die in mid-stream. All it takes is to time out on retries looking for that last handshake.
ID: 1987105 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1987106 - Posted: 25 Mar 2019, 19:40:13 UTC - in response to Message 1987104.  

I haven't seen any notice or post that anyone has contacted staff. So they might not be aware of the issue if they haven't looked at the forums or the servers stats.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1987106 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11451
Credit: 29,581,041
RAC: 66
United States
Message 1987109 - Posted: 25 Mar 2019, 19:55:22 UTC
Last modified: 25 Mar 2019, 19:56:56 UTC

My avg. work done as is reported on Boinc manager has been stuck for several hrs. This is not normal. Here on the web site it moves around as is normal.
ID: 1987109 · Report as offensive
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · 27 · 28 . . . 30 · Next

Message boards : Number crunching : Panic Mode On (115) Server Problems?


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.