Panic Mode On (80) Server Problems?

Message boards : Number crunching : Panic Mode On (80) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 25 · Next

AuthorMessage
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1327054 - Posted: 12 Jan 2013, 8:36:06 UTC - in response to Message 1326191.  
Last modified: 12 Jan 2013, 8:55:35 UTC


I don't believe they have the time and resources to do what they want but to say the guys in the lab "don't care" is to me and insult.


Bernie, I don't know what to suggest except that you either read what I wrote and accept that I used my words correctly, or that you just be insulted or offended all over the place.

I said that "I fear," meaning that I am afraid of that, not that I have evidence of that.

I fear one of my children may die before me, because I would find that a traumatic event.

In a similar, but far, far less important way, I would hate to learn that the guys in the lab don't care.

I'm hardly a coward. My word choice was not a clever way of saying something without saying it.

If I meant, "The guys in the lab at SETI don't care about this project anymore since they are busy with their high-speed spectrometer and other data they've recently acquired from Green Bank," I would have no problem, at all, saying that, on the record, right here.

If I held that opinion, it certainly wouldn't be pointed at, or meant to offend, or insult, you.

That isn't even what I said. Read it again with a different inflection.

Perhaps we are having a U.S. vs. U.K. usage of words problem?

"and insult" to whom, Bernie?
ID: 1327054 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1327056 - Posted: 12 Jan 2013, 8:41:54 UTC - in response to Message 1326414.  

That they've gone to the effort to inform the fund raisers of their requirements would indicate to most people that they do care.



Hey Grant,

I may know more about that than you do, and it's entirely possible that I was the very first person to contribute to that fundraiser and made the first public plea for others to do likewise --- before it was even announced.

BUT, I don't need, or appreciate, you or anyone else putting words in my mouth.

The word is "fear" and denotes anxiety.

If I need someone to tell me what I meant, I'll be sure to call for help.
ID: 1327056 · Report as offensive
Mark Lybeck

Send message
Joined: 9 Aug 99
Posts: 245
Credit: 216,677,290
RAC: 173
Finland
Message 1327058 - Posted: 12 Jan 2013, 8:44:42 UTC

Very close to 3 minute TCP timeout for scheduler. The Scheduler transactions seems to take some time again:

12/01/2013 10:17:20 | SETI@home | update requested by user
12/01/2013 10:17:24 | SETI@home | [sched_op] Starting scheduler request
12/01/2013 10:17:24 | SETI@home | Sending scheduler request: Requested by user.
12/01/2013 10:17:24 | SETI@home | Reporting 36 completed tasks, not requesting new tasks
12/01/2013 10:17:24 | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
12/01/2013 10:17:24 | SETI@home | [sched_op] NVIDIA work request: 0.00 seconds; 0.00 devices
12/01/2013 10:17:47 | SETI@home | Scheduler request failed: Couldn't connect to server
12/01/2013 10:17:47 | SETI@home | [sched_op] Deferring communication for 3 hr 3 min 49 sec
12/01/2013 10:17:47 | SETI@home | [sched_op] Reason: Scheduler request failed
12/01/2013 10:17:57 | SETI@home | work fetch resumed by user
12/01/2013 10:36:51 | SETI@home | update requested by user
12/01/2013 10:36:55 | SETI@home | [sched_op] Starting scheduler request
12/01/2013 10:36:55 | SETI@home | Sending scheduler request: Requested by user.
12/01/2013 10:36:55 | SETI@home | Reporting 36 completed tasks, requesting new tasks for CPU and NVIDIA
12/01/2013 10:36:55 | SETI@home | [sched_op] CPU work request: 202266.66 seconds; 0.00 devices
12/01/2013 10:36:55 | SETI@home | [sched_op] NVIDIA work request: 224640.00 seconds; 2.00 devices
12/01/2013 10:39:50 | SETI@home | Scheduler request completed: got 128 new tasks
12/01/2013 10:39:50 | SETI@home | [sched_op] Server version 701
12/01/2013 10:39:50 | SETI@home | Project requested delay of 303 seconds
12/01/2013 10:39:50 | SETI@home | [sched_op] estimated total CPU task duration: 206174 seconds
12/01/2013 10:39:50 | SETI@home | [sched_op] estimated total NVIDIA task duration: 88822 seconds

ID: 1327058 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1327084 - Posted: 12 Jan 2013, 10:52:00 UTC - in response to Message 1327056.  
Last modified: 12 Jan 2013, 10:54:20 UTC

BUT, I don't need, or appreciate, you or anyone else putting words in my mouth.

I didn't put any words there, i used the exact words you posted.


I did say my "fear" is that nobody cares, and by "nobody" I meant anyone who could do anything about server issues.

right from the post you made- http://setiathome.berkeley.edu/forum_thread.php?id=70431&postid=1326112


I said that "I fear," meaning that I am afraid of that, not that I have evidence of that.

I fear one of my children may die before me, because I would find that a traumatic event.

In a similar, but far, far less important way, I would hate to learn that the guys in the lab don't care.

The only reason to fear something as you describe it there, is because you consider it possible.
No, you didn't come out & say out right that they don't care, you just implied it.


If I need someone to tell me what I meant, I'll be sure to call for help.

I have no idea what you meant, i can only go by what you post & what you posted implies that those running the project don't care about the issues they are having.
Grant
Darwin NT
ID: 1327084 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1327099 - Posted: 12 Jan 2013, 11:56:51 UTC - in response to Message 1327084.  

The only reason to fear something as you describe it there, is because you consider it possible.

... what you posted implies that those running the project don't care about the issues they are having.

Non sequitur. Saying something is possible doesn't turn it into a certainty.

"When you have eliminated the impossible, whatever remains, however improbable, must be the truth" (Sherlock Holmes). But that doesn't apply here - there are a lot more possibilities left than the highly unlikely one that the project staff don't care, as evidenced by Matt's New Year post.
ID: 1327099 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 1327116 - Posted: 12 Jan 2013, 14:15:35 UTC
Last modified: 12 Jan 2013, 14:24:58 UTC

[panic on]
Last well scheduler server contact at 12 Jan 2013, 12:34:52 UTC.
Since then:
Scheduler request failed: Couldn't connect to server
Scheduler request failed: Failure when receiving data from the peer

[/panic on]

[panic off]
Just now.. - OK, at 14:14 UTC again contact. ;-)
[/panic off]

[EDIT:
[panic on]
It was just one well contact since then again ..
Scheduler request failed: Couldn't connect to server

[/panic on]]


* Best regards! :-) * Sutaru Tsureku, team seti.international founder. * Optimize your PC for higher RAC. * SETI@home needs your help. *
ID: 1327116 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1327125 - Posted: 12 Jan 2013, 15:32:46 UTC

No work or scheduler connects for over 24 hours now per BOINCTasks. Have been out of gpu work for about 18 hours, crunching Einstein on zero resource share to stay busy.

I'm puzzled by the Cricket Graph. It still shows downloads are maxxed out, saying that many are getting work. Is this another issue like the old HE conncection issue where only certain computers or IP addresses are blocked?
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1327125 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1327155 - Posted: 12 Jan 2013, 18:48:40 UTC

Finally got thru - logjam is slightly abating as the incoming traffic
has finally made it above 10
Cur: 10.12 Mbits/sec
First Contact in over 24 hours with or without proxy.
lost tasks were mostly shorties
funny thing was I got one of those red BOINC messages saying I lost contact
with the internet during the request that finally made it thru??
ID: 1327155 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1327156 - Posted: 12 Jan 2013, 18:58:32 UTC

Im just going to let my machines go to other projects. two allready are, One still has work for maybe 8 hours. They are going to shut down for maintenace anyway so im not going to fight an uphill battle trying to get work.
[/quote]

Old James
ID: 1327156 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1327163 - Posted: 12 Jan 2013, 19:37:09 UTC

if i could download as many work units as i have `scheduler list` and `master file` i would be crunching fine :((

All i am getting is:-

12/01/2013 17:01:47 | SETI@home | Fetching scheduler list
12/01/2013 17:01:49 | SETI@home | Master file download succeeded
12/01/2013 17:01:55 | SETI@home | Sending scheduler request: To report completed tasks.
12/01/2013 17:01:55 | SETI@home | Reporting 100 completed tasks, not requesting new tasks
12/01/2013 17:02:17 | SETI@home | Scheduler request failed: Couldn't connect to server
12/01/2013 17:02:20 | | Project communication failed: attempting access to reference site
12/01/2013 17:02:21 | | Internet access OK - project servers may be temporarily down.

Haz some one stolen the server
ID: 1327163 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1327169 - Posted: 12 Jan 2013, 19:44:02 UTC - in response to Message 1327099.  
Last modified: 12 Jan 2013, 19:46:00 UTC

Non sequitur. Saying something is possible doesn't turn it into a certainty.

I agree.
Just because someone believes something doesn't make it so. But beleiving something to be possible indicates that you consider it likely and implies that you consider it to be the case.

But that doesn't apply here - there are a lot more possibilities left than the highly unlikely one that the project staff don't care, as evidenced by Matt's New Year post.

Yet another indication that they do care.
Grant
Darwin NT
ID: 1327169 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1327173 - Posted: 12 Jan 2013, 20:04:23 UTC


Back to the panic-

If you do manage to get some work it won't last long, they appear to be all shorties.
Grant
Darwin NT
ID: 1327173 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1327182 - Posted: 12 Jan 2013, 21:12:58 UTC - in response to Message 1327173.  


And for some time now the splitters have been unable to do more than 15/s, and the Ready to Send buffer that as a resault of the limited splitter output had been steadily declining, thanks to the shorties, is now falling like a stone...
Grant
Darwin NT
ID: 1327182 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1327187 - Posted: 12 Jan 2013, 21:20:29 UTC

More Ammo to have shorties handled like VLAR's especially when AP's are being Split.!!!

ID: 1327187 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1327191 - Posted: 12 Jan 2013, 21:49:02 UTC

Wouldn't really help much. The problem with VLAR is a computational one, many GPU just freak out and so produce wrong answers, or get stalled in near infinite loops. The problem with shorties is that everyone gets through them so much faster than a normal WU that everyone, be they a slow CPU cruncher or a mega fast GPU cruncher is getting through them about 5 times faster than a normal one the demand for WUs goes up by a factor of five - perhaps the "cure" is actually the reverse - only distribute shorties when the fastest crunchers are stuffed up to the gills with APs.....
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1327191 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1327200 - Posted: 12 Jan 2013, 22:29:26 UTC - in response to Message 1327191.  

Rob,

I agree that only distributing Shorties when AP's are not being distrubuted would help.

The point I was making is if you turn Shorties into VLAR's (should be fairly easy if they do it for the real VLAR's) then the CPU shorties would take
5 times as long as the GPU shorties.and not ask for more, more, mpore, like they do now.
ID: 1327200 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1327270 - Posted: 13 Jan 2013, 2:54:14 UTC - in response to Message 1327099.  

The only reason to fear something as you describe it there, is because you consider it possible.

... what you posted implies that those running the project don't care about the issues they are having.

Non sequitur. Saying something is possible doesn't turn it into a certainty.

"When you have eliminated the impossible, whatever remains, however improbable, must be the truth" (Sherlock Holmes). But that doesn't apply here - there are a lot more possibilities left than the highly unlikely one that the project staff don't care, as evidenced by Matt's New Year post.


Inferences are out of my control.

What's worse; we've changed the subject and are in danger of running through the entire Oxford English Dictionary defining "care."

Since the discussion isn't fun, I suppose I'll give it a rest.

I will, however, point-out that I know that Eric interrupted turkey basting to go kick a server and someone has seen to some issues on a Sunday.

In light of those facts, and the reader's knowing I am aware of those facts, make whatever of my statement of fear "thou wilt," whosoever has the desire to make anything of it at all.

I surrender.

In the meantime, we can all resume complaining about and enlightening others about our interactions with the servers, as we have with regularity for several months.
ID: 1327270 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1327272 - Posted: 13 Jan 2013, 3:40:37 UTC
Last modified: 13 Jan 2013, 3:41:21 UTC

I punched the update button 14 times in the past 30 minutes and finally got some work reported. Most of the failed contacts were "couldn't connect to server" after 10-30 seconds. The successful contact took 104 seconds to complete.

Uploads go through in less than 5 seconds on the first try. I have yet to see one of those fail.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1327272 · Report as offensive
Keith White
Avatar

Send message
Joined: 29 May 99
Posts: 392
Credit: 13,035,233
RAC: 22
United States
Message 1327274 - Posted: 13 Jan 2013, 3:45:04 UTC

Well it appears that they will be running out of MB units soon so without those clogging the pipes the reports should start getting through.

Sort of a good news/bad news situation since that means our queues will run dry. But since the AC is scheduled to be worked on Monday/Tuesday it was going to happen anyways.
"Life is just nature's way of keeping meat fresh." - The Doctor
ID: 1327274 · Report as offensive
tbret
Volunteer tester
Avatar

Send message
Joined: 28 May 99
Posts: 3380
Credit: 296,162,071
RAC: 40
United States
Message 1327285 - Posted: 13 Jan 2013, 5:00:51 UTC - in response to Message 1327274.  

But since the AC is scheduled to be worked on Monday/Tuesday it was going to happen anyways.


But you know, in a way the AC repair is a thing to be celebrated. The last time I can remember that the AC needed to be repaired it had unexpectedly died and a bunch of stuff overheated and stopped.

I'm also glad that the resources exist to repair it.

ID: 1327285 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 25 · Next

Message boards : Number crunching : Panic Mode On (80) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.