Panic Mode On (108) Server Problems?

Message boards : Number crunching : Panic Mode On (108) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 29 · Next

AuthorMessage
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1898801 - Posted: 3 Nov 2017, 1:28:20 UTC - in response to Message 1898798.  

It's looking like things might be clearing up now, 4 boxes have received a partial load.
ID: 1898801 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898802 - Posted: 3 Nov 2017, 1:35:47 UTC
Last modified: 3 Nov 2017, 1:36:26 UTC

My crunch-only machines shut down for their weekday siesta before they ran out of work. However, my daily driver has been struggling, too, but occasionally gets a sudden burst of work. About 15 minutes ago, it got 27 tasks, all Arecibo non-VLAR, spread across CPU and GPU. Checking my event log on that machine, I noticed that I haven't gotten any BLC or Arecibo VLAR tasks for the last 11 hours.
ID: 1898802 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898803 - Posted: 3 Nov 2017, 1:36:27 UTC - in response to Message 1898773.  

I have no clue why the servers are calculating that I am requesting that much work. It should be 2 days worth. I have 92 CPU tasks on board now. Zero GPU tasks.

The Host ID is 8306366
It's not the servers that calculate that - it's your own client doing the requesting. Asking for two days of work for each of 8 CPUs - that would be 16 days. You must have had some left.

I'm pretty sure that isn't how BOINC counts processors. I have ONE CPU in that machine. It gets a quota of 100 tasks at any one time. It has just has ONE CPU with 8 Threads. It should be asking for 172800 seconds of CPU work if the CPU cache is empty based on your formula.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898803 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898805 - Posted: 3 Nov 2017, 1:44:44 UTC

Wow! Talk about chutzpah. Just received the donor request for more funds. Why should I continue to contribute my monies to the project when they can't maintain it properly? I have contributed considerable funds to the project for the past several years. I am having a hard time reconciling my emotions right now with continuing support of the project when they can't send work to my expensive crunchers.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898805 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1898807 - Posted: 3 Nov 2017, 1:51:31 UTC - in response to Message 1898778.  

That's why I suspect there's something odd about the way Keith's Linux client is doing the requesting, which causes the server to fall over with an error. And of course if the server daemon falls over, it has to restart and re-cache whatever it held in memory - that'll slow things down.

Keith's log contained:

690			11/2/2017 13:31:02	[http] [ID#0] Sent header to server: ÿ	
702	SETI@home	11/2/2017 13:31:02	[http] [ID#1] Sent header to server: t (x86_64-pc-linux-gnu 7.8.3)
704	SETI@home	11/2/2017 13:31:02	[http] [ID#1] Sent header to server: Ac


. . Would that be his BOINC client adding those characters?

Stephen

??
ID: 1898807 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1898808 - Posted: 3 Nov 2017, 1:59:53 UTC - in response to Message 1898805.  

Wow! Talk about chutzpah. Just received the donor request for more funds. Why should I continue to contribute my monies to the project when they can't maintain it properly? I have contributed considerable funds to the project for the past several years. I am having a hard time reconciling my emotions right now with continuing support of the project when they can't send work to my expensive crunchers.


. . Maybe that is why they need more donations, to solve the problem with the download servers :)

Stephen

In for the long haul??
ID: 1898808 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898809 - Posted: 3 Nov 2017, 2:25:45 UTC - in response to Message 1898808.  

Wow! Talk about chutzpah. Just received the donor request for more funds. Why should I continue to contribute my monies to the project when they can't maintain it properly? I have contributed considerable funds to the project for the past several years. I am having a hard time reconciling my emotions right now with continuing support of the project when they can't send work to my expensive crunchers.


. . Maybe that is why they need more donations, to solve the problem with the download servers :)

Stephen

In for the long haul??

No I am seriously thinking about giving that $250 donation so I can get a face-to-face with the project scientists and ask them what would it take to get them properly funded for staff and equipment. I thought that some of the $100M that the Breakthrough Listen initiative received would go towards the SETI project. That didn't happen did it. I wonder if they even would be forthcoming to an individual peon such as myself. Probably would have more success if I started some committee or something that would appear on their radar.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898809 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898811 - Posted: 3 Nov 2017, 2:32:22 UTC

Looks the scientists twisted the right knobs and got work flowing again. Down to the normal 600K level in the RTS buffer. I have only Numbskull that is still getting no work when requested. I might have to resort to the server kick method. The Linux cruncher is back to full caches. Only the Win 10 machine is down to under 100 total tasks. It blows through the CPU tasks pretty quickly being a Ryzen 1700X.

If I didn't have to go to town for groceries this afternoon, it would have been a good time to rip apart the Linux cruncher and start its rebuild to the Ryzen 1800X platform since I wasn't getting any work for it. Project for tomorrow when I can get an earlier start.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898811 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898814 - Posted: 3 Nov 2017, 2:46:51 UTC - in response to Message 1898809.  

No I am seriously thinking about giving that $250 donation so I can get a face-to-face with the project scientists and ask them what would it take to get them properly funded for staff and equipment.
I remember seriously considering that last year, when the web site downgrade fiasco was in progress. Then I decided that it wouldn't be very nice to get my dander up at a holiday party, because no matter how polite and low-key I might start out, it seemed likely that my frustrations would get the better of me. ;^)
ID: 1898814 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1857
Credit: 268,616,081
RAC: 1,349
United States
Message 1898815 - Posted: 3 Nov 2017, 3:14:13 UTC - in response to Message 1898795.  
Last modified: 3 Nov 2017, 3:16:34 UTC

Sheesh! The RTS buffer is up over 800K tasks! And nobody is getting any of them. The splitters have run amok. You would think they have a process that tells the splitters to back off and stop once you reach a prescribed buffer threshold.

Something is definitely hosed when there's 855k rts but no work available, and yet eventually with no intervention the flow will resume..
Makes Eric's fund raiser news blast comment today about how there aren't enough folks crunching to keep up with the available work a bit much ...
When it takes a 10 hour outage each week to do a backup and file maintenance, there's something basic that needs to be addressed.
When it takes several days each week to stagger back into full operation after the outage, there's something basic that needs to be addressed.
ID: 1898815 · Report as offensive
Profile Jimbocous Project Donor
Volunteer tester
Avatar

Send message
Joined: 1 Apr 13
Posts: 1857
Credit: 268,616,081
RAC: 1,349
United States
Message 1898816 - Posted: 3 Nov 2017, 3:16:05 UTC - in response to Message 1898805.  

Wow! Talk about chutzpah. Just received the donor request for more funds. Why should I continue to contribute my monies to the project when they can't maintain it properly? I have contributed considerable funds to the project for the past several years. I am having a hard time reconciling my emotions right now with continuing support of the project when they can't send work to my expensive crunchers.

Ditto
ID: 1898816 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898817 - Posted: 3 Nov 2017, 3:17:27 UTC - in response to Message 1898811.  

If I didn't have to go to town for groceries this afternoon, it would have been a good time to rip apart the Linux cruncher and start its rebuild to the Ryzen 1800X platform since I wasn't getting any work for it. Project for tomorrow when I can get an earlier start.
I know what you mean. I'd been intending to upgrade my daily driver for a couple weeks, but figured I needed an uninterrupted block of time so I could get it done same day. Finally managed to get that early start on Tuesday. Went from an old Compaq Presario with AMD dual-core and a GT630 to a somewhat newer HP Z620 with a Xeon hex core and GT750Ti. I figure it'll up my SETI contribution on this box by close to 10-fold with, hopefully, minimal increase in electric usage.
ID: 1898817 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898818 - Posted: 3 Nov 2017, 3:33:20 UTC - in response to Message 1898815.  


Something is definitely hosed when there's 855k rts but no work available, and yet eventually with no intervention the flow will resume..
Makes Eric's fund raiser news blast comment today about how there aren't enough folks crunching to keep up with the available work a bit much ...
When it takes a 10 hour outage each week to do a backup and file maintenance, there's something basic that needs to be addressed.
When it takes several days each week to stagger back into full operation after the outage, there's something basic that needs to be addressed.

Yes, something basic needs to be addressed. Doubt if we peons will ever hear what is troubling the project. I just cannot fathom the lack of communication.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898818 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898819 - Posted: 3 Nov 2017, 3:34:27 UTC - in response to Message 1898814.  

No I am seriously thinking about giving that $250 donation so I can get a face-to-face with the project scientists and ask them what would it take to get them properly funded for staff and equipment.
I remember seriously considering that last year, when the web site downgrade fiasco was in progress. Then I decided that it wouldn't be very nice to get my dander up at a holiday party, because no matter how polite and low-key I might start out, it seemed likely that my frustrations would get the better of me. ;^)


I think you have my doubts also. I think I would probably blow a gasket or something after the initial pleasantries.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898819 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898820 - Posted: 3 Nov 2017, 3:37:23 UTC - in response to Message 1898817.  

I know what you mean. I'd been intending to upgrade my daily driver for a couple weeks, but figured I needed an uninterrupted block of time so I could get it done same day. Finally managed to get that early start on Tuesday. Went from an old Compaq Presario with AMD dual-core and a GT630 to a somewhat newer HP Z620 with a Xeon hex core and GT750Ti. I figure it'll up my SETI contribution on this box by close to 10-fold with, hopefully, minimal increase in electric usage.

I just couldn't not jump on the price drop for the motherboard and processor since March. Only the memory was a sore point still. I should net a 100W drop in the system even with double the amount of cores.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898820 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1898823 - Posted: 3 Nov 2017, 4:15:46 UTC - in response to Message 1898818.  

I just cannot fathom the lack of communication.
That's really the root of it all, isn't it? To me, any and all hardware, software and funding issues are far less significant than the near total absence of simple, basic periodic communication with the masses (that'd be us) who are diligently donating our time, hardware, and electricity to the project in the hopes that it may one day succeed.
ID: 1898823 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1898826 - Posted: 3 Nov 2017, 4:53:38 UTC - in response to Message 1898823.  

+1 Totally agree.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1898826 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1898828 - Posted: 3 Nov 2017, 5:18:50 UTC - in response to Message 1898826.  

+1 Totally agree.

+ 1000!!!
ID: 1898828 · Report as offensive
Ghia
Avatar

Send message
Joined: 7 Feb 17
Posts: 238
Credit: 28,911,438
RAC: 50
Norway
Message 1898839 - Posted: 3 Nov 2017, 8:05:20 UTC - in response to Message 1898826.  

+1 Totally agree.


What bothers me is that they're crying for more "customers", when they can't even properly serve (pun intended) the ones they already have... !
...Ghia...
Humans may rule the world...but bacteria run it...
ID: 1898839 · Report as offensive
Profile Dr.Diesel Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 14 May 99
Posts: 41
Credit: 123,695,755
RAC: 139
United States
Message 1898861 - Posted: 3 Nov 2017, 12:43:02 UTC - in response to Message 1898839.  

So I'm new to the forums here, returning to the project after a way too long absence.

My machines ran out of work last night too, and the last two outages they have set idle running out of work (though I think I'm prepared for bunkering now, but manual intervention shouldn't really be necessary given the outages are scheduled and known). If the project really needs our computing resources sounds like there are fairly easy opportunities to improve?

Do none of the official backend SETI guys read the forums, are they aware of our ligament needs and concerns?
ID: 1898861 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 29 · Next

Message boards : Number crunching : Panic Mode On (108) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.