Panic Mode On (112) Server Problems?

Message boards : Number crunching : Panic Mode On (112) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 33 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1931781 - Posted: 25 Apr 2018, 7:12:50 UTC - in response to Message 1931766.  

That's the same comment I made in the unexplained slowness thread. Until Eric informs up just what that process does, all we can do is speculate. I haven't caught the splitters entirely shutoff this week. Last week, yes. So, maybe your speculation that the process really is a throttle and actually performs that function, which by standard definition slows or speeds up a function, but usually does not entirely stop the function or kill it. Your guess is as good as mine.

It looks like the requests to the scheduler are getting filled regularly. And the uploads are basically back working normally. So back to the new normal it seems. I too wonder how the servers will fare once the Arecibo work disappears for longer than a week and the results returned per hour creeps back up. Will we run into the ever climbing pending deletions again or not?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1931781 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1931782 - Posted: 25 Apr 2018, 7:20:13 UTC - in response to Message 1931781.  

Will we run into the ever climbing pending deletions again or not?

That's what was happening before all the new AP work & the release of Arecibo VLARs to Nvidia GPUs. So unless they've made some more database tweaks since then, if the Returned-last-hour gets back over 120k sustained, we can expect issues again.
Grant
Darwin NT
ID: 1931782 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1931787 - Posted: 25 Apr 2018, 8:35:24 UTC - in response to Message 1931750.  

I see the new splitter_throttle_sah process is not running now. Hasn't started back up since I first checked on the project 45 minutes ago and the RTS buffer was down in the mid 500K range. Up to 620K range now. Wonder if this is the reason we are not getting any work. It was enabled all the time since Sunday and only went disabled for the outage.
You must remember Keith that the SSP only gives a snapshot of what is going on at that moment and between those moments that process could've been on/off a few times in between and work is getting out (you just have to be lucky enough to get in 1st before others do or you miss out and there are a lot of us hitting them up atm). ;-)

Cheers.

If the SSP is updating, you get a snapshot every ten minutes. I basically sit on the website all day and look at the SSP literally dozens of times a day. I think I am getting a pretty valid view of the process during the day. Hard to think the law of averages doesn't come into play for me.
If SETI ran constant numbers then your theory might have a chance, but knowing what I know tells me that your theory is flawed (just like CreditScrew). Nothing here now works on stable averages anymore Keith (and it hasn't for years now) and that's the flaw in your thinking as even 10min SSP snapshots can turn out to be 20-30mins snapshots more often than not.

Cheers.
ID: 1931787 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1931800 - Posted: 25 Apr 2018, 10:04:09 UTC - in response to Message 1931787.  

Or, with the snapshot happening every 10 minutes on a whole number (e.g. 9:00, 9:10, 9:20, 9:30, 9:40, 9:50, 10:00), it running and stopping at 9:05, 9:07, 9:11, 9:13, 9:25, 9:43, 9:47, 9:55 and 9:59 shows it as not running at all the times one checks in.
ID: 1931800 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1932096 - Posted: 26 Apr 2018, 19:27:34 UTC

Been getting mostly GBT tasks since the Arecibo work stopped a day ago. Thinking the RTS buffer now contains mostly GBT work, I have been watching and waiting for the results returned per hour to start moving upwards. Haven't seen any sign of that so far. Still hanging around 102K per hour. Grant and I expect the servers to start having issues again once the results returned per hour starts moving past 120K. Maybe the average turnaround time is much greater for the bulk of the hosts and so it will take a long sustained absence of Arecibo work before the returned results per hour starts climbing again. In the meantime, new Arecibo tasks are being generated again so enjoy the well working project for the while.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1932096 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1932147 - Posted: 26 Apr 2018, 22:38:53 UTC - in response to Message 1932096.  
Last modified: 26 Apr 2018, 22:39:13 UTC

. . Getting a LOT of Arecibo VLARs here again.

Stephen

:(
ID: 1932147 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1932203 - Posted: 27 Apr 2018, 1:38:21 UTC - in response to Message 1932147.  

Yea, the respite from Arecibo tasks was very short. I too am back to mostly a 60/30 mix of Arecibo/GBT work.
Don't know how that happens when there are more splitters on the GBT files compared to the Arecibo files.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1932203 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34744
Credit: 261,360,520
RAC: 489
Australia
Message 1932224 - Posted: 27 Apr 2018, 4:16:31 UTC
Last modified: 27 Apr 2018, 4:17:29 UTC

The only problem that I have ATM with VLAR's to GPU is that they are playing havoc with the estimated CPU times on this old rig of mine that I put back into action almost 5 days ago after 6yrs of being away in the hands of someone else.

Cheers.
ID: 1932224 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1932257 - Posted: 27 Apr 2018, 7:09:30 UTC

Server status numbers no longer updating, Graphs have flat lined.
Grant
Darwin NT
ID: 1932257 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22158
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1932272 - Posted: 27 Apr 2018, 9:34:26 UTC

As I type ssp is timed 09:30 utc, or only a couple of minutes old...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1932272 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1932274 - Posted: 27 Apr 2018, 9:41:11 UTC - in response to Message 1932272.  

As I type ssp is timed 09:30 utc, or only a couple of minutes old...

The page might be that old, unfortunately much of the data is much older.
Grant
Darwin NT
ID: 1932274 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1932288 - Posted: 27 Apr 2018, 11:58:21 UTC

Make that 6 hours.
ID: 1932288 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1932325 - Posted: 27 Apr 2018, 14:43:51 UTC

looks like someone got s@h up again...bad news is no WUs at the moment.
ID: 1932325 · Report as offensive
Profile Bill G Special Project $75 donor
Avatar

Send message
Joined: 1 Jun 01
Posts: 1282
Credit: 187,688,550
RAC: 182
United States
Message 1932332 - Posted: 27 Apr 2018, 15:22:26 UTC - in response to Message 1932325.  

looks like someone got s@h up again...bad news is no WUs at the moment.

? all my computers are 100% full.

SETI@home classic workunits 4,019
SETI@home classic CPU time 34,348 hours
ID: 1932332 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24875
Credit: 3,081,182
RAC: 7
Ireland
Message 1932337 - Posted: 27 Apr 2018, 15:36:09 UTC - in response to Message 1932325.  

27/04/2018 15:51:58 | SETI@home | Sending scheduler request: To fetch work.
27/04/2018 15:51:58 | SETI@home | Requesting new tasks for CPU
27/04/2018 15:52:01 | SETI@home | Scheduler request completed: got 28 new tasks
ID: 1932337 · Report as offensive
Profile Kissagogo27 Special Project $75 donor
Avatar

Send message
Joined: 6 Nov 99
Posts: 715
Credit: 8,032,827
RAC: 62
France
Message 1932484 - Posted: 28 Apr 2018, 9:47:14 UTC

at this time
splitter_throttle_sah bruno Running


Database/file status
BOINC Database Engine State	#	As of*
Master database queries/second	  1,181	0m	 
Replica seconds behind master	  73	0m	 
 
Data Distribution State             	SETI@home v7 #	Astropulse #	SETI@home v8 #	As of*
Results ready to send                     	0	      0        	606,420         	0m
Current result creation rate **         	0/sec	1.1812/sec	45.7224/sec	5m
Results out in the field                     	0        185,781    	4,329,142	0m
Results received in last hour **               	0      	4,153       	94,338          0m
Result turnaround time (last hour average) **	0.00 hrs 29.97 hours	35.13 hours	0m
Results returned and awaiting validation	0	108,120	        3,605,449	0m
Workunits waiting for validation              	0	0	         217                	0m
Workunits waiting for assimilation           	0	2	        190                	0m
Workunit files waiting for deletion           	0	97	       2,748               	0m
Result files waiting for deletion	        0	1               	0              	0m
Workunits waiting for db purging            	0	34,394      	1,097,668	0m
Results waiting for db purging	                71     73,331	      2,276,984   	0m
Transitioner backlog (hours)	      0	                         0m

ID: 1932484 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1932487 - Posted: 28 Apr 2018, 10:07:19 UTC - in response to Message 1932484.  

at this time
splitter_throttle_sah bruno Running

It's been running most of the day. Doesn't seem to have had much effect on anything.
Grant
Darwin NT
ID: 1932487 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1932523 - Posted: 28 Apr 2018, 16:09:49 UTC - in response to Message 1932487.  

I too don't see much of an effect of it running on the SSP. Unless it has something to do with the frequency/amplitude change in the result creation rate Weekly graph at Haveland. Since the process came into existence that graph is showing much higher frequency cycles between off and on. The amplitude is slightly lower too.

Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1932523 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1639
Credit: 12,921,799
RAC: 89
New Zealand
Message 1932579 - Posted: 29 Apr 2018, 0:56:38 UTC

It will be interesting to see how much work comes from blc05_2bit_blc05_guppi_58152_83520_DIAG_PSR_J0645+5158_0006 Its size is 0.00 GB Does anybody know what the +5158_0006 means? I am thinking perhaps they are now joining little bits of tape on to bigger tapes to bring them up to just over 104 GB
ID: 1932579 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1932587 - Posted: 29 Apr 2018, 2:38:34 UTC - in response to Message 1932579.  

No that isn't it. The tape names have the names of the target star, exoplanet, pulsar or galaxy. That file just uses the name from the astromical PSR catalog of pulsars.

PSR J0645+5158 - Simbad
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1932587 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 33 · Next

Message boards : Number crunching : Panic Mode On (112) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.