Panic Mode On (114) Server Problems?

Message boards : Number crunching : Panic Mode On (114) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 45 · Next

AuthorMessage
Profile Chris904395093209d Project Donor
Volunteer tester

Send message
Joined: 1 Jan 01
Posts: 112
Credit: 29,923,129
RAC: 6
United States
Message 1973632 - Posted: 5 Jan 2019, 16:12:57 UTC - in response to Message 1973626.  

it seems that the "no task available " horror show is back ...

forum is a little bit laggy too ;)


I'm at work so I can't check my system logs, but I noticed an hour or so ago, the statuses for multiple services on the server status page were an hour old. Right now, we have over 700k tasks ready to send, and all splitters are offline. I would suspect with the splitters offline, the backend system will catch up a bit. At least until the pendulum starts to swing the other way again.
~Chris

ID: 1973632 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1973635 - Posted: 5 Jan 2019, 16:18:14 UTC - in response to Message 1973626.  

You missed the memo?
They expanded "Search for Extraterrestrial Intelligence" to include tasks, and servers.
ID: 1973635 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1973692 - Posted: 5 Jan 2019, 22:43:08 UTC - in response to Message 1973464.  

some stuck in tape processing ... not FIFO ? but LIFO ?

not seen any BLC 15 14 12 6 5 4 Wu in my cache ...


. . That's because the Blc16 tapes being split are older than the tapes of the other varieties. When they reach the same age tapes the others will begin to split again. 1st the Blc15's because they have the next oldest tapes, then the 14/12s etc.

Stephen

:)

blc05_2bit_guppi_58406_33654_DIAG_3C249_1_0120 has been in the queue since last year it is just over 10 GB in size. I wonder when this will be split
ID: 1973692 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1973705 - Posted: 5 Jan 2019, 23:43:00 UTC - in response to Message 1973692.  


blc05_2bit_guppi_58406_33654_DIAG_3C249_1_0120 has been in the queue since last year it is just over 10 GB in size. I wonder when this will be split


. . If they don't load any older tapes in the meantime it should split within the next 3 or 4 days. It is one of the "youngest" tapes presently loaded.

Stephen

..
ID: 1973705 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1973709 - Posted: 6 Jan 2019, 0:00:11 UTC - in response to Message 1973705.  


blc05_2bit_guppi_58406_33654_DIAG_3C249_1_0120 has been in the queue since last year it is just over 10 GB in size. I wonder when this will be split


. . If they don't load any older tapes in the meantime it should split within the next 3 or 4 days. It is one of the "youngest" tapes presently loaded.

Stephen

..

Thanks Stephen. Are you working this out by converting the dates to the date as we know it (as we would write it on paper)?
ID: 1973709 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1973719 - Posted: 6 Jan 2019, 1:11:09 UTC - in response to Message 1973709.  

. . If they don't load any older tapes in the meantime it should split within the next 3 or 4 days. It is one of the "youngest" tapes presently loaded.
Stephen

Thanks Stephen. Are you working this out by converting the dates to the date as we know it (as we would write it on paper)?


. . To check the actual age yes. But for relative age it is much simpler. They are all named using the modified Julian date, so the lower the number in the file name, the older the tape and the higher the number the more recent it is. The Blc number is the recorder channel that sourced the tape.

blc05_2bit_guppi_58406_29306

blc05_2bit_guppi_58406_30679

. . The first 5 digit number after guppi is the date and the second is the daily offset in seconds. Hence the first file I listed is about 20 minutes older than the second one. And there is an even simpler way to check which are the older tapes. Each file is given a sequential number.

HIP21036_0116

HIP20440_0117

. . The descriptor such as 'HIP21036' identifies the part of the sky the telescope is aimed at from reference star charts, this one is looking at the constellation Hippocampus (the Seahorse). Then comes the sequence number. 116 is older than 117 ... :) If you traced the target descriptors on the appropriate star chart the sequence numbers should give you the path the telescope was tracking across the sky. This may not be contiguous because unlike Arecibo, GBT is a fully steerable telescope, but since all recorders are fed from the same telescope they should all follow the same progression of targets.

. . So when we are looking at tapes all from the same night, such as we are now (going from 58405 to58406) captured from half a dozen different recorder channels but all sharing the same daily sequence linkage, a blc16 tape with the sequence 100 is older than a Blc04 tape with the sequence 118, etc. As the splitters have progressed through the blc16 and now blc15 tapes to a point nearing the oldest sequence numbers of the other channel tapes we will see those tapes, which though mounted earlier were recorded more recently in time, begin to be split. Unless older tapes are mounted which then take precedence over all current tapes. As you may have now deduced, for the last 6 weeks or so we have been processing the data from one night :).

. . Unless I have gotten something wrong I hope I haven't confused you even more :)

Stephen

:)
ID: 1973719 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1973725 - Posted: 6 Jan 2019, 2:16:55 UTC

The descriptor such as 'HIP21036' identifies the part of the sky the telescope is aimed at from reference star charts, this one is looking at the constellation Hippocampus (the Seahorse)


This is incorrect Stephen. Stars are commonly identified by their catalog number, in this case the Hipparchus catalog named after the ancient Greek astronomer. In this case HIP21036 identifies the star as 83 Tauri which is in the Taurus constellation.
https://www.universeguide.com/star/83tauri
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1973725 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1973743 - Posted: 6 Jan 2019, 4:21:15 UTC - in response to Message 1973719.  
Last modified: 6 Jan 2019, 5:14:09 UTC

Thanks Stephen for your explanation. I have a little bit more an understanding but I am still a little confused. I did not realise we have been processing information from the same night. I will be happy when those little files are processed. It's just me I hate seeing small files sitting waiting to be processed most likely a lot quicker than the current work we are doing
ID: 1973743 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1973749 - Posted: 6 Jan 2019, 5:12:38 UTC - in response to Message 1973725.  

The descriptor such as 'HIP21036' identifies the part of the sky the telescope is aimed at from reference star charts, this one is looking at the constellation Hippocampus (the Seahorse)


This is incorrect Stephen. Stars are commonly identified by their catalog number, in this case the Hipparchus catalog named after the ancient Greek astronomer. In this case HIP21036 identifies the star as 83 Tauri which is in the Taurus constellation.
https://www.universeguide.com/star/83tauri


. . Ah, my mistake, I should have checked before leaping to a mistaken conclusion ...

. . I thought they were referring to constellations but it is fact stars. You learn something everyday.

Stephen

oops
ID: 1973749 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13732
Credit: 208,696,464
RAC: 304
Australia
Message 1973767 - Posted: 6 Jan 2019, 7:13:38 UTC

It's that time again.
Most Server stats not updating, "Project has no tasks available" is the usual response to work requests.
Grant
Darwin NT
ID: 1973767 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1973862 - Posted: 6 Jan 2019, 19:58:07 UTC

Panic! RTS is falling. Can't see the files to split in the status... general weirdness.
ID: 1973862 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1973867 - Posted: 6 Jan 2019, 20:10:09 UTC - in response to Message 1973862.  
Last modified: 6 Jan 2019, 20:11:06 UTC

Panic! RTS is falling. Can't see the files to split in the status... general weirdness.


. . The RTS is down but not critical yet, it is just below 500K. The replica status is 2 Hrs stale but everything else is showing pretty much up to date and the file splitting status just isn't showing at all. The split rate is very low @ 1.6/sec but the splitters themselves are showing online and OK.

. . Something is wrong but not yet critically ...

Stephen

? ?
ID: 1973867 · Report as offensive
Profile Freewill Project Donor
Avatar

Send message
Joined: 19 May 99
Posts: 766
Credit: 354,398,348
RAC: 11,693
United States
Message 1973869 - Posted: 6 Jan 2019, 20:13:10 UTC - in response to Message 1973862.  

Under Splitter status, the page says splitters are all currently offline. That cannot be good, although I did just get a few new jobs.
ID: 1973869 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1973870 - Posted: 6 Jan 2019, 20:17:18 UTC - in response to Message 1973869.  
Last modified: 6 Jan 2019, 20:21:30 UTC

Under Splitter status, the page says splitters are all currently offline. That cannot be good, although I did just get a few new jobs.



yes where it normally lists files being split it does say splitters offline, but the processes on the left do show some splitters still going, but maybe not as the split rate is so low. Just a general lack of info. There are enough WUs in the RTS queue to keep handing out work for a couple of hours.

edit: looks like the science db is disabled ??
ID: 1973870 · Report as offensive
J. Mileski
Volunteer tester
Avatar

Send message
Joined: 9 Jun 02
Posts: 632
Credit: 172,116,532
RAC: 572
United States
Message 1973876 - Posted: 6 Jan 2019, 20:37:46 UTC
Last modified: 6 Jan 2019, 20:39:04 UTC

Just upgraded one of my computers to cuda 9.4 and I get

Sun 06 Jan 2019 08:30:36 PM UCT | SETI@home | Reporting 1 completed tasks
Sun 06 Jan 2019 08:30:36 PM UCT | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
Sun 06 Jan 2019 08:30:36 PM UCT | SETI@home | [sched_op] CPU work request: 890024.20 seconds; 10.00 devices
Sun 06 Jan 2019 08:30:36 PM UCT | SETI@home | [sched_op] NVIDIA GPU work request: 86400.00 seconds; 1.00 devices
Sun 06 Jan 2019 08:30:38 PM UCT | SETI@home | Scheduler request completed: got 0 new tasks
Sun 06 Jan 2019 08:30:38 PM UCT | SETI@home | Project is temporarily shut down for maintenance
Sun 06 Jan 2019 08:30:38 PM UCT | SETI@home | Project requested delay of 3600 seconds
Sun 06 Jan 2019 08:30:38 PM UCT | SETI@home | [sched_op] Deferring communication for 01:00:00
Sun 06 Jan 2019 08:30:38 PM UCT | SETI@home | [sched_op] Reason: project requested delay

SETI@home science database paddym Disabled, is showing in the server status
ID: 1973876 · Report as offensive
Profile Chris904395093209d Project Donor
Volunteer tester

Send message
Joined: 1 Jan 01
Posts: 112
Credit: 29,923,129
RAC: 6
United States
Message 1973879 - Posted: 6 Jan 2019, 21:06:45 UTC

Most services are showing disabled. Somebody at Seti doing some work on the systems today?
~Chris

ID: 1973879 · Report as offensive
W3Perl Project Donor
Volunteer tester

Send message
Joined: 29 Apr 99
Posts: 251
Credit: 3,696,783,867
RAC: 12,606
France
Message 1973886 - Posted: 6 Jan 2019, 21:45:30 UTC

137.164.11.37 (one host from cenic.org which provide internet connectivities for california) was down from 14 h to 20 h.
So many computers were not able to join seti in order to send theirs results.
Around 20 pm, the problem has been solved. Maybe too much requests may have broken some seti service ?
ID: 1973886 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1973889 - Posted: 6 Jan 2019, 21:50:08 UTC
Last modified: 6 Jan 2019, 21:54:08 UTC

files that had previously been split are now back on the files list. I'm going to take a wild guess and say a restore to a previous state has happened for some reason.

Edit: looks like some Seti person is working on a Sunday. I'll send my thanks, and I hope it isn't too bad.
ID: 1973889 · Report as offensive
JohnDK Crowdfunding Project Donor*Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 28 May 00
Posts: 1222
Credit: 451,243,443
RAC: 1,127
Denmark
Message 1973894 - Posted: 6 Jan 2019, 22:15:26 UTC

Getting work again.
ID: 1973894 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1973899 - Posted: 6 Jan 2019, 22:45:22 UTC

Can report finished tasks most connections but getting nothing in return. Out of gpu work on 3 hosts and rapidly depleting what's left on the 2 remaining with gpu work. Too many people fighting over what's available currently it seems.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1973899 · Report as offensive
Previous · 1 . . . 20 · 21 · 22 · 23 · 24 · 25 · 26 . . . 45 · Next

Message boards : Number crunching : Panic Mode On (114) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.