Panic Mode On (111) Server Problems?

Message boards : Number crunching : Panic Mode On (111) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 19 · 20 · 21 · 22 · 23 · 24 · 25 . . . 31 · Next

AuthorMessage
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929317 - Posted: 10 Apr 2018, 23:52:23 UTC - in response to Message 1929281.  

Heh, tons and tons of Arecibo .VLARs in the "Results ready to send" queue now.
Don't expect many WU's for Nvidia GPU's ....

Edit: Well, except for those 60+ that I just got for my 980 that is :-)


. . I'm hoping the log jam effect of the Arecibo VLAR WUs has been mitigated in some way, call me an optimist :)

Stephen

:)
ID: 1929317 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1929320 - Posted: 11 Apr 2018, 0:16:46 UTC - in response to Message 1929317.  
Last modified: 11 Apr 2018, 0:17:52 UTC

Heh, tons and tons of Arecibo .VLARs in the "Results ready to send" queue now.

Just curiosity. Where we see that?
AFAIK RTS shows the total SETI@home v8 # avaiable , but not discriminate how many of each flavor.
ID: 1929320 · Report as offensive
Cruncher-American Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor

Send message
Joined: 25 Mar 02
Posts: 1513
Credit: 370,893,186
RAC: 340
United States
Message 1929338 - Posted: 11 Apr 2018, 2:26:46 UTC

In some of the new Arecibo files, work units (not all, but a few) generated by the splitter code seems to screw up the SoG app - they take forever to run (I have had a few run for upwards of an hour, with remaining time growing; they normally run for 15 minutes or so on my rigs). I noticed it first with 03apr18??, and now with 12dc17aa.

Somehow the WUs generated are incompatible with the app code.

Anyone else notice this? What can be done about it, if I am correct?
ID: 1929338 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929341 - Posted: 11 Apr 2018, 2:33:33 UTC - in response to Message 1929338.  
Last modified: 11 Apr 2018, 2:36:03 UTC

This needs to be brought to Raistmer's attention. All the aborted tasks took WAY too long to compute. The one thing they have in common is that they have VERY large angle ranges. The highest that I have seen or recall. Over 10 in fact. It's possible that high an angle range is confusing the app by being out of expected compute bounds and it hangs up.

[Edit] Just took a look at my two new aborted tasks that hung up. AR =151. Didn't notice before and that now is the highest angle range I have ever seen. This is likely the reason for the stuck tasks.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929341 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1929364 - Posted: 11 Apr 2018, 5:04:46 UTC

File deletion backlog continues to grow...
Grant
Darwin NT
ID: 1929364 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929370 - Posted: 11 Apr 2018, 6:07:30 UTC - in response to Message 1929367.  

Thanks for the info Tut. Was not a part of the Beta SoG testing so never have seen these before. If I get another one I will ignore the High Priority running and see if it completes in normal time.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929370 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1929374 - Posted: 11 Apr 2018, 7:54:35 UTC

Just noticed an AP WU that had been downloading for 30min & was only halfway done. Suspended & re-enabled networking & it finished off at almost 1MB/s (usual download rate is 200-250kB/s).
Grant
Darwin NT
ID: 1929374 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22204
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1929382 - Posted: 11 Apr 2018, 9:40:28 UTC

Juan, in response to:
Just curiosity. Where we see that?


People glean the information about the nature of tasks being distributed (VLAR/LHAR/Normal) from studying the tasks they receive, it is possible for two people to get an entirely different mix even when their requests for tasks are made within the same time slot.

AFAIK RTS shows the total SETI@home v8 # avaiable , but not discriminate how many of each flavor.

You are right, the Server Stats Page does not discriminate between the source telescope for the "normal" SETI tasks.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1929382 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1929383 - Posted: 11 Apr 2018, 9:47:26 UTC
Last modified: 11 Apr 2018, 10:17:17 UTC

I could fall into a bath tub full of boobs and still end up sucking my thumb, all I see are numbers and letters on each task only time
I know I have something different is when it shows an AP job..Yeah I'm thick :)

Sorry my attempt at humour, noticed threads getting a little tense out there..
ID: 1929383 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929428 - Posted: 11 Apr 2018, 16:30:48 UTC - in response to Message 1929383.  

LOL. Love your comment as I had never run across that euphemism before.

Pretty simple to determine just what and where tasks originate by the naming convention and the first letters or number in the task name.

BLC (Breakthrough Listen Project) tasks always start with BLC in the name. They are only Multiband tasks. They originate from the Green Bank Telescope in West Virginia.

AP tasks always start with ap in the name. The originate solely from the Arecibo Telescope in Puerto Rico.

Any task starting with just numbers is an Arecibo Multiband task. They originate solely from the Arecibo Telescope.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929428 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929458 - Posted: 11 Apr 2018, 20:26:27 UTC - in response to Message 1929428.  

LOL. Love your comment as I had never run across that euphemism before.

Pretty simple to determine just what and where tasks originate by the naming convention and the first letters or number in the task name.

BLC (Breakthrough Listen Project) tasks always start with BLC in the name. They are only Multiband tasks. They originate from the Green Bank Telescope in West Virginia.

AP tasks always start with ap in the name. The originate solely from the Arecibo Telescope in Puerto Rico.

Any task starting with just numbers is an Arecibo Multiband task. They originate solely from the Arecibo Telescope.


. . And the Arecibo tasks are friendly in that the first 6 characters are the file date, as in 12dc17 is the 12th December 2017. The date component in the Blc file names comes later and is a modified Julian date, you will need a date converter to make sense of those but they are readily available on the internet :)

Stephen

:)
ID: 1929458 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24879
Credit: 3,081,182
RAC: 7
Ireland
Message 1929474 - Posted: 11 Apr 2018, 22:09:47 UTC

Some light relief :-) It's amazing what one comes across when following links.

Employ Cliff to sort out the database :-)

Nice one UCB :-)
ID: 1929474 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1929482 - Posted: 11 Apr 2018, 23:21:17 UTC - in response to Message 1929474.  

Some light relief :-) It's amazing what one comes across when following links.

Employ Cliff to sort out the database :-)

Nice one UCB :-)


. . Maybe we need Cliff to find ET, though he would be further away than Germany ...

Stephen

:)
ID: 1929482 · Report as offensive
Profile Stargate (SA)
Volunteer tester
Avatar

Send message
Joined: 4 Mar 10
Posts: 1854
Credit: 2,258,721
RAC: 0
Australia
Message 1929513 - Posted: 12 Apr 2018, 3:20:52 UTC

Thanks Keith and Stephen for explaining it for me :)
ID: 1929513 · Report as offensive
Profile Unixchick Project Donor
Avatar

Send message
Joined: 5 Mar 12
Posts: 815
Credit: 2,361,516
RAC: 22
United States
Message 1929523 - Posted: 12 Apr 2018, 4:47:33 UTC

I'm guessing the mix of data streams has messed up the flow. The ready to send is high the out in the field has dropped and the results ready to purge from db is at 5 million and still going up.

is this a problem?
ID: 1929523 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1929531 - Posted: 12 Apr 2018, 5:49:44 UTC - in response to Message 1929523.  
Last modified: 12 Apr 2018, 5:52:32 UTC

I'm guessing the mix of data streams has messed up the flow. The ready to send is high the out in the field has dropped and the results ready to purge from db is at 5 million and still going up.

is this a problem?

The Results-awaiting-purge climbing, along with WU awaiting-deletion aren't a good sign.

And to add to that, for the last 4 hours I've been getting quite a few "Temporarily failed download: transient HTTP error" showing up in my Event logs.
Grant
Darwin NT
ID: 1929531 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1929536 - Posted: 12 Apr 2018, 7:03:14 UTC

No problem getting work so far. And no comms issues with the servers. Crossing fingers that holds.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1929536 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13736
Credit: 208,696,464
RAC: 304
Australia
Message 1929541 - Posted: 12 Apr 2018, 7:54:08 UTC - in response to Message 1929536.  

No problem getting work so far. And no comms issues with the servers. Crossing fingers that holds.

Yeah, looks like the usual "post about a problem, and it stops" occurred again.
Grant
Darwin NT
ID: 1929541 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1929549 - Posted: 12 Apr 2018, 9:53:40 UTC

I think I found a corrupted task file ... blc11_2bit_guppi_58137_42360_HIP64241_0045.2736.818.22.45.145.vlar_6
It seems to be running download errors now.
I'm #7 on the list.
http://setiathome.berkeley.edu/workunit.php?wuid=2865073343
ID: 1929549 · Report as offensive
Ghia
Avatar

Send message
Joined: 7 Feb 17
Posts: 238
Credit: 28,911,438
RAC: 50
Norway
Message 1929558 - Posted: 12 Apr 2018, 11:28:19 UTC - in response to Message 1929549.  

I think I found a corrupted task file ... blc11_2bit_guppi_58137_42360_HIP64241_0045.2736.818.22.45.145.vlar_6
It seems to be running download errors now.
I'm #7 on the list.
http://setiathome.berkeley.edu/workunit.php?wuid=2865073343

blc11 ??
Humans may rule the world...but bacteria run it...
ID: 1929558 · Report as offensive
Previous · 1 . . . 19 · 20 · 21 · 22 · 23 · 24 · 25 . . . 31 · Next

Message boards : Number crunching : Panic Mode On (111) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.