Message boards :
Number crunching :
Panic Mode On (112) Server Problems?
Message board moderation
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 33 · Next
Author | Message |
---|---|
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
blc13_2bit_blc13_guppi_58166_59941_DIAG_PSR_J1909-3744_0006 looks as if it's full of errors looking at the SSP. However it doesn't appear to be affecting splitter output |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . OK that is officially the shortest outage I have ever seen. 3 hours and change. . . Don't I feel silly ... :) Stephen :) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
. . OK that is officially the shortest outage I have ever seen. 3 hours and change. Me too. I wish we could count on one or the other long or short. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Looks like the splitters are getting tired. Going along OK, then take a break for a couple of hours. Get going again for a few more hours, then taking another 1-2hr break. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
What is it about the weekend? Splitter output back to being less than level of demand. Grant Darwin NT |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
there is definitely something wrong with the splitters. The output has dropped. |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
looking at the results to send numbers I'm guessing that the throttle range has been reset from in the 500s to in the 300s. I think that this is the new "normal" place for the results to send value. I'm concerned how this will affect recovery after Tuesday's planned outage. Might be ok if we have an outage and not an outrage. Guess we will find out Tuesday. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
looking at the results to send numbers I'm guessing that the throttle range has been reset from in the 500s to in the 300s. I think that this is the new "normal" place for the results to send value. I'm concerned how this will affect recovery after Tuesday's planned outage. Might be ok if we have an outage and not an outrage. Guess we will find out Tuesday. I don't think the limiter on the splitters has been changed. They seem to get in a tangle and output doesn't keep up with demand, so RTS drops off. It was boosted by the arrival of a multibeam dataset, which increased splitter output while it was running. With that done, RTS will probably continue to drop off again, unless somebody steps in to realign the DB again, or some more multibeam work is added to the cache to split. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Hmmmm, I can see both comments having validity. If as Unixchix says we no longer have outrages and the new normal outage is 3-4 hours, I think the RTS buffer in the 300K range would suffice. I can see a benefit to the database size if you don't have to allocate space for those 300K of extra tasks that would not need to split to pump up the RTS buffer to the past traditional size of 600K. Will have to wait and monitor to see where we're headed. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
rts is now back in the 500k range with the splitting of some AP/MP data sets. Here is my new theory. The throttle is only on the gbt data as the AP/MP data is so minimal at this point. As AP/MP data is added to the queue the rts builds to 500k to 600k range total, but when the gbt portion of the rts reaches the 400k the throttle kicks in and it is all MP data being added until some of the gbt data gets taken out of the queue. As long as there is some MP data in the rts then the numbers stay up in the 500k range. This effect lingers as errors and timeouts cause resends so it doesn't fall the moment all MP data is split. It looks like they are splitting 10jn18aa right now. Hopefully they have one or two AP/MP files held back to split on Tuesday after the outage (I'm an optimist, so no outrage). My theories are just playful thoughts as I'm just happy I am getting a good supply of data and that the newbies that have come to join the project (from seeing HBO special??) are seeing a nice stable system that has handled the added load. I admit I was worried when the rts numbers fell to the 300k range, but all is good. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874 |
It looks like they are splitting 10jn18aa right now. Hopefully they have one or two AP/MP files held back to split on Tuesday after the outage (I'm an optimist, so no outrage).10jn18aa is yesterday's recording at Arecibo, so I doubt they have anything 'held back'. But tomorrow, they may have today's recording - fingers crossed. |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
ok the splitter for MP went to 0 around 650k rts . so there is a throttle on MP... but I think there are different throttles for gbt and mp and maybe a secondary total throttle... hmm. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
ok the splitter for MP went to 0 around 650k rts . so there is a throttle on MP... but I think there are different throttles for gbt and mp and maybe a secondary total throttle... hmm. Where you're typing MP I think you mean MB. AP= AstroPulse MB= Multi beam. I think it's just a case of the splitters have had a good rest & are now back to putting out enough work to keep the Ready-to-send buffer full. At least until the next time they revert to go slow mode. Grant Darwin NT |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304 |
Well that sucks. Looks like there's some dodgy WUs out there. 09jn18aa.15655.365782.9.36.243_1 Outcome Computation error Client state Compute error Exit status -112 (0xFFFFFF90) ERR_XML_PARSE as did 05jn18aa.24200.1472803.10.37.225_1 application SETI@home v8 created 10 Jun 2018, 10:53:59 UTC minimum quorum 2 initial replication 2 max # of error/total/success tasks 5, 10, 5 errors Too many errors (may have bug) and 05jn18aa.24200.1472803.10.37.231_0 Has crashed & burnt 3 times so far. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Yes there seems to be a few dodgy workunits out there. Task 3008834112 We all seem to have been hit with: </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>05jn18aa.18403.1473928.7.34.147_2_r1038861265_0</file_name> <error_code>-161 (not found)</error_code> </file_xfer_error> errors. Task names don't match. Assume that is why the file isn't found. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Ghia Send message Joined: 7 Feb 17 Posts: 238 Credit: 28,911,438 RAC: 50 |
Well that sucks. I've had a couple, too. 05jn18aa.18372.1473928.6.33.130_2 Stopped at 0.0 and 0.0 on the clock, Exit status -6 (0xFFFFFFFA) Unknown error code Crashed 5 times so far. 05jn18aa.18403.1473928.7.34.113_2 Stopped at 0.0 and 0.0 on the clock, Exit status -6 (0xFFFFFFFA) Unknown error code Crashed 4 times so far. Humans may rule the world...but bacteria run it... |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Well that sucks. . . Yep I've been seeing them too. Had a few on a couple of the rigs. Stephen :( |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
SETI@home error -6 Bad workunit headerAlong with those I'm also getting a few Download Errors. <error_code>-119 (md5 checksum failed for file)</error_code>I'm not seeing any Download Errors anywhere else, and why do I never see any Upload Errors? Uploading to SETI is different than Downloading? I Dunno... |
rob smith Send message Joined: 7 Mar 03 Posts: 22200 Credit: 416,307,556 RAC: 380 |
Upload is subtly different, and the files are smaller... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
That's nice. I can Download 5+ gigabyte files from Apple, 1+ gigabyte from nVidia and Ubuntu, but it fails on less than a megabyte from SETI. The machines Upload just as many files to SETI as they Download, never seen an Upload failure. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.