Message boards :
Number crunching :
Panic Mode On (108) Server Problems?
Message board moderation
Author | Message |
---|---|
arkayn Send message Joined: 14 May 99 Posts: 4438 Credit: 55,006,323 RAC: 0 |
|
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
My theory is there must be around 600k Arecibo VLARs in the RTS. At that count level the creation rate drops off meaning very few tasks are being sent. The extra 100000 were sent out quickly, so, that would imply they weren't VLARs. Solution, temporarily raise the RTS to 1 Mil and hope most of the new additions aren't VLARs. Once the VLARs drop off lower the RTS to normal. Problem solved...for now. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
My theory is there must be around 600k Arecibo VLARs in the RTS. At that count level the creation rate drops off meaning very few tasks are being sent. The extra 100000 were sent out quickly, so, that would imply they weren't VLARs. Solution, temporarily raise the RTS to 1 Mil and hope most of the new additions aren't VLARs. Once the VLARs drop off lower the RTS to normal. . . Well that theory isn't been borne out. I have re-configured (changed location) this rig to take CPU work, since there is no GPU work, but still nada, zip, zilch! No flood of Arecibo VLAR tasks that I was hoping for. I can't get work by begging for it ... Stephen :( |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
My theory is there must be around 600k Arecibo VLARs in the RTS. At that count level the creation rate drops off meaning very few tasks are being sent. The extra 100000 were sent out quickly, so, that would imply they weren't VLARs. Solution, temporarily raise the RTS to 1 Mil and hope most of the new additions aren't VLARs. Once the VLARs drop off lower the RTS to normal. As usual, what works for Most people doesn't seem to work for you, https://setiathome.berkeley.edu/results.php?hostid=8097309 All I did was increase the cache setting a little and I instantly received 30 VLARs on the next contact. If I needed GPU tasks I would simply reassign them to the GPUs. Arecibo VLARs run about the same as a BLC5 on the Special App. Watch as the first two appear as finished in about 13 minutes... Hmmm, looks as though my clanged together Maxwell zi3xs3 is a little faster on the Arecibo VLARs than zi3v, 29ja07ad.16591.13571.10.37.63.vlar_1 Runtime = 10 min 31 sec 29ja07ad.16591.13571.10.37.69.vlar_1 Run time = 10 min 30 sec Look at that. I downloaded 30 VLARs and the Server rewarded me with some real GPU task for that machine. Nice. |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
I never get effected by this, but tonight for some reason I did. I changed my profile ("school" vs. "home") twice and updated, and am now refilling. :shrugs: |
Jimbocous Send message Joined: 1 Apr 13 Posts: 1856 Credit: 268,616,081 RAC: 1,349 |
"If no work for selected applications is available, accept work from other applications?: Yes, does not work any longer. That's basically all I'm toggling when I swap from Home to School. I did have to do it a couple times... |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14679 Credit: 200,643,578 RAC: 874 |
That's interesting. The SSP says that the number 'ready to send' has dropped by 400K+ this morning (last five hours back from now), and the number of tasks in progress has gone up by about the same amount. I'd say somebody is getting work - I got over 100 of them. I don't, as yet, have any explanation for that. But an explanation is what is needed first, before you can ask somebody to break into their other busy work schedules and fix it. |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Hmmm, looks as though my clanged together Maxwell zi3xs3 is a little faster on the Arecibo VLARs than zi3v, . . When I eventually got some work for the CPU I started getting work for the GPU as well. But after that the GPU filled right up so I suspect that was when the problem was sorted out. But ironically the first batch of 50 tasks for CPU had not a single Arecibo VLAR. Mind you, after that there were a whole lot of them :( Stephen <shrug> |
dwhirl17 Send message Joined: 19 Feb 17 Posts: 3 Credit: 24,532,548 RAC: 1 |
Just to put in my two cents as well, my 1060's ran out of work last night around 9:00 pm EST. I tried a couple of manual updates and a restart - no changes. Tasks started trickling out at around 10:30pm EST and were back to normal this morning. Some info on the situation/issue would be appreciated. And if there is anything we can do to resolve from the host side would also be helpful. Regards, Doug. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
I would say a lot of people were affected. The dropoff in tasks in progress was real and plotted on the Haveland graphs. That parameter is climbing fast as soon as someone got to the lab this morning and fixed the issue whatever it was. Yes, it would be nice to get an explanation of what happened. I see that the RTS buffer has fallen all the way down to the 200k level from the 600-700K level it was at all day yesterday and nobody was getting any work when requested. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Back to not being able to get any work after the outage because of "internal server error" messages. I've been dry on gpu tasks on the Linux cruncher since the project came back online. I wonder if it has anything to do with the number of tasks trying to be reported. I played with that cc_config parameter last week and it didn't have any effect by dropping to max reported of 40. Wonder if I should try again or just wait it out. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
All my machines have reloaded, looking good now. The big question is, WTH is a blc24, https://setiathome.berkeley.edu/show_server_status.php ? Hopefully it isn't 5 times worse than a blc05? Or is it just a blc2.4? I think we are about to find out. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Haha. Hopefully not the first you mentioned. I think it is just the catalog number in a prescribed search the BLC staff has developed. Don't think it has anything to do with the star Hipparchos catalog number. Maybe a map catalog number. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
All my machines have reloaded, looking good now. . . The only theory I can offer is that it is a progression, GBT tasks were originally blc then 1-7 (I didn't actually see any 0 tasks but they may have existed, nor did I see any 8s so not sure at which point the series starts, 0-7 or 1-8??). Then they revised them adding an extra digit and they became 01-07, though I have yet to see anything outside 02-05. The current batch of blc04's are remarkable in that while still VLAR tasks they run as quickly as Arecibo tasks on GPUs and yet are even faster than blc02s on CPUs. Perhaps then this new number series of blc24 is an identifier for this new variation? As a reference I believe the numbers correspond to the recorder channels at Greenbank. Interesting though that there was no 1x series. There had been talk of doubling the number of recorders so perhaps they were to be 1x. . . There is always the possibility it is a change of designation to identify 4 bit tasks or something completely different again :) Stephen ?? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
There had been talk of doubling the number of recorders so perhaps they were to be 1x. Do you have a link about this information about doubling the recorders? I must have missed the news. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
How quickly things change. Suddenly the RTS is Empty and the creation rate is in the toilet. Trying to grab the last few netted 33 out of 250. I hope you got them while you could. I did get a couple 24s and even a 25, I'll see how they run. Well, the 24s & 25s appear to be about the same as the old blc3s...so, not that bad. blc25_2bit_guppi_57895_36299_KIC8462852_0002.32486.0.23.46.204.vlar_0 Run time : 4 min 40 sec WU true angle range is : 0.008985 blc24_2bit_guppi_57895_36299_KIC8462852_0002.5027.0.24.47.174.vlar_1 Run time : 7 min 10 sec WU true angle range is : 0.009079 Of course it depends on your machine, note the above difference between a GTX 1060 & 1050Ti. |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Doesn't help that the Haveland graphs have gone missing too. 100% packet loss. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
There had been talk of doubling the number of recorders so perhaps they were to be 1x. . . It was ages ago, it was some time before they changed the designator to 0x. I can't help with the link :( There was chat in one thread about what the number in the blc designator meant and someone (might have been Zalster) posted the link to an article which gave the information about the recorders and the number being associated. That article also mentioned plans to add another back of eight recorders. So my first reaction when the designator became 0x was that it was preparation for that. . . But I have now run about a dozen of the blc24 tasks and the run times are closest to blc05. Oh well, scratch theory a). Stephen :( |
Zalster Send message Joined: 27 May 99 Posts: 5517 Credit: 528,817,460 RAC: 242 |
Eric explained what it meant in the News section https://setiathome.berkeley.edu/forum_thread.php?id=79411&postid=1778453 |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
How quickly things change. Suddenly the RTS is Empty and the creation rate is in the toilet. Trying to grab the last few netted 33 out of 250. . . Again, variations between boxes. I have run a couple of dozen through Bertie with 2 x 970s and 1 x 1050. Run times are just over 5 mins on the 970s and 8 mins on the 1050 which puts them right in the middle of Blc05 territory on that machine (running 3v). . . Oh well, please give me more of the blc04s. . . More? You want more? . . Yes please sir! Stephen :) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.