Panic Mode On (109) Server Problems?

Author	Message
Mr. Kevvy Volunteer moderator Volunteer tester Send message Joined: 15 May 99 Posts: 3776 Credit: 1,114,826,392 RAC: 3,319	Message 1913739 - Posted: 18 Jan 2018, 16:00:29 UTC - in response to Message 1913730. Caches are full and RAC is steadily dropping... Indeed... the lack of work accelerated the fall to where it was going to end up anyways; ripping off the proverbial bandage. ID: 1913739 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1913743 - Posted: 18 Jan 2018, 16:30:25 UTC Those who use Einstein as a backup project should take note and have a sufficient cache next Tuesday or they may very well run out of work. Einstein has decided to join the fun by stating We are going to shut down the project next Tuesday, Jan 23rd at around 10 AM CET for an upgrade of our database backend systems to make them ready for the years to come. We're going to upgrade hardware parts, operating systems as well the databases themselves, which is why we need to shut down the entire project, including the BOINC backend and this very website. We should have the pleasure of a double outrage. ID: 1913743 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 1913778 - Posted: 18 Jan 2018, 18:42:44 UTC - in response to Message 1913743. Those who use Einstein as a backup project should take note and have a sufficient cache next Tuesday or they may very well run out of work. Einstein has decided to join the fun by stating We are going to shut down the project next Tuesday, Jan 23rd at around 10 AM CET for an upgrade of our database backend systems to make them ready for the years to come. We're going to upgrade hardware parts, operating systems as well the databases themselves, which is why we need to shut down the entire project, including the BOINC backend and this very website. We should have the pleasure of a double outrage. Yeah, it is a shame about the scheduling. Any other day would have sufficed ... ID: 1913778 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1913807 - Posted: 18 Jan 2018, 20:58:17 UTC - in response to Message 1913778. Can always choose another backup project. I'll have MilkyWay and GPUGrid.net as backups also. Though if I build a big enough cache of Einstein work, that shouldn't be an issue either. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1913807 ·

betreger Send message Joined: 29 Jun 99 Posts: 11361 Credit: 29,581,041 RAC: 66	Message 1913813 - Posted: 18 Jan 2018, 21:21:18 UTC - in response to Message 1913778. Yeah, it is a shame about the scheduling. Any other day would have sufficed ... But a double outrage is something to behold. ID: 1913813 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1913833 - Posted: 18 Jan 2018, 23:01:11 UTC - in response to Message 1913807. Can always choose another backup project. I'll have MilkyWay and GPUGrid.net as backups also. Though if I build a big enough cache of Einstein work, that shouldn't be an issue either. Last time I crossed swords with MilkyWay, it was appallingly badly managed. And GPUGrid is giving me RSI in the mouse-click finger, because I run it but have extreme difficulty snagging new work. ID: 1913833 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1913844 - Posted: 19 Jan 2018, 0:05:46 UTC - in response to Message 1913833. Can always choose another backup project. I'll have MilkyWay and GPUGrid.net as backups also. Though if I build a big enough cache of Einstein work, that shouldn't be an issue either. Last time I crossed swords with MilkyWay, it was appallingly badly managed. And GPUGrid is giving me RSI in the mouse-click finger, because I run it but have extreme difficulty snagging new work. MilkyWay?? Badly managed?? Wow, very different experience here.. MW is the most set and forget project I have run. I never have to micromanage it at all. I love the hard limit of 80 tasks per gpu at any one time. Never a chance of getting too much work and never any chance of running out. I only crunch gpu tasks so that means I have run the Binary Pulsar Search while it lasted and now run the Gamma Ray Pulsar Search. The only issues I have seen with the project is the occasional bad work unit that promptly gets tossed out very quickly. The servers seem to stay up for very long times, months at a time in fact. Yes, I have just recently joined GPUGrid.net and the gpu work availability is very spotty and random. The tasks when they are made available are quickly gobbled up by many fingers bashing the update button. The cpu work for Linux hosts has had work available pretty much all the time. No problem getting cpu work for the Linux host. We are asking the project scientists to make the cpu work available for Windows too. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1913844 ·

Richard Haselgrove Volunteer tester Send message Joined: 4 Jul 99 Posts: 14650 Credit: 200,643,578 RAC: 874	Message 1913850 - Posted: 19 Jan 2018, 0:27:43 UTC - in response to Message 1913844. Well, I found it necessary to make Post 58550. Read through to Post 58572, and note his titles. ID: 1913850 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1913852 - Posted: 19 Jan 2018, 0:40:27 UTC - in response to Message 1913850. Last modified: 19 Jan 2018, 0:46:25 UTC Well, I found it necessary to make Post 58550. Read through to Post 58572, and note his titles. Well, even Project Scientists and Developers are human. Witness our own Eric K. and the recent spate of typo errors. I do remember they (MW) having issues initially with the n-body mt application but they sorted it out evidently and I didn't follow any of the threads since as I stated I don't do MW cpu work. I haven't seen many posts about n-body issues other than host configuration questions. And the mt documentation must be mostly stable and understood by now as the mt cpu app deployed at GPUGrid just this month by a student had a relatively easy startup. It was nice to find it obeyed the app_config core usage setting so it didn't hog all cores on my Ryzen 1800X. I am using 4 cores to process the cpu tasks leaving the other cores for Seti cpu tasks. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1913852 ·

juan BFP Volunteer tester Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799	Message 1913859 - Posted: 19 Jan 2018, 1:09:44 UTC MW is fine, but IIRC to do something productive there you need to have an AMD GPU, NV stuff has troubles to work with DP used by MW. Does that changes? GPUGrid with it's long time to crunch WU is not really a project to use as a backup. IMHO ID: 1913859 ·

Jeff Buck Volunteer tester Send message Joined: 11 Feb 00 Posts: 1441 Credit: 148,764,870 RAC: 0	Message 1913874 - Posted: 19 Jan 2018, 1:39:21 UTC - in response to Message 1913844. MW is the most set and forget project I have run. I never have to micromanage it at all. I only keep a backup available on one of my crunch-only machines, just to make sure it maintains a little heat in the bedroom on chilly nights when SaH runs out of work. My first choice is Asteroids, but they're often out of work, too, so I added MilkyWay as a backup to the backup. The last time it ran on Windows was about 3 years ago. It worked fine. But that machine is now Linux, and when MilkyWay kicked in one night a couple months ago, it turned out to be a colossal waste of time. I don't remember how many tasks it ran, but when I checked the results the next day, I found that all but one of them had been marked Invalid. I think they all ran to completion without throwing any errors, but it was all just wasted electricity (except for the little bit of extra heat). I never did try to figure out what might have happened, just turned off MilkyWay and added Einstein for the next time that both SaH and Asteroids ran out. ID: 1913874 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1913877 - Posted: 19 Jan 2018, 1:45:56 UTC - in response to Message 1913859. Last modified: 19 Jan 2018, 1:52:26 UTC MW is fine, but IIRC to do something productive there you need to have an AMD GPU, NV stuff has troubles to work with DP used by MW. Does that changes? GPUGrid with it's long time to crunch WU is not really a project to use as a backup. IMHO Milkyway used Double Precision calculations. Most GeForce GPUs are allowed DP performance of 1/32 SP Radeon GPUs are allowed DP performance of 1/16 SP So if both GPUs were 6000 GFLOPS in Single Precision the GeForce would be 188 GFLOPS DP and the Radeon 375 GFLOPS If you move to the workstation GPUs they will have DP performance of up to 1/2 SP. Which is likely why they have 4 digit price tags. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1913877 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1913883 - Posted: 19 Jan 2018, 1:59:59 UTC - in response to Message 1913859. MW is fine, but IIRC to do something productive there you need to have an AMD GPU, NV stuff has troubles to work with DP used by MW. Does that changes? GPUGrid with it's long time to crunch WU is not really a project to use as a backup. IMHO No, MW has no issue with Nvidia as long as the card can do double-precision. Probably any card greater than Kepler or avoiding the lowest denomination of any family. Nvidia doesn't have the same degree of performance in double-precision as ATI/AMD but they still work fine. I do a Gamma Ray Binary Pulsar task in 190 seconds and get awarded 227 credits for it. The credit is static for all task types. The longest running GPUGrid gpu task so far I've run was 8 hours and was awarded 387,150 credits. The shortest task was 3 hours. The longest cpu task was 1 hour and the shortest 20 minutes. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1913883 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1913914 - Posted: 19 Jan 2018, 4:03:02 UTC In progress back up around 5 million, Received-last-hour back over 120k. WU-awaiting-deletion climbing, splitter output dropped down to lower level again. Will clearing awaiting-deletion fire up the splitters again? We'll just have to wait and see! Grant Darwin NT ID: 1913914 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1913916 - Posted: 19 Jan 2018, 4:11:09 UTC - in response to Message 1913914. In progress back up around 5 million, Received-last-hour back over 120k. WU-awaiting-deletion climbing, splitter output dropped down to lower level again. Will clearing awaiting-deletion fire up the splitters again? We'll just have to wait and see! NEWS at 10! Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1913916 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1913943 - Posted: 19 Jan 2018, 6:17:52 UTC Around 4min 55sec on my GTX 1070s for the current BLC_02s, now we have some BLC_02s that aren't VLARs. About 50sec quicker to crunch, but they do cause some noticeable system/display lag. Grant Darwin NT ID: 1913943 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1913965 - Posted: 19 Jan 2018, 9:07:18 UTC And there we go. Awaiting-deletion backlog cleared, splitters crank out the WUs. Grant Darwin NT ID: 1913965 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1913967 - Posted: 19 Jan 2018, 9:12:11 UTC - in response to Message 1913965. Last modified: 19 Jan 2018, 9:12:26 UTC I'd say that is pretty convincing evidence that the two are directly linked. If you overlay the graphs, they are coincident. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1913967 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13736 Credit: 208,696,464 RAC: 304	Message 1913974 - Posted: 19 Jan 2018, 9:46:21 UTC - in response to Message 1913967. Last modified: 19 Jan 2018, 9:46:49 UTC I'd say that is pretty convincing evidence that the two are directly linked. If you overlay the graphs, they are coincident. Correlation isn't causation, but yeah, when returned-per-hour hits it's present highs, and work-in-progress gets right up there, the deleters & splitters certainly aren't able to both run at 100% at the same time; the splitters crank out the work, the deleter backlog grows. It gets to a certain point & the splitters slow down and stay there till the delete backlog clears. And it's continued to occur after the weekly outage. It's choking on it's own I/O. Grant Darwin NT ID: 1913974 ·

Keith Myers Volunteer tester Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873	Message 1914035 - Posted: 19 Jan 2018, 17:50:14 UTC - in response to Message 1913974. I'd say the administrators need to shorten up the cron job interval on the deleters purge task so that we could maintain a higher average RTS buffer quantity. Or if the purge is threshold based, to lower it. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) ID: 1914035 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.