Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (118)
Message board moderation
Previous · 1 . . . 79 · 80 · 81 · 82 · 83 · 84 · 85 . . . 94 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13946 Credit: 208,696,464 RAC: 304 ![]() ![]() |
I'm wondering if they ran some sort of script during this outage to jiggle any missed WUs? I'm seeing a lot more groups of re-sends than usual. Getting batches of 20+ at a time. Grant Darwin NT |
Ville Saari ![]() Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 ![]() ![]() |
. . Less than 9 hours is a pleasant change from recent outages.You apparently count the outage length differently than I do. My system logged that as 11.93 hour outage. That's the time between the last reported or received task before the outage and the first received new task after it. So for me the outage ends when the servers have recovered to the point where they can actually hand out new work. That's what matters if I want to determine if my cache was sufficient to 'survive' the outage. Actually the end should be at the point when the server can feed my computers new work faster than they can process so that the caches start filling. The first received task could be a random single task long before the faucets open for real. The first received task is just much easier to measure. December 3rd was the last 'normal' tuesday outage (4.21 hours). All outages after that have been way longer - except some unplanned ones. |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19691 Credit: 40,757,560 RAC: 67 ![]() ![]() |
True, better than the past couple of weeks, but nowhere near the Tuesday outage boilerplate of a 4-5 hour outage for database backup. During the outage did your CPU's run out of tasks? |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
I'm wondering if they ran some sort of script during this outage to jiggle any missed WUs? I'm seeing a lot more groups of re-sends than usual. Getting batches of 20+ at a time. . . A measure to try and clear the backlog perhaps ?? |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
. . Less than 9 hours is a pleasant change from recent outages.You apparently count the outage length differently than I do. My system logged that as 11.93 hour outage. That's the time between the last reported or received task before the outage and the first received new task after it. So for me the outage ends when the servers have recovered to the point where they can actually hand out new work. That's what matters if I want to determine if my cache was sufficient to 'survive' the outage. . . I think the general consensus is that it starts when the servers go into maintenance mode and ends when they come back online and we no longer receive a "Shut down for maintenance" message. After that the time till we receive "normal" feeds of WUs is the recovery period. I found the recovery pretty smooth and not long this time. This outage began at approx. 12:30 am local time and I was able to report work from about 9:05 am with only a small number (relatively) of "http internal error messages" so the outage was around 8.5 hours. I agree that the assessment of work required to get through an outage must include the recovery period but I was receiving 'significant' downloads (20 or more WUs) by about 10:15 am. So all up still less than 10 hours here. After 24 to 48 hour monsters I welcome the improvement ... Stephen < shrug > |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
True, better than the past couple of weeks, but nowhere near the Tuesday outage boilerplate of a 4-5 hour outage for database backup. The Ryzens, ALWAYS. The Intel Xeon usually always has cpu work during these long outages. It is just way slower than the Rzyens. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19691 Credit: 40,757,560 RAC: 67 ![]() ![]() |
True, better than the past couple of weeks, but nowhere near the Tuesday outage boilerplate of a 4-5 hour outage for database backup. I just took a quick look at one of your Ryzens (computer 5741129) and guestimate that about 60 tasks/day would keep each core busy, excluding noise bombs. So the next question has to be why are there not enough tasks being cached for the CPU's |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
just took a quick look at one of your Ryzens (computer 5741129) and guestimate that about 60 tasks/day would keep each core busy, excluding noise bombs. I don't know how you came up with 60 tasks a day on that host. BoincTasks says that host does 576 cpu tasks a day. Generally I do a cpu task in 25-45 minutes for most work. Have 16 threads running cpu work. I don't generally reschedule gpu work to the cpu. I just crunch through my allotted 150 tasks and then the cpu goes idle for the rest of the outage. I only do cpu work for Seti. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
Speedy ![]() Send message Joined: 26 Jun 04 Posts: 1646 Credit: 12,921,799 RAC: 89 ![]() ![]() |
I am just guessing Keith, I am thinking possibly a slower CPU by 600 MHz and running Windows 10 ![]() |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
Well that host has the new Ryzen 3950X cpu. So 32 threads available and I have the cores locked to 4.2Ghz and I run the memory at 3600Mhz CL14. So it crunches through the cpu tasks the fastest of all my hosts. The second fastest is the 3900X with the same memory clock speed but only running 4.15Ghz and only 12 threads running cpu work. That one is currently in pieces on the kitchen table getting a custom loop cooling upgrade so I can punch the clock up on it. The Threadripper is next fastest and is mostly hamstrung by slower memory and slower clock that is autoboosted and variable. I'm waiting on a new hot-shirt cpu block for it so I can move to locked clocks on it which will improve the cpu processing time. Hoping to punch the memory clocks up too with a cooler cpu. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19691 Credit: 40,757,560 RAC: 67 ![]() ![]() |
just took a quick look at one of your Ryzens (computer 5741129) and guestimate that about 60 tasks/day would keep each core busy, excluding noise bombs. I did say each core. Surely the limits set by Seti are per processor not per type of processor. So I must ask again why is your cpu cache not storing enough to last a 12 hour outage when by your own numbers each core/thread is averaging 36 tasks/day. At a max cache/processor of 100, a number I think is now out of date, you should be able to cache nearly three days of cpu work. Assuming no -9's. |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
At a max cache/processor of 100, a number I think is now out of date, you should be able to cache nearly three days of cpu work. Assuming no -9's. There we go using common sense again :) By your maths, I should be able to store about 1500 GPU tasks to keep me purring along, although I wish I could get a cache of about 150 -200 opencl_ati_mac AP files a day as a decent replacement. Edit: Next you'll be saying we should no longer crunch AP on CPU because it is too inefficient, and has too many casual users who don't devote their machines to the extent full time crunchers do. But hey, it is a good hobby. |
Stephen "Heretic" ![]() ![]() ![]() ![]() Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 ![]() ![]() |
I did say each core. . . The limit is per CPU not per core. So we only get 150 WUs whether you have 4 cores or 64 cores. Stephen < shrug > |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
The limit is per CPU not per core. So we only get 150 WUs whether you have 4 cores or 64 cores. See...this is where we are messing up. We could create a SETI@home Pro, as a tiered level system. Free just as it currently is. $5/mo for 2 days advanced work $10/mo for 3 up to $50 for 10. Then an addition $5/mo for customizations. pick and choose which data types to work with for each machine. Edit: Ok, I'll stop with the comedy. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
As Stephen mentioned, each host no matter how many cpus or cores gets 150 cpu tasks. That's it. That is all. No more. So you can never carry more than the server allotment of 150 on any host. So I start the outage with 150 tasks and retire 24 tasks per hour. So I am out of cpu tasks in 6 1/4 hours. And that is if I get no overflows or shorties with high angle range which can be finished in 20 minutes or less. Since our outages lately have gone on for 12 hours or more before you actually start getting replacement work, the Ryzen cpus sit cold for several hours with no work. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19691 Credit: 40,757,560 RAC: 67 ![]() ![]() |
As Stephen mentioned, each host no matter how many cpus or cores gets 150 cpu tasks. That's it. That is all. No more. So you can never carry more than the server allotment of 150 on any host. I thought it was per core. I can only conclude the system is illogical. |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 ![]() ![]() |
I can only conclude the system is illogical. I get all tingly inside when I see deductive humor. |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19691 Credit: 40,757,560 RAC: 67 ![]() ![]() |
I can only conclude the system is illogical. LMAO. Next step in the conclusion. The special sauce needs more ingredients, so that each core or thread is translated into a CPU when asking for work. |
![]() ![]() ![]() Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 ![]() ![]() |
I can only conclude the system is illogical. We already tried. The scheduler wants nothing to do with spoofed cpus. A host can only have one cpu so only gets 150 tasks. The only way around the issue is to reschedule gpu tasks to the cpu. Seti@Home classic workunits:20,676 CPU time:74,226 hours ![]() ![]() A proud member of the OFA (Old Farts Association) |
W-K 666 ![]() Send message Joined: 18 May 99 Posts: 19691 Credit: 40,757,560 RAC: 67 ![]() ![]() |
I can only conclude the system is illogical. I had dual CPU computer on here at one time and I'm pretty sure the cache was per CPU, that's why I'm surprised the cache is not per core or thread. That computer isn't listed here these days, but it is still in the list at Beta https://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=6318 |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.