Message boards :
Number crunching :
The Server Issues / Outages Thread - Panic Mode On! (119)
Message board moderation
Previous · 1 . . . 44 · 45 · 46 · 47 · 48 · 49 · 50 . . . 107 · Next
Author | Message |
---|---|
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 |
Yep. I'm just going through the tasks I have and making sure the _2 and _1 are prioritized and running first. Maybe we can force a small kick on the system by trying to clear out the DB??? Edit: My room is so much colder now. |
rob smith Send message Joined: 7 Mar 03 Posts: 22221 Credit: 416,307,556 RAC: 380 |
No point in prioritising _1 tasks, they are one half of the initial pair sent out. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
Yep. I'm just going through the tasks I have and making sure the _2 and _1 are prioritized and running first. Maybe we can force a small kick on the system by trying to clear out the DB???.There's no point in priorizing _1s. Only _2s and up. _0 and _1 are the initial replication, not resends. |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14653 Credit: 200,643,578 RAC: 874 |
To where the new splitted data is going?I'm getting some, in short bursts every few hours. It's just the luck of asking at exactly the right time. Edit - I didn't get everything I was wanting, but it'll do for now. These were all new work, not resends. 28/03/2020 16:26:07 | SETI@home | Reporting 1 completed tasks 28/03/2020 16:26:07 | SETI@home | [sched_op] NVIDIA GPU work request: 36067.85 seconds; 0.00 devices 28/03/2020 16:26:10 | SETI@home | Scheduler request completed: got 31 new tasks 28/03/2020 16:26:10 | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 15004 seconds |
Unixchick Send message Joined: 5 Mar 12 Posts: 815 Credit: 2,361,516 RAC: 22 |
I wish that the system was making some progress on assimilation since it isn't doing much of any splitting. I know the Status page data is about an hour out of date, but I would like to see the system recovering a bit since it isn't splitting much if at all at the moment. I have been prioritizing 2s as well. I used to do it to help the db size, but recently I have been doing it because they are usually short run WUs that I could return and get a "meatier" WU in its place. Since I'm no longer topped off, and thus have space should any request actually get something, I'm no longer prioritizing anything and I'm running in normal FIFO order. I think I'm good for another 2 days, so I'm hanging out for a bit longer. glad to have made it over 2 million on this and the previous mac-mini. |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 |
Well then...the _2s are running. Not much help on my end. |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
All my machines are Out of Work and not receiving much of anything. I think it's time to do something a little more constructive, I dunno, maybe finish burning the leafs or something... |
[TA]Assimilator1 Send message Joined: 16 Oct 99 Posts: 52 Credit: 8,551,146 RAC: 50 |
I seem to be having the opposite problem to most, so far my faster rig hasn't run out of WUs (got some some at ~11.30, 14.30 & 15.30), but the cache is dwindling. My 2nd rig's CPU ran out of WUs yesterday, bar 1 AP WU which is going to take over a day!! It's GPU was getting WUs sporadically but is now out. So is the consensus just general server problems which aren't being addressed, rather than a purposeful wind down? Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP@H. Main rig - Ryzen 5 3600, 32GB DDR4 3200, RX 580 8GB, 500GB Samsung 970 Evo+, Win 10 2nd rig - i7 4930k @4.1 GHz, 16GB DDR2 1866, HD 7870 XT 3GB (DS), Win7 64bit |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
So is the consensus just general server problems which aren't being addressed, rather than a purposeful wind down?The database server is out of RAM and running in snail mode and the splitters are being heavily throttled in an attempt to reduce the size of the database so it could recover. |
Keith T. Send message Joined: 23 Aug 99 Posts: 962 Credit: 537,293 RAC: 9 |
Yep. I'm just going through the tasks I have and making sure the _2 and _1 are prioritized and running first. Maybe we can force a small kick on the system by trying to clear out the DB??? _1 and _0 are original, normal tasks _2 or greater are the re-sends. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
To where the new splitted data is going?I'm getting some, in short bursts every few hours. It's just the luck of asking at exactly the right time. Just get: 28-Mar-2020 12:29:30 [SETI@home] Scheduler request completed: got 33 new tasks Did that indicates we both win the S@H new WU lotto? |
AllgoodGuy Send message Joined: 29 May 01 Posts: 293 Credit: 16,348,499 RAC: 266 |
7 splitters splitting 6 machines doing nothing 5 extraterrestrial rings 4 servers burning 3 clients crashing 2 servers smashing and an adult beverage to be enjoyed now. |
[TA]Assimilator1 Send message Joined: 16 Oct 99 Posts: 52 Credit: 8,551,146 RAC: 50 |
Lol! :D So is the consensus just general server problems which aren't being addressed, rather than a purposeful wind down?The database server is out of RAM and running in snail mode and the splitters are being heavily throttled in an attempt to reduce the size of the database so it could recover. Ok, thanks :) (not that I can remember what the splitters do :o, ........oh wait, do they split up the large data blocks from the observatory(ies?) to the smaller chunks we crunch? Team AnandTech - SETI@H, Muon1 DPAD, F@H, MW@H, A@H, LHC@H, POGS, R@H, DHEP@H. Main rig - Ryzen 5 3600, 32GB DDR4 3200, RX 580 8GB, 500GB Samsung 970 Evo+, Win 10 2nd rig - i7 4930k @4.1 GHz, 16GB DDR2 1866, HD 7870 XT 3GB (DS), Win7 64bit |
Ville Saari Send message Joined: 30 Nov 00 Posts: 1158 Credit: 49,177,052 RAC: 82,530 |
Ok, thanks :) (not that I can remember what the splitters do :o, ........oh wait, do they split up the large data blocks from the observatory(ies?) to the smaller chunks we crunch?They produce the tasks for us to download and crunch. If they are not running and the RTS buffer is empty, all we get are some occasional resends. And to get even them our computers have to win the lottery against thousands of other hungry computers. |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
i feel so lucky, i just get 6 new WU!!! Enough for less than a minute.... but is a progress. 6 is better than nothing. |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
Still barely any work, so it looks like the Servers have finally ground to a halt. "Results out in the field" is only 5 million now. Dropping away as work comes in but little goes out. "Results returned and awaiting validation" 20.2 million and "Workunits waiting for assimilation" 6.85 million. I think they might as well shut things down now. Leave just the BOINC master database, SETI@home science database, data-driven web pages, the Transitioners and the sah assimilators running but shut down all other functions (inc AP, the Scheudler, upload & download servers, Replica etc). Just let the Assimilators clear that 6.85 million WU backlog. Once that is done, then start up the Deleters to delete all the Tasks & WUs that have now been marked for deletion. Once that backlog clears, then run the Purgers to actually remove everything marked for Deletion. And once that is done, time for one final official weekly outage to compact the database. It would be nice to think this could all be done in a few days, but given the state of the database i can see it taking a week (or more). But once done, the database shouldn't have any further issues with 6.8 million less WUs and at least 13.6 million less Results in it, along with much smaller indexes. Fire up everything except for the Scheduler, upload server (and of course the splitters). Let things run for a bit (after a week there should be plenty of resends to go out), then start up the Scheduler, then a bit later the upload server, and let the WU cleanup lottery commence. And i would really, really, really hope they finally implement the revised deadlines for resends, set it for 2 days & get the worst of the outstanding work cleared up in 3.5 months, instead of the 10+ months it will take (if all work is to be cleared completely). Grant Darwin NT |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
i feel so lucky, i just get 6 new WU!!! Enough for less than a minute.... but is a progress. 6 is better than nothing. I am amazed a Linux machine can process 6 work units in under a minute. Did I read that correctly? |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13746 Credit: 208,696,464 RAC: 304 |
Depends how many GPUs you have, and what type. High end cards can do a WU in 30sec or so.i feel so lucky, i just get 6 new WU!!! Enough for less than a minute.... but is a progress. 6 is better than nothing.I am amazed a Linux machine can process 6 work units in under a minute. Did I read that correctly? Grant Darwin NT |
juan BFP Send message Joined: 16 Mar 07 Posts: 9786 Credit: 572,710,851 RAC: 3,799 |
Depends how many GPUs you have, and what type. High end cards can do a WU in 30sec or so.i feel so lucky, i just get 6 new WU!!! Enough for less than a minute.... but is a progress. 6 is better than nothing.I am amazed a Linux machine can process 6 work units in under a minute. Did I read that correctly? Yes you are right. This host has now 4 GPU's (could be more) and it crunch a WU in about 70 secs (some in the range of 30 secs and slower in about 2:10) so yes 6 WU normal last for 1-1:30 minutes. But the current WU produced are mainly shorties or noise bombs so they crunch in about 1 min only. I measure the crunching performance on how many WU are reported on each scheduled call. It goes from 22 WU to 60 each 5 min. Then just do the math. Be aware i running on 1070 & 2070, In the 2080 Ti or Titan that number is cut by 2 or 3. |
Speedy Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89 |
Once that is done, then start up the Deleters to delete all the Tasks & WUs that have now been marked for deletion. Once that backlog clears, then run the Purgers to actually remove everything marked for Deletion. And once that is done, time for one final official weekly outage to compact the database. Unless the above can be done remotely it will not be getting done until Berkely is out of lockdown. |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.