Panic Mode On (94) Server Problems?

Author	Message
JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1	Message 1626938 - Posted: 13 Jan 2015, 0:38:16 UTC "Sour Grapes make a bitter Whine." <(0)> ID: 1626938 ·

David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1626943 - Posted: 13 Jan 2015, 1:11:24 UTC David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1626943 ·

betreger Send message Joined: 29 Jun 99 Posts: 11362 Credit: 29,581,041 RAC: 66	Message 1626947 - Posted: 13 Jan 2015, 1:26:22 UTC Panic is called for, Einstein is borked and I can't get enough Seti as a back up on that box, truly a cause for PANIC. ID: 1626947 ·

JanniCash Send message Joined: 17 Nov 03 Posts: 57 Credit: 1,276,920 RAC: 0	Message 1626949 - Posted: 13 Jan 2015, 1:45:11 UTC https://www.youtube.com/watch?v=Gd6aLnPHqeE&noredirect=1 ID: 1626949 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1626950 - Posted: 13 Jan 2015, 1:47:30 UTC Last modified: 13 Jan 2015, 1:52:56 UTC Progress! (?) The machine that couldn't update finally did a few seconds ago when I clicked Update button (reported 171 tasks) but no tasks available.... Where have I heard that song before? EDIT And that forced my Pendings well over 2000 (2118, to be exact) for the first time in living memory. /EDIT ID: 1626950 ·

Cruncher-American Send message Joined: 25 Mar 02 Posts: 1513 Credit: 370,893,186 RAC: 340	Message 1626962 - Posted: 13 Jan 2015, 2:58:25 UTC Got straightened out starting about 9pm EST (2am UTC) (d/l 86, then 68 WUs, then more). Now, back to 300, the correct amount - 1 CPU, 2 GPU) Yay!!! And thanks to whomever did the work! ID: 1626962 ·

Speedy Volunteer tester Send message Joined: 26 Jun 04 Posts: 1643 Credit: 12,921,799 RAC: 89	Message 1626998 - Posted: 13 Jan 2015, 5:20:02 UTC As I write this there are over 2 million MB results waiting to be purged. When this is done I am sure it will relieve a little bit of space ID: 1626998 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1627207 - Posted: 13 Jan 2015, 15:53:29 UTC Now what. Everything was looking good, I was up to a whole 23 APs on my Mac (My Windows machines have Hundreds) and suddenly all the APs are gone from the SSP. Splitters disabled... Are they about to finally Fix the APs during this Outage? One can only Hope. ID: 1627207 ·

TBar Volunteer tester Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768	Message 1627509 - Posted: 14 Jan 2015, 14:57:37 UTC Last modified: 14 Jan 2015, 15:33:31 UTC Oh well. There doesn't appear to be any change in the APs. The creation rate is still in the basement even when there are files to split. On the other front, there is still the issue where the server will send CPU APs even if the GPUs are Out of work. A recent example where there were 2 CPU spots open and the server decided to fill those 2 spots even though a GPU was Out of work and there were Days of CPU work; Wed Jan 14 09:37:34 2015 \| SETI@home \| [sched_op] Starting scheduler request Wed Jan 14 09:37:34 2015 \| SETI@home \| Sending scheduler request: To fetch work. Wed Jan 14 09:37:34 2015 \| SETI@home \| Requesting new tasks for CPU and ATI Wed Jan 14 09:37:34 2015 \| SETI@home \| [sched_op] CPU work request: 79474.48 seconds; 0.00 devices Wed Jan 14 09:37:34 2015 \| SETI@home \| [sched_op] ATI work request: 1177312.35 seconds; 1.00 devices Wed Jan 14 09:37:36 2015 \| SETI@home \| Scheduler request completed: got 1 new tasks Wed Jan 14 09:37:36 2015 \| SETI@home \| [sched_op] estimated total CPU task duration: 36280 seconds Wed Jan 14 09:37:36 2015 \| SETI@home \| [sched_op] estimated total ATI task duration: 0 seconds Wed Jan 14 09:42:42 2015 \| SETI@home \| [sched_op] Starting scheduler request Wed Jan 14 09:42:42 2015 \| SETI@home \| Sending scheduler request: To fetch work. Wed Jan 14 09:42:42 2015 \| SETI@home \| Requesting new tasks for CPU and ATI Wed Jan 14 09:42:42 2015 \| SETI@home \| [sched_op] CPU work request: 43953.40 seconds; 0.00 devices Wed Jan 14 09:42:42 2015 \| SETI@home \| [sched_op] ATI work request: 1177960.66 seconds; 1.00 devices Wed Jan 14 09:42:44 2015 \| SETI@home \| Scheduler request completed: got 2 new tasks Wed Jan 14 09:42:44 2015 \| SETI@home \| [sched_op] estimated total CPU task duration: 72558 seconds Wed Jan 14 09:42:44 2015 \| SETI@home \| [sched_op] estimated total ATI task duration: 0 seconds If I raise the cache another half a day, the server will quickly send another CPU task to fill that spot even while the GPUs are Out of work. Wed Jan 14 09:54:59 2015 \| SETI@home \| [sched_op] Starting scheduler request Wed Jan 14 09:54:59 2015 \| SETI@home \| Sending scheduler request: To fetch work. Wed Jan 14 09:54:59 2015 \| SETI@home \| Reporting 2 completed tasks Wed Jan 14 09:54:59 2015 \| SETI@home \| Requesting new tasks for CPU and ATI Wed Jan 14 09:54:59 2015 \| SETI@home \| [sched_op] CPU work request: 4741.31 seconds; 0.00 devices Wed Jan 14 09:54:59 2015 \| SETI@home \| [sched_op] ATI work request: 1179360.00 seconds; 3.00 devices Wed Jan 14 09:55:00 2015 \| SETI@home \| Scheduler request completed: got 0 new tasks Wed Jan 14 09:55:00 2015 \| SETI@home \| No tasks sent Wed Jan 14 09:55:00 2015 \| SETI@home \| No tasks are available for AstroPulse v7 Wed Jan 14 10:05:36 2015 \| SETI@home \| update requested by user Wed Jan 14 10:05:39 2015 \| SETI@home \| [sched_op] Starting scheduler request Wed Jan 14 10:05:39 2015 \| SETI@home \| Sending scheduler request: Requested by user. Wed Jan 14 10:05:39 2015 \| SETI@home \| Requesting new tasks for CPU and ATI Wed Jan 14 10:05:39 2015 \| SETI@home \| [sched_op] CPU work request: 61260.20 seconds; 0.00 devices Wed Jan 14 10:05:39 2015 \| SETI@home \| [sched_op] ATI work request: 1308960.00 seconds; 3.00 devices Wed Jan 14 10:05:40 2015 \| SETI@home \| Scheduler request completed: got 2 new tasks Wed Jan 14 10:05:40 2015 \| SETI@home \| [sched_op] estimated total CPU task duration: 72552 seconds Wed Jan 14 10:05:40 2015 \| SETI@home \| [sched_op] estimated total ATI task duration: 0 seconds Wed Jan 14 10:10:47 2015 \| SETI@home \| Sending scheduler request: To fetch work. Wed Jan 14 10:10:47 2015 \| SETI@home \| Requesting new tasks for CPU and ATI Wed Jan 14 10:10:47 2015 \| SETI@home \| [sched_op] CPU work request: 12837.28 seconds; 0.00 devices Wed Jan 14 10:10:47 2015 \| SETI@home \| [sched_op] ATI work request: 1308960.00 seconds; 3.00 devices Wed Jan 14 10:10:48 2015 \| SETI@home \| Scheduler request completed: got 0 new tasks Wed Jan 14 10:32:17 2015 \| SETI@home \| [sched_op] Starting scheduler request Wed Jan 14 10:32:17 2015 \| SETI@home \| Sending scheduler request: To fetch work. Wed Jan 14 10:32:17 2015 \| SETI@home \| Requesting new tasks for CPU and ATI Wed Jan 14 10:32:17 2015 \| SETI@home \| [sched_op] CPU work request: 78845.72 seconds; 0.00 devices Wed Jan 14 10:32:17 2015 \| SETI@home \| [sched_op] ATI work request: 1438560.00 seconds; 3.00 devices Wed Jan 14 10:32:18 2015 \| SETI@home \| Scheduler request completed: got 1 new tasks Wed Jan 14 10:32:18 2015 \| SETI@home \| [sched_op] estimated total CPU task duration: 36273 seconds Wed Jan 14 10:32:18 2015 \| SETI@home \| [sched_op] estimated total ATI task duration: 0 seconds There are over 4 days of CPU work, the 3 GPUs are Out of work, so the server sends CPU work. You can't make this stuff up... ID: 1627509 ·

David S Volunteer tester Send message Joined: 4 Oct 99 Posts: 18352 Credit: 27,761,924 RAC: 12	Message 1627566 - Posted: 14 Jan 2015, 18:31:15 UTC - in response to Message 1627509. Oh well. There doesn't appear to be any change in the APs. The creation rate is still in the basement even when there are files to split. On the other front, there is still the issue where the server will send CPU APs even if the GPUs are Out of work. When I checked the SSP about an hour ago, there were 2 AP splitters munching away and about 300 ready to send. Now all the AP tapes are done (although the 2 splitters still show running) and RTS is 0. David Sitting on my butt while others boldly go, Waiting for a message from a small furry creature from Alpha Centauri. ID: 1627566 ·

JaundicedEye Send message Joined: 14 Mar 12 Posts: 5375 Credit: 30,870,693 RAC: 1	Message 1627574 - Posted: 14 Jan 2015, 18:41:42 UTC Again................. "Sour Grapes make a bitter Whine." <(0)> ID: 1627574 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1627588 - Posted: 14 Jan 2015, 19:00:47 UTC - in response to Message 1627566. Oh well. There doesn't appear to be any change in the APs. The creation rate is still in the basement even when there are files to split. On the other front, there is still the issue where the server will send CPU APs even if the GPUs are Out of work. When I checked the SSP about an hour ago, there were 2 AP splitters munching away and about 300 ready to send. Now all the AP tapes are done (although the 2 splitters still show running) and RTS is 0. Depending on how full the temp WU storage area is at the moment. We may or may not get more soon. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1627588 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1627766 - Posted: 15 Jan 2015, 2:54:26 UTC - in response to Message 1627588. Depending on how full the temp WU storage area is at the moment. We may or may not get more soon. I'm going to take another stab at estimating (roughly) how much disk space is being consumed by WUs. AP: -WUs waiting for assimilation: ~750k = 5.70TiB -Results returned and awaiting validation: ~1.7M / 3 (assuming an average of 3 results per WU) = 4.3TiB -Results out in the field: ~120k / 3 (assuming average of 3 per WU) = ~312GiB AP's subtotal: ~10.3TiB MB: -Results returned and awaiting validation: ~3M / 3 = ~340GiB -Results out in the field: ~3.6M / 3 = ~410GiB -WUs waiting for DB purging: ~1.3M = ~450GiB MB's subtotal: ~1TiB So in total, [quite] roughly, ~11.5TiB of WUs on disk presently. There's probably a +/- of 500GiB in my [very] rough numbers and estimations, but I think it is good enough for rough estimations. Last week, the problem was the storage area filled (and I estimated it to be at 8TiB then) and more space got added to that volume of the array, but it is unknown how much space got added. If it was 4TiB, then we're nearing that limit again soon. If it was 8TiB, then we still have probably ~1 week longer until it becomes an issue again. Really hope the AP database finishes getting massaged and mended so this mess can start being cleaned-up. MB doesn't take up much space on disk at all.. but AP is a massive byte hog. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1627766 ·

Juha Volunteer tester Send message Joined: 7 Mar 04 Posts: 388 Credit: 1,857,738 RAC: 0	Message 1628071 - Posted: 15 Jan 2015, 18:49:23 UTC - in response to Message 1627766. assuming an average of 3 results per WU For Multibeam: Workunits waiting for db purging 1,011,068 Results waiting for db purging 2,142,904 That gives about 2,12 results per workunit. The ratio is probably inflated by some amount by the bad batch from the rogue splitter. Astropulse ratio is/has been usually higher than that but still way under 3. (The current numbers in SSP makes no sense.) ID: 1628071 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1628280 - Posted: 16 Jan 2015, 5:43:55 UTC Well... considering the splitters are supposedly running, and there are tapes to split, but the creation rate is near-zero.. I'm going to assume the WU storage filled again. Based on my ~11.5 TiB math and splitting has screeched to a halt, I'm going to assume (again) that 4 TiB was added to the original 8 and the storage space is once again full. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1628280 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 1628582 - Posted: 16 Jan 2015, 20:22:26 UTC Pretty screwy numbers on the SSP at this point, so something must have gone sideways. Not buying that the splitters somehow generated 3.6M MBs ready to send over night, and definitely not 100k APs. As if everything that was out in the field got lost track of and all of a sudden is back waiting to send? ID: 1628582 ·

HAL9000 Volunteer tester Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57	Message 1628586 - Posted: 16 Jan 2015, 20:30:28 UTC - in response to Message 1628582. Pretty screwy numbers on the SSP at this point, so something must have gone sideways. Not buying that the splitters somehow generated 3.6M MBs ready to send over night, and definitely not 100k APs. As if everything that was out in the field got lost track of and all of a sudden is back waiting to send? Of course the splitters didn't do that overnight. They did it in the past hour. Along with the 90k AP. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ ID: 1628586 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 1628615 - Posted: 16 Jan 2015, 21:12:35 UTC - in response to Message 1628586. Of course the splitters didn't do that overnight. They did it in the past hour. Along with the 90k AP. Fortunately, it seems it shook its little head, resolved the issue and now looks normal. But, it looks as though the AP splitters are stuck. At least, they appear to be at the same point on the same files that they were 12 hours ago. Oh well ... ID: 1628615 ·

Jimbocous Volunteer tester Send message Joined: 1 Apr 13 Posts: 1853 Credit: 268,616,081 RAC: 1,349	Message 1628787 - Posted: 17 Jan 2015, 2:10:49 UTC - in response to Message 1628615. Of course the splitters didn't do that overnight. They did it in the past hour. Along with the 90k AP. Fortunately, it seems it shook its little head, resolved the issue and now looks normal. But, it looks as though the AP splitters are stuck. At least, they appear to be at the same point on the same files that they were 12 hours ago. Oh well ... And both then promptly died. Oh well ... ID: 1628787 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13750 Credit: 208,696,464 RAC: 304	Message 1628820 - Posted: 17 Jan 2015, 4:05:08 UTC - in response to Message 1628787. Last modified: 17 Jan 2015, 4:05:20 UTC Of course the splitters didn't do that overnight. They did it in the past hour. Along with the 90k AP. Fortunately, it seems it shook its little head, resolved the issue and now looks normal. But, it looks as though the AP splitters are stuck. At least, they appear to be at the same point on the same files that they were 12 hours ago. Oh well ... And both then promptly died. Oh well ... Add to that the result of my last Scheduler request- 17/01/2015 13:32:22 \| SETI@home \| Scheduler request failed: HTTP service unavailable Grant Darwin NT ID: 1628820 ·

©2024 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.