Panic Mode On (38) Server problems

Author	Message
Jord Volunteer tester Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3	Message 1034445 - Posted: 19 Sep 2010, 11:56:25 UTC - in response to Message 1034388. ..and had to be emailed to Dr. A for manual upload. When David is asking for such a file to be sent to him, he won't manually upload the contents of said file into the database or anything. He'll be using it to test with what happens to it - checking the backend, see what it does - when he tries to send such a file. At the end of the day, the user that has the problem will still need to report all those tasks himself; or at least, his BOINC needs to do so. ID: 1034445 ·

Josef W. Segur Volunteer developer Volunteer tester Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0	Message 1034482 - Posted: 19 Sep 2010, 14:31:42 UTC - in response to Message 1034352. Yes, limits of 40/320 are in place, but im wondering more about this: SETI@home 19.09.2010 08:38:50 Scheduler request completed: got 58 new tasks Thought, there was also a limit of about 20 per request? Seems, this doesn't apply any more. Yes, the project's config.xml used to have max_wus_to_send set to 20, and that used to be applied directly. But changeset [trac]changeset:18255[/trac] revised it so the number is now scaled by number of CPUs and GPUs (*gpu_multiplier). The 20 is probably still there, but is being treated as 20 per CPU and 160 per GPU. There are only 100 feeder slots so that's the absolute maximum which could be sent for one request, and for new work there are pairs of tasks for each WU. Getting more than 50 means some are unpaired reissues (_2 and up). Joe ID: 1034482 ·

Grant (SSSF) Volunteer tester Send message Joined: 19 Aug 99 Posts: 13947 Credit: 208,696,464 RAC: 304	Message 1034549 - Posted: 19 Sep 2010, 18:47:10 UTC Probably time to up the server side limits. Network traffic has finally eased off (apart from the AP bursts), but the inbound traffic has been steadily climbing- most likely from hosts that have reached the Server Side limit & still trying for work. Grant Darwin NT ID: 1034549 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51540 Credit: 1,018,363,574 RAC: 1,004	Message 1034550 - Posted: 19 Sep 2010, 18:48:59 UTC - in response to Message 1034549. Probably time to up the server side limits. Network traffic has finally eased off (apart from the AP bursts), but the inbound traffic has been steadily climbing- most likely from hosts that have reached the Server Side limit & still trying for work. I said that hours ago before I passed out, my friend. "Time is simply the mechanism that keeps everything from happening all at once." ID: 1034550 ·

Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13	Message 1034644 - Posted: 20 Sep 2010, 0:20:44 UTC I remember the first time I noticed I could get more than 20 tasks in one request. It was nice when I needed to get 150-225 tasks to fill the cache back when I was doing MB work. Used to take a ton of requests, but then it would only take 4-6 requests instead of 10+. Presently, my record for most APs in one request is 23. Just about all of the requests are 1-3 though, but every now and then, I can get 5-15 at a time. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) ID: 1034644 ·

Blake Bonkofsky Volunteer tester Send message Joined: 29 Dec 99 Posts: 617 Credit: 46,383,149 RAC: 0	Message 1034647 - Posted: 20 Sep 2010, 0:31:13 UTC Looks like the limits have been raised up, as I have picked up 250ish WU's within the last few hours. I had been banging off of the limit all weekend. ID: 1034647 ·

Terror Australis Volunteer tester Send message Joined: 14 Feb 04 Posts: 1817 Credit: 262,693,308 RAC: 44	Message 1034713 - Posted: 20 Sep 2010, 3:09:30 UTC Last modified: 20 Sep 2010, 3:09:55 UTC New message from the server !! 20/09/2010 12:21:27 SETI@home Message from server: Resent lost task 24ap10ae.5264.1058.11.10.50_2 Looks like some thing has finally been done about the ghost problem - Yeehaa T.A. ID: 1034713 ·

Zeus Fab3r Send message Joined: 17 Jan 01 Posts: 649 Credit: 275,335,635 RAC: 597	Message 1034717 - Posted: 20 Sep 2010, 3:14:53 UTC - in response to Message 1034713. Last modified: 20 Sep 2010, 4:01:41 UTC YES indeed !!!!!!! 20-Sep-10 05:12:27 SETI@home Scheduler request completed: got 20 new tasks 20-Sep-10 05:12:27 SETI@home [sched_op_debug] Server version 611 20-Sep-10 05:12:27 SETI@home Message from server: Resent lost task D%my10adep23.24198.12.10.222_2 20-Sep-10 05:12:27 SETI@home Message from server: Resent lost task 31dc09af.17708.8411.12.10.90_3 20-Sep-10 05:12:27 SETI@home Message from server: Resent lost task 31dc09af.17708.8411.12.10.102_2 Edit: But something is very wrong... I was tracking 26 ghosts after scheduler request timeout when I saw that message, but instead of those 26 I just got 20 tasks that are not on my rig's list at all ?! So, will I crunch someone else's wu's in the near future? Edit 2: Things are going from bad to worse. I just got a bunch of fresh .vlars on my GPU ;-< Task 1712880409 Who the hell is General Failure and why is he reading my harddisk?Â¿ ID: 1034717 ·

Dirk Sadowski Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1034829 - Posted: 20 Sep 2010, 13:45:44 UTC - in response to Message 1034717. Last modified: 20 Sep 2010, 13:47:08 UTC YES indeed !!!!!!! 20-Sep-10 05:12:27 SETI@home Scheduler request completed: got 20 new tasks 20-Sep-10 05:12:27 SETI@home [sched_op_debug] Server version 611 20-Sep-10 05:12:27 SETI@home Message from server: Resent lost task D%my10adep23.24198.12.10.222_2 20-Sep-10 05:12:27 SETI@home Message from server: Resent lost task 31dc09af.17708.8411.12.10.90_3 20-Sep-10 05:12:27 SETI@home Message from server: Resent lost task 31dc09af.17708.8411.12.10.102_2 Edit: But something is very wrong... I was tracking 26 ghosts after scheduler request timeout when I saw that message, but instead of those 26 I just got 20 tasks that are not on my rig's list at all ?! So, will I crunch someone else's wu's in the near future? Edit 2: Things are going from bad to worse. I just got a bunch of fresh .vlars on my GPU ;-< Task 1712880409 Ohh.. well.. - a new feature enabled and (unintentional) in the same time an other old feature disabled.. :-( ID: 1034829 ·

Helli_retiered Volunteer tester Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0	Message 1034831 - Posted: 20 Sep 2010, 13:56:16 UTC - in response to Message 1034647. Looks like the limits have been raised up, as I have picked up 250ish WU's within the last few hours. I had been banging off of the limit all weekend. Yes, fine. But as always for me - as a Power Cruncher - very late. I'm not sure that i can download >4,000 workunits in the next 24 Hours. :-\| Helli A loooong time ago: First Credits after SETI@home Restart ID: 1034831 ·

SciManStev Volunteer tester Send message Joined: 20 Jun 99 Posts: 6662 Credit: 121,090,076 RAC: 0	Message 1034848 - Posted: 20 Sep 2010, 14:48:46 UTC - in response to Message 1034831. Looks like the limits have been raised up, as I have picked up 250ish WU's within the last few hours. I had been banging off of the limit all weekend. Yes, fine. But as always for me - as a Power Cruncher - very late. I'm not sure that i can download >4,000 workunits in the next 24 Hours. :-\| Helli I have that problem too. It is very difficult to download a lot of work at what turns out to be the last minute. You really have to stay with your rig and baby sit it. I hope the new server will allow things to run with higher limits earlier on. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website ID: 1034848 ·

Helli_retiered Volunteer tester Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0	Message 1034850 - Posted: 20 Sep 2010, 14:54:11 UTC - in response to Message 1034848. Last modified: 20 Sep 2010, 14:54:33 UTC You said it right, Steve. I sit here (maybe the next 4 hours) and push the Retry-Button because i can't wait 11:32:45 Hours for the next automatic connect. ;-) Helli A loooong time ago: First Credits after SETI@home Restart ID: 1034850 ·

Dirk Sadowski Volunteer tester Send message Joined: 6 Apr 07 Posts: 7105 Credit: 147,663,825 RAC: 5	Message 1034865 - Posted: 20 Sep 2010, 15:41:37 UTC - in response to Message 1034850. My machines will DL to tomorrow to the outage.. and with bad luck during the outage they will have a lot of backlogged DLs.. I have only DSL light (384/64 kbit/s).. :-( For example, my GPU machine need at least 5 hours after every 3 day outage to UL all results of 3 days.. and during this time no work request.. After the 940 BE made the UL, then I enable the network for the E7600 machine.. Everyone have his own specially problems.. ;-) ID: 1034865 ·

Helli_retiered Volunteer tester Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0	Message 1034872 - Posted: 20 Sep 2010, 16:09:15 UTC - in response to Message 1034865. Last modified: 20 Sep 2010, 16:10:03 UTC ...Everyone have his own specially problems.. ;-) Yupp. And i don't think that Oscar makes any Difference on this Situation. IMHO the 100MBit Line in Combination with this 3-Day-in-7-Days Outage is the Bottleneck. Maybe a 5-Day-in-14-Days Outage would be a better Choice for us. Helli A loooong time ago: First Credits after SETI@home Restart ID: 1034872 ·

SciManStev Volunteer tester Send message Joined: 20 Jun 99 Posts: 6662 Credit: 121,090,076 RAC: 0	Message 1034887 - Posted: 20 Sep 2010, 16:42:49 UTC - in response to Message 1034872. ...Everyone have his own specially problems.. ;-) Yupp. And i don't think that Oscar makes any Difference on this Situation. IMHO the 100MBit Line in Combination with this 3-Day-in-7-Days Outage is the Bottleneck. Maybe a 5-Day-in-14-Days Outage would be a better Choice for us. Helli I was thinking that after they shuffle the servers around, and add Oscar, the need for an extended down time might be some what aleviated. I am thinking that 96 Gig of RAM can process a whole lot of science. I am most likely wrong, but if the outage is so NTPC can work while not being bombarded with new files, then the ability to handle the files and do science may reduce the need for down time. Actually, I don't really know. Steve Warning, addicted to SETI crunching! Crunching as a member of GPU Users Group. GPUUG Website ID: 1034887 ·

X-Files 27 Send message Joined: 17 May 99 Posts: 104 Credit: 111,191,433 RAC: 0	Message 1034888 - Posted: 20 Sep 2010, 16:43:21 UTC 2 parts is broken: 1st) .VLAR are being sent to GPU 2nd) Prefs of "Use CPU" = No, are getting cpu tasks. ID: 1034888 ·

Helli_retiered Volunteer tester Send message Joined: 15 Dec 99 Posts: 707 Credit: 108,785,585 RAC: 0	Message 1034896 - Posted: 20 Sep 2010, 17:20:10 UTC - in response to Message 1034887. Last modified: 20 Sep 2010, 17:39:18 UTC I was thinking that after they shuffle the servers around, and add Oscar, the need for an extended down time might be some what aleviated. .... If this would be possible, our (Special-)Donations would be a good Investment - for us also. [Edit:] Woah.. Not bad man, not bad. I had luck and found a hole in the wall and downloaded >1,800 Workunits in the past 40 Minutes. Yiippieh.. Helli A loooong time ago: First Credits after SETI@home Restart ID: 1034896 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51540 Credit: 1,018,363,574 RAC: 1,004	Message 1034912 - Posted: 20 Sep 2010, 17:46:08 UTC - in response to Message 1034887. ...Everyone have his own specially problems.. ;-) Yupp. And i don't think that Oscar makes any Difference on this Situation. IMHO the 100MBit Line in Combination with this 3-Day-in-7-Days Outage is the Bottleneck. Maybe a 5-Day-in-14-Days Outage would be a better Choice for us. Helli I was thinking that after they shuffle the servers around, and add Oscar, the need for an extended down time might be some what aleviated. I am thinking that 96 Gig of RAM can process a whole lot of science. I am most likely wrong, but if the outage is so NTPC can work while not being bombarded with new files, then the ability to handle the files and do science may reduce the need for down time. Actually, I don't really know. Steve Oscar should be quite a beast after the generous response I got in my donation thread..... It should be interesting to see if those in charge of actually setting the specs agree with the upgrades I proposed....especially the CPU upgrade to add more cores and speed and allow the RAM to run at it's rated speed. That should all add up to a nice increase in processing power. More bandwidth would be nice at some point, but the weakest link has been the processing power of the server base. With mighty Oscar taking over some of the chores, the tasks running on the other servers might be spread around a bit, and Matt intends to retire some of the less reliable ones. Of course, if their load is lessened, that might not have to happen. He would be the judge of that. If the servers can hold up their end, we can saturate the bandwidth like we have since coming back up last Thursday. And if the downtime is 3 days or less in the future, I think we can get by nicely with what is on tap right now. "Time is simply the mechanism that keeps everything from happening all at once." ID: 1034912 ·

James Sotherden Send message Joined: 16 May 99 Posts: 10436 Credit: 110,373,059 RAC: 54	Message 1034923 - Posted: 20 Sep 2010, 18:00:24 UTC Ive been jumping all over the threads, so I cant find wich one said that they have the new line run up the hill. I take it that it hasnt been connected yet? If not any idea when it will be? Oscar and a new line, should be a WOW moment right there. [/quote] Old James ID: 1034923 ·

kittyman Volunteer tester Send message Joined: 9 Jul 00 Posts: 51540 Credit: 1,018,363,574 RAC: 1,004	Message 1034928 - Posted: 20 Sep 2010, 18:08:33 UTC - in response to Message 1034923. Ive been jumping all over the threads, so I cant find wich one said that they have the new line run up the hill. I take it that it hasnt been connected yet? If not any idea when it will be? Oscar and a new line, should be a WOW moment right there. I could be wrong, but if I recall correctly, the discussion was that the new fiber link was NOT going to increase Seti's bandwidth. "Time is simply the mechanism that keeps everything from happening all at once." ID: 1034928 ·

©2025 University of California

SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.