Panic Mode On (78) Server Problems?

Message boards : Number crunching : Panic Mode On (78) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 22 · Next

AuthorMessage
Filipe

Send message
Joined: 12 Aug 00
Posts: 218
Credit: 21,281,677
RAC: 20
Portugal
Message 1304466 - Posted: 10 Nov 2012, 14:22:02 UTC
Last modified: 10 Nov 2012, 14:22:12 UTC

The astropulse tapes are almost splitted. Only 25 channels remain. maybe it will help
ID: 1304466 · Report as offensive
WezH
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 576
Credit: 67,033,957
RAC: 95
Finland
Message 1304472 - Posted: 10 Nov 2012, 14:33:40 UTC - in response to Message 1304466.  

The astropulse tapes are almost splitted. Only 25 channels remain. maybe it will help


Hope so. I do have feeling that AP tasks with 8MB dowloads are the problem now. Let's wait and see what happens when all AP units are out & running.... If that ever happens...
"Please keep Your signature under four lines so Internet traffic doesn't go up too much"

- In 1992 when I had my first e-mail address -
ID: 1304472 · Report as offensive
Profile Fred E.
Volunteer tester

Send message
Joined: 22 Jul 99
Posts: 768
Credit: 24,140,697
RAC: 0
United States
Message 1304484 - Posted: 10 Nov 2012, 14:49:01 UTC
Last modified: 10 Nov 2012, 14:50:00 UTC

The astropulse tapes are almost splitted. Only 25 channels remain. maybe it will help

Maybe so. Will take the splitting load off Synergy. Haven't had many downloads, but some have hung, so might get help there also.

Just had two interesting work requests. Was a little under limit on cpu tasks, so I came off NNT. First work request completed in 8 seconds with one task - I had my cache setting too low and that one task filled the request. Bumped the cache up and the next request completed in 7 seconds with the other tasks to take me to limit. Not bragging or saying things are better across the board - just that simple requests that are not reporting completions seem to be easier for Synergy to handle, and it can still handle them quickly. I haven't seem nuch of that lately and it seems to rule out some theories.
Another Fred
Support SETI@home when you search the Web with GoodSearch or shop online with GoodShop.
ID: 1304484 · Report as offensive
cdemers
Volunteer tester

Send message
Joined: 18 May 99
Posts: 30
Credit: 17,235,002
RAC: 0
Canada
Message 1304550 - Posted: 10 Nov 2012, 17:14:02 UTC

Maybe a suggestion to those windows users having download problems you can try this, it helped for me.

*DO THIS AT YOUR OWN RISK*

For me Windows 7 was not setting TCP window size/TCP optimizations correctly. Since my last reload I had not run this and though I would try it. It corrected most of my network connection problems. This is not perfect but I am no longer getting TCP connection issues and downloads/uploads are proceeding more smoothly. Program must be run as administrator, just select the Optimal radio button at bottom and apply. Then reboot.

SG TCP Optimizer info about program:
http://www.speedguide.net/tcpoptimizer.php

Download here:
http://www.speedguide.net/downloads.php

So far have downloaded 100+ tasks with only minor retry's on some of the downloads. Uploads are going though just about every time on the first shot, they were piling up. And this is with NNT off. Was tired of the constantly allowing/disallowing tasks, seems to be going reasonably fine now.

So for myself this looks like mostly a windows networking issue. For others it may be a differnt problem. I would only recomend using the Optimal radio button in the program unless you really know what your doing and fully understand how the TCP protocol works.

If anyone else tries this report back if it works for you. Or if it doesn't.

*DO THIS AT YOUR OWN RISK*

This is just my 0.02 cents.

ID: 1304550 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1304573 - Posted: 10 Nov 2012, 18:09:51 UTC

It seems to me (I am not a code person) that the problem back at the project level is pretty clearly with the scheduler.

Typically, when there is no work to offer, or no work to offer for a given client, when the scheduler is working properly, it sends back a reply -- no work available, and the client then moves on to other tasks.

The current messed up SETI scheduler is responding to new work requests with 'looking', 'looking' -- and is hung up in a loop.

Only when the client (after 5 minutes or so) realizes that the scheduler is definitely out to lunch, does the client then move on to other tasks.

This problem AT THE SERVER creates a two part problem, -- at the client side, it impedes communications with other projects (or reporting of work to SETI) as the 'phone is off the hook'. At the project side, it creates a major traffic problem as client scheduler connections, which typically run a minute or two are all running 5 minutes or longer.

OK -- that much seems pretty obvious at a layman level.

For users (we are all users here), I think the current frustration has several components.

First -- I've really not seen anything from the project that suggests they have looked at this problem yet -- I suspect they have, but the information vacuum is troublesome. I realize Matt isn't around at the moment, so our primary project communications 'good guy' isn't in the loop.

Second -- this problem has persisted now for 10 days or more -- in a project of this size that is quite severe.

Third -- folks have run out of work to process -- and that adds to the level of angst folks have.

At this point it seems the best approach is for everyone to go to 'no new work' mode and for somebody from the project to communicate.

ID: 1304573 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1304576 - Posted: 10 Nov 2012, 18:14:41 UTC

BarryAZ wrote:

For users (we are all users here),


No Sir , I am a program ;-)
ID: 1304576 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 1304578 - Posted: 10 Nov 2012, 18:21:10 UTC - in response to Message 1304576.  

I was sort of wondering a bit -- since corporations are people, are CPU's people, and if CPU's are people, are programs people? <grin>
ID: 1304578 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1304581 - Posted: 10 Nov 2012, 18:22:08 UTC - in response to Message 1304550.  
Last modified: 10 Nov 2012, 18:27:34 UTC

cdemers wrote
"For me Windows 7 was not setting TCP window size/TCP optimizations correctly. Since my last reload I had not run this and though I would try it. It corrected most of my network connection problems"

Hmm my Windows 7 fix for the "Lost / Ghost Tasks clogging up the scheduler"
was to use a Proxy / Concentrator I thought at first this was due to the Proximity of the Proxy server (San Jose) to SETI, but it is much more likely
now that cdemers has seen good results with TCP changes that the Proxy server
does not use Windows.

Good work cdemers now does this help anyone else or just me?

PS - I could not get any downloads to work until I started using a proxy
I have not had any problems scheduling work or downloading or uploading since
06=Nov when I switched to a proxy server
ID: 1304581 · Report as offensive
Profile Michael W.F. Miles
Avatar

Send message
Joined: 24 Mar 07
Posts: 268
Credit: 34,410,870
RAC: 0
Canada
Message 1304614 - Posted: 10 Nov 2012, 19:47:25 UTC

Well I have put this machine on a http proxy server and going like a bat out of hell except every task is a GHOST task
All resends, every one.

Here is the proxy address I am using
201.147.20.245:80

Not sure where it is from but it works.

Now if I just get by the limits

Michael Miles
The Assimilators

ID: 1304614 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1304621 - Posted: 10 Nov 2012, 20:27:05 UTC - in response to Message 1304614.  

Mexico... Make a pit stop for a Tequila!
ID: 1304621 · Report as offensive
fscheel

Send message
Joined: 13 Apr 12
Posts: 73
Credit: 11,135,641
RAC: 0
United States
Message 1304624 - Posted: 10 Nov 2012, 20:36:14 UTC - in response to Message 1304614.  
Last modified: 10 Nov 2012, 20:37:34 UTC

The proxy is working. Downloads are very slow..but sure beats getting nothing.

Can someone suggest one that might be faster?
ID: 1304624 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1304651 - Posted: 10 Nov 2012, 21:56:35 UTC

Three channels of AP to go . . . .
tick
tock
tick
tock
:¬)
ID: 1304651 · Report as offensive
Profile Brother Frank

Send message
Joined: 10 Dec 11
Posts: 26
Credit: 15,142,410
RAC: 0
United States
Message 1304658 - Posted: 10 Nov 2012, 22:13:24 UTC - in response to Message 1304581.  

Tom, I do notice in the Tools option in the Boinc Management Menu there are ways of setting connections with two kinds of proxies: One is a socks proxy and there is one other option. Is this where I would put my proxy internet ID numbers?

The next question is how to find these proxies. I am not at all sure how to do that and where to look to find it. I use Comcast as my ISP and have an enhanced help service available to me.

I am running pretty well with my notebook computers except for the very powerful one which is now not able to download much at all when I set to no new tasks. My two desktops with good Nvidia graphics cards are gradually running out of work.

Perhaps you or some others here that are informed and good explainers of setting up the proxies can give me some informed finding aids and tools here. I am willing to work at it once I know how to start. Thanks in advance for any help you can give me. It will probably help many others too. I am sorry to say this, but we are really getting very little help out of the project as this grinds many of us to a halt. Brother Frank
ID: 1304658 · Report as offensive
cdemers
Volunteer tester

Send message
Joined: 18 May 99
Posts: 30
Credit: 17,235,002
RAC: 0
Canada
Message 1304773 - Posted: 11 Nov 2012, 3:54:53 UTC

Just as a followup, everything still running much more smoothly. Only getting the odd download that needs a retry. Uploads have been going ok, just a little slow. Downloads have picked up speed.

ID: 1304773 · Report as offensive
.clair.

Send message
Joined: 4 Nov 04
Posts: 1300
Credit: 55,390,408
RAC: 69
United Kingdom
Message 1304776 - Posted: 11 Nov 2012, 3:59:45 UTC

Ap is done,
so now we will see if that makes any difference :¬)
Though it will take some time to get through backlogs of downloads and ghosts.
ID: 1304776 · Report as offensive
BetelgeuseFive Project Donor
Volunteer tester

Send message
Joined: 6 Jul 99
Posts: 158
Credit: 17,117,787
RAC: 19
Netherlands
Message 1304830 - Posted: 11 Nov 2012, 10:07:23 UTC - in response to Message 1304776.  


Don't know if it has to do with no more AP ready to split, but things are definitely better than they have been in days. I am now able to report completed tasks and request new ones in a single transaction and I am receiving lots of new work (and most of the downloads complete on the first try).

Tom


Ap is done,
so now we will see if that makes any difference :¬)
Though it will take some time to get through backlogs of downloads and ghosts.


ID: 1304830 · Report as offensive
Profile Vipin Palazhi
Avatar

Send message
Joined: 29 Feb 08
Posts: 286
Credit: 167,386,578
RAC: 0
India
Message 1304863 - Posted: 11 Nov 2012, 11:33:51 UTC

My RAC has been steadily declining and I have noticed that the task list for two of my rigs show most of the assigned tasks under Error, and the status as abandoned. Could anyone tell me why this might occur. The rigs still have all the tasks and are crunching them, but obviously, not gaining any credit for the work being done. Should I reset the rig or is this something that will get sorted out automatically?
ID: 1304863 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1304870 - Posted: 11 Nov 2012, 11:52:44 UTC - in response to Message 1304866.  

I cannot believe this!!

I accidently unset NNT on one machine. Realised quite quickly and reset it. However although it reported "Project communication failed" when I checked I had 56 ghosts.

Now to get them I had to unset NNT again, I got them and another 81 ghosts!! As this was a mistake as I don't crunch SETI on this machine any more I am trying decided what to do as I don't really want to abandon them but SETI@Home no longer deserves my time and effort!

Very annoying!!!

Set a smaller cache size, don't try and get the full 20 resends at once, just because AP isn't being split doesn't mean everything is magically fixed, scheduler contacts still sometimes take a long time:

11/11/2012 10:26:12 SETI@home [sched_op_debug] Starting scheduler request
11/11/2012 10:26:12 SETI@home Sending scheduler request: Requested by user.
11/11/2012 10:26:12 SETI@home Reporting 1 completed tasks, requesting new tasks for CPU
11/11/2012 10:26:12 SETI@home [sched_op_debug] CPU work request: 409918.70 seconds; 0.00 CPUs
11/11/2012 10:26:12 SETI@home [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs
11/11/2012 10:26:12 SETI@home [sched_op_debug] ATI GPU work request: 0.00 seconds; 0.00 GPUs
11/11/2012 10:30:20 SETI@home Scheduler request completed: got 7 new tasks
11/11/2012 10:30:20 SETI@home [sched_op_debug] Server version 701
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.230_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.233_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20269.12750.140733193388048.10.255_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.241_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.247_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20269.12750.140733193388048.10.205_1
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20269.12750.140733193388048.10.200_0
11/11/2012 10:30:20 SETI@home Project requested delay of 303 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] estimated total CPU job duration: 8399 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] estimated total NVIDIA GPU job duration: 0 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] estimated total ATI GPU job duration: 0 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] handle_scheduler_reply(): got ack for result 29se12ab.30551.24198.140733193388036.10.127_1
11/11/2012 10:30:20 SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec
11/11/2012 10:30:20 SETI@home [sched_op_debug] Reason: requested by project

Claggy
ID: 1304870 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1304872 - Posted: 11 Nov 2012, 11:56:09 UTC - in response to Message 1304870.  

I cannot believe this!!

I accidently unset NNT on one machine. Realised quite quickly and reset it. However although it reported "Project communication failed" when I checked I had 56 ghosts.

Now to get them I had to unset NNT again, I got them and another 81 ghosts!! As this was a mistake as I don't crunch SETI on this machine any more I am trying decided what to do as I don't really want to abandon them but SETI@Home no longer deserves my time and effort!

Very annoying!!!

Set a smaller cache size, don't try and get the full 20 resends at once, just because AP isn't being split doesn't mean everything is magically fixed, scheduler contacts still sometimes take a long time:

11/11/2012 10:26:12 SETI@home [sched_op_debug] Starting scheduler request
11/11/2012 10:26:12 SETI@home Sending scheduler request: Requested by user.
11/11/2012 10:26:12 SETI@home Reporting 1 completed tasks, requesting new tasks for CPU
11/11/2012 10:26:12 SETI@home [sched_op_debug] CPU work request: 409918.70 seconds; 0.00 CPUs
11/11/2012 10:26:12 SETI@home [sched_op_debug] NVIDIA GPU work request: 0.00 seconds; 0.00 GPUs
11/11/2012 10:26:12 SETI@home [sched_op_debug] ATI GPU work request: 0.00 seconds; 0.00 GPUs
11/11/2012 10:30:20 SETI@home Scheduler request completed: got 7 new tasks
11/11/2012 10:30:20 SETI@home [sched_op_debug] Server version 701
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.230_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.233_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20269.12750.140733193388048.10.255_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.241_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20122.12750.140733193388047.10.247_0
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20269.12750.140733193388048.10.205_1
11/11/2012 10:30:20 SETI@home Message from server: Resent lost task 04se12aa.20269.12750.140733193388048.10.200_0
11/11/2012 10:30:20 SETI@home Project requested delay of 303 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] estimated total CPU job duration: 8399 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] estimated total NVIDIA GPU job duration: 0 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] estimated total ATI GPU job duration: 0 seconds
11/11/2012 10:30:20 SETI@home [sched_op_debug] handle_scheduler_reply(): got ack for result 29se12ab.30551.24198.140733193388036.10.127_1
11/11/2012 10:30:20 SETI@home [sched_op_debug] Deferring communication for 5 min 3 sec
11/11/2012 10:30:20 SETI@home [sched_op_debug] Reason: requested by project

Claggy

Sorry realised I don't care. I will abandon the ones I have and the other will time out naturally.
ID: 1304872 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1304877 - Posted: 11 Nov 2012, 12:03:10 UTC - in response to Message 1304872.  

The servers are still in recover after the AP splitting, no doubt it'll be some time before everyone's ghosts are resent,

There is a scheduler Bug fix in the works, hopefully it'll be deployed at Seti Beta on Monday, not expecting it to be a total cure, just a step in the right direction,

Claggy
ID: 1304877 · Report as offensive
Previous · 1 . . . 12 · 13 · 14 · 15 · 16 · 17 · 18 . . . 22 · Next

Message boards : Number crunching : Panic Mode On (78) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.