Panic Mode On (29) Server problems

Message boards : Number crunching : Panic Mode On (29) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next

AuthorMessage
Profile perryjay
Volunteer tester
Avatar

Send message
Joined: 20 Aug 02
Posts: 3377
Credit: 20,676,751
RAC: 0
United States
Message 971709 - Posted: 19 Feb 2010, 17:02:43 UTC - in response to Message 971706.  

I do not think bandwidth has anything to do with this since I'm at UC Berkeley.


Have you stopped in to see if they need a hand? Or are you having any problems with a slowdown on the web?


PROUD MEMBER OF Team Starfire World BOINC
ID: 971709 · Report as offensive
BarryAZ

Send message
Joined: 1 Apr 01
Posts: 2580
Credit: 16,982,517
RAC: 0
United States
Message 971711 - Posted: 19 Feb 2010, 17:05:55 UTC - in response to Message 971706.  

I wonder if this is something of a self solving condition. A (bad) analogy -- in the current ongoing recession, unemployment numbers seem to be getting better -- as an increasing number of people despair of getting work and no longer look for work. Here, in current ongoing data flow recession, perhaps eventually the numbers will seem to get better as an increasing number of people despair of the system running properly and efficiently and detatch from SETI and either uninstall BOINC or move on to projects perhaps a bit less problematic.


ID: 971711 · Report as offensive
Profile 52 Aces
Avatar

Send message
Joined: 7 Jan 02
Posts: 497
Credit: 14,261,068
RAC: 67
United States
Message 971718 - Posted: 19 Feb 2010, 17:12:00 UTC - in response to Message 971678.  

Consensus, backed by source code, is that it represents the filling of a local buffer.

Yes, never trust the source code ;-) Bottom line, call it a visual layer of re-direction that they could instrument off of for the sake of some Warm-n-Fuzzy UI re-assurance.

However, we here at SETI are blessed by small upload file sizes:

A lot of us have said the same things (it's not the size of the upload file, it's that the back end is mercilessly being hammered and has other woes as well), and then drill down on some of the subtle nuances. In this instance, it's why those In-Progress UL files that indicate "100%" are back at square zero on the next go round. There is defensive code in the progress bar which actually stops the bar at 100% --> because of a situation with a header <-- but ahhhh, it would set it to 100% under any circumstance where bytes sent EEXCEED the size of the file to begin with. So on a hunch, I was looking for a situation where a RESEND of the buffer may have come back up the pipeline, thus the same buffer gets counted twice (ergo, without that defensive code, progress might go up to ++600% before failing). Alas, I didn't find a quick smoking gun, sluething it got complex quickly, so I left it as it was ;-)
ID: 971718 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 971727 - Posted: 19 Feb 2010, 17:26:25 UTC - in response to Message 971709.  

I do not think bandwidth has anything to do with this since I'm at UC Berkeley.


Have you stopped in to see if they need a hand? Or are you having any problems with a slowdown on the web?

and the fact that data transfers don't go though the campus network.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 971727 · Report as offensive
PhonAcq

Send message
Joined: 14 Apr 01
Posts: 1656
Credit: 30,658,217
RAC: 1
United States
Message 971738 - Posted: 19 Feb 2010, 17:42:10 UTC

In situations such as the current one, why don't they just disable all the splitters and let the output queue drain (assuming it is able to). I don't remember the interdependencies of the boinc system, but surely that must serve to reduce the loading while they work on the problem.

I suppose if the servers were organized for this scenario, they could even shut down the power on one or more of them, or re-task them for debugging.

Or, although this may take more effort in the long run, just turn off the MB splitters. That should reduce the bandwidth loading at least.

Yes, people won't be happy if they can't sustain their RAC. But do I really need to mention how silly that objective is (in real life). And they will return when the system is fully up again because we are all addicted to this project (why else would you be reading these boards?).

(And turn off the message boards for a while; hang a big sign out saying "Gone Fishing" or some such.)
ID: 971738 · Report as offensive
W5DMG - Dave

Send message
Joined: 19 May 99
Posts: 155
Credit: 33,162,251
RAC: 0
United States
Message 971746 - Posted: 19 Feb 2010, 17:51:21 UTC

2/19/2010 11:05:59 AM SETI@home Reporting 237 completed tasks, not requesting new tasks
2/19/2010 11:06:21 AM Project communication failed: attempting access to reference site
2/19/2010 11:06:23 AM Internet access OK - project servers may be temporarily down.

Still waiting, hopefully this will all be fixed by tonight.
ID: 971746 · Report as offensive
Profile [B^S] madmac
Volunteer tester
Avatar

Send message
Joined: 9 Feb 04
Posts: 1175
Credit: 4,754,897
RAC: 0
United Kingdom
Message 971747 - Posted: 19 Feb 2010, 17:52:12 UTC

I have had some uploaded but now the schedular is causing trouble with either getting internal server error or the old project servers may be down message. ON this machine I have one left to do which makes four to upload even though two are at 100% and I have got finished uploading message in my message board. So will do another BOINC project whilst this is sorted out.
ID: 971747 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 971748 - Posted: 19 Feb 2010, 17:53:24 UTC - in response to Message 971682.  

I have another task about to complete in the next 10 minutes, I'll let you know how the upload/report goes. Finally I have nothing in my upload or report queue.

Nope, multiple retry attempts on the upload finally went thru, but I now have 4 tasks ready to report again.
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 971748 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 971749 - Posted: 19 Feb 2010, 17:53:26 UTC - in response to Message 971682.  
Last modified: 19 Feb 2010, 18:50:43 UTC

Duplicate post removed.
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 971749 · Report as offensive
Profile Gatekeeper
Avatar

Send message
Joined: 14 Jul 04
Posts: 887
Credit: 176,479,616
RAC: 0
United States
Message 971763 - Posted: 19 Feb 2010, 18:15:08 UTC

Cricket graph is flatlining - hopefully it's a sign of some affirmative action on the part of staff, and not an indication of more problems.
ID: 971763 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 971775 - Posted: 19 Feb 2010, 18:36:04 UTC - in response to Message 971763.  

Also just noticed that Thumper and the primary science database are off line. Possibly related?

Just confirms the horrible suspicion I've had all along: SETI is just here for the science, not to keep us credit-hounds happy.

ID: 971775 · Report as offensive
Profile hiamps
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 4292
Credit: 72,971,319
RAC: 0
United States
Message 971776 - Posted: 19 Feb 2010, 18:37:52 UTC - in response to Message 971775.  

Also just noticed that Thumper and the primary science database are off line. Possibly related?

Just confirms the horrible suspicion I've had all along: SETI is just here for the science, not to keep us credit-hounds happy.

Thats a + in my book, maybe someone realizes something is wrong, or we killed it.
Official Abuser of Boinc Buttons...
And no good credit hound!
ID: 971776 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 971779 - Posted: 19 Feb 2010, 18:39:28 UTC - in response to Message 971678.  

Here's another observation, with this outage confirming previous observations: a BOINC upload seems to have a greater chance of success if it's a newly-finished task, than if it's an old one which has tried several times before.

That's something i've noticed as well.

Grant
Darwin NT
ID: 971779 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 971780 - Posted: 19 Feb 2010, 18:39:31 UTC - in response to Message 971679.  

Oooooops... New strategy is needed. I just lost about 100 completed jobs because they wouldn't and still haven't uploaded.
Oh well. I guess it's back to letting BOINC handle it all.

Let them upload, and cross your fingers. The odds are good that you'll get them in before the work can be reissued, crunched and returned -- and you will get credit.
ID: 971780 · Report as offensive
Profile Bill Walker
Avatar

Send message
Joined: 4 Sep 99
Posts: 3868
Credit: 2,697,267
RAC: 0
Canada
Message 971789 - Posted: 19 Feb 2010, 18:49:37 UTC - in response to Message 971779.  

Here's another observation, with this outage confirming previous observations: a BOINC upload seems to have a greater chance of success if it's a newly-finished task, than if it's an old one which has tried several times before.

That's something i've noticed as well.


Is that just an artifact of the current random success of uploads, plus backing off of re-sends? For several days I had about 5 pending uploads hanging around, and completed about 10 a day. That means in any given time period I had twice as many new completions as old pendings. If they all go through with equal probability, I would expect to see twice as many successful uploads of new tasks per time interval than of old pending transfers.

I think the longer the pendings stay around, the longer Task Manager backs off the time between attempted re-sends, which would also tend to make the pendings look slower and slower.

ID: 971789 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 971792 - Posted: 19 Feb 2010, 18:56:07 UTC

All my uploads went through within the last 18 hours, all without me pressing a button. BOINC did it's job and I'm happy.
ID: 971792 · Report as offensive
Profile ccappel
Avatar

Send message
Joined: 27 Jan 00
Posts: 362
Credit: 1,516,412
RAC: 0
United States
Message 971795 - Posted: 19 Feb 2010, 18:58:08 UTC - in response to Message 971792.  
Last modified: 19 Feb 2010, 18:58:29 UTC

All my uploads went through within the last 18 hours, all without me pressing a button. BOINC did it's job and I'm happy.

What about your reports? I have all my uploads out of the way (without intervention), but my "ready to report" list is growing.
"Life is a tragedy for those who feel, and a comedy for those who think."

"I never get into an argument that I cannot win."
ID: 971795 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 971797 - Posted: 19 Feb 2010, 19:00:25 UTC - in response to Message 971795.  
Last modified: 19 Feb 2010, 19:02:40 UTC

All my uploads went through within the last 18 hours, all without me pressing a button. BOINC did it's job and I'm happy.

What about your reports? I have all my uploads out of the way (without intervention), but my "ready to report" list is growing.


The ones from last night are gone. I have new downloads that have finished and uploaded, but are still waiting to report.

[Edit] Hold on, upon double-checking my logs, these are the same ones from last night still trying to report.
ID: 971797 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13727
Credit: 208,696,464
RAC: 304
Australia
Message 971798 - Posted: 19 Feb 2010, 19:00:29 UTC - in response to Message 971789.  

Here's another observation, with this outage confirming previous observations: a BOINC upload seems to have a greater chance of success if it's a newly-finished task, than if it's an old one which has tried several times before.

That's something i've noticed as well.


Is that just an artifact of the current random success of uploads, plus backing off of re-sends? For several days I had about 5 pending uploads hanging around, and completed about 10 a day. That means in any given time period I had twice as many new completions as old pendings. If they all go through with equal probability, I would expect to see twice as many successful uploads of new tasks per time interval than of old pending transfers.

I think the longer the pendings stay around, the longer Task Manager backs off the time between attempted re-sends, which would also tend to make the pendings look slower and slower.

Nope.

As Richard posted older stuck uploads take more retries before they finally go through- i've noticed this whenever there's been a big backlog of uploads. After a lot of button abuse i was able to get all of my uploads to go through, and the oldest ones in the queue were the last ones to eventually upload; and they took a lot of manual re-tries on my part.

Now, uploads are going through without manual intervention, but it's taking as many as 12 retries before that happens, and i've noticed the upload speeds are often [i]less[i/] than 1kB/s. For me it's usually around 30kB/s.
Grant
Darwin NT
ID: 971798 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 971799 - Posted: 19 Feb 2010, 19:00:42 UTC
Last modified: 19 Feb 2010, 19:01:49 UTC

Thought this would be over by now........


Watch this a couple of dozen times.....
and if that don't work,


Stuff that in your Gorilla.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 971799 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 10 · Next

Message boards : Number crunching : Panic Mode On (29) Server problems


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.