Panic Mode On (20) Server problems

Message boards : Number crunching : Panic Mode On (20) Server problems
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 15 · Next

AuthorMessage
Profile S@NL - Eesger - www.knoop.nl
Avatar

Send message
Joined: 7 Oct 01
Posts: 385
Credit: 50,200,038
RAC: 0
Netherlands
Message 917595 - Posted: 14 Jul 2009, 13:39:04 UTC - in response to Message 917540.  

I have noticed that BOINC often "forgets" to upload results and I have to do a manual Update to get them to send in. I'll see logs of "finished wu, uploading wu, dowloading wu" and such with no errors. Just will have 6-10 results "Ready for upload" in my tasks. I have read around to see if this is a "normal" thnk to occur, but it happens on all of my hosts.


Uploading and reporting are two different things, you may be referring to that? A wu that has been processed should be uploaded fairly fast (asuming i-net connection to Berkeley/SETI is good).. reporting is done in larger chunks usualy. This is done because the reporting part takes resources and is more efficient done in 'bursts'. An manual reporting action partially circumvents this..

Hope to have (partly?) answerd your question

Eesger
The SETI@Home Gauntlet 2012 april 16 - 30| info / chat | STATS
ID: 917595 · Report as offensive
Profile Lint trap

Send message
Joined: 30 May 03
Posts: 871
Credit: 28,092,319
RAC: 0
United States
Message 917612 - Posted: 14 Jul 2009, 14:56:55 UTC - in response to Message 917584.  


Absolutely, there are bumps on any road/path you will choose.

and, sometimes, ways to smooth them somewhat.

If you are SETI-centric, as I am, "Give me MB wu's or give me "death" (an idle processor), the rebranding tools come in very handy at times like these.

Rescheule.exe can be your friend to move some non-VLAR/VHAR work to a GPU.

Just don't get carried away with the pctage, to avoid problems later. I use 24-28%, which is more or less in line with my assignment mix from the servers.

Martin
ID: 917612 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 917625 - Posted: 14 Jul 2009, 16:07:35 UTC
Last modified: 14 Jul 2009, 16:12:51 UTC


I had ~ 4 and ~ 5 day WU cache on my GPU cruncher and ran two or three times in two months out of work.

He crunch ~ 800 ['normal'] MB WUs/day.

And the reschedule prog [CPU/GPU WUs] isn't for my GPU cruncher.. I crunch only on the GPUs for best crunching performance.
If I would crunch also on the CPU, the GPU performance would be very bad.

I would recommend for PCs with GPU, to let run one CPU-Core idle per one or two GPUs.
I mean don't crunch WUs on this CPU-Cores. Only GPU support.

O.K., it depend which GPU card, but I would do it with the GTX2xx series.
On my PC.. for example, the GPU (OCed GTX260 Core216) [~ 06:45] is ~ 9 x faster than one CPU-Core (AMD Phenom II X4 940 BE) [~ 60:00] [AR=0.44x]. [m:s]

ID: 917625 · Report as offensive
Profile Dirk Sadowski
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 917629 - Posted: 14 Jul 2009, 16:17:54 UTC


Ahh.. BTW. .. the weekly maintenance is well for the UL..

Today.. (if the UL server will be online) all the results can go home very quick, like last week.

ID: 917629 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 917630 - Posted: 14 Jul 2009, 16:19:14 UTC

They should turn off the scheduler for the weekly maintenance in 15 minutes or so (unless they're celebrating Bastiile Day).

When the downloads stop, there should be no problem with uploads. So - provided they remember to turn Bruno back on - we've all got four hours or so of nice clean bandwidth to get the uploads through. Then when the scheduler comes back after maintenance, we can report them, and form a mad queue at the download point again.

That's my theory, at least.
ID: 917630 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 917638 - Posted: 14 Jul 2009, 16:43:00 UTC - in response to Message 917630.  

They should turn off the scheduler for the weekly maintenance in 15 minutes or so (unless they're celebrating Bastiile Day).

When the downloads stop, there should be no problem with uploads. So - provided they remember to turn Bruno back on - we've all got four hours or so of nice clean bandwidth to get the uploads through. Then when the scheduler comes back after maintenance, we can report them, and form a mad queue at the download point again.

That's my theory, at least.

Sounds like a plan. I can agree with that.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 917638 · Report as offensive
samuel7
Volunteer tester

Send message
Joined: 2 Jan 00
Posts: 47
Credit: 2,194,240
RAC: 0
Finland
Message 917653 - Posted: 14 Jul 2009, 17:08:07 UTC - in response to Message 917581.  

Hi, a look at the server status page, tells the UPLOAD SERVER is disabled.
Don't know why it's DISABLED?!

Maybe after the regular maintenance outage, today, they will turn it on, if it functions correctly.



Hopefully, it is turned on during the outage so we can upload our completed tasks while there's no download traffic. Sounds like a plan Matt could have thought...
ID: 917653 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 917656 - Posted: 14 Jul 2009, 17:10:36 UTC - in response to Message 917595.  
Last modified: 14 Jul 2009, 17:12:17 UTC

I have noticed that BOINC often "forgets" to upload results and I have to do a manual Update to get them to send in. I'll see logs of "finished wu, uploading wu, dowloading wu" and such with no errors. Just will have 6-10 results "Ready for upload" in my tasks. I have read around to see if this is a "normal" thnk to occur, but it happens on all of my hosts.


Uploading and reporting are two different things, you may be referring to that? A wu that has been processed should be uploaded fairly fast (asuming i-net connection to Berkeley/SETI is good).. reporting is done in larger chunks usualy. This is done because the reporting part takes resources and is more efficient done in 'bursts'. An manual reporting action partially circumvents this..

Hope to have (partly?) answerd your question

Eesger


It's actually "Ready to report" I miss typed the message.

I will see say 4 tasks "Ready to report" and will be processing 2. When one of the processing tasks finishes it reports and uploads and the other 4 are still "Ready to report". Manually selecting Update sends them in. Just seems odd to me. Once the upload server is back on I can do some screen shots if that would be more helpful.

and
"I have read around to see if this is a "normal" thnk to occur, but it happens on all of my hosts."
should have read
"I have not read around to see if this is a "normal" thing to occur, but it happens on all of my hosts."
Problems with posting at the wee hours of the morning lol.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 917656 · Report as offensive
Profile TCP JESUS
Avatar

Send message
Joined: 19 Jan 03
Posts: 205
Credit: 1,248,845
RAC: 0
Canada
Message 917665 - Posted: 14 Jul 2009, 22:53:42 UTC

So...is the UPLOAD server EVER going to come back online ? is it time to just shut down and call it a day ?
I am TCP JESUS...The Carpenter Phenom Jesus....and HAMMERING is what I do best!
formerly known as...MC Hammer.
ID: 917665 · Report as offensive
Profile Steve Dodd

Send message
Joined: 29 May 99
Posts: 23
Credit: 8,695,373
RAC: 1
United States
Message 917667 - Posted: 14 Jul 2009, 22:56:00 UTC - in response to Message 917630.  

Alas, the best laid plans:)
ID: 917667 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 917669 - Posted: 14 Jul 2009, 22:58:00 UTC - in response to Message 917630.  

That's my theory, at least.

Too bad it didn't happen. And now after the outage, the upload server is still down. Oh well, at least I now have more tasks uploading than I have CPUs, so BOINC shouldn't ask for new work for Seti. Nice test, that. ;-)
ID: 917669 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 917670 - Posted: 14 Jul 2009, 23:01:31 UTC - in response to Message 917665.  

So...is the UPLOAD server EVER going to come back online ? is it time to just shut down and call it a day ?

If you are adverse to the tiniest little bump, then you should shut down and call it a day.

Personally, I'm fascinated. It is interesting to sit back and observe, to figure out what BOINC is doing, to try to see how the loads interrelate, and how things recover under extreme conditions.

... and the vast majority of the time, it's pretty interesting stuff.
ID: 917670 · Report as offensive
Profile TCP JESUS
Avatar

Send message
Joined: 19 Jan 03
Posts: 205
Credit: 1,248,845
RAC: 0
Canada
Message 917673 - Posted: 14 Jul 2009, 23:04:03 UTC

Ned. Gimme a break. I have read enough of your preaching......every comment someone makes, there you are to offer your insight as to 'How well BOINC handles things' blah blah.......

We get it......you have inside knowledge......great.

Now, let's either get this project moving, or quit wasting people's time.
I am TCP JESUS...The Carpenter Phenom Jesus....and HAMMERING is what I do best!
formerly known as...MC Hammer.
ID: 917673 · Report as offensive
Profile TCP JESUS
Avatar

Send message
Joined: 19 Jan 03
Posts: 205
Credit: 1,248,845
RAC: 0
Canada
Message 917675 - Posted: 14 Jul 2009, 23:06:02 UTC

...And by the way - at this point it's NOT a speed bump. The Upload server is down and that is that.

It's been atleast a week since I complained, but this is just getting old....and fast.
I am TCP JESUS...The Carpenter Phenom Jesus....and HAMMERING is what I do best!
formerly known as...MC Hammer.
ID: 917675 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8964
Credit: 12,678,685
RAC: 0
United States
Message 917679 - Posted: 14 Jul 2009, 23:18:58 UTC - in response to Message 917675.  

...And by the way - at this point it's NOT a speed bump. The Upload server is down and that is that.

It's been atleast a week since I complained, but this is just getting old....and fast.


Xenu--please step back and take a deep breath.

We are all frustrated with the outages and your complaints have been heard. The fact is--until some major financing comes in or Berkeley steps up to the requests for new hardware made by Eric and the boys, the difficulties are going to continue. Everyone has a right (including Ned) to discuss the different aspects of Boinc and how it works.


ID: 917679 · Report as offensive
Profile TCP JESUS
Avatar

Send message
Joined: 19 Jan 03
Posts: 205
Credit: 1,248,845
RAC: 0
Canada
Message 917680 - Posted: 14 Jul 2009, 23:21:13 UTC

The worst part is just not knowing what is happening...and how long all of the volunteer Seti@Home 'data crunching' participants will be expected to also be used as 'Offsite storage' for the project without their actuall consent.......
I am TCP JESUS...The Carpenter Phenom Jesus....and HAMMERING is what I do best!
formerly known as...MC Hammer.
ID: 917680 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8964
Credit: 12,678,685
RAC: 0
United States
Message 917681 - Posted: 14 Jul 2009, 23:22:38 UTC - in response to Message 917680.  
Last modified: 14 Jul 2009, 23:26:01 UTC

The worst part is just not knowing what is happening...and how long all of the volunteer Seti@Home 'data crunching' participants will be expected to also be used as 'Offsite storage' for the project without their actuall consent.......


Xenu--I'm sorry but this is not true. You chose to install Boinc and therefore there is nothing being done "without their actuall consent..."

Also the Tech News has had multiple posts from the Admins (specifically Matt L) discussing the problems that are occuring. I'd like to suggest you read some of these posts over the past week or so. There's a lot of good information in there-specifically regarding my discussion with Matt about the bandwith issues.


ID: 917681 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 917688 - Posted: 14 Jul 2009, 23:29:09 UTC - in response to Message 917673.  

Ned. Gimme a break. I have read enough of your preaching......every comment someone makes, there you are to offer your insight as to 'How well BOINC handles things' blah blah.......

We get it......you have inside knowledge......great.

Now, let's either get this project moving, or quit wasting people's time.

You are mistaken.

What I have is four decades of watching and participating in various projects based around computers.

In those four decades, I've learned three simple things:

1) It always takes longer than I think it should.

2) When someone is doing something I think is incredibly stupid, I should find out why before I call them on it.

I don't know why the upload server is down. I suspect that there is a good reason -- or a really bad reason like a hardware failure.

I've watched the project, and I've seen that they work pretty hard, and at the end of the day Matt usually gives us some idea of what they're dealing with.

... and I promised you three things:

3) I can get incredibly angry when things go badly, or I can accept that life is not fair, and see what I can learn from it.

Anger solves nothing. I've learned a lot by sitting on my hands, trying to stay calm and watching quietly.

If you'd rather be mad, be my guest. SETI@Home doesn't have the money for 99.999% reliability.
ID: 917688 · Report as offensive
Profile TCP JESUS
Avatar

Send message
Joined: 19 Jan 03
Posts: 205
Credit: 1,248,845
RAC: 0
Canada
Message 917689 - Posted: 14 Jul 2009, 23:31:47 UTC - in response to Message 917681.  

Also the Tech News has had multiple posts from the Admins (specifically Matt L) discussing the problems that are occuring. I'd like to suggest you read some of these posts over the past week or so. There's a lot of good information in there-specifically regarding my discussion with Matt about the bandwith issues.

I stay on top of my reading in that department as well.

I am just wondering why the Upload server is still down - that is all.

There was NO mention of it prior to scheduled downtime, yet it was offline and there is no update yet.....which is odd.
I am TCP JESUS...The Carpenter Phenom Jesus....and HAMMERING is what I do best!
formerly known as...MC Hammer.
ID: 917689 · Report as offensive
Profile Blurf
Volunteer tester

Send message
Joined: 2 Sep 06
Posts: 8964
Credit: 12,678,685
RAC: 0
United States
Message 917690 - Posted: 14 Jul 2009, 23:36:55 UTC - in response to Message 917689.  
Last modified: 14 Jul 2009, 23:39:14 UTC

Also the Tech News has had multiple posts from the Admins (specifically Matt L) discussing the problems that are occuring. I'd like to suggest you read some of these posts over the past week or so. There's a lot of good information in there-specifically regarding my discussion with Matt about the bandwith issues.

I stay on top of my reading in that department as well.

I am just wondering why the Upload server is still down - that is all.

There was NO mention of it prior to scheduled downtime, yet it was offline and there is no update yet.....which is odd.


On Tuesdays when the Weekly Outage occurs, there is no specific timeframe that is guaranteed for when all the servers come back up.


ID: 917690 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 15 · Next

Message boards : Number crunching : Panic Mode On (20) Server problems


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.