Validaters not Validating??

Message boards : Number crunching : Validaters not Validating??
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Dorsai
Avatar

Send message
Joined: 7 Sep 04
Posts: 474
Credit: 4,504,838
RAC: 0
United Kingdom
Message 105854 - Posted: 30 Apr 2005, 19:14:27 UTC

Is it only me that has noticed that not credit seems to be getting granted (or very little)??

In the past I have never seen the 'waiting for validation' (on the server page) get into double figures. It's as I type 101,678 or there a bouts. This morning it was 60,000 odd.

Results are getting returned, but nothing seems to be happening beyond that.

How long before the storage space for 'un-validated' results fills up? It may be huge, but must be finite. When it fills up, what will happen? I am not going to let Boinc return any results till the backlog clears, as I don't want my returned results to 'get lost in a major server crash' caused by a full HDD...


Any-one any views/answers/comments?



Foamy is "Lord and Master".
(Oh, + some Classic WUs too.)
ID: 105854 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2452
Credit: 33,281
RAC: 0
Germany
Message 105857 - Posted: 30 Apr 2005, 19:27:35 UTC

AFAIR it was up to some 200K after the another outage.

Simply a bit backlog, patience is a virtue ;)
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 105857 · Report as offensive
Profile Dorsai
Avatar

Send message
Joined: 7 Sep 04
Posts: 474
Credit: 4,504,838
RAC: 0
United Kingdom
Message 105861 - Posted: 30 Apr 2005, 19:48:29 UTC

>Simply a bit backlog, patience is a virtue ;)

I beg to differ about it being "a bit of a backlog".

100,000 compaired to the normal 2-6 is more like 'Gridlock'.

A backlog suggest something that has built up, but is in the process of clearing. The current situations is growing, not clearing.

But I am patient...

I have 5-6 days WU's cached.

Not worried, just curious.

Foamy is "Lord and Master".
(Oh, + some Classic WUs too.)
ID: 105861 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 105868 - Posted: 30 Apr 2005, 20:06:00 UTC

It is a backlog. There are reasons for the backlog:

They just broght the replication server back online. This will take time to replicate everything over to that server, because it is slower than the main server.

The spike of traffic the last few days has been huge, and the validate does take some time to validate each result. If you had 200000+ computers pushing data at you, you'd get backlogged also.

Seeing this number at 100000+ is nothing to worry about. Things are getting validated. My credits have gone up the past couple of days, abit slow, but it is happening.

Just let everything get back to normal. This could take a week or more. Last time I think it took almost 10 days before the validator was back near 0.



My movie https://vimeo.com/manage/videos/502242
ID: 105868 · Report as offensive
Profile Prognatus

Send message
Joined: 6 Jul 99
Posts: 1600
Credit: 391,546
RAC: 0
Norway
Message 105874 - Posted: 30 Apr 2005, 20:23:51 UTC
Last modified: 30 Apr 2005, 20:26:33 UTC

It's going down now. While I was reading this thread, it went down from over 106,000 to about 104,000. I guess that's how much it did between two updates (ca. 10 minutes) of the Server Status page. So, if it continues in this pace, we'll reach zero in about.... 9 hours.

ID: 105874 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 105886 - Posted: 30 Apr 2005, 21:14:17 UTC
Last modified: 30 Apr 2005, 21:18:07 UTC

When they switched over to the new server months ago(that was a multi day outage), the Waiting to validate number was as high as 278,000. After, they came back, the new server not only handled all the newly arriving WUs, but also reduced this number by 6,000/hour.

Be patient, keep watching.
ID: 105886 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 105897 - Posted: 30 Apr 2005, 21:42:19 UTC - in response to Message 105861.  

>
> A backlog suggest something that has built up, but is in the process of
> clearing. The current situations is growing, not clearing.
>

A quick look on the server-status-page shows 2 of the validators is running on Kosh, this is the replica-database-server.
The 2 other validators shares Koloth with 2 transitioners and the file_deleter.

A Transitioner is responsible for generating new "results" for newly-split wu, for errored-out results and past deadline-results if needed. Also responsible for setting "need_validate"-flag when enough results for a wu to try validation, and some other jobs.


After any longer outages, both the replica-database and the Transitioners will work harder, using more cpu-power than normal. This means the validators will get less cpu-power than normal, till things catches-up.
ID: 105897 · Report as offensive
Pascal, K G
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2343
Credit: 150,491
RAC: 0
United States
Message 105929 - Posted: 30 Apr 2005, 23:54:50 UTC

Hang with me, cause I am going into never neverland and not to see Jocko either, OK we have 50,000 hosts banging on the server to download 10 results each, that is, "damn ran out of digits" Ok used a calculator", that equals 500,000. So the whole ball of wax is moving at a crawl all the way thru the system so be patient.
Semper Eadem
So long Paul, it has been a hell of a ride.

Park your ego's, fire up the computers, Science YES, Credits No.
ID: 105929 · Report as offensive
Profile Prognatus

Send message
Joined: 6 Jul 99
Posts: 1600
Credit: 391,546
RAC: 0
Norway
Message 106174 - Posted: 1 May 2005, 16:40:22 UTC
Last modified: 1 May 2005, 16:41:18 UTC

I just noticed something in my browser (I use Opera 8) yesterday, when I visited the Server Status page: in the status bar that pops up when a page is loading, it said "Connecting to bla-bla-bla... (#70 in queue)" or something like that. I thought that was funny, so I reloaded the page several times and it said #13 and so on...

So, one can actually see how many are in front of you in the queue! I didn't know that was possible (but then again I'm no TCP/IP guru). That'd be a neat feature in BOINC Manager too! Or at least in a 3rd party tool like BoincView.

ID: 106174 · Report as offensive
RANGEPUP

Send message
Joined: 14 May 99
Posts: 3
Credit: 94,479
RAC: 0
United States
Message 106188 - Posted: 1 May 2005, 17:45:43 UTC

Well as of 1:30EST, the que was at 150K and going upwards. I did not see my stat's move in a few days so I started researching this. Funny, Seti runs only 6 computers ( must be real beast ) but I wonder if maybe we should start a donation fund for them to get another one. I have donated in the past ( back in 2001 ), and I trust them to do what's right. DOes anybody have a clue to what type of server they really need ?

Rangepup
ID: 106188 · Report as offensive
Profile Crunch3r
Volunteer tester
Avatar

Send message
Joined: 15 Apr 99
Posts: 1546
Credit: 3,438,823
RAC: 0
Germany
Message 106199 - Posted: 1 May 2005, 18:16:08 UTC - in response to Message 106188.  

> Well as of 1:30EST, the que was at 150K and going upwards. I did not see my
> stat's move in a few days so I started researching this.

>Funny, Seti runs only 6 computers ( must be real beast )

actually they´re not really beast more like little sheeps :-)

>but I wonder if maybe we should start a
> donation fund for them to get another one. I have donated in the past ( back
> in 2001 ), and I trust them to do what's right. DOes anybody have a clue to
> what type of server they really need ?
>
> Rangepup
>

I don´t think that we need to start a donation for new hardware cause when seti classic is shut down these servers (i think) will switch over to seti2.


Join BOINC United now!
ID: 106199 · Report as offensive
W-K 666 Project Donor
Volunteer tester

Send message
Joined: 18 May 99
Posts: 19059
Credit: 40,757,560
RAC: 67
United Kingdom
Message 106200 - Posted: 1 May 2005, 18:20:33 UTC

I noticed this morning, UK, that my credits etc hadn't risen. So I checked the server status and at 07:00 UTC approx that the waiting to validate was 128K. I checked again at midday and it ha risen to 135K and just now it's up to 152K.

But, one of the noisy units that I've just completed, which had already been thru the granted credit stage updated almost immediately.
ID: 106200 · Report as offensive
Profile Scribe
Avatar

Send message
Joined: 4 Nov 00
Posts: 137
Credit: 35,235
RAC: 0
United Kingdom
Message 106208 - Posted: 1 May 2005, 19:08:29 UTC

Now at 155K and therefore still rising, this is not a backlog, it is an ever increasing jam!
ID: 106208 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 106222 - Posted: 1 May 2005, 20:21:27 UTC

I am getting credits. They are validating, just taking their time.

As it has been stated, they added the replicator back online. Since the replicator is slower than the main database and will take time for the replication to be finished. This allows them to do backups without taking the main database down.

When that replication finishes, the validator will catch up. I had replicating servers at a job once, it went down cause of a hard drive failer. It took almost a week for it to fully replicate after we got it back online.

Things are working. Watch your credits closely, I bet you will get a trickle of credits every so often. I've gotten nearly 1000 since Friday.



My movie https://vimeo.com/manage/videos/502242
ID: 106222 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 106228 - Posted: 1 May 2005, 20:47:46 UTC - in response to Message 106208.  
Last modified: 1 May 2005, 20:48:40 UTC

> Now at 155K and therefore still rising, this is not a backlog, it is an ever
> increasing jam!
>

At 160K now, and two of the four validators are down. 'kosh' is off-line.

Want to take bets on how high the backlog will go?


ID: 106228 · Report as offensive
Profile Kajunfisher
Volunteer tester
Avatar

Send message
Joined: 29 Mar 05
Posts: 1407
Credit: 126,476
RAC: 0
United States
Message 106229 - Posted: 1 May 2005, 20:51:16 UTC - in response to Message 106228.  
Last modified: 1 May 2005, 20:52:49 UTC

> > Now at 155K and therefore still rising, this is not a backlog, it is an
> ever
> > increasing jam!
> >
>
> At 160K now, and two of the four validators are down. 'kosh' is off-line.
>
> Want to take bets on how high the backlog will go?
>
>
>

I'm seeing 155K, maybe a minute after your post, everything still up

Server time: [As of 1 May 2005 19:00:08 UTC]

[edit] when i hit refresh, i got the same, sorry [endedit]
No matter where you go, there you are...
ID: 106229 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 2452
Credit: 33,281
RAC: 0
Germany
Message 106234 - Posted: 1 May 2005, 21:09:33 UTC - in response to Message 106228.  

> At 160K now, and two of the four validators are down. 'kosh' is off-line.
>
> Want to take bets on how high the backlog will go?

I think, it will go up above 200K again, but so what?
That's no problem whatsoever, and I hope they don't bother to look for it, as they have important things to do.
Gruesse vom Saenger

For questions about Boinc look in the BOINC-Wiki
ID: 106234 · Report as offensive
Profile MikeSW17
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 1603
Credit: 2,700,523
RAC: 0
United Kingdom
Message 106237 - Posted: 1 May 2005, 21:25:31 UTC

I see Two validators on kosh have gone 'Not running' recently. Perhaps someone at Berkeley is looking at this.

ID: 106237 · Report as offensive
Profile Prognatus

Send message
Joined: 6 Jul 99
Posts: 1600
Credit: 391,546
RAC: 0
Norway
Message 106275 - Posted: 1 May 2005, 23:28:31 UTC
Last modified: 1 May 2005, 23:43:06 UTC

Many of my uploads right after the outage are still not credited, while more recent uploads seems to be credited at once. I suspect that the validate queue is a LIFO stack. ...or maybe just random? (Perhaps Matt or someone at Berkeley could comment on this) If so, I guess users would be more happy with a FIFO stack scheme.

LIFO = Last In, First Out
FIFO = First In, First Out

ID: 106275 · Report as offensive
Profile MJKelleher
Volunteer tester
Avatar

Send message
Joined: 1 Jul 99
Posts: 2048
Credit: 1,575,401
RAC: 0
United States
Message 106292 - Posted: 1 May 2005, 23:56:16 UTC - in response to Message 106237.  

> I see Two validators on kosh have gone 'Not running' recently. Perhaps someone
> at Berkeley is looking at this.

More likely, will be looking at this when they return to the lab on Monday morning. With the other two validators on koloth showing green, I wouldn't want to pull them in from their weekend!

ID: 106292 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Validaters not Validating??


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.