Transitioner backlog?

Message boards : Number crunching : Transitioner backlog?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
B-Roy

Send message
Joined: 4 May 03
Posts: 220
Credit: 260,955
RAC: 1
Austria
Message 122791 - Posted: 12 Jun 2005, 19:19:00 UTC

Ready to send 77,215
In progress 1,765,954
Waiting for validation 0
Transitioner backlog 22 hours

What is the last line all about, as I haven't seen it in the past?
ID: 122791 · Report as offensive
Profile Angus
Volunteer tester

Send message
Joined: 26 May 99
Posts: 459
Credit: 91,013
RAC: 0
Pitcairn Islands
Message 122806 - Posted: 12 Jun 2005, 19:35:22 UTC - in response to Message 122791.  

Ready to send 77,215
In progress 1,765,954
Waiting for validation 0
Transitioner backlog 22 hours

What is the last line all about, as I haven't seen it in the past?


It's a new metric they put up when the servers came back on-line after the OS update. (What was really updated - a server OS or a storage system OS?)

It was backlogged 10 or 11 hours when the servers came up, and it's been steadily growing ever since.

Which is an interesting thing... If the connection to the servers is so iffy right now, why can't the transitioner recover? It would seem to have less load if hosts can't connect to upload and download.

ID: 122806 · Report as offensive
Timcom99

Send message
Joined: 30 Sep 04
Posts: 105
Credit: 8,927,290
RAC: 0
United States
Message 122810 - Posted: 12 Jun 2005, 19:39:42 UTC

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.
ID: 122810 · Report as offensive
Divide Overflow
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 365
Credit: 131,684
RAC: 0
United States
Message 122831 - Posted: 12 Jun 2005, 20:11:40 UTC

That could be, but I doubt it. My guess is that these numbers are way, way off of what's going on in reality. (Heck, it's shown a sold green light for the scheduler when we all know that it's been up and down like crazy for the last few days.) When the scheduler situation settles down the status page might start returning some usefull information. Don't hold your breath waiting though! ;)

ID: 122831 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 122887 - Posted: 13 Jun 2005, 3:38:49 UTC

Psssst, hey buddy.......wanna buy a WU?
ID: 122887 · Report as offensive
Profile ksnash

Send message
Joined: 28 Nov 99
Posts: 402
Credit: 528,725
RAC: 0
United States
Message 122890 - Posted: 13 Jun 2005, 3:42:31 UTC - in response to Message 122831.  

That could be, but I doubt it. My guess is that these numbers are way, way off of what's going on in reality. (Heck, it's shown a sold green light for the scheduler when we all know that it's been up and down like crazy for the last few days.) When the scheduler situation settles down the status page might start returning some usefull information. Don't hold your breath waiting though! ;)


They can't figure out how long it takes to complete a work unit. How are they supposed to figure out if a computer is working?
ID: 122890 · Report as offensive
Profile Archon

Send message
Joined: 31 Aug 01
Posts: 90
Credit: 400,599
RAC: 0
New Zealand
Message 122911 - Posted: 13 Jun 2005, 4:29:04 UTC - in response to Message 122887.  

Psssst, hey buddy.......wanna buy a WU?


Yeah lets start selling WU's on ebay, lol that will raise some cash for those berkeley guys :)
Cheers

Gav



Nothing is 'fool-proof', someone will always invent a better fool!
ID: 122911 · Report as offensive
Profile Speedy67 & Friends
Volunteer tester
Avatar

Send message
Joined: 14 Jul 99
Posts: 335
Credit: 1,178,138
RAC: 0
Netherlands
Message 122933 - Posted: 13 Jun 2005, 5:29:19 UTC - in response to Message 122810.  

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.


The transitioner has to take the final step on the workunits before they get sent out. Because it's backlogged 26 hours at the moment, there are lots of wu's already split, but waiting to be transitioned.

Greetings,
Speedy67


ID: 122933 · Report as offensive
Divide Overflow
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 365
Credit: 131,684
RAC: 0
United States
Message 123011 - Posted: 13 Jun 2005, 16:08:06 UTC

If my understanding is correct, the transitioner is also checking on when results are returned and when a quorum is achieved. Does this mean that the transitioner backlog figure is the rough estimate of how long it will take a quorum of results to be "seen" by the validator?

It would appear that there are lot of situations where the transitioner comes into play: incomming, outgoing and WU end-of-life. It's tough to figure out what this backlog figure applies to. All of them?

I guess this bugger is also the culprit for these six month to a year old results that are lingering in the system. It seems that the backlog is underestimated by quite a bit for some people! ;)

ID: 123011 · Report as offensive
Profile Dorsai
Avatar

Send message
Joined: 7 Sep 04
Posts: 474
Credit: 4,504,838
RAC: 0
United Kingdom
Message 123018 - Posted: 13 Jun 2005, 16:19:20 UTC
Last modified: 13 Jun 2005, 16:19:50 UTC

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.

at a guess this, from the description of what the transition does:
Handles state transitions of workunits and results. Basically, the transitioners keep track of the many, many results in progress and makes sure they properly move down the pipeline. It is always asking the questions: Is this workunit ready to send out? (snip)....

may explain where the WU's are going, they are going into limbo, waiting for the transitioned to decide they are ready to be sent?



Foamy is "Lord and Master".
(Oh, + some Classic WUs too.)
ID: 123018 · Report as offensive
Profile Silver Streak
Volunteer tester

Send message
Joined: 14 May 03
Posts: 4
Credit: 826,866
RAC: 0
United States
Message 123023 - Posted: 13 Jun 2005, 16:23:26 UTC - in response to Message 122810.  

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.


They may not be making it all the way to the machines to crunch them though, I just got 40 W/U's but they are all "ghosts". None of them made it to my 'puter.
ID: 123023 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 123024 - Posted: 13 Jun 2005, 16:26:58 UTC - in response to Message 123023.  

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.


They may not be making it all the way to the machines to crunch them though, I just got 40 W/U's but they are all "ghosts". None of them made it to my 'puter.

Are you running 4.45?


BOINC WIKI
ID: 123024 · Report as offensive
Profile Silver Streak
Volunteer tester

Send message
Joined: 14 May 03
Posts: 4
Credit: 826,866
RAC: 0
United States
Message 123026 - Posted: 13 Jun 2005, 16:28:27 UTC - in response to Message 123024.  

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.


They may not be making it all the way to the machines to crunch them though, I just got 40 W/U's but they are all "ghosts". None of them made it to my 'puter.

Are you running 4.45?


No, I went back to 4.27
ID: 123026 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 123027 - Posted: 13 Jun 2005, 16:29:45 UTC - in response to Message 123026.  

Someone has to be Downloading Tons of Work Units too. The 3 Splitters have been working Full Tilt and the Work Units are going out a soon as they get Split.


They may not be making it all the way to the machines to crunch them though, I just got 40 W/U's but they are all "ghosts". None of them made it to my 'puter.

Are you running 4.45?


No, I went back to 4.27

That would be the cause of the ghost WUs. This is believed to have been fixed in 4.45.


BOINC WIKI
ID: 123027 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 123028 - Posted: 13 Jun 2005, 16:31:06 UTC
Last modified: 13 Jun 2005, 16:37:24 UTC

The Transitioners have many jobs:

1; Generate 4 results for newly-split wu.
2; Generate more results if one or more reported as client-error, or failed validation due to missing files, or due to "no consensus yet", or due to past deadline if wu not already validated.
3; Setting "need_validate"-flag when atleast 3 "success"-results reported, and when more "success"-results is reported for already validated wu or if already in validator-queue.
4; Trigger Assimilator if wu errored-out.
5; Trigger file_deleter when all results "done", and wu Assimilated.

With Transitioners being 31 hours backlogged, it means only now is wu split 31 hours ago being made ready to be sent out to users. Also, it will take over 31 hours from a wu have got 3 "success"-results before it's tried validated.

Validator only looks on wu with "need_validate"-flag set, since Transitioners is 31 hours backlogged the validators have currently no problems keeping up, and the validator-queue is therefore very short even many wu have enough "success"-results.
Oh, and "Waiting for validation" is a count of how many wu or results with "need_validate"-flag set, it's not a count of how many "pending" results there is. ;)

For successfully validatet wu or if got 6 "success"-results but still fails due to "no consensus yet", it's the Validator that triggers Assimilator.


Not sure, but one possibility is the status-page displays how many hours since last wu split that have zero "results" generated. This means if all servers is shut off 24 hours the queue will also increase 24 hours.

Since the Transitioners-queue is increasing, this means the Splitters is currently generating wu faster than Transitioners can keep up, and till there's 500k results ready to send out the queue will continue to increase.
ID: 123028 · Report as offensive
Divide Overflow
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 365
Credit: 131,684
RAC: 0
United States
Message 123030 - Posted: 13 Jun 2005, 16:41:03 UTC

Thanks for the insight into this, Ingleside!

ID: 123030 · Report as offensive
Profile Darth Dogbytes™
Volunteer tester

Send message
Joined: 30 Jul 03
Posts: 7512
Credit: 2,021,148
RAC: 0
United States
Message 123039 - Posted: 13 Jun 2005, 16:58:31 UTC

[b]Now it would be great if we could just connect to the server!
Account frozen...
ID: 123039 · Report as offensive
Profile Celtic Wolf
Volunteer tester
Avatar

Send message
Joined: 3 Apr 99
Posts: 3278
Credit: 595,676
RAC: 0
United States
Message 123041 - Posted: 13 Jun 2005, 17:04:07 UTC - in response to Message 123039.  

[b]Now it would be great if we could just connect to the server!


You sure ask alot!!!


I'd rather speak my mind because it hurts too much to bite my tongue.

American Spirit BBQ Proudly Serving those that courageously defend freedom.
ID: 123041 · Report as offensive
Profile MikeSW17
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 1603
Credit: 2,700,523
RAC: 0
United Kingdom
Message 123066 - Posted: 13 Jun 2005, 18:07:03 UTC

Anyone know how the transitioner delay affects deadlines?
In other words, does a WU get recorded as returned when it's transmitted and updated, or only when the transitioner gets round to it?

I hope it's the first, as with the backlog at 31 hours and growing around 1 hour per 2 real hours, it would not be long before work gets dumped?

A second issue comes to mind, is there enough storage for this backlog?

ID: 123066 · Report as offensive
Ingleside
Volunteer developer

Send message
Joined: 4 Feb 03
Posts: 1546
Credit: 15,832,022
RAC: 13
Norway
Message 123087 - Posted: 13 Jun 2005, 18:46:19 UTC - in response to Message 123066.  

Anyone know how the transitioner delay affects deadlines?
In other words, does a WU get recorded as returned when it's transmitted and updated, or only when the transitioner gets round to it?

I hope it's the first, as with the backlog at 31 hours and growing around 1 hour per 2 real hours, it would not be long before work gets dumped?

A second issue comes to mind, is there enough storage for this backlog?


The report-time is recorded in the database.

But, the only reason for not getting any credit for too-late results is due to "canonical result" have been deleted from upload-directory by file_deleter.

If wu not already deleted, if there's any results reported when Transitioner catches-up the "need_validate"-flag is set regardless of these being before or after deadline. Also, the file_deleter isn't triggered before all results is tried validated.


As for size on upload/download-disks, don't know, but if they plans to handle 1M result/day without adding more disk-capasity it shouldn't be a problem...


Anyway, they've now moved 2 of the transitioners off klaatu, the web-server, and over to kosh. This seems to have sped-up things a little, and they're now only 29 hours backlogged.
When again, the speed-up can also be due to scheduling-server being down at the time so no work reported, and therefore the Validators sharing servers with transitioners doesn't use any cpu-power...
ID: 123087 · Report as offensive
1 · 2 · Next

Message boards : Number crunching : Transitioner backlog?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.