Panic Mode On (109) Server Problems?

Message boards : Number crunching : Panic Mode On (109) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 36 · Next

AuthorMessage
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907254 - Posted: 15 Dec 2017, 17:16:39 UTC - in response to Message 1907195.  

The back-offs are administered by your machine.
You enter a back-off state when the communication during a file transfer takes too long. The first few are quite short, but progressively increase to several hours (as you have no doubt observed). The theory behind this is that the by having a back-off the communications link has a chance to recover, and by having a long back-off that chance is increased and the number of "client retry" messages is reduced. However there is a downside in that the large number of long back-offs results in the servers having a lot of "sending" tasks hanging around, which increases the load on the storage system, and the chances of some of the tasks becoming "ghosts". A balancing act that I think has swung too far in the direction of long delays.

There is another type of back-off, which is when the server sends out a message to say "I'm down for maintenance" as the trigger, the starting delay is larger, but the final maximum is about the same.

As I posted, I do not have any issue communicating with the project. All requests for work and reports of work happen with normal speed and the project does not complain.

And TBar has posted EXACTLY the scenario I see, constant backoffs that are originated by the SCHEDULER. My Event Log with sched_ops and work_fetch have IDENTICAL type entries where the gpu is backed off for no good reason. Just one of the reasons why I constantly fight getting work when there is work available. I can't get work when it doesn't ask for any because of the backoffs.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907254 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907258 - Posted: 15 Dec 2017, 17:52:41 UTC - in response to Message 1907254.  

And TBar has posted EXACTLY the scenario I see, constant backoffs that are originated by the SCHEDULER. My Event Log with sched_ops and work_fetch have IDENTICAL type entries where the gpu is backed off for no good reason. Just one of the reasons why I constantly fight getting work when there is work available. I can't get work when it doesn't ask for any because of the backoffs.
Yes, he did - and his message log posts showed exactly where to see them in action.

But your interpretation is incomplete. As Rob said in the post you quoted, there are two types of backoff. One is administered by the scheduler component running on the server. This can easily be seen in the ordinary event log, as

[SETI@home] Project requested delay of 303 seconds
(or 3600 seconds during maintenance). You can always see the time remaining on the Projects tab in BOINC Manager. This delay is absolute: your machine will not contact the server during the delay. You can, however, override it by clicking on the Upadte button (same tab): that can be useful if, for instance, you see that the message boards have come back up after maintenance but your client is still waiting until the end of the hour.

The second type of backoff - the ones TBar was drawing attention to - are known as 'resource backoffs'. They are managed by by the scheduler component running on your client - a different beast entirely. These backoffs start if your client requests more work for, say, one particular GPU type, and is rejected - whatever the reason (no work available, reached a limit of tasks in progress, anything like that). But the crucial difference is that this type of backoff is cleared every time your computer completes a task for that type of GPU. And that happens pretty often with modern GPUs.

The only time that the resource backoffs cause a problem is if you are running a GPU which is so fast that it runs completely dry during extended maintenance periods. Then, you're trying to prime the pump at the same time as everyone else, and likely to be getting no work because everyone got there first, and the server is still getting up to working speed (caching data to memory, etc.). If you are completely dry, with no tasks to run and complete, the backoffs can increase - but again, clicking the 'Update' button clears them.

So, if your machine is dry, click 'Update' once every 5:03 minutes (to respect the server backoff - wait until the timer ticks down to zero), but give up and relax as soon as you've got work to run. The resource backoffs will look after themselves once you're running and completing work.
ID: 1907258 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1907259 - Posted: 15 Dec 2017, 18:01:40 UTC - in response to Message 1907254.  

The back-offs are administered by your machine.
You enter a back-off state when the communication during a file transfer takes too long. The first few are quite short, but progressively increase to several hours (as you have no doubt observed). The theory behind this is that the by having a back-off the communications link has a chance to recover, and by having a long back-off that chance is increased and the number of "client retry" messages is reduced. However there is a downside in that the large number of long back-offs results in the servers having a lot of "sending" tasks hanging around, which increases the load on the storage system, and the chances of some of the tasks becoming "ghosts". A balancing act that I think has swung too far in the direction of long delays.

There is another type of back-off, which is when the server sends out a message to say "I'm down for maintenance" as the trigger, the starting delay is larger, but the final maximum is about the same.

As I posted, I do not have any issue communicating with the project. All requests for work and reports of work happen with normal speed and the project does not complain.

And TBar has posted EXACTLY the scenario I see, constant backoffs that are originated by the SCHEDULER. My Event Log with sched_ops and work_fetch have IDENTICAL type entries where the gpu is backed off for no good reason. Just one of the reasons why I constantly fight getting work when there is work available. I can't get work when it doesn't ask for any because of the backoffs.

What you see in my Log is a perfectly normal 5 minute Project backoff between Work Request. You do not see any Resource backoffs directed at just the GPU. The only problem seen in my log is the Server is Ignoring the Client's Requested task numbers and sending only a few instead of the Number of tasks the Client is requesting. This is Clearly a problem with the Server. Stopping the project won't help. What would help is for someone that understands the code to come up with a theory of why the Server is Repeatedly Ignoring the Work Request and make a suggestion on a fix. That code makes less sense to me than the Japanese subtitles in a few of those old movies. The only thing I'm sure of is the code is Too complicated and shouldn't be concerned with how far along the current tasks is, but rather how many dozens of tasks it should send. It appears the machine corrected itself for now, but is still sending the No Tasks Available response on occasion.
ID: 1907259 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907264 - Posted: 15 Dec 2017, 18:25:57 UTC - in response to Message 1907258.  


The second type of backoff - the ones TBar was drawing attention to - are known as 'resource backoffs'. They are managed by by the scheduler component running on your client - a different beast entirely. These backoffs start if your client requests more work for, say, one particular GPU type, and is rejected - whatever the reason (no work available, reached a limit of tasks in progress, anything like that). But the crucial difference is that this type of backoff is cleared every time your computer completes a task for that type of GPU. And that happens pretty often with modern GPUs.

I see that I will have to post my Event Log with sched_ops and work_fetch when I see I am in a 10 or 20 minute GPU backoff again.

That will show that the backoff DOES NOT CLEAR after a gpu task is reported. I report at least a couple of tasks in a 10 minute period even with BLC tasks on the Linux cruncher. Typically I will be reporting upwards of 6 to 8 tasks.

So please explain your reasoning WHY the CLIENT backoffs don't clear on my machines every time I report tasks.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907264 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907265 - Posted: 15 Dec 2017, 18:42:03 UTC - in response to Message 1907259.  

The client request, and the server response, are measured in seconds, not number of tasks. The server response may be lower than the client request if it brings you up to the 'maximum number of tasks in progress' (say you have a single GPU, and you have 100 tasks waiting to use it: when you finish one GPU task and request a replacement, you'll get at most one task to bring you back up to 100, however much you ask for). That is by design and normal: let's leave that one out of the discussion.

I think you're talking more about the case where you're well below the task limit: you request a significant number of seconds of work: and the SSP says there is plenty of work 'ready to send'. But you don't get as many seconds of work as you asked for, or you even get no work at all.

That's when you have to think about how any one those 600,000 'ready to send' tasks finds its way to your computer. Bear in mind that there are restrictions (you can't be your own wingmate: the same task can't be sent to two different computers at once: some types of task can't be sent to some types of GPU: and so on). The server has to check every one of those rules before it sends a task to you, and all at the same time as processing requests from many other computers. Even the fastest servers couldn't process the entire 600,000 'ready to send' queue at the same time.

So the current design uses an intermediate stage called the 'feeder': a limited set of (we think) 200 potential tasks held in memory. For every request that you make, the database has to find and load your user records for the 'own wingmate' test: they can be checked against the feeder set quickly enough, and the ones which pass can be reserved and sent out to your computer. But it can happen that none of the 200 are available for you (rule violation), or all of them have been reserved before your personal rule-checker can reach them. In that case, you get sent back to wait for the next attempt.

If you can think of a better design that handles all these multiple, complex, tests quickly enough - given that data has to be retrieved from a database with hundreds of thousands of users, with hundreds of thousands of computers, and with tens of millions of tasks all at the same time - then please suggest it. But it isn't easy.
ID: 1907265 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907267 - Posted: 15 Dec 2017, 18:50:06 UTC - in response to Message 1907264.  

I see that I will have to post my Event Log with sched_ops and work_fetch when I see I am in a 10 or 20 minute GPU backoff again.
Yes, that would be the next step.

So please explain your reasoning WHY the CLIENT backoffs don't clear on my machines every time I report tasks.
Well, I've done the first check: one possible answer would be the use of an older BOINC client without all that resource management in place. But your Linux machine is reporting the use of BOINC v7.8.3, the same as your Windows machine. It's still possible that a bug in the code could take a different path under Linux from the path it takes under Windows, but to attack that we really do need the evidence.
ID: 1907267 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907268 - Posted: 15 Dec 2017, 18:52:43 UTC - in response to Message 1907265.  

A very good response Richard. I understand the nature of the feeder and the requirements checking. What I observe is some sort of penalty enforced on my fastest machines. I don't see this kind of backoff at all in my recollection on my slowest machines. Only the Ryzen crunchers have this issue. Those crunchers have 3 gpu each and process a lot of work every five minutes. With the ever growing need for more processing power by the project, I should hope that the developers are working toward a solution to better utilize the fastest hardware.

I too wish I had the knowledge to correct the server code. I just have to leave that up to the developers. I hope that I have explained that the current design is not working as well as it theoretically could.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907268 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907269 - Posted: 15 Dec 2017, 19:02:12 UTC - in response to Message 1907268.  

One possibility suggested itself to my mind as I was typing it. That 'self wingmate' test: the higher your performance - both capacity per machine, and number of machines - the longer it will take to find and retrieve all your data from the database. By the time that's happened (it might take, oooh, whole milliseconds), smaller, leaner requests with fewer old tasks to check against might have bagged all the available work.
ID: 1907269 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1907272 - Posted: 15 Dec 2017, 19:25:08 UTC - in response to Message 1907265.  

So the current design uses an intermediate stage called the 'feeder': a limited set of (we think) 200 potential tasks held in memory. For every request that you make, the database has to find and load your user records for the 'own wingmate' test: they can be checked against the feeder set quickly enough, and the ones which pass can be reserved and sent out to your computer. But it can happen that none of the 200 are available for you (rule violation), or all of them have been reserved before your personal rule-checker can reach them. In that case, you get sent back to wait for the next attempt....
Sorry, the old fallback of the empty feeder doesn't standup to the most basic logic. It takes Hours to run a Host out of work. There are 12 Work requests every Hour, so, probably at least three dozen consecutive empty feeder events would have to occur on the same Host. Not very likely. More likely to win the Lottery in my estimation. Then there is the fact that I can repeatedly have the Server send Over 200 tasks All at Once by simply moving GPU tasks to the CPU cache and then launching BOINC. How likely is it to just happen to hit a full feeder every time I do that? Not very. Then there is the procedure of merely causing an event that has the Server check the Work request, such as, hitting the update button three times, restarting the client, or forcing a resend event. Once the Server is forced to read the Work Request it usually sends the correct number of tasks, full feeder or not. Something is causing the Server to Not read the Work Request...correctly, until forced to.
ID: 1907272 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907277 - Posted: 15 Dec 2017, 19:54:28 UTC - in response to Message 1907272.  

Something is causing the Server to Not read the Work Request...correctly, until forced to.
OK, you've just set yourself the task of working out what that 'something' is, and how it operates.
ID: 1907277 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1907283 - Posted: 15 Dec 2017, 20:16:28 UTC - in response to Message 1907267.  

... reporting the use of BOINC v7.8.3
With v7.2.42 the reaction I get on an empty computer is ... force update, nothing returned, backoff of 5m, automatic retry, then an extended wait (I think the 1h BOINC min contact time?) .... force update, starts again.

All seems well as long as you can get 1 task to return every 10m, but if not, then you have to baby sit the update button.
ID: 1907283 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907288 - Posted: 15 Dec 2017, 20:43:59 UTC - in response to Message 1907277.  

Something is causing the Server to Not read the Work Request...correctly, until forced to.
OK, you've just set yourself the task of working out what that 'something' is, and how it operates.

But since the volunteer developers don't have access to the server code, they have access to only the client and manager source code, correct me if wrong, that is an impossible mission.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907288 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1907291 - Posted: 15 Dec 2017, 21:03:52 UTC - in response to Message 1907234.  

These massive long posts are out of order.

WAG's are wives and girl friends i.e. women. Nuff said?


. . Except in US where it is (W)ild (A)*#*# (G)uess ... :)

Stephen

;)
ID: 1907291 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 1907295 - Posted: 15 Dec 2017, 21:25:08 UTC - in response to Message 1907258.  
Last modified: 15 Dec 2017, 21:26:26 UTC

if your machine is dry, click 'Update' once every 5:03 minutes (to respect the server backoff - wait until the timer ticks down to zero), but give up and relax as soon as you've got work to run. The resource backoffs will look after themselves once you're running and completing work.


. . I think the problem that is bothering people is the frequently erratic behaviour of the server software that even though your device queue is less than full and/or there are WUs in the hopper (RTS) your requests are constantly met with "no tasks available" or "You have reached your limit". Manually requesting work at the end of the mandatory 303 secs will occassionally trigger a download but often not. Yet manually initiating multiple premature work requests after a failed attempt will mostly (unless the hopper is genuinely empty) result in the following automatic request getting new work, and often in large amounts. It is like a flag somewhere in the code got stuck to the wrong (or maybe an invalid) value causing the scheduler to ignore your request, but the "kick the server" trick somehow resets that erroneous flag allowing the following request to be properly recognised and honoured.

. . That is my take on the issue as I have experienced it.

Stephen

<shrug>
ID: 1907295 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1907319 - Posted: 15 Dec 2017, 23:18:43 UTC - in response to Message 1907288.  

Something is causing the Server to Not read the Work Request...correctly, until forced to.
OK, you've just set yourself the task of working out what that 'something' is, and how it operates.
But since the volunteer developers don't have access to the server code, they have access to only the client and manager source code, correct me if wrong, that is an impossible mission.
Since this project is used as the testbed for BOINC development, the server code will be very close to the master code at https://github.com/BOINC/boinc. The client and manager code is in subdirectories 'client' and 'clientgui', respectively: the bit of the server code we're interested in here is in subdirectory 'sched'.
ID: 1907319 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1907332 - Posted: 16 Dec 2017, 0:00:31 UTC - in response to Message 1907234.  
Last modified: 16 Dec 2017, 0:00:51 UTC

These massive long posts are out of order.

Why?
They are relevant to the thread and the discussion in it.

WAG's are wives and girl friends i.e. women. Nuff said?

Depends on context.
When I read it, I read it as "Wild Arse Guess". Once again, appropriate for the post made.
Grant
Darwin NT
ID: 1907332 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907334 - Posted: 16 Dec 2017, 0:03:12 UTC - in response to Message 1907319.  

Thanks for pointing that out. It wasn't obvious to me from the descriptions at GitHub whether any of the server code was exposed.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907334 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13720
Credit: 208,696,464
RAC: 304
Australia
Message 1907338 - Posted: 16 Dec 2017, 0:29:06 UTC - in response to Message 1907269.  

One possibility suggested itself to my mind as I was typing it. That 'self wingmate' test: the higher your performance - both capacity per machine, and number of machines - the longer it will take to find and retrieve all your data from the database. By the time that's happened (it might take, oooh, whole milliseconds), smaller, leaner requests with fewer old tasks to check against might have bagged all the available work.

But the question is Why does doing Tbar's triple Update result in the system that struggles to get work result in work coming down? Another, less reliable method to get work on these systems, is to flip the Application work preference settings, save them, Update on the Manager, then wait for the next automatic request.
Then work will continue to be allocated on each automatic work request, for the rest of the day, or for only the next 3 or so Scheduler requests, depending on how the Scheduler is behaving at the time. It only affects some systems, not others. Some systems are affected more than others (eg My i7 & GTX 1070s is most affected, my C2D & GTX 750Tis are occasionally affected). When I didn't have the AP application installed, the i7 was barely capable of getting any MB work, now I have the AP application installed it's not as affected as it once was.
It just seems odd to have to do AP work in order to receive MB work.

This issue began in December of last year, at the time of the SoG stock roll out (and the problems with the plan class 8.22 v8.23 I think it was).
People started posting they were unable to get any MB work. There was plenty available, and everyone else was getting work, other than these affected posters.
Turns out it was the people that preferred AP and had their preferences set to
Run only the selected applications
AstroPulse v7: yes
SETI@home v8: no
If no work for selected applications is available, accept work from other applications? yes

For whatever reason, the changes Eric did resulted in those settings no longer working. The only way they could now get AP work was to set SETI@home v8: to Yes.
The secondary effect was those of us that didn't have AP installed were now finding our caches running down as we were no longer able to reliably get MB work to replace what we had returned. I would have to change my settings to "AstroPulse v7: yes" and "If no work for selected applications is available, accept work from other applications? yes", even though I had no AP application installed. After saving & updating, the next work request would generally get work. After a few days, or a few hours, MB work would stop being allocated, I would change the settings back to No, save, Update and work would flow again. Then It would stop again, I would change the settings back to Yes, and I would get work again.
This was a daily event, sometimes several times a day. So much for set and forget.

One person suggested I install the AP application and see what happened. I did, and guess what? I was now able to get MB work much more reliably. The issues still occurred, but it was more every week or so, not daily.
However over the last week or 2, with all the other server issues, it's been occurring more & more often. Luckily Tbar came up with his triple update, which helps get work flowing more easily & more reliably than flipping the application settings.

It would just be nice if Eric were able to fix whatever became broken back in Dec of last year so it's not necessary.
And people that don't want to do AP don't have to do it in order to be able to get any MB work. And those that want to do AP can chose to have it is their preferred work type but still get MB when no AP is available.
It would be nice if the options in the Seti preferences were actually options again, and worked as they once did.
Grant
Darwin NT
ID: 1907338 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1907350 - Posted: 16 Dec 2017, 1:20:30 UTC - in response to Message 1907338.  

Do we know what module it was that Eric made those changes in? I've been reading through various modules looking for suspicious code that might have something to do with scheduling in the /sched directory but I don't know what I'm really looking for. And the descriptors on the modules only say when the last time a change was made so if whatever module Eric messed with got changed since December you can't tell since there is no apparent dated change log for each module.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1907350 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1907360 - Posted: 16 Dec 2017, 1:51:01 UTC - in response to Message 1907319.  

...the server code will be very close to the master code at https://github.com/BOINC/boinc. The client and manager code is in subdirectories 'client' and 'clientgui', respectively: the bit of the server code we're interested in here is in subdirectory 'sched'.
That was an easy solution. There is no directory https://github.com/BOINC/boinc/client/sched, not that I can find anyway. Just by chance I did open home/tbar/Downloads/boinc-master/client/scheduler_op.cpp and search for Project has no tasks available which has to be somewhere in the Server code since it is printed on the replies. There isn't a Project has no tasks available anywhere in the code, so, not the right code.

Just by another chance I decided to change my preferences to what Grant suggested, AP Yes, MB No, If no work for selected applications is available, accept work from other application Yes
I suddenly stopped receiving MB tasks even though If no work for selected applications is available, accept work from other application? is YES
Perhaps it's just a matter of the Server losing track of just what Tasks it does have available and assuming it doesn't have MB work for a particular Host. By forcing it to look at the available tasks again, it realizes it does have MB tasks, until it again loses track of which tasks it has available and needs to be reminded. Anyway, I should be receiving MB tasks with my current settings, but, I'm Not. I am receiving APs though;

Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | [sched_op] Starting scheduler request
Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | [work_fetch] request: CPU (0.00 sec, 0.00 inst) NVIDIA GPU (123561.31 sec, 0.00 inst)
Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | Sending scheduler request: To report completed tasks.
Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | Reporting 2 completed tasks
Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | Requesting new tasks for NVIDIA GPU
Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | [sched_op] CPU work request: 0.00 seconds; 0.00 devices
Fri 15 Dec 2017 08:38:27 PM EST | SETI@home | [sched_op] NVIDIA GPU work request: 123561.31 seconds; 0.00 devices
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | Scheduler request completed: got 0 new tasks
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | [sched_op] Server version 707
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | No tasks sent
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | No tasks are available for AstroPulse v7
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | Project requested delay of 303 seconds
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 15fe07ac.17702.289635.10.37.145_1
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 21mr08aa.1770.52937.7.34.160_2
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | [sched_op] Deferring communication for 00:05:03
Fri 15 Dec 2017 08:38:29 PM EST | SETI@home | [sched_op] Reason: requested by project
Fri 15 Dec 2017 08:38:29 PM EST | | [work_fetch] Request work fetch: RPC complete
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | [work_fetch] set_request() for CPU: ninst 1 nused_total 35.90 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 172.16
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | [work_fetch] set_request() for NVIDIA GPU: ninst 2 nused_total 189.00 nidle_now 0.00 fetch share 1.00 req_inst 0.00 req_secs 124525.06
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | [sched_op] Starting scheduler request
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | [work_fetch] request: CPU (172.16 sec, 0.00 inst) NVIDIA GPU (124525.06 sec, 0.00 inst)
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | Sending scheduler request: To report completed tasks.
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | Reporting 3 completed tasks
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | Requesting new tasks for CPU and NVIDIA GPU
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | [sched_op] CPU work request: 172.16 seconds; 0.00 devices
Fri 15 Dec 2017 08:43:35 PM EST | SETI@home | [sched_op] NVIDIA GPU work request: 124525.06 seconds; 0.00 devices
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | Scheduler request completed: got 1 new tasks
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] Server version 707
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | Project requested delay of 303 seconds
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] estimated total CPU task duration: 0 seconds
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] estimated total NVIDIA GPU task duration: 1754 seconds
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 15fe07ac.17702.290862.10.37.182_0
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task 08mr07ag.23411.24203.15.42.211_1
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] handle_scheduler_reply(): got ack for task blc04_2bit_guppi_57903_51834_SO0253_0016.17268.0.24.47.234.vlar_1
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] Deferring communication for 00:05:03
Fri 15 Dec 2017 08:43:37 PM EST | SETI@home | [sched_op] Reason: requested by project
Fri 15 Dec 2017 08:43:37 PM EST | | [work_fetch] Request work fetch: RPC complete
Fri 15 Dec 2017 08:43:39 PM EST | SETI@home | Started download of ap_13ja07aa_B2_P0_00272_20171215_01438.wu
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] ------- start work fetch state -------
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] target work buffer: 86400.00 + 8640.00 sec
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] --- project states ---
Fri 15 Dec 2017 08:43:42 PM EST | SETI@home | [work_fetch] REC 298847.111 prio -0.012 can't request work: scheduler RPC backoff (297.93 sec)
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] --- state for CPU ---
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] shortfall 177.53 nidle 0.00 saturated 94862.47 busy 0.00
Fri 15 Dec 2017 08:43:42 PM EST | SETI@home | [work_fetch] share 0.000
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] --- state for NVIDIA GPU ---
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] shortfall 122783.91 nidle 0.00 saturated 33629.39 busy 0.00
Fri 15 Dec 2017 08:43:42 PM EST | SETI@home | [work_fetch] share 0.000
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] ------- end work fetch state -------
Fri 15 Dec 2017 08:43:42 PM EST | | [work_fetch] No project chosen for work fetch
Fri 15 Dec 2017 08:43:56 PM EST | SETI@home | Finished download of ap_13ja07aa_B2_P0_00272_20171215_01438.wu
ID: 1907360 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 36 · Next

Message boards : Number crunching : Panic Mode On (109) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.