about to run dry of work

Message boards : Number crunching : about to run dry of work
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 115138 - Posted: 26 May 2005, 16:02:47 UTC

Hi,

I have upgraded to Boinc Mgr v 4.43. This has been Ok for 2-3 days but today it has stopped downloading new WUs. It is now on the last WU and even if I run update on the project manually no new work is down loaded. My cache size is set to 2 days.

This is all that is displayed in the message tab:

26/05/2005 17:01:10||request_reschedule_cpus: project op
26/05/2005 17:01:10||schedule_cpus: must schedule
26/05/2005 17:01:11|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
26/05/2005 17:01:12|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded


The server status seems to be OK, any ideas on how to get some WUs downloaded?

Thanks

ID: 115138 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 115142 - Posted: 26 May 2005, 16:16:34 UTC

Everyone will be running dry of work. The splitters are offline and not making new work, as the team is going to be moving stuff to new machines.

This info is from the front page:

May 24, 2005
The splitters and assimilator are offline for a day or so while we relocate the backend science database.



My movie https://vimeo.com/manage/videos/502242
ID: 115142 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 115144 - Posted: 26 May 2005, 16:23:09 UTC - in response to Message 115142.  
Last modified: 26 May 2005, 16:30:45 UTC

Everyone will be running dry of work. The splitters are offline and not making new work, as the team is going to be moving stuff to new machines.

This info is from the front page:

May 24, 2005
The splitters and assimilator are offline for a day or so while we relocate the backend science database.



I saw that news item, I did not think it was related as

1) my other PC is downloading WUs OK, it got one very recently.

2) the status page is still showing plenty of WUs ready for downloading

3) I do not see any messages to the effect that no work is available


The PC in question has not downloaded anything for a day and is just emptying the cache.

Is the status page out of date in terms of available WUs or has the scheduler on this specific PC gone wrong?




ID: 115144 · Report as offensive
Profile Pooh Bear 27
Volunteer tester
Avatar

Send message
Joined: 14 Jul 03
Posts: 3224
Credit: 4,603,826
RAC: 0
United States
Message 115147 - Posted: 26 May 2005, 16:30:31 UTC

My suspicion is the scheduler in 4.43 was not fully ready for release, and has some issues. I see there is over 100K as of about 20 minutes ago. This goes down a few thousand an hour, but with the weekend coming, I know people will be trying to get extra WUs for the weekend, and it will run dry quickly.

I hope they can make the move when the WU cache is gone (which I expect to be sometime around 00:00 - 03:00 UTC. I do other projects, so I am not so worried about being out of work, except that 2 of the other project also are having some issues. So I might actually run out of work this weekend.

No biggie to me, cause I am here to help the science, and not worried if there is no work. Life goes on.



My movie https://vimeo.com/manage/videos/502242
ID: 115147 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 115149 - Posted: 26 May 2005, 16:37:31 UTC - in response to Message 115147.  
Last modified: 26 May 2005, 16:38:23 UTC

My suspicion is the scheduler in 4.43 was not fully ready for release, and has some issues. I see there is over 100K as of about 20 minutes ago. This goes down a few thousand an hour, but with the weekend coming, I know people will be trying to get extra WUs for the weekend, and it will run dry quickly.

I hope they can make the move when the WU cache is gone (which I expect to be sometime around 00:00 - 03:00 UTC. I do other projects, so I am not so worried about being out of work, except that 2 of the other project also are having some issues. So I might actually run out of work this weekend.

No biggie to me, cause I am here to help the science, and not worried if there is no work. Life goes on.



I think your correct about the scheduler. I also have einstein@home but that project is suspended and set not to download work.

I think the scheduler must still be thinking of giving the suspended project an allocation of CPU time. I just detached from einstein@home, I did not want to do that as I would have lost my machine id number as an early user I kinda liked the low machine number. As soon as I did the detach it tried to download some seti@home work:

26/05/2005 17:36:09|SETI@home|Message from server: Not sending work - last RPC too recent: 2 sec

Still no work but at least it tried. Looks like a bug in the scheduler when you have a suspended project. How do you report potential bugs to the developers?


ID: 115149 · Report as offensive
keputnam
Volunteer tester

Send message
Joined: 2 Jul 99
Posts: 242
Credit: 2,736,564
RAC: 3
United States
Message 115151 - Posted: 26 May 2005, 16:52:06 UTC - in response to Message 115149.  
Last modified: 26 May 2005, 16:52:47 UTC

In one of the other threads on the subject, JM7 suggested resetting any projects that have a very large positive Long Term Debt.

A project like Pirates or LHC that has not sent out work recently probably has a huge positive LTD. If you use BoincView, from the projects tab, look all the way to the right to see LTD


This "tweak" worked for me (I have both Pirates and LHC on all my machines). Before I even finished multiple resets on both of these, the CC was already downloading another Seti WU.


ID: 115151 · Report as offensive
rsisto
Volunteer tester

Send message
Joined: 30 Jul 03
Posts: 135
Credit: 729,936
RAC: 0
Uruguay
Message 115152 - Posted: 26 May 2005, 17:04:58 UTC
Last modified: 26 May 2005, 17:05:35 UTC

It is not necesary to reset all projects, just to suspend them. This will make long term debt 0.
ID: 115152 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 115159 - Posted: 26 May 2005, 17:25:44 UTC - in response to Message 115152.  

It is not necesary to reset all projects, just to suspend them. This will make long term debt 0.



In my case the second project (einstein@home) was suspended and it was also set not to download work.

I have now detached from the project. I wish I had tried a reset but it is too late now :-(

My best guess is that the scheduler was building up a debt for einstein@home even though it was suspended.

I will have a look at boincview, may that should go in the wish list for Boinc mgr to show these debt values. If I had seen those maybe I could have worked out what was going on.



ID: 115159 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 119906 - Posted: 6 Jun 2005, 20:07:04 UTC

Just a quick update.

Firstly a big thanks to whoever came in over the w/e to sort out the server issues. Much appreciated as I am sure you had better things planned for the weekend.

As I ran out of SETI work I re-signed up for Einstein@home. I am now using version 4.44 of boinc.exe so was able to test the long-term debt stuff with this release.

When seti@home came back SETI got some WUs OK so I suspended einstein@home and set einstein not to download more work. Alas the issue is still present in 4.44 in that SETI@home worked though its cache and would not download due to it building up a negative debt. i.e. seti@home was the only active project but it still would not download, it almost ran out of work. I tried resetting the einstein@home project but this had no effect. I had to detach from einstein@home, as soon as I did that seti@home downloaded more work.

I think if a project is suspended then the long-term debt should not be applied.









ID: 119906 · Report as offensive
John McLeod VII
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jul 99
Posts: 24806
Credit: 790,712
RAC: 0
United States
Message 119924 - Posted: 6 Jun 2005, 20:40:00 UTC - in response to Message 119906.  
Last modified: 6 Jun 2005, 20:41:21 UTC

Just a quick update.

Firstly a big thanks to whoever came in over the w/e to sort out the server issues. Much appreciated as I am sure you had better things planned for the weekend.

As I ran out of SETI work I re-signed up for Einstein@home. I am now using version 4.44 of boinc.exe so was able to test the long-term debt stuff with this release.

When seti@home came back SETI got some WUs OK so I suspended einstein@home and set einstein not to download more work. Alas the issue is still present in 4.44 in that SETI@home worked though its cache and would not download due to it building up a negative debt. i.e. seti@home was the only active project but it still would not download, it almost ran out of work. I tried resetting the einstein@home project but this had no effect. I had to detach from einstein@home, as soon as I did that seti@home downloaded more work.

I think if a project is suspended then the long-term debt should not be applied.

The LT debt of projects that are suspended, are marked as no download and have no work (graceful suspended) or have communications deferred and have no work on the host (not supplying work) should not have their LT debt move. There is a bug that may have made this not work quite right in 4.44. I believe it to be fixed in 4.45.

In any case if a CPU is idle the client is supposed to get work from any project (even one with a negative LT debt). However, this does not work quite right in 4.44 either, and I believe that it is fixed in 4.45.


BOINC WIKI
ID: 119924 · Report as offensive
Profile David@home
Volunteer tester
Avatar

Send message
Joined: 16 Jan 03
Posts: 755
Credit: 5,040,916
RAC: 28
United Kingdom
Message 119937 - Posted: 6 Jun 2005, 21:29:10 UTC - in response to Message 119924.  
Last modified: 6 Jun 2005, 21:29:37 UTC

Just a quick update.

Firstly a big thanks to whoever came in over the w/e to sort out the server issues. Much appreciated as I am sure you had better things planned for the weekend.

As I ran out of SETI work I re-signed up for Einstein@home. I am now using version 4.44 of boinc.exe so was able to test the long-term debt stuff with this release.

When seti@home came back SETI got some WUs OK so I suspended einstein@home and set einstein not to download more work. Alas the issue is still present in 4.44 in that SETI@home worked though its cache and would not download due to it building up a negative debt. i.e. seti@home was the only active project but it still would not download, it almost ran out of work. I tried resetting the einstein@home project but this had no effect. I had to detach from einstein@home, as soon as I did that seti@home downloaded more work.

I think if a project is suspended then the long-term debt should not be applied.

The LT debt of projects that are suspended, are marked as no download and have no work (graceful suspended) or have communications deferred and have no work on the host (not supplying work) should not have their LT debt move. There is a bug that may have made this not work quite right in 4.44. I believe it to be fixed in 4.45.

In any case if a CPU is idle the client is supposed to get work from any project (even one with a negative LT debt). However, this does not work quite right in 4.44 either, and I believe that it is fixed in 4.45.



Many thanks for the update John
ID: 119937 · Report as offensive

Message boards : Number crunching : about to run dry of work


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.