Message boards :
Number crunching :
Why is there no work?
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 ![]() |
I don't get it. A lot of crunchers are complaining that they can't get any work. Yet (at this time) there are 67,570 Seti MB work units waiting to be sent out from the Berkeley servers. Server bandwidth is less than half of the 100 MB that is capable. And it's not because of the normal Tuesday outage or the recovery period that follows. This has been going on for several weeks now. So......... If the work is plentiful and the demand is high why aren't they being sent out? Boinc....Boinc....Boinc....Boinc.... |
JohnDK ![]() ![]() ![]() Send message Joined: 28 May 00 Posts: 1222 Credit: 451,243,443 RAC: 1,127 ![]() ![]() |
I might very well have missed an official explanation, but my guess is that after the APs ran out, people now gets MBs instead. To cover those many more MBs needed, the servers simply can't keep up with the requests, it's way under dimensioned for this situation... |
Cosmic_Ocean ![]() Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 ![]() ![]() |
Something Matt mentioned last week in one of the tech news posts is that the numbers on the status page are not exact like they used to be.. they're a "good guess". This makes the load on the database much less strenuous. The accurate method locked the database while the whole thing was scanned to get a count of how many are ready to send. While the DB was locked, new units could not be created (split) and so on. Some logic was shuffled around and some code re-written, and now it does something different than that while allowing the database to continue working as it is read for the status page numbers. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
![]() ![]() Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 ![]() |
Ok...........that explains it. I missed the posting where it went from an exact number to a SWAG. Boinc....Boinc....Boinc....Boinc.... |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 ![]() |
I don't get it. A lot of crunchers are complaining that they can't get any work. Yet (at this time) there are 67,570 Seti MB work units waiting to be sent out from the Berkeley servers. Server bandwidth is less than half of the 100 MB that is capable. Because there can be work that is "ready" but it isn't available to the scheduler until the feeder puts it into the (100 work unit) feeder queue. I forget the exact speed of the feeder, but obviously BOINC clients are requesting work faster than the feeder is supplying it to the scheduler. |
![]() ![]() Send message Joined: 11 Mar 01 Posts: 16 Credit: 15,351,703 RAC: 37 ![]() ![]() |
Not sure if the fact the replica database is still offline has anything to do with this - I'm sure I've seen a lack of work units being sent out before when this happens. ![]() |
darengosse ![]() Send message Joined: 8 Mar 06 Posts: 9 Credit: 1,045,896 RAC: 0 ![]() |
I don't get it. A lot of crunchers are complaining that they can't get any work. Yet (at this time) there are 67,570 Seti MB work units waiting to be sent out from the Berkeley servers. Server bandwidth is less than half of the 100 MB that is capable. Hello. Me also to have many difficulties to obtain work When I consult the status waiter at June 16, 2009 with 10:30: 11 UTC: State of distribution of the data: Results ready to send: SETI@home = 22.149 at June 16, 2009 with 13:30: 17 UTC: State of distribution of the data: Results ready to send; SETI@home = 106.474 and all the waiters of remote loadings are on Running My question: Why I receive permanently, all the day in the boinc, this message: "Message from server: (Project has no jobs available)", and I do not receive any work, or then to the maximum 1 work at the same time, and that very seldom.... However I put in the preferences at 3,50 of reserve of work per days. I specify that is moreover, the project or I have less RAC. Thank you very much advances; to explain why…???? ![]() ![]() |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
... The Feeder tries to refill the list at 2 second intervals, but other database activity can slow that process a lot. Matt Lebofsky's post last December is still worth reading. It seems to me that something is overloading the database and effectively blocking the Feeder for long periods. Joe |
![]() ![]() Send message Joined: 31 Jul 01 Posts: 2467 Credit: 86,146,931 RAC: 0 ![]() |
Thanks Joe........this was my feeling also but I lack the scientific data and background to back it up. I was baseing my opinion on the fact that all database tasks are running slow. Validators, Assimilators etc. And also the fact that Berkeley used to easily handle traffic that is now seemingly choking it. And what's new and what's been getting a lot of effort and requires massive database access?? NTPCKR Boinc....Boinc....Boinc....Boinc.... |
ST Send message Joined: 28 Nov 06 Posts: 1 Credit: 203,721 RAC: 0 ![]() |
I stopped SETI@HOME last year because of the constant "NO WORK", and last week received a request to join back, which I did. But it is still the same old problem of "NO WORK", if it can't be resolved, then I will stop again. |
1mp0£173 Send message Joined: 3 Apr 99 Posts: 8423 Credit: 356,897 RAC: 0 ![]() |
Thanks Joe........this was my feeling also but I lack the scientific data and background to back it up. I was baseing my opinion on the fact that all database tasks are running slow. Validators, Assimilators etc. And also the fact that Berkeley used to easily handle traffic that is now seemingly choking it. I'm getting work, but it looks like we could have a string of shorties -- which is something else that's changed. Are we also seeing some of the mount issues that Matt has talked about, or is someone else "scraping" for stats, or what else could possibly be an issue? Don't know. |
![]() ![]() Send message Joined: 4 May 08 Posts: 417 Credit: 6,440,287 RAC: 0 ![]() |
I stopped SETI@HOME last year because of the constant "NO WORK", and last week received a request to join back, which I did. But it is still the same old problem of "NO WORK", if it can't be resolved, then I will stop again. Unfortunately it looks like your timing was not very good. I have been crunching for Seti for 13 months and this is the only time that any of my hosts has run out of work due to project problems, but only 1 host out of 8 so far. |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 ![]() |
I don't get it. A lot of crunchers are complaining that they can't get any work. Yet (at this time) there are 67,570 Seti MB work units waiting to be sent out from the Berkeley servers. Server bandwidth is less than half of the 100 MB that is capable. Ned, I agree. The 100 result limit on the feeder is holding back throughput for getting those work units out in the field. It really needs to be 1000 for SETI. |
![]() ![]() Send message Joined: 26 Jul 99 Posts: 338 Credit: 20,544,999 RAC: 0 ![]() |
Is there a specific reason for the 100 number? Is it in some way hardware limited etc. or...was just chose as a good number because significant less throughput was required in the past. Maybe it is server setting that could be changed easily? Or how about more than one instance running? No doubt the resent dramatic increase in throughput potential of an individual host has added alot more work for server in very short time frame. "The most exciting phrase to hear in science, the one that heralds new discoveries, is not Eureka! (I found it!) but rather, 'hmm... that's funny...'" -- Isaac Asimov ![]() |
Josef W. Segur Send message Joined: 30 Oct 99 Posts: 4504 Credit: 1,414,761 RAC: 0 ![]() |
That 100 is the default setting in BOINC, and there is a warning about increasing it in sched_shmem.h: // Default number of work items in shared mem. // You can configure this in config.xml (<shmem_work_items>) // If you increase this above 100, // you may exceed the max shared-memory segment size // on some operating systems. // #define MAX_WU_RESULTS 100 As noted there, it can be changed fairly easily. Whether it would help much here I don't know. My impression is the Feeder has been effectively blocked for minutes at a time recently so feeding 1000 at a time would be only a minor help. For some period last year they were running two Feeders and Schedulers, one pair handling odd numbered tasks and the other even numbered. That had issues too, and if other activity on the database is the cause of the feeding delays, IMO the extra instance would just add to the problem. Joe |
![]() ![]() Send message Joined: 8 Dec 08 Posts: 231 Credit: 28,112,547 RAC: 1 ![]() |
true. i finale got some work but have not been able to upload it since early today |
DJStarfox Send message Joined: 23 May 01 Posts: 1066 Credit: 1,226,053 RAC: 2 ![]() |
Hmm... if the feeder is blocked for such a long time, then it would make even more sense for it to have a larger buffer. And there is a way to increase the shared memory size on Linux. Even when the problem of DB is solved, I don't see any long term harm in having a big feeder buffer. I'd love to know: what is the average number of scheduler requests per minute? Ahem... # Set shared memory size (bytes) by including these lines in /etc/sysctl.conf # Default is 32M on most 2.6.2x kernels kernel.shmmax=268435456 kernel.shmall=268435456 From what I know of computer architecture, increasing this value will cause more cache misses on the CPU(s). |
![]() ![]() Send message Joined: 4 Oct 00 Posts: 9541 Credit: 50,759,529 RAC: 60 ![]() ![]() |
Please refer to the "Server Outage" stickied or the panic mode thread for server problems ![]() In a rich man's house there is no place to spit but his face. Diogenes Of Sinope |
darengosse ![]() Send message Joined: 8 Mar 06 Posts: 9 Credit: 1,045,896 RAC: 0 ![]() |
Hello. My preceding message with was useful because on the whole my 2 Computers received 54 WU but impossible to send the results. 18 results in sendings in progress since June 17 with 20:41: 23 UTC Message permanently in the BOINC:(Temporarily failed upload - Internet access OK - project servers may be temporarily down). Please excuse me, but I think that SETI of, (according to a French expression), " eyes larger than the belly.!! " Indeed, why have almost 1 d' million; users and to accept the new ones, if they are unable to follow the rate/rhythm ....... http://www.boincstats.com/signature/user_754953_project-1.gif ![]() ![]() |
darengosse ![]() Send message Joined: 8 Mar 06 Posts: 9 Credit: 1,045,896 RAC: 0 ![]() |
Hello. My preceding message with was useful because on the whole my 2 Computers received 54 WU but impossible to send the results. Afflicted, but I think that there is an error in my message precede. It is necessary to read: why have almost 1 million users and to accept the new ones, if they are unable to follow the rythm.... Jean-Paul ![]() ![]() |
©2025 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.