upload issues and other news (Oct 5, 2007)



Advanced search

Message boards : Technical News : upload issues and other news (Oct 5, 2007)

AuthorMessage
Jeff Cobb
Forum moderator
Project administrator
Project developer
Project scientist
Send message
Joined: Mar 1 99
Posts: 2
Credit: 33,340
RAC: 0
United States
Message 654763 - Posted 5 Oct 2007 21:54:54 UTC

    Matt is still away on his well deserved vacation so I will summarize the week.

    Last weekend we had 3 servers go down, as Eric described in the previous tech note. Two of these were attached to a UPS that malfunctioned. Not good, but at least we understand what happened. The third machine, bruno, crashes every week or two and hangs on reboot for reasons we have yet to understand. Our best guess at this time is that the fiber connection to the disk array that holds the upload directory is sometimes throwing garbage onto the bus that the machine cannot gracefully handle. This is an old fiber array that we would like to phase out anyway, so we thought about different storage devices that we currently have that could hold the uploads. We came up with the underutilized disk space on the master science database machine, thumper. This could have the added benefit of hosting the assimilators on the same machine that hosts the back end science database. Eric ran a script that gradually migrated the uploads over to thumper.

    This worked fine until the migration reach a critical point, at which time the loads on the two download machines shot up to the 80-100 range (they are usually at 5 or less). The high loads were because each instance of the file_upload_handler was taking a long time to write the uploaded results over to thumper. To make a long story short, it turns out that the volume on thumper that held the new upload directory was getting slammed by the uploads. It was running at nearly 100% utilization (local disk, not network, utilization). This was, and still is, a bit surprising. The volume on bruno is software RAID50 and on thumper the volume is software RAID5, the latter having 2 more spindles than each of the RAID50 mirrors on bruno. At any rate, we are migrating back to the fiber array on bruno and have already seen download performance normalize. We'll have to figure this one out...

    The other systems news of the week involves database replication on both of our production databases. The seti_boinc database (users, hosts, teams, recent results) replica was lost to a machine crash. We restored from the master and the replica is once again running normally. We are getting very close to having a replica of the back end science database. The initial data load is nearly complete. We will turn on replication either over the weekend or early next week.

    Over in science development we are getting the splitter ready to handle the radar blanking signal that will be embedded in all new data once Arecibo comes back on line later this month.

    -- Jeff

    Profile Blurf
    Volunteer tester
    Avatar
    Send message
    Joined: Sep 2 06
    Posts: 3112
    Credit: 777,069
    RAC: 564
    United States
    Message 654782 - Posted 5 Oct 2007 22:35:07 UTC

      Thank you for the update Jeff.
      ____________

      Play Fallen Sword!

      Profile petrusbroder
      Send message
      Joined: Dec 2 01
      Posts: 8
      Credit: 7,198,804
      RAC: 32,307
      Sweden
      Message 654791 - Posted 5 Oct 2007 22:59:50 UTC - in response to Message 654763.

        Thanks for the info, I'll post a link @ TeAm AnandTech - there have been a few ;) questions ...
        ____________

        Profile Dr. C.E.T.I.
        Avatar
        Send message
        Joined: Feb 29 00
        Posts: 15245
        Credit: 525,847
        RAC: 93
        United States
        Message 654827 - Posted 6 Oct 2007 0:29:06 UTC


          Thanks Immensely for the Update Jeff . . . Nice Work Berkeley . . .

          DJStarfox
          Send message
          Joined: May 23 01
          Posts: 654
          Credit: 126,070
          RAC: 366
          United States
          Message 654882 - Posted 6 Oct 2007 2:25:06 UTC - in response to Message 654763.

            I think that damn military radar has been the source of the -9 errors and noise in a lot of the WU in the last 6 months or so.

            Profile Jim-R.
            Forum moderator
            Volunteer tester
            Avatar
            Send message
            Joined: Feb 7 06
            Posts: 1466
            Credit: 14,631
            RAC: 0
            United States
            Message 654941 - Posted 6 Oct 2007 4:12:26 UTC - in response to Message 654882.

              I think that damn military radar has been the source of the -9 errors and noise in a lot of the WU in the last 6 months or so.

              It is in *all* of the multibeam data. As I recall, the linefeed array was lower down on the structure so the radar signal was blocked by a nearby hill. The multibeam array is positioned higher so it receives the signal. The splitters have been configured to remove most of it, but some still gets through. Right now it's been a tradeoff between the amount of radar signal in the work units and the amount of good data that is not split due to the possibility of radar contamination. When they get the new setup working they can remove the radar signal and *only* the radar signal.
              ____________
              Jim

              Join GQFCrunchers
              Problems?
              Check the Boinc Wiki
              Or NEW Enhanced FAQ

              Profile JLDun
              Volunteer tester
              Avatar
              Send message
              Joined: Apr 21 06
              Posts: 306
              Credit: 22,085
              RAC: 0
              United States
              Message 655042 - Posted 6 Oct 2007 10:04:19 UTC - in response to Message 654941.

                I think that damn military radar has been the source of the -9 errors and noise in a lot of the WU in the last 6 months or so.

                It is in *all* of the multibeam data. As I recall, the linefeed array was lower down on the structure so the radar signal was blocked by a nearby hill. The multibeam array is positioned higher so it receives the signal. The splitters have been configured to remove most of it, but some still gets through. Right now it's been a tradeoff between the amount of radar signal in the work units and the amount of good data that is not split due to the possibility of radar contamination. When they get the new setup working they can remove the radar signal and *only* the radar signal.

                Thanks for this explanation.


                When they get the new setup working they can remove the radar signal and *only* the radar signal.

                I know I've seen an explanation before, but how can SETI 'be sure' they've removed the "d*** military radar" from the WU's? Without going through every second of WU data before it's sent?
                ____________

                Richard Haselgrove
                Volunteer tester
                Send message
                Joined: Jul 4 99
                Posts: 3759
                Credit: 10,956,622
                RAC: 20,101
                United Kingdom
                Message 655067 - Posted 6 Oct 2007 10:27:03 UTC - in response to Message 655042.

                  I know I've seen an explanation before, but how can SETI 'be sure' they've removed the "d*** military radar" from the WU's? Without going through every second of WU data before it's sent?

                  There's a Scientific Newsletter article about it.

                  Scrat
                  Send message
                  Joined: Jul 12 00
                  Posts: 2
                  Credit: 51,524
                  RAC: 0
                  Austria
                  Message 655268 - Posted 6 Oct 2007 17:45:42 UTC

                    If you have pending problems with garbage on the fibre cables you could check the mass points on the system.

                    One problem could be that the shielding of the fibre cables is directly connected on both ends of the cable. Its strongly recommended to connect ONLY one point directly with the mass point and the other over a spezial surge protector.

                    Another problem could be that the mass points of the server array is to long. If you connect too many servers over a larger distance with the same mass point the equalizing current of the mass wire could also cause serious problems on the sending and receiving modules. (You know these problems if you ever connected your hifi with the television....)

                    These two problems can occour and cause such problems. Mostly this problems are not taken care of at the construction of the servers, cause of the increased costs of building.

                    I already had such problems..... maybe i could help.....


                    Excuse me for the english.... its a long time since last using it ;)

                    Profile Clyde C. Phillips, III
                    Send message
                    Joined: Aug 2 00
                    Posts: 1845
                    Credit: 3,909,444
                    RAC: 4,871
                    United States
                    Message 655284 - Posted 6 Oct 2007 18:29:14 UTC

                      I'm suggesting making an occulting screen probably something like that of a drive-in movie theatre but that just may be too expensive and on someone elses property. Also the probability of hurricanes and tornadoes. Maybe a mesh, rather than a light-opaque structure would be just as effective.
                      ____________

                      Profile Odan
                      Send message
                      Joined: May 8 03
                      Posts: 42
                      Credit: 6,061,022
                      RAC: 12,914
                      United Kingdom
                      Message 655312 - Posted 6 Oct 2007 19:13:41 UTC - in response to Message 655268.

                        If you have pending problems with garbage on the fibre cables you could check the mass points on the system .....


                        Scrat, the fibres are _optical_ fibres so electrical problems such as grounding (mass?) do not have any effect.

                        Excuse me for the english.... its a long time since last using it ;)

                        Your English is fine :~}
                        ____________

                        Profile Jim-R.
                        Forum moderator
                        Volunteer tester
                        Avatar
                        Send message
                        Joined: Feb 7 06
                        Posts: 1466
                        Credit: 14,631
                        RAC: 0
                        United States
                        Message 655315 - Posted 6 Oct 2007 19:15:32 UTC - in response to Message 655284.

                          I'm suggesting making an occulting screen probably something like that of a drive-in movie theatre but that just may be too expensive and on someone elses property. Also the probability of hurricanes and tornadoes. Maybe a mesh, rather than a light-opaque structure would be just as effective.

                          Hmm, good suggestion. It wouldn't have to be very big, just big enough to block the radar signal from the receiving antennas. Unless there's reflections from some other structure. It would have to be made of something opaque to the freq of the radar. Maybe the sort of radar absorbing material they use on the stealth and other planes. Absorbing the signal would be preferable to reflecting it for the radar op's sake! hehe Unless it were reflected at an angle so as not to bounce straight back at them!
                          ____________
                          Jim

                          Join GQFCrunchers
                          Problems?
                          Check the Boinc Wiki
                          Or NEW Enhanced FAQ

                          Profile Dr. C.E.T.I.
                          Avatar
                          Send message
                          Joined: Feb 29 00
                          Posts: 15245
                          Credit: 525,847
                          RAC: 93
                          United States
                          Message 655352 - Posted 6 Oct 2007 20:15:17 UTC - in response to Message 655315.

                            I'm suggesting making an occulting screen probably something like that of a drive-in movie theatre but that just may be too expensive and on someone elses property. Also the probability of hurricanes and tornadoes. Maybe a mesh, rather than a light-opaque structure would be just as effective.

                            Hmm, good suggestion. It wouldn't have to be very big, just big enough to block the radar signal from the receiving antennas. Unless there's reflections from some other structure. It would have to be made of something opaque to the freq of the radar. Maybe the sort of radar absorbing material they use on the stealth and other planes. Absorbing the signal would be preferable to reflecting it for the radar op's sake! hehe Unless it were reflected at an angle so as not to bounce straight back at them!



                            The First Invisibility Shield . . . Technology mighten be 'incorporated into' said 'meshin' . . . hmmm?

                            msattler
                            Volunteer tester
                            Avatar
                            Send message
                            Joined: Jul 9 00
                            Posts: 15616
                            Credit: 44,072,636
                            RAC: 115,238
                            United States
                            Message 655371 - Posted 6 Oct 2007 20:43:36 UTC - in response to Message 655268.

                              If you have pending problems with garbage on the fibre cables you could check the mass points on the system.

                              One problem could be that the shielding of the fibre cables is directly connected on both ends of the cable. Its strongly recommended to connect ONLY one point directly with the mass point and the other over a spezial surge protector.

                              Another problem could be that the mass points of the server array is to long. If you connect too many servers over a larger distance with the same mass point the equalizing current of the mass wire could also cause serious problems on the sending and receiving modules. (You know these problems if you ever connected your hifi with the television....)

                              These two problems can occour and cause such problems. Mostly this problems are not taken care of at the construction of the servers, cause of the increased costs of building.

                              I already had such problems..... maybe i could help.....


                              Excuse me for the english.... its a long time since last using it ;)


                              Sorry, but you are confusing electrical signal cables with fiber optic cables. The grounding advice you gave is well applied to electrical cables, but fiber optics are immune to RFI and other electical interference, and do not require shielding which is earthed (grounded).
                              ____________
                              4 kitties on a Seti mission...Meeeeeeooowwwrrrrr!!!


                              The Genuine Kittyman..........accept no substitutes.



                              Profile Andy Lee Robinson
                              Avatar
                              Send message
                              Joined: Dec 8 05
                              Posts: 540
                              Credit: 15,104,606
                              RAC: 4,702
                              Hungary
                              Message 655391 - Posted 6 Oct 2007 21:18:14 UTC - in response to Message 655315.

                                Last modified: 6 Oct 2007 21:19:53 UTC

                                Absorbing the signal would be preferable to reflecting it for the radar op's sake! hehe Unless it were reflected at an angle so as not to bounce straight back at them!


                                Well I'd send it back to them amplified a few billion times... but that's me!

                                Problem with any amount of terrestrial rfi, it massively outweighs those we are looking for. Blocking rfi is difficult because of diffraction and reradiation, and the pylons will also resonate and reradiate all kinds of frequencies and pick up much more than just the radar. A huge multiwalled Faraday cage might help.

                                Damned inconvenient to have one within line-of-sight-and-a-bit-more near Arecibo. The power of these things can be 10s of kW so even when not facing the antenna, leakage could still be considerable.

                                Those responsible for the radar should also be responsible for it not interfering with their neighbours, and a radar absorber/attenuator/blocker or switcher offer would be more effective closer to the source.

                                Saving grace is that the signal can be neutralized somewhat, or skipped over but there can still be annoying traces.

                                Ideally would need to build a really big mountain around the dish and turn it
                                into a large tube, or move to the far side of the Moon and recruit a few convenient craters.

                                msattler
                                Volunteer tester
                                Avatar
                                Send message
                                Joined: Jul 9 00
                                Posts: 15616
                                Credit: 44,072,636
                                RAC: 115,238
                                United States
                                Message 655396 - Posted 6 Oct 2007 21:24:36 UTC - in response to Message 655391.

                                  Absorbing the signal would be preferable to reflecting it for the radar op's sake! hehe Unless it were reflected at an angle so as not to bounce straight back at them!


                                  Well I'd send it back to them amplified a few billion times... but that's me!

                                  Problem with any amount of terrestrial rfi, it massively outweighs those we are looking for. Blocking rfi is difficult because of diffraction and reradiation, and the pylons will also resonate and reradiate all kinds of frequencies and pick up much more than just the radar. A huge multiwalled Faraday cage might help.

                                  Damned inconvenient to have one within line-of-sight-and-a-bit-more near Arecibo. The power of these things can be 10s of kW so even when not facing the antenna, leakage could still be considerable.

                                  Those responsible for the radar should also be responsible for it not interfering with their neighbours, and a radar absorber/attenuator/blocker or switcher offer would be more effective closer to the source.

                                  Saving grace is that the signal can be neutralized somewhat, or skipped over but there can still be annoying traces.

                                  Ideally would need to build a really big mountain around the dish and turn it
                                  into a large tube, or move to the far side of the Moon and recruit a few convenient craters.

                                  If it were a civilian installation, they might be held responsible for the problem, but being a military installation, no such restrictions apply.

                                  ____________
                                  4 kitties on a Seti mission...Meeeeeeooowwwrrrrr!!!


                                  The Genuine Kittyman..........accept no substitutes.



                                  Scrat
                                  Send message
                                  Joined: Jul 12 00
                                  Posts: 2
                                  Credit: 51,524
                                  RAC: 0
                                  Austria
                                  Message 655481 - Posted 7 Oct 2007 0:03:39 UTC - in response to Message 655371.



                                    Sorry, but you are confusing electrical signal cables with fiber optic cables. The grounding advice you gave is well applied to electrical cables, but fiber optics are immune to RFI and other electical interference, and do not require shielding which is earthed (grounded).


                                    :) I know that fibre cables normally have no shielding, but sometimes the rodent protection of the cables is connected to the sending and the receiving modul, or to the server rack. This can cause such problems, not because of electrical interference in the cable itself, but in the sending or receiving modul.


                                    I just wanted to mention it, cause i had such problems twice (with fibre optic) and it took me 2 months to locate the problem.

                                    Profile Gary Charpentier
                                    Volunteer tester
                                    Avatar
                                    Send message
                                    Joined: Dec 25 00
                                    Posts: 1600
                                    Credit: 1,137,817
                                    RAC: 512
                                    United States
                                    Message 655484 - Posted 7 Oct 2007 0:07:23 UTC - in response to Message 655315.

                                      Last modified: 7 Oct 2007 0:16:03 UTC

                                      Don't worry about sending it back to them. Their software will just remove it as ground clutter.

                                      Gary

                                      I'm suggesting making an occulting screen probably something like that of a drive-in movie theatre but that just may be too expensive and on someone elses property. Also the probability of hurricanes and tornadoes. Maybe a mesh, rather than a light-opaque structure would be just as effective.

                                      Hmm, good suggestion. It wouldn't have to be very big, just big enough to block the radar signal from the receiving antennas. Unless there's reflections from some other structure. It would have to be made of something opaque to the freq of the radar. Maybe the sort of radar absorbing material they use on the stealth and other planes. Absorbing the signal would be preferable to reflecting it for the radar op's sake! hehe Unless it were reflected at an angle so as not to bounce straight back at them!


                                      ____________

                                      msattler
                                      Volunteer tester
                                      Avatar
                                      Send message
                                      Joined: Jul 9 00
                                      Posts: 15616
                                      Credit: 44,072,636
                                      RAC: 115,238
                                      United States
                                      Message 655489 - Posted 7 Oct 2007 0:12:43 UTC - in response to Message 655481.



                                        Sorry, but you are confusing electrical signal cables with fiber optic cables. The grounding advice you gave is well applied to electrical cables, but fiber optics are immune to RFI and other electical interference, and do not require shielding which is earthed (grounded).


                                        :) I know that fibre cables normally have no shielding, but sometimes the rodent protection of the cables is connected to the sending and the receiving modul, or to the server rack. This can cause such problems, not because of electrical interference in the cable itself, but in the sending or receiving modul.


                                        I just wanted to mention it, cause i had such problems twice (with fibre optic) and it took me 2 months to locate the problem.

                                        I suppose if the 2 interconnected pieces of equipment were not on the same ground bus there could be some differential between them thus inducing some current flow from one to the other. Fair enough. When you don't have an obvious solution to a problem, you must look for other obscure causes.
                                        ____________
                                        4 kitties on a Seti mission...Meeeeeeooowwwrrrrr!!!


                                        The Genuine Kittyman..........accept no substitutes.



                                        Grant (SSSF)
                                        Send message
                                        Joined: Aug 19 99
                                        Posts: 3161
                                        Credit: 1,936,041
                                        RAC: 2,277
                                        Australia
                                        Message 655492 - Posted 7 Oct 2007 0:15:19 UTC - in response to Message 655489.

                                          I suppose if the 2 interconnected pieces of equipment were not on the same ground bus there could be some differential between them thus inducing some current flow from one to the other.

                                          All too common.
                                          Earth loops can be an nightmare to track down & resolve.

                                          ____________
                                          Grant
                                          Darwin NT.

                                          Profile Gary Charpentier
                                          Volunteer tester
                                          Avatar
                                          Send message
                                          Joined: Dec 25 00
                                          Posts: 1600
                                          Credit: 1,137,817
                                          RAC: 512
                                          United States
                                          Message 655499 - Posted 7 Oct 2007 0:37:35 UTC - in response to Message 655492.

                                            I used to live about a wavelength from a commercial 10kW AM band radio transmitter. Everything was hot! Of course powerline likes to get into everything too. Sometimes it is easier to shield the powerlines or move them away from the equipment or orient them at right angles to the data cables.

                                            I also remember something recent about the power being redone near the lab building. Check on the timeframe of when the problem started and when that work was done. Perhaps some powerline got moved too close to the data cable maybe even on another floor of the building. New construction might be ENT rather than EMT. Or they could have induced a ground loop in the construction or worse not tied something to ground that should be.

                                            Finally, despite the name Fibre Channel, not all Fibre Channel is carried on fibre optic cables. Copper cable alternatives are available on short runs by some vendors.

                                            I suppose if the 2 interconnected pieces of equipment were not on the same ground bus there could be some differential between them thus inducing some current flow from one to the other.

                                            All too common.
                                            Earth loops can be an nightmare to track down & resolve.


                                            ____________

                                            Robert Ribbeck
                                            Avatar
                                            Send message
                                            Joined: Jun 7 02
                                            Posts: 196
                                            Credit: 1,745,876
                                            RAC: 2,628
                                            United States
                                            Message 655626 - Posted 7 Oct 2007 6:14:51 UTC

                                              Something is still messed up

                                              From the log of one of my machines

                                              10/7/2007 1:07:03 AM|SETI@home|Scheduler RPC succeeded [server version 511]
                                              10/7/2007 1:07:03 AM|SETI@home|Message from server: No work sent
                                              10/7/2007 1:07:03 AM|SETI@home|Message from server: (won't finish in time) Computer on 85.4% of time, BOINC on 100.0% of that, this project gets 100.0% of that

                                              Say What ?? the machine is on 24/7 and only runs seti@home

                                              The closest deadline is 0ct 30 for 6 less than 5.5hr chunks everything else is
                                              not due till Nov. Unless the scheduler is messed up and is ignoring deadlines
                                              the oct deadlines should be completed early monday.

                                              So WHY the won't finish in time and computer on errors ???


                                              ____________

                                              Grant (SSSF)
                                              Send message
                                              Joined: Aug 19 99
                                              Posts: 3161
                                              Credit: 1,936,041
                                              RAC: 2,277
                                              Australia
                                              Message 655636 - Posted 7 Oct 2007 7:37:38 UTC - in response to Message 655626.

                                                So WHY the won't finish in time and computer on errors ???

                                                Try the Help forums, or the Number Crunching forum.

                                                ____________
                                                Grant
                                                Darwin NT.

                                                Profile arr25b
                                                Send message
                                                Joined: Nov 19 05
                                                Posts: 15
                                                Credit: 4,219,671
                                                RAC: 8,328
                                                United Kingdom
                                                Message 656218 - Posted 8 Oct 2007 12:50:59 UTC - in response to Message 655636.

                                                  08/10/2007 13:49:01||Access to reference site succeeded - project servers may be temporarily down.
                                                  08/10/2007 13:49:04|SETI@home|Scheduler request failed: failed sending data to the peer
                                                  08/10/2007 13:49:04|SETI@home|Sending scheduler request: To fetch work
                                                  08/10/2007 13:49:04|SETI@home|Requesting 34560 seconds of new work
                                                  08/10/2007 13:49:05||Project communication failed: attempting access to reference site
                                                  08/10/2007 13:49:06||Access to reference site succeeded - project servers may be temporarily down.
                                                  08/10/2007 13:49:09|SETI@home|Scheduler request failed: failed sending data to the peer
                                                  08/10/2007 13:49:09|SETI@home|Deferring communication for 1 min 0 sec
                                                  08/10/2007 13:49:09|SETI@home|Reason: scheduler request failed


                                                  having trouble getting any wu's and when I do I only receive about 6, I have my settings set to receive upto 10 days worth
                                                  ____________

                                                  Idefix
                                                  Volunteer tester
                                                  Send message
                                                  Joined: Sep 7 99
                                                  Posts: 152
                                                  Credit: 334,718
                                                  RAC: 102
                                                  Germany
                                                  Message 656252 - Posted 8 Oct 2007 13:43:56 UTC - in response to Message 655626.

                                                    Hi,

                                                    Something is still messed up

                                                    From the log of one of my machines

                                                    10/7/2007 1:07:03 AM|SETI@home|Scheduler RPC succeeded [server version 511]
                                                    10/7/2007 1:07:03 AM|SETI@home|Message from server: No work sent
                                                    10/7/2007 1:07:03 AM|SETI@home|Message from server: (won't finish in time) Computer on 85.4% of time, BOINC on 100.0% of that, this project gets 100.0% of that

                                                    Say What ?? the machine is on 24/7 and only runs seti@home

                                                    The closest deadline is 0ct 30 for 6 less than 5.5hr chunks everything else is
                                                    not due till Nov. Unless the scheduler is messed up and is ignoring deadlines
                                                    the oct deadlines should be completed early monday.

                                                    So WHY the won't finish in time and computer on errors ???

                                                    Your WUs will finish in time. But Berkeley thinks that the new WUs Berkeley wants to send to you won't be reported in time. At the moment, VHAR workunits with a deadline of about 8 days are sent out. Some of your machines have average turnaround times of 8 to 10 days, which are too long to report the WUs in time. That's the reason why Boinc blocks you from getting new work.

                                                    Regards,
                                                    Carsten

                                                    Profile KWSN THE Holy Hand Grenade!
                                                    Volunteer tester
                                                    Avatar
                                                    Send message
                                                    Joined: Dec 20 05
                                                    Posts: 653
                                                    Credit: 1,479,956
                                                    RAC: 2,185
                                                    United States
                                                    Message 656264 - Posted 8 Oct 2007 14:12:39 UTC

                                                      I agree here, something is messed up...

                                                      10/8/2007 6:59:57 AM|SETI@home|Sending scheduler request: Requested by user
                                                      10/8/2007 6:59:57 AM|SETI@home|Requesting 700834 seconds of new work, and reporting 16 completed tasks
                                                      10/8/2007 7:00:13 AM||Project communication failed: attempting access to reference site
                                                      10/8/2007 7:00:16 AM||Access to reference site succeeded - project servers may be temporarily down.
                                                      10/8/2007 7:00:18 AM|SETI@home|Scheduler request failed: server returned nothing (no headers, no data)
                                                      10/8/2007 7:00:18 AM|SETI@home|Deferring communication for 1 min 0 sec
                                                      10/8/2007 7:00:18 AM|SETI@home|Reason: scheduler request failed
                                                      10/8/2007 7:01:18 AM|SETI@home|Sending scheduler request: Requested by user
                                                      10/8/2007 7:01:18 AM|SETI@home|Requesting 700832 seconds of new work, and reporting 16 completed tasks
                                                      10/8/2007 7:02:53 AM|SETI@home|Scheduler request failed: HTTP service unavailable
                                                      10/8/2007 7:02:53 AM|SETI@home|Deferring communication for 1 min 0 sec
                                                      10/8/2007 7:02:53 AM|SETI@home|Reason: scheduler request failed

                                                      at this point, I set "no new tasks"...

                                                      10/8/2007 7:03:13 AM|SETI@home|Sending scheduler request: Requested by user
                                                      10/8/2007 7:03:13 AM|SETI@home|Reporting 16 tasks
                                                      10/8/2007 7:04:33 AM|SETI@home|Scheduler request failed: HTTP service unavailable
                                                      10/8/2007 7:04:33 AM|SETI@home|Deferring communication for 1 min 0 sec
                                                      10/8/2007 7:04:33 AM|SETI@home|Reason: scheduler request failed
                                                      10/8/2007 7:05:33 AM|SETI@home|Sending scheduler request: Requested by user


                                                      This behavior is with 5.10.20 (x64)... I'm NOT getting the same with 5.8.16 or 15 (x86) (on different computers in the "farm".) I finally got through on the next try...
                                                      ____________
                                                      _______

                                                      Profile KWSN THE Holy Hand Grenade!
                                                      Volunteer tester
                                                      Avatar
                                                      Send message
                                                      Joined: Dec 20 05
                                                      Posts: 653
                                                      Credit: 1,479,956
                                                      RAC: 2,185
                                                      United States
                                                      Message 656267 - Posted 8 Oct 2007 14:18:31 UTC - in response to Message 655626.

                                                        Something is still messed up

                                                        From the log of one of my machines

                                                        10/7/2007 1:07:03 AM|SETI@home|Scheduler RPC succeeded [server version 511]
                                                        10/7/2007 1:07:03 AM|SETI@home|Message from server: No work sent
                                                        10/7/2007 1:07:03 AM|SETI@home|Message from server: (won't finish in time) Computer on 85.4% of time, BOINC on 100.0% of that, this project gets 100.0% of that

                                                        Say What ?? the machine is on 24/7 and only runs seti@home

                                                        The closest deadline is 0ct 30 for 6 less than 5.5hr chunks everything else is
                                                        not due till Nov. Unless the scheduler is messed up and is ignoring deadlines
                                                        the Oct deadlines should be completed early Monday.

                                                        So WHY the won't finish in time and computer on errors ???



                                                        Have you recently (within the last 2 months) shut that machine down for any reason? (maintenance, a reboot, anything?) That would cause the "Computer on" stat to fall below 100%...

                                                        ____________
                                                        _______

                                                        Profile Martin P.
                                                        Send message
                                                        Joined: May 19 99
                                                        Posts: 273
                                                        Credit: 3,126,253
                                                        RAC: 6,570
                                                        Austria
                                                        Message 656318 - Posted 8 Oct 2007 16:10:04 UTC - in response to Message 656218.

                                                          08/10/2007 13:49:01||Access to reference site succeeded - project servers may be temporarily down.
                                                          08/10/2007 13:49:04|SETI@home|Scheduler request failed: failed sending data to the peer
                                                          08/10/2007 13:49:04|SETI@home|Sending scheduler request: To fetch work
                                                          08/10/2007 13:49:04|SETI@home|Requesting 34560 seconds of new work
                                                          08/10/2007 13:49:05||Project communication failed: attempting access to reference site
                                                          08/10/2007 13:49:06||Access to reference site succeeded - project servers may be temporarily down.
                                                          08/10/2007 13:49:09|SETI@home|Scheduler request failed: failed sending data to the peer
                                                          08/10/2007 13:49:09|SETI@home|Deferring communication for 1 min 0 sec
                                                          08/10/2007 13:49:09|SETI@home|Reason: scheduler request failed


                                                          having trouble getting any wu's and when I do I only receive about 6, I have my settings set to receive upto 10 days worth


                                                          arr25b,

                                                          does the estimated time to finish a WU match with the real time? It can happen that the client_state.xml file gets corrupted. SETI@Home calculates the need for new WUs on basis of the estimated time to finish your backlog.

                                                          You can repair this by editing the client_state.xml file. Open it in any Text Editor and search for "duration_correction_factor" This will give results like this:
                                                          <duration_correction_factor>0.368199</duration_correction_factor>
                                                          Make sure it is the one for SETI!
                                                          If estimated times are longer than real crunch time then reduce the correction factor. Make sure you use the exact same number of decimal places!!!
                                                          Safe it and restart BOINC.

                                                          ____________

                                                          Profile arr25b
                                                          Send message
                                                          Joined: Nov 19 05
                                                          Posts: 15
                                                          Credit: 4,219,671
                                                          RAC: 8,328
                                                          United Kingdom
                                                          Message 656322 - Posted 8 Oct 2007 16:14:11 UTC - in response to Message 656318.

                                                            08/10/2007 13:49:01||Access to reference site succeeded - project servers may be temporarily down.
                                                            08/10/2007 13:49:04|SETI@home|Scheduler request failed: failed sending data to the peer
                                                            08/10/2007 13:49:04|SETI@home|Sending scheduler request: To fetch work
                                                            08/10/2007 13:49:04|SETI@home|Requesting 34560 seconds of new work
                                                            08/10/2007 13:49:05||Project communication failed: attempting access to reference site
                                                            08/10/2007 13:49:06||Access to reference site succeeded - project servers may be temporarily down.
                                                            08/10/2007 13:49:09|SETI@home|Scheduler request failed: failed sending data to the peer
                                                            08/10/2007 13:49:09|SETI@home|Deferring communication for 1 min 0 sec
                                                            08/10/2007 13:49:09|SETI@home|Reason: scheduler request failed


                                                            having trouble getting any wu's and when I do I only receive about 6, I have my settings set to receive upto 10 days worth


                                                            arr25b,

                                                            does the estimated time to finish a WU match with the real time? It can happen that the client_state.xml file gets corrupted. SETI@Home calculates the need for new WUs on basis of the estimated time to finish your backlog.

                                                            You can repair this by editing the client_state.xml file. Open it in any Text Editor and search for "duration_correction_factor" This will give results like this:
                                                            <duration_correction_factor>0.368199</duration_correction_factor>
                                                            Make sure it is the one for SETI!
                                                            If estimated times are longer than real crunch time then reduce the correction factor. Make sure you use the exact same number of decimal places!!!
                                                            Safe it and restart BOINC.


                                                            I'll give that a try,I have also altered my days buffering from 10 to 7,still can't d/l any wu's even on a fresh install

                                                            ____________

                                                            Ned Ludd
                                                            Volunteer tester
                                                            Avatar
                                                            Send message
                                                            Joined: Apr 3 99
                                                            Posts: 7693
                                                            Credit: 264,450
                                                            RAC: 331
                                                            United States
                                                            Message 656327 - Posted 8 Oct 2007 16:37:04 UTC - in response to Message 655489.



                                                              Sorry, but you are confusing electrical signal cables with fiber optic cables. The grounding advice you gave is well applied to electrical cables, but fiber optics are immune to RFI and other electical interference, and do not require shielding which is earthed (grounded).


                                                              :) I know that fibre cables normally have no shielding, but sometimes the rodent protection of the cables is connected to the sending and the receiving modul, or to the server rack. This can cause such problems, not because of electrical interference in the cable itself, but in the sending or receiving modul.


                                                              I just wanted to mention it, cause i had such problems twice (with fibre optic) and it took me 2 months to locate the problem.

                                                              I suppose if the 2 interconnected pieces of equipment were not on the same ground bus there could be some differential between them thus inducing some current flow from one to the other. Fair enough. When you don't have an obvious solution to a problem, you must look for other obscure causes.

                                                              No. We're talking about converting electrical signals to light, running them through the fiber optics, and then converting from light back to electrical signals.

                                                              The signal is referenced to local ground at the converter at each end. That's one of the best things about fiber optics, all of the yucky "what is ground" questions go away. The light is light at each end of the cable.
                                                              ____________

                                                              Ned Ludd
                                                              Volunteer tester
                                                              Avatar
                                                              Send message
                                                              Joined: Apr 3 99
                                                              Posts: 7693
                                                              Credit: 264,450
                                                              RAC: 331
                                                              United States
                                                              Message 656331 - Posted 8 Oct 2007 16:40:10 UTC - in response to Message 655492.

                                                                I suppose if the 2 interconnected pieces of equipment were not on the same ground bus there could be some differential between them thus inducing some current flow from one to the other.

                                                                All too common.
                                                                Earth loops can be an nightmare to track down & resolve.

                                                                ... and this is the other reason you use optical fiber -- it's a great way to break up ground loops (earth loops) caused by the cable ground lines.
                                                                ____________

                                                                Robert Ribbeck
                                                                Avatar
                                                                Send message
                                                                Joined: Jun 7 02
                                                                Posts: 196
                                                                Credit: 1,745,876
                                                                RAC: 2,628
                                                                United States
                                                                Message 656502 - Posted 8 Oct 2007 23:05:08 UTC - in response to Message 656267.

                                                                  Something is still messed up

                                                                  From the log of one of my machines

                                                                  10/7/2007 1:07:03 AM|SETI@home|Scheduler RPC succeeded [server version 511]
                                                                  10/7/2007 1:07:03 AM|SETI@home|Message from server: No work sent
                                                                  10/7/2007 1:07:03 AM|SETI@home|Message from server: (won't finish in time) Computer on 85.4% of time, BOINC on 100.0% of that, this project gets 100.0% of that

                                                                  Say What ?? the machine is on 24/7 and only runs seti@home

                                                                  The closest deadline is 0ct 30 for 6 less than 5.5hr chunks everything else is
                                                                  not due till Nov. Unless the scheduler is messed up and is ignoring deadlines
                                                                  the Oct deadlines should be completed early Monday.

                                                                  So WHY the won't finish in time and computer on errors ???



                                                                  Have you recently (within the last 2 months) shut that machine down for any reason? (maintenance, a reboot, anything?) That would cause the "Computer on" stat to fall below 100%...



                                                                  It's been on 24/27
                                                                  ____________

                                                                  Robert Ribbeck
                                                                  Avatar
                                                                  Send message
                                                                  Joined: Jun 7 02
                                                                  Posts: 196
                                                                  Credit: 1,745,876
                                                                  RAC: 2,628
                                                                  United States
                                                                  Message 656507 - Posted 8 Oct 2007 23:11:46 UTC - in response to Message 656252.

                                                                    Hi,
                                                                    Something is still messed up

                                                                    From the log of one of my machines

                                                                    10/7/2007 1:07:03 AM|SETI@home|Scheduler RPC succeeded [server version 511]
                                                                    10/7/2007 1:07:03 AM|SETI@home|Message from server: No work sent
                                                                    10/7/2007 1:07:03 AM|SETI@home|Message from server: (won't finish in time) Computer on 85.4% of time, BOINC on 100.0% of that, this project gets 100.0% of that

                                                                    Say What ?? the machine is on 24/7 and only runs seti@home

                                                                    The closest deadline is 0ct 30 for 6 less than 5.5hr chunks everything else is
                                                                    not due till Nov. Unless the scheduler is messed up and is ignoring deadlines
                                                                    the oct deadlines should be completed early monday.

                                                                    So WHY the won't finish in time and computer on errors ???

                                                                    Your WUs will finish in time. But Berkeley thinks that the new WUs Berkeley wants to send to you won't be reported in time. At the moment, VHAR workunits with a deadline of about 8 days are sent out. Some of your machines have average turnaround times of 8 to 10 days, which are too long to report the WUs in time. That's the reason why Boinc blocks you from getting new work.

                                                                    Regards,
                                                                    Carsten


                                                                    That was from a machine that returns a result in about 8 hours The whole queue
                                                                    would be returned in 8-10 days

                                                                    ____________

                                                                    Profile The Grinch
                                                                    Volunteer tester
                                                                    Avatar
                                                                    Send message
                                                                    Joined: Jul 21 99
                                                                    Posts: 6
                                                                    Credit: 4,702,911
                                                                    RAC: 11,356
                                                                    Germany
                                                                    Message 656634 - Posted 9 Oct 2007 6:55:27 UTC

                                                                      The seti@home - Server actual very unstable!
                                                                      What the Matter?

                                                                      Ned Ludd
                                                                      Volunteer tester
                                                                      Avatar
                                                                      Send message
                                                                      Joined: Apr 3 99
                                                                      Posts: 7693
                                                                      Credit: 264,450
                                                                      RAC: 331
                                                                      United States
                                                                      Message 656831 - Posted 9 Oct 2007 20:04:42 UTC - in response to Message 656634.

                                                                        The seti@home - Server actual very unstable!
                                                                        What the Matter?

                                                                        SETI@Home runs on old, hand-me-down hardware. That's why BOINC is designed to tolerate outages.

                                                                        You can:

                                                                        • Set BOINC to cache more work
                                                                        • Connect to more than one BOINC project so you'll always have work from someone
                                                                        • Donate some cash so SETI can get better servers.



                                                                        It is amazing how well things work given the size of the project and the project funding.
                                                                        ____________

                                                                        Message boards : Technical News : upload issues and other news (Oct 5, 2007)

                                                                        Copyright © 2009 University of California