Message boards :
Technical News :
mork
Message board moderation
Author | Message |
---|---|
Jeff Cobb Send message Joined: 1 Mar 99 Posts: 122 Credit: 40,367 RAC: 0 |
I'm starting a thread to let people know what's going on with the mork (our boinc DB server) issue. As most of you know, mork will sometimes hang, requiring a power cycle to boot. There are no footprints as to what causes this. So we strongly suspect hardware. Mork has a sister machine (mindy, of course) that never really worked (both are donated, used, HW). So mindy is mork's parts machine. This is a little dicey because we don't know why mindy did not work. The RAM in these machines are arranged on 4 daughter boards. Last week we swapped all four of mindy's identically populated memory boards into mork. But at least one of the "new" sticks was bad because mork then showed differing amounts of memory across subsequent boots. So we returned mork's original memory and ran the first three memtest tests. They showed no error. The final several tests are very time consuming and we may or may not do them, as mork's OS is down for these tests. Today, we swapped mindy's two power supplies into mork. This is not because we strongly suspect the power supplies but because this is an easy exercise. If mork hangs again, we are likely to replace the entire machine. Further component testing is becoming too cumbersome and time consuming. And after all, we now have the funds to do this because your very generous donations (thank you!!!). |
Jord Send message Joined: 9 Jun 99 Posts: 15184 Credit: 4,362,181 RAC: 3 |
Fun, those gremlins. Threaten with heavy bodily (hardwarial) harm? ;-) |
Jack Zhang Send message Joined: 2 Jul 06 Posts: 206 Credit: 6,142,449 RAC: 0 |
General Tip: Memtest passing is not the whole story, Memory timing settings being too tight can also cause IO errors that aren't detectable by memtest. From overclocking experience, sometimes rated timings do not necessarily mean stable. What if Fiction was Fact and Fact was Fiction and vice versa? |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
there could be the "reseat to clear" memory issues.. and they MIGHT not be back.. also suspect is electronic "disks"(non-physical) as well as many other possibilities. But it sounds like you proved either memory or bios issues on mindy. might be worth taking another look at? Janice |
Gary Charpentier Send message Joined: 25 Dec 00 Posts: 30932 Credit: 53,134,872 RAC: 32 |
Thanks for the update. Much appreciated. Sounds like an experience I had a long time ago with a memory test. Ran it and it said every chip was good. Re-ran and every chip was a failure. Knew right then and there they were all 100% good. Problem was elsewhere but still in the memory circuits. Turned out to be a broken trace on the motherboard. Agree with cash on hand not worth chasing it down further, but might be worth it after it is replaced to have a box for something else. |
zoom3+1=4 Send message Joined: 30 Nov 03 Posts: 66215 Credit: 55,293,173 RAC: 49 |
Daughter boards, Yeah I remember something like that on the Amiga 1000 and It's graphics/chip ram, As boards went they were ok, It was the pins coming up through the daughter board from the motherboard that was the problem, Needless to say when the computer worked It worked, As It was those contacts between the two boards that could cause problems. Good luck Jeff and keep up the good work. Savoir-Faire is everywhere! The T1 Trust, T1 Class 4-4-4-4 #5550, America's First HST |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Jeff... Thanks so much for the update on mork. Best of luck on Friday when you crank back up and he will be heavily stressed again. Meow meow. "Time is simply the mechanism that keeps everything from happening all at once." |
KWSN THE Holy Hand Grenade! Send message Joined: 20 Dec 05 Posts: 3187 Credit: 57,163,290 RAC: 0 |
Jeff... Agreed: but how about cranking up Beta Friday? NTM a stats export? (even a single would help!) . Hello, from Albany, CA!... |
Byron Leigh Hatch @ team Carl Sagan Send message Joined: 5 Jul 99 Posts: 4548 Credit: 35,667,570 RAC: 4 |
thank you Jeff, for the update Best Wishes Byron |
S@NL - Eesger - www.knoop.nl Send message Joined: 7 Oct 01 Posts: 385 Credit: 50,200,038 RAC: 0 |
As always thanx for the update! .. NTM a stats export? (even a single would help!) I'm hoping for a stats-update also. I've made my system so that it can cope with a hickup in statsexports.. even across the change of a month.. but not two, so I really hope you guys can give the export-script a go this month. Could you tell me if you can make it this month? If not I'dd really like to know.. the I will need to do some thinking & programming to make my stats cope with it.. Thanx very much in advance for your reply. The SETI@Home Gauntlet 2012 april 16 - 30| info / chat | STATS |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
I love weird H/W issues. We were using an old Compaq ProLiant 5500 server at work that was handed down from our IS department. One day it started randomly crashing/rebooting. Sometimes it would be up for a minute or 6-7 days. We tried reinstalling the OS several times, swapping out both of it's redundant PSUs, pulling out each of the 4 CPU & running the system with only 1 at a time, swapping out memory riser boards & RAM dimms, & swapping out SCSI controllers & drives. Finally after all of that & several months of troubleshooting we gave up. I installed BOINC on the system & let it run to see what would happen. Turns out that it ran 24/7 without crashing/rebooting while BOINC was running the CPUs full tilt. If BOINC was closed the random reboots would start again. So we left BOINC on and ran it as part of our infrastructure for 15 more months w/o a single issue. Recently I retired it from use as we got some newer more powerful, and MUCH quieter, machines. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
kepan Send message Joined: 17 Sep 99 Posts: 7 Credit: 27,442,770 RAC: 0 |
Does anyone knows why BOINCstats does not update the score for SETI@home? |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Does anyone knows why BOINCstats does not update the score for SETI@home? It has updated my total credits, but not the graphs. Not sure if that will happen today yet, or sort with tomorrow's update. "Time is simply the mechanism that keeps everything from happening all at once." |
Ruopp Send message Joined: 18 May 99 Posts: 2 Credit: 3,793,074 RAC: 0 |
Does anyone knows why BOINCstats does not update the score for SETI@home? Extracted from Boinc stats FAQ: How often is BOINCstats updated? BOINCstats checks for XML updates every two hours, and, when available, downloads them, reads the content into the database and updates the credits and ranks. The numbers from this update are used to display current credits and ranks for the stats only. The incremental updates take between 15 minutes up to one hour to complete. At 15:00GMT each day all new info from the XML files is imported into the BOINCstats database. New users/teams/countries are inserted at this point, and daily/weekly/monthly numbers are calculated. When there is no new XML file for more then a day, the stats will show zero credits for those days. The numbers from this update are used to display the numbers on the frontpage and the detailed stats pages. The daily update takes about 2,5 hours to complete. The same update, but then just for hosts, runs each day at 1:00GMT, and takes about five hours to complete. Only users, teams, hosts and countries with at least one (1) total credit are listed! When an update is running, there is no check for new XML files until the update is finished. This is why the time since last update can be more than one hour. Until this date, BOINCstats never failed to run its daily update , which means: when new credit is granted and the XML output by the project is OK, you'll get your credit on BOINCstats within 25 hours. |
S@NL - Eesger - www.knoop.nl Send message Joined: 7 Oct 01 Posts: 385 Credit: 50,200,038 RAC: 0 |
As always thanx for the update! Yay! the stats-import is running, thanx guys (and girls?) You made me a happy man ;) (and all/most stats-lovers will get their stats-update shortly!) The SETI@Home Gauntlet 2012 april 16 - 30| info / chat | STATS |
Ray_GTI-R Send message Joined: 17 May 99 Posts: 56 Credit: 276,906 RAC: 0 |
it ran 24/7 without crashing/rebooting while BOINC was running the CPUs full tilt. If BOINC was closed the random reboots would start again. This exact thing happened to me once, running SetiClassic. It turned out to be a CPU that was on the brink of failing. HTH, Ray The difference between 0 and 1 is greater than the difference between 1 and 1,000,000 |
J. Mileski Send message Joined: 9 Jun 02 Posts: 632 Credit: 172,116,532 RAC: 572 |
With the new servers on the way, I was wondering about our 3 day outage. I am under the impression that 2 of the 3 days are for the The Near-Time Persistency Checker, because it needs exclusive database access. I was wondering if a third database backup could be created and use that as the The Near-Time Persistency Checker database? Log the changes then on the database maintenance day, use a log to make changes to the master database then resynchronize the 3rd DB with the new results from the week. I hope I explained my idea good enough, I am a truck driver and only dabble in computers. I like to assemble components and see if I can make them work. On edit, I was wondering if Mork is stable enough to take on this role |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
With the new servers on the way, I was wondering about our 3 day outage. I am under the impression that 2 of the 3 days are for the The Near-Time Persistency Checker, because it needs exclusive database access. I was wondering if a third database backup could be created and use that as the The Near-Time Persistency Checker database? Log the changes then on the database maintenance day, use a log to make changes to the master database then resynchronize the 3rd DB with the new results from the week. I hope I explained my idea good enough, I am a truck driver and only dabble in computers. I like to assemble components and see if I can make them work. Dunno.....and I'm not gonna bother Eric with such questions until the new servers are in the closet and producing. There was conjecture long ago that a certain setup would allow continuous uptime of the project on our side, whilst allowing for proper backups on the fly. The new science database will certainly have enough horsepower to allow much of the heavy duty science work to be done without much interference to the daily routine of the project. We shall have to wait until everything is up and humming before those questions can be entertained. "Time is simply the mechanism that keeps everything from happening all at once." |
soft^spirit Send message Joined: 18 May 99 Posts: 6497 Credit: 34,134,168 RAC: 0 |
Just curious to hear from the lab.. do you guys think the bubble gum will hold? Or do you need more black tape? Janice |
kittyman Send message Joined: 9 Jul 00 Posts: 51477 Credit: 1,018,363,574 RAC: 1,004 |
Just curious to hear from the lab.. do you guys think the bubble gum will hold? Or do you need more black tape? Eric told me he was going to try to have a look at why the downloads seem so bottled up and the inbound bandwidth is so high. Not sure when. "Time is simply the mechanism that keeps everything from happening all at once." |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.