1)
Message boards :
Number crunching :
Can not report
(Message 230593)
Posted 13 Jan 2006 by ![]() Post: 1/13/2006 7:28:55 AM||Attempting to send data to [setiboinc.ssl.berkeley.edu] failed [failed sending data to the peer] I found an old entry in the Wiki defining a -103 as a write error, but the important thing is that it looks like the error is on the server side, not your client. Looking through my logs, I've got several similar messages starting last night, with the last one a couple of hours ago. Happy crunching, Brian |
2)
Message boards :
Number crunching :
Fourth Result Wasted
(Message 229820)
Posted 11 Jan 2006 by ![]() Post: The issue I had with Einstein is that the WUs associated with a data file are all done by the same four PCs. So if you are the slowest PC of the four and return the result last (fourth) you are stuck in a downward cycle as you will not be sent the next WU until you return a result. Einstein seems to have stopped sending out the 4th unit unless it's needed. My outstanding Einstein units only have a 4th unit if there was a problem--downloading error, client error on computation, etc. On the pending ones, it looks like only 3 were sent out and validation is still done on 3. Happy crunching, Brian |
3)
Message boards :
Number crunching :
LTD / STD
(Message 227555)
Posted 7 Jan 2006 by ![]() Post: I'd like to thank everyone who replied. I guess its the LTD that needs to get closer to zero. I'll let it continue to run the way that it is for a while longer, don't really want to reset my debt values, or to reset the other two projects, this has just been going on a long time and I was getting concerned. so as long as my LTD keeps progressing toward zero Ill let it keep doing its thing. If you're running Windows, BoincDV is a handy little utility that lets you view the short-term and long-term debt for each project. And you're right, "let it run and let the scheduler do its thing" is probably the best course here. Trying to micromanage the scheduler usually just makes its flaky behavior worse. Happy crunching! |
4)
Message boards :
Number crunching :
LTD / STD
(Message 227194)
Posted 6 Jan 2006 by ![]() Post: yes but what does that mean, that discription is worse than sterio instructions. When a project has the CPU, its long-term debt decreases (goes toward the negative direction); when a project is waiting for someone else, its debt increases (goes toward the positive direction). The debt is the amount of CPU time owed *to* that project. Each time the scheduler switches between projects, debt is recalculated for all projects, and the numbers are then adjusted so that the total of all debt across all projects is 0. If a project gets more than its resource share (earliest-deadline-first to make deadline, other projects not having work, etc) its LTD can go below 0. If its LTD goes farther below 0 than 1 time slice (i.e. below -3600 for the default settings), then that project is prevented from downloading new work. Work it's already downloaded will continue to be processed as usual--round-robin if it's not under deadline pressure, earliest-deadline-first if it is. The project can download new work when its long-term debt returns to something close to 0. Example: 3 projects, equal resource share. Each should get the CPU about 1/3 of the time. In normal round-robin mode each project gets the CPU one hour each, in rotation. You could think of it this way: When it gets the CPU, Project A runs for 1 hour. Because it was only 'entitled' to 1/3 of that hour, it owes 2/3 of an hour--1/3 to each of the other 2 projects. The next hour, project B runs, and A sits idle--1/3 of an hour repaid. The next hour C runs, and again A sits idle, repaying 1/3 hour of long-term debt. At the end of 3 hours, A has run for 1 hour, accumulating 2/3 hour of debt, then paying it back over the next 2 hours, and it's A's turn again. Now suppose A had a lot of work on hand that was under deadline pressure and ran for 60 hours straight. A now owes 40 hours (2/3 of 60 hours) long term debt. A will not be allowed to download more work until it's paid that debt off. Every hour that A sits idle pays off 1/3 hour of long-term debt. So A's debt will start out much less than 0, because A has used more than its share, and will gradually return to 0 as the debt is paid off. B and C, meanwhile, will have large positive debts--they're both owed quite a bit, because they haven't had the CPU at all for 60 hours. Their debts will gradually come down as they divide the CPU between them 50/50 for a while. If the projects have different resource shares the numbers get awkward, but the same principle applies. If the 2 projects still have LTD that's much less than 0, they're still paying off debt. Hope this helps, Brian |
5)
Message boards :
Number crunching :
Wow!
(Message 226844)
Posted 6 Jan 2006 by ![]() Post: I'm seeing "Wow!" now. Hmmm...I'm still not getting through. Oh well, plenty of work on hand, if it's still not connecting by tomorrow I'll start being concerned. |
6)
Message boards :
Number crunching :
Unless somebody can explain what is going on, I quit!
(Message 226180)
Posted 5 Jan 2006 by ![]() Post: Waiting for setiathome.berkeley.edu Check the home page and the technical news. The data-driven web pages (e.g. these fora) are still running, but several other services including upload/download aren't. |
7)
Message boards :
Number crunching :
The Future of BOINC? (BOINC vs LCG)
(Message 226102)
Posted 5 Jan 2006 by ![]() Post: I had heard that they were impressed with the power of Boinc, especially when compared to all the Rental time on Supercomputers they've been paying for. I guess each project knows best what their needs are. And I wouldn't be at all surprised if they have BOINC do as much of it as possible...but 10 petabytes is a LOT of data. They may still need to rent time or come up with some other resource. Can they get the equivalent of 100K CPU's doing LHC full time from BOINC alone? |
8)
Message boards :
Number crunching :
Not Requesting New Work Units
(Message 225527)
Posted 4 Jan 2006 by ![]() Post: Btw.. one more thing... I dont have any work units to work on.. atleast no work units on any besides Climate Predictor which is only set to a 22% share of resources.. So the other projects have 78% of the resources between them? If they've been getting more, for whatever reason, it's possible they all owe some time to Climate. I've got something similar on my machine...it downloads a bunch of Einstein, works round-robin for a while, goes into EDF to finish it off (5-day work queue), then Einstein has to wait a while while CPDN has sole use of the CPU for a day or so until long-term debt evens out. Give it a couple days to straighten out. |
9)
Message boards :
Number crunching :
Not Requesting New Work Units
(Message 225526)
Posted 4 Jan 2006 by ![]() Post: I currently am actively participating in 4 projects... 3 of which are acting like SETI.. the 4th.. Climate Predictor is working fine.. If for some reason they've been getting more than their resource share lately, they'll hold off on downloading more until they've paid off the debt. Could that be it? If so, once one of them has waited long enough, it'll request more work and off you go. Each will come back 'online' one at a time as the long-term debt falls. Sometimes working out the debt takes a while.... Cheers, Brian |
10)
Message boards :
Number crunching :
Downloads Fixed I think
(Message 224774)
Posted 2 Jan 2006 by ![]() Post: Well just got this message maybe backed up for a while ??? After any sort of outage, or when any fix is applied to a problem, there's usually a several-hour period of the problem gradually clearing up before things are back up to full speed. So yes, even if their extra outage of today fixed everything, there'd still be several hours of gradually improving performance rather than a sudden sharp rise. Happy crunching, Brian |
11)
Message boards :
Number crunching :
WU's Finishing way early
(Message 224769)
Posted 2 Jan 2006 by ![]() Post: It seems there were a lot of them sent out over night. It says for the top 3 WU's that: It means there was some sort of electromagnetic noise picked up, and the number of signals is so ridiculously high it's obviously a terrestrial artifact, so there's no point in further analysis. You'll get credit for the time you spent crunching on it, but it'll probably be a fraction of a cobblestone. Cheers, Brian |
12)
Message boards :
Number crunching :
Loss of ADSL & back to dial up - how to run general preferences?
(Message 224239)
Posted 1 Jan 2006 by ![]() Post: For the last 10 days I have had intermittent ADSL connections (router loosing synchronisation regularly with DSLAM in local, distant, rural UK exchange). More importantly during the last 50+ hours ADSL connection has ceased completely. I'd set the queue to a day or so, so you've got a cache of work on each machine. (If you set it much longer than that, do it in stages...set it to 1 day, then tomorrow up it to 2 days, etc. Sometimes the scheduler downloads too much if you jump it to a large value all at once.) Website: Your Account, Preferences, General Prefences, Connect about every X days. Some people are reporting download issues at the moment, so don't panic if you don't fill the cache completely on the first try. Set your preferences (Boinc client, Commands menu) to "Network activity based on preferences" so it doesn't try to connect when there's no connection available. When you do have a connection, hit "Update" on each machine. (You can use the BOINC manager to remotely update each of the machines from whichever one you're at.) Cheers, Brian |
13)
Message boards :
Number crunching :
Average Credit: 1,498 (12/31/2005) on 1 CPU!
(Message 224027)
Posted 1 Jan 2006 by ![]() Post: But why is it claiming only about 5 credits per WU? Because claimed credit is based on benchmarked CPU speed * crunch time. A machine that crunches a unit very quickly claims less credit. Also, there are factors such as memory speed and L2 cache that affect crunch times that the benchmark doesn't measure. Finally, Linux machines will routinely have lower benchmark scores than Windows machines, even on the same hardware. The enhanced science application, due for release Real Soon Now (tm), will use an actual operations count to determine credit, rather than estimating based on a flawed benchmark. Happy crunching, Brian |
14)
Message boards :
Number crunching :
Question about Avg-credit
(Message 223177)
Posted 30 Dec 2005 by ![]() Post: I do not understand the avg-credit.Why the others is higher than me.How is the criterion? Wiki: Recent Average Credit Basically, it's a running average of how much you've turned in lately. As a rough analogy, RAC is your speedometer (how fast you're going), Total Credit is your odometer (total distance traveled). Because of the vagaries of granted credit and the formula used to calculate it, small fluctuations in RAC happen all the time. Unless there's a sudden sharp drop or sharp rise, minor variations are nothing to worry about. Happy crunching, Brian |
15)
Message boards :
Number crunching :
Multiple CPU / Optimized Client / Granted credit
(Message 222800)
Posted 29 Dec 2005 by ![]() Post: Have someone realized how bad are scoring most of new multiple-cpu hosts and the hosts running third-parts "optimized client"? One possibility is to download an "optimized" core client as well; however, this will inflate your claims on ALL projects, not just SETI, and many people consider this at least borderline cheating. Another workaround is to run a longer work queue, so the work you return is less likely to be part of the validation quorum and the credit claims based on the other units. (This is what I do on my Linux box that runs an optimized science app, and has a typical claim of 4.5 - 5.5 per WU.) Finally, the _enhanced application uses an actual operations count rather than relying on the flawed benchmarks, so the problem may largely go away once that's in place. The enhanced application is supposed to be released Real Soon Now (tm). So it may be a temporary problem anyway. Cheers, Brian |
16)
Message boards :
Number crunching :
Running out of work
(Message 222560)
Posted 29 Dec 2005 by ![]() Post: That may not be true in this case. The recovery from the outage went outstanding well for the first two hours. My machines had no trouble at all uploading results and reporting results. A few even got validated quickly. If it's cleared, then no problem, these sorts of transient incidents are fairly common. The regular outage will produce the "can't connect" messages, but with the other messages that were posted ("system I/O"), there's obviously something else going on. Nothing major, let's hope. If it continues or recurs, posting the error messages (and a few lines of the log before the error message) can help troubleshoot what's happening. Happy crunching! |
17)
Message boards :
Number crunching :
Running out of work
(Message 222540)
Posted 29 Dec 2005 by ![]() Post: Are others seeing "no work" messages in response to work requests? Or 12/28/2005 7:32:35 PM||Couldn't connect to hostname [setiboincdata.ssl.berkeley.edu] messages? I didn't see similar posts in my quick skim of thread titles. Wanted to see what others were experiencing before looking further on my end, as otherwise my network and servers seem to be normal. There's an outage every Wednesday while maintenence is done on the database. And there's always a congested period for a few hours after it comes back up as the accumulated work tries to get done. So some "can't connect" message during those periods aren't unusual. If it's still going on several hours from now or tomorrow, it's worth troubleshooting, but otherwise it's probably outage-related. Cheers, Brian |
18)
Message boards :
Number crunching :
team problem -stats all wrong
(Message 222409)
Posted 28 Dec 2005 by ![]() Post: in team members email list It shows me about 21 members. In my account control panel - it shows 11 members Main page: December 27, 2005 We have temporarily turned off the counting of workunits and results in various states (in progress, waiting for validation, etc) in order to give the database cleanup process more resources. Until we turn the counts back on, several of the numbers on the server status page will not be up to date. ============ I wouldn't panic until a day or two after everything gets turned back on. [Edit: on re-reading, I'm not so sure this is related to the server status. The validators etc are on. The much larger number of people signed up compared to active members, is, alas, a fact of life. Are the new members showing up on the third-party stats sites? (I know, they pull their data from SETI, I'm just wondering if it's a data problem or a webpage problem.)] |
19)
Message boards :
Number crunching :
100% CPU usage
(Message 221387)
Posted 26 Dec 2005 by ![]() Post: If it's a temperature problem, I find that opening the case and pointing a fan inside is a cheap and easy way to fix it. The chip is designed to run 100%, 24/7, for years. If it's overheating, then it's an overheating problem, not an overworked CPU problem. The most likely culprit is inadequate airflow. Something as simple as opening the case and using some canned air to knock the dust off may be all it takes. There are programs that work with the OS to limit the amount of CPU time any one process gets (for some reason the name "Threadmaster" comes to mind), but I have absolutely no experience with them so can't comment any further, other than I've heard they exist. |
20)
Message boards :
Number crunching :
What's more important to faster SETI processing times? Floating point, or Integer ops? TIA!
(Message 220702)
Posted 24 Dec 2005 by ![]() Post: I would recommend that a memory speed test be added to the suite of benchmarks used by BOINC. Or better yet, ditch them altogether and replace them with something like, say, a count of how many operations are actually being done. (This is the plan for the _enhanced client, I understand.) Benchmarking has a long and sordid history, and the main lesson to take away from it is that if one machine has a higher benchmark than another, all it really tells you is that it runs the benchmark suite faster. The relation between Whetstone/Dhrystone scores and actual in-the-field performance is whimsical, at best. The only way to reliably estimate how long a particular hardware/OS will take to crunch a work unit is to see how long it's taken to crunch similar workunits in the past. |
©2023 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.