2 to avoid

Message boards : Number crunching : 2 to avoid
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 643828 - Posted: 18 Sep 2007, 23:00:22 UTC

I was just about to start thinking about celebrating my RAC going back into 4 figures when I got teamed up with the following pair of jokers:

The first is running core client 3.20, is crunching and returning WUs that validate but is claiming zero credits. A sample of his stderr out is

stderr out <core_client_version>3.20</core_client_version>
<stderr_txt>
no start tag in app init data
Can't parse init data file - running in standalone mode
no start tag in app init data
Can't parse init data file - running in standalone mode
setiathome_enhanced 5.27 DevC++/MinGW

Work Unit Info:
...............
WU true angle range is : 0.439020
Optimal function choices:
-----------------------------------------------------
name
-----------------------------------------------------
v_BaseLineSmooth (no other)
v_vGetPowerSpectrumUnrolled2 0.00072 0.00000
sse1_ChirpData_ak 0.02369 0.00000
v_vTranspose4x8ntw 0.01229 0.00000
AK SSE folding 0.00221 0.00000

Flopcounter: 15684420151540.854000

Spike count: 0
Pulse count: 0
Triplet count: 2
Gaussian count: 1

</stderr_txt>

So that was 90 minutes crunching without credit (but the science was served, I console myself).

The second is not so immediately painful. It has 182 results showing on its Results pages and all the ones I have sampled have been -9's or have failed to validate. Almost all that have been completed have been granted a big round 0 credit and even most -9's have been sent out to a 3rd cruncher for validation.

Ah well - tomorrow is another day!!

F.
ID: 643828 · Report as offensive
Astro
Volunteer tester
Avatar

Send message
Joined: 16 Apr 02
Posts: 8026
Credit: 600,015
RAC: 0
Message 643853 - Posted: 18 Sep 2007, 23:59:23 UTC

That's certainly strange. The min boinc version is 4.19, earlier than that should be not getting downloads. I wonder if running in "stand alone mode" makes a difference.
ID: 643853 · Report as offensive
Profile BOINC*Zappattazz*Synergy
Volunteer tester

Send message
Joined: 27 May 99
Posts: 2
Credit: 58,624
RAC: 0
United States
Message 643873 - Posted: 19 Sep 2007, 0:26:39 UTC

Fred W: A cursory search shows the first user can receive private messages. Have you tried to contact him with your concerns?
ID: 643873 · Report as offensive
Profile Sarge
Volunteer tester

Send message
Joined: 25 Aug 99
Posts: 12273
Credit: 8,569,109
RAC: 79
United States
Message 643917 - Posted: 19 Sep 2007, 2:36:42 UTC - in response to Message 643873.  

Fred W: A cursory search shows the first user can receive private messages. Have you tried to contact him with your concerns?

If taking this route, do it nicely!
Just inform him/her about the newer versions. I think going the route of "you're holding back my RAC" would be bad. On the other, besides pointing out the newer versions, indicate how this will do a better job for SETI.
So on and so forth.
My humble ideas, submitted.
Capitalize on this good fortune, one word can bring you round ... changes.
ID: 643917 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 643962 - Posted: 19 Sep 2007, 3:56:30 UTC - in response to Message 643853.  

That's certainly strange. The min boinc version is 4.19, earlier than that should be not getting downloads. I wonder if running in "stand alone mode" makes a difference.


There was another 3x'er which also indicated it was running in 'Standalone' mode in the stderr section which some one posted about when we first went to 2/2. And I agree, I was under the impression 3x wasn't allowed anymore.

As far as the second host goes, this is more troubling (IMHO) since the host is running a current CC and the stock app but is Dash 9'ing when the wingmen don't.

I suppose it could be something as simple as dirt buildup inside case, bad/weak fan, etc. However, we can't tell if it's OC'ed or not, but this isn't first instance where this scenario has played out lately and some of those cases were reported to have been checked for overheating conditions and they weren't being OC'ed. In any event, it seems to need some help since something isn't right.

Alinator
ID: 643962 · Report as offensive
Christoph
Volunteer tester

Send message
Joined: 21 Apr 03
Posts: 76
Credit: 355,173
RAC: 0
Germany
Message 644470 - Posted: 19 Sep 2007, 20:31:30 UTC

Maybe The Project should send out a mail: "Please update to the latest BOINC Version"

That should for sure be due after 6.xx is out in the open in my opinion. No idea how long that will be.

Happy crunching, Christoph
Christoph
ID: 644470 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 644522 - Posted: 19 Sep 2007, 22:26:28 UTC
Last modified: 19 Sep 2007, 22:28:59 UTC

For example:
http://setiathome.berkeley.edu/workunit.php?wuid=158226644
One host spent some time for calculation, the second spent zero time.
Both got zero credit. Is it mean results were not validated?
If so, why there was not work reissue? One result declared as canonical but it probably cant be validated with zero-CPU-time result in any way.
Something wrong? How is it possible?
P.S. another sampe (first already discarded from base probably)
http://setiathome.berkeley.edu/workunit.php?wuid=156609326
ID: 644522 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 644534 - Posted: 19 Sep 2007, 22:58:55 UTC
Last modified: 19 Sep 2007, 22:59:38 UTC

No, they validated OK from a science POV. It means that the 3.20 host didn't return anything useful from scoring POV, and since the lowest claim is what's granted everybody gets a goose egg.

Alinator
ID: 644534 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 644538 - Posted: 19 Sep 2007, 23:00:58 UTC - in response to Message 644522.  

For example:
http://setiathome.berkeley.edu/workunit.php?wuid=158226644
One host spent some time for calculation, the second spent zero time.
Both got zero credit. Is it mean results were not validated?
If so, why there was not work reissue? One result declared as canonical but it probably cant be validated with zero-CPU-time result in any way.
Something wrong? How is it possible?
P.S. another sampe (first already discarded from base probably)
http://setiathome.berkeley.edu/workunit.php?wuid=156609326

Validation has nothing to do with time, it's strictly done by comparing the result files and they do not even have the CPU time. Time is in the report to the Scheduler, assuming a version of BOINC which reports it.

All you're seeing is that the lower of two credit claims is granted.
                                                                Joe
ID: 644538 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 644580 - Posted: 20 Sep 2007, 0:04:56 UTC - in response to Message 644522.  
Last modified: 20 Sep 2007, 0:05:41 UTC

For example:
http://setiathome.berkeley.edu/workunit.php?wuid=158226644
One host spent some time for calculation, the second spent zero time.
Both got zero credit. Is it mean results were not validated?
If so, why there was not work reissue? One result declared as canonical but it probably cant be validated with zero-CPU-time result in any way.
Something wrong? How is it possible?
P.S. another sampe (first already discarded from base probably)
http://setiathome.berkeley.edu/workunit.php?wuid=156609326


This was another of the 4 current hosts of the owner of the one that caused me to start this thread. He has run 524 since Aug 2004 and amassed the grand total of 5345.91 credits; i.e. an average of about 10 credits / host. Of course, most have never gained any credit at all but one recently reached the dizzy heights of 650.53. All have reported "Intel(R) Pentium(R) 4 CPU 2.80GHz Pentium" processor so he doesn't like variety in his kit!
ID: 644580 · Report as offensive
Alinator
Volunteer tester

Send message
Joined: 19 Apr 05
Posts: 4178
Credit: 4,647,982
RAC: 0
United States
Message 644600 - Posted: 20 Sep 2007, 0:35:44 UTC - in response to Message 644580.  
Last modified: 20 Sep 2007, 0:36:09 UTC


This was another of the 4 current hosts of the owner of the one that caused me to start this thread. He has run 524 since Aug 2004 and amassed the grand total of 5345.91 credits; i.e. an average of about 10 credits / host. Of course, most have never gained any credit at all but one recently reached the dizzy heights of 650.53. All have reported "Intel(R) Pentium(R) 4 CPU 2.80GHz Pentium" processor so he doesn't like variety in his kit!


LOL...

Yep, I guess you can say this guy is truly in it only for the science. I can't think of any other reason to have a host claiming zero if I had any other choice about it! ;-)

Alinator
ID: 644600 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 644720 - Posted: 20 Sep 2007, 5:55:13 UTC - in response to Message 644600.  


LOL...

Yep, I guess you can say this guy is truly in it only for the science. I can't think of any other reason to have a host claiming zero if I had any other choice about it! ;-)

Alinator


The rate he gets through hosts, sounds more to me like demo screensavers.

F.
ID: 644720 · Report as offensive
JLDun
Volunteer tester
Avatar

Send message
Joined: 21 Apr 06
Posts: 574
Credit: 196,101
RAC: 0
United States
Message 644735 - Posted: 20 Sep 2007, 6:59:58 UTC - in response to Message 644600.  


LOL...

Yep, I guess you can say this guy is truly in it only for the science. I can't think of any other reason to have a host claiming zero if I had any other choice about it! ;-)

Alinator

The true "set it and forget it" fan...

ID: 644735 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 644763 - Posted: 20 Sep 2007, 9:13:57 UTC

http://setiathome.berkeley.edu/workunit.php?wuid=148308701

This is bizarre... still in my pending list - both machines completed in August and neither have received credit.
I wonder if this phenomenon has any bearing on everyone's RAC drop last month, beyond the credit adjustment for MB, and people seeming to leave in droves.

6 week - 8 week timeouts are just so OTT, 3 weeks should be more than enough.

I vote that if NO contact at all has been received from anyone within 2 weeks, then all their outstanding work should be returned to the pool so work can proceed. It's a simple heartbeat check to see if the machine is alive, and quite separate. Deadlines could be more generous if the user is proven to be working through the tasks.
ID: 644763 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14690
Credit: 200,643,578
RAC: 874
United Kingdom
Message 644772 - Posted: 20 Sep 2007, 9:41:18 UTC - in response to Message 644763.  

I vote that if NO contact at all has been received from anyone within 2 weeks, then all their outstanding work should be returned to the pool so work can proceed. It's a simple heartbeat check to see if the machine is alive, and quite separate. Deadlines could be more generous if the user is proven to be working through the tasks.

That's not a bad idea.

I'm paired up with this bundle of fun. Downloaded 240 WUs on 19 August, hasn't crunched a bean since then. Quite apart from the pending credit issue, that's an awful lot of WU storage space, and database recording space, clogged up for no benefit at all. I wonder what proportion of the 1.8 million results in "progress" come into the same category?

On the other hand, your example WU flags up a completely different problem, which perhaps Berkeley should be alerted to: something ought to have happened to that WU - validation, cancellation or re-issue. A server process clearly stumbled and missed it. Unfortunately, now that we no longer get an update on validation status, we can't tell from here which category it comes into. But if the servers can miss one result, the chances are that they missed a whole bundle of similar ones around the same time. I believe you are supposed to use one of the semi-official back channels when this happens.
ID: 644772 · Report as offensive
Profile Andy Lee Robinson
Avatar

Send message
Joined: 8 Dec 05
Posts: 630
Credit: 59,973,836
RAC: 0
Hungary
Message 644959 - Posted: 20 Sep 2007, 17:18:19 UTC - in response to Message 644772.  
Last modified: 20 Sep 2007, 17:18:35 UTC

I vote that if NO contact at all has been received from anyone within 2 weeks, then all their outstanding work should be returned to the pool so work can proceed. It's a simple heartbeat check to see if the machine is alive, and quite separate. Deadlines could be more generous if the user is proven to be working through the tasks.

That's not a bad idea.

Thanks - I think it makes sense too, however getting someone to do something about it is quite another thing!

I'm paired up with this bundle of fun.

Nasty! Sympathies... I think a lot of us are suffering too.


Downloaded 240 WUs on 19 August, hasn't crunched a bean since then. Quite apart from the pending credit issue, that's an awful lot of WU storage space, and database recording space, clogged up for no benefit at all. I wonder what proportion of the 1.8 million results in "progress" come into the same category?

On the other hand, your example WU flags up a completely different problem, which perhaps Berkeley should be alerted to: something ought to have happened to that WU - validation, cancellation or re-issue. A server process clearly stumbled and missed it. Unfortunately, now that we no longer get an update on validation status, we can't tell from here which category it comes into. But if the servers can miss one result, the chances are that they missed a whole bundle of similar ones around the same time.



Yes, I'm sure it isn't a unique case - the whole MB transition raised a lot of issues!

I believe you are supposed to use one of the semi-official back channels when this happens.


hehe... a very eloquent and polite note taped to a large housebrick... :-)
ID: 644959 · Report as offensive
Fred W
Volunteer tester

Send message
Joined: 13 Jun 99
Posts: 2524
Credit: 11,954,210
RAC: 0
United Kingdom
Message 644974 - Posted: 20 Sep 2007, 17:39:21 UTC - in response to Message 644959.  

I vote that if NO contact at all has been received from anyone within 2 weeks, then all their outstanding work should be returned to the pool so work can proceed. It's a simple heartbeat check to see if the machine is alive, and quite separate. Deadlines could be more generous if the user is proven to be working through the tasks.

That's not a bad idea.

Thanks - I think it makes sense too, however getting someone to do something about it is quite another thing!

I'm paired up with this bundle of fun.

Nasty! Sympathies... I think a lot of us are suffering too.


Downloaded 240 WUs on 19 August, hasn't crunched a bean since then. Quite apart from the pending credit issue, that's an awful lot of WU storage space, and database recording space, clogged up for no benefit at all. I wonder what proportion of the 1.8 million results in "progress" come into the same category?

On the other hand, your example WU flags up a completely different problem, which perhaps Berkeley should be alerted to: something ought to have happened to that WU - validation, cancellation or re-issue. A server process clearly stumbled and missed it. Unfortunately, now that we no longer get an update on validation status, we can't tell from here which category it comes into. But if the servers can miss one result, the chances are that they missed a whole bundle of similar ones around the same time.



Yes, I'm sure it isn't a unique case - the whole MB transition raised a lot of issues!

I believe you are supposed to use one of the semi-official back channels when this happens.


hehe... a very eloquent and polite note taped to a large housebrick... :-)


Ahhhh!! Silly me. I totally misinterpreted "back channels" ;)
ID: 644974 · Report as offensive
Christoph
Volunteer tester

Send message
Joined: 21 Apr 03
Posts: 76
Credit: 355,173
RAC: 0
Germany
Message 645093 - Posted: 20 Sep 2007, 21:15:49 UTC

I'm sure there had been already a discussion somewhere about "auto-update" for BOINC? Maybe a delayed one, for the case that some really annoying buck had been missed during beta test(maybe not often happening, but sometimes?). Ok, only future version will do that, but when people start to not getting work due to outdated client, or receiving an email to update their client, then my guess is, it won't take too long. And then there would be no more problems about the "set and forget". If important features are new, auto-update would prevent problems like we have now. I'm sure somebody already thought something similar.

Happy crunching, Christoph
Christoph
ID: 645093 · Report as offensive
OzzFan Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Apr 02
Posts: 15691
Credit: 84,761,841
RAC: 28
United States
Message 645461 - Posted: 21 Sep 2007, 4:42:43 UTC - in response to Message 645093.  

I'm sure there had been already a discussion somewhere about "auto-update" for BOINC? Maybe a delayed one, for the case that some really annoying buck had been missed during beta test(maybe not often happening, but sometimes?). Ok, only future version will do that, but when people start to not getting work due to outdated client, or receiving an email to update their client, then my guess is, it won't take too long. And then there would be no more problems about the "set and forget". If important features are new, auto-update would prevent problems like we have now. I'm sure somebody already thought something similar.


The problem is people that are allowed to use a specific, tested version of BOINC in an approved corporate environment, the SysAdmins don't like it when software auto-updates itself which may introduce compatibility or stability issues (one BOINC version was recalled due to a bug, and this issue alone may scare some people off an auto-update feature).

Of course, I still say that the ability to turn off the "auto-update" feature for those environments would be a great workaround (but then that may become the cause for another issue if everyone is turning it off).
ID: 645461 · Report as offensive
Sirius B Project Donor
Volunteer tester
Avatar

Send message
Joined: 26 Dec 00
Posts: 24938
Credit: 3,081,182
RAC: 7
Ireland
Message 645475 - Posted: 21 Sep 2007, 4:53:39 UTC - in response to Message 644959.  
Last modified: 21 Sep 2007, 4:54:16 UTC

I'm paired up with this bundle of fun.


Wow - 35 day turnaround!

Is it possible to restrict the number of wu's on any machine.

Since my post/thread not long after rejoining seti, I have taken Richard's advice & have a small reasonable cache.

These large caches are not doing the science or the crunchers any good.

I'm just glad my machines are crunching merrily away & also keeping an eye on the number crunching thread to keep abreast of all the updates needed.
ID: 645475 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : 2 to avoid


 
©2025 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.