You might want to check this one Again...

Message boards : Number crunching : You might want to check this one Again...
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1353938 - Posted: 6 Apr 2013, 1:13:42 UTC

I have serious doubts about the results of this task...

Workunit 1199573476
Task ---- Computer ----------- Sent ----------- Time reported ---------- Status -------- Run time - CPU time - Credit ---- Application
2899037611 	4202271 	30 Mar 2013, 10:58:09 UTC 	30 Mar 2013, 13:12:55 UTC 	Completed and validated 	15.30 	11.98 	2.39  SETI@home Enhanced Anonymous platform (NVIDIA GPU)
2899037612 	6864181 	30 Mar 2013, 10:58:15 UTC 	31 Mar 2013, 2:05:58 UTC 	Completed, marked as invalid 	1,415.40 	100.99 	0.00  SETI@home Enhanced Anonymous platform (NVIDIA GPU)
2900744232 	6829067 	31 Mar 2013, 5:49:03 UTC 	4 Apr 2013, 14:03:02 UTC 	Completed, marked as invalid 	10,004.00 	9,842.60 	0.00 SETI@home Enhanced v6.03
2904766778 	6680152 	4 Apr 2013, 19:45:26 UTC 	5 Apr 2013, 5:25:27 UTC 	Completed and validated 	49.10 	12.54 	2.39  SETI@home Enhanced Anonymous platform (NVIDIA GPU)


The two validating computers don't have a very good history.
ID: 1353938 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1353942 - Posted: 6 Apr 2013, 1:28:51 UTC - in response to Message 1353938.  

I wonder why people are so sloppy as to run machines like that.
That of course does not say that your specific result was not bad, but they are producing a lot of garbage.
ID: 1353942 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1353943 - Posted: 6 Apr 2013, 1:32:31 UTC - in response to Message 1353942.  
Last modified: 6 Apr 2013, 1:36:02 UTC

I wonder why people are so sloppy as to run machines like that.
That of course does not say that your specific result was not bad, but they are producing a lot of garbage.

My results match the CPU results. Even the Host with the CPU results has problems with his GPU, but, I kinda trust his CPU.
ID: 1353943 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34841
Credit: 261,360,520
RAC: 489
Australia
Message 1353944 - Posted: 6 Apr 2013, 1:33:22 UTC - in response to Message 1353938.  

Ah yes those pair, this was bound to happen sooner or later (just like the old v12 days all over again).

I've sent these 2 guys so many PM's over the last year about their unstable rigs that it's far from a joke now. :-(

Cheers.
ID: 1353944 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1353948 - Posted: 6 Apr 2013, 1:40:29 UTC - in response to Message 1353944.  

I've sent these 2 guys so many PM's over the last year about their unstable rigs that it's far from a joke now. :-(

The problem is most people don't come to the forums, and many of those that do probably aren't even aware of PMs.


In the community preferences it has the option to be notified by PMs to the users email address (either for each PM or one email daily).

It might be worth the team considering changing the default to Notify by email, and even changing all the current settings to that. Send a PM to each person (over several days) advising them of the change, and how to change it back.
It would then allow people to send PMs to those with problem systems, and they would at least be notified they have a message even if they choose not to look at it.

The only other option is for a Mod or Admin that is able to view email addresses to send or forward messages to each problem user individually.
Grant
Darwin NT
ID: 1353948 · Report as offensive
Profile Dimly Lit Lightbulb 😀
Volunteer tester
Avatar

Send message
Joined: 30 Aug 08
Posts: 15399
Credit: 7,423,413
RAC: 1
United Kingdom
Message 1353956 - Posted: 6 Apr 2013, 1:54:56 UTC - in response to Message 1353948.  
Last modified: 6 Apr 2013, 1:59:30 UTC

I've sent these 2 guys so many PM's over the last year about their unstable rigs that it's far from a joke now. :-(

The problem is most people don't come to the forums, and many of those that do probably aren't even aware of PMs.


In the community preferences it has the option to be notified by PMs to the users email address (either for each PM or one email daily).

It might be worth the team considering changing the default to Notify by email, and even changing all the current settings to that. Send a PM to each person (over several days) advising them of the change, and how to change it back.
It would then allow people to send PMs to those with problem systems, and they would at least be notified they have a message even if they choose not to look at it.

The only other option is for a Mod or Admin that is able to view email addresses to send or forward messages to each problem user individually.

No none here can see any e-mail address, you'll have to send a PM and hope it's someone who watches such things, takes your advice and then corrects it. Although experience says at least nine times out of ten it'll be ignored, sadly.

[EDIT]Discussion of Invalid Host Messaging thread bumped[/EDIT]

Member of the People Encouraging Niceness In Society club.

ID: 1353956 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1353969 - Posted: 6 Apr 2013, 3:06:21 UTC - in response to Message 1353956.  

The only other option is for a Mod or Admin that is able to view email addresses to send or forward messages to each problem user individually.

No none here can see any e-mail address, you'll have to send a PM and hope it's someone who watches such things, takes your advice and then corrects it.

So only the forum admin can access that information.
They could set it so all mods are able to, or they could set it so one particular mod can do so, or they'd be the one to pass on such messages.
Otherwise things stay as they are- systems pumpimg out rubbish continuously because their owns never check on them.
Grant
Darwin NT
ID: 1353969 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1353975 - Posted: 6 Apr 2013, 3:35:01 UTC - in response to Message 1353969.  

But even if someone were able to send them an email or any other kind of message, there is no warranty that they will fix it...
It was discussed a lot of times that the way in which BOINC handles errors is too permisive and that the real fix is to change that so it can effectively cut down the amount of task sent to those hosts...
ID: 1353975 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13746
Credit: 208,696,464
RAC: 304
Australia
Message 1353979 - Posted: 6 Apr 2013, 3:49:48 UTC - in response to Message 1353975.  
Last modified: 6 Apr 2013, 3:50:05 UTC

But even if someone were able to send them an email or any other kind of message, there is no warranty that they will fix it...

Nope.
But no one will fix something if they don't know it's broken. If they know, then there's a chance.
Grant
Darwin NT
ID: 1353979 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1353981 - Posted: 6 Apr 2013, 4:07:25 UTC - in response to Message 1353979.  

But no one will fix something if they don't know it's broken. If they know, then there's a chance.

That's true, but I think that they (BOINC/Projects) dont't want people freaking out on paranoia about their personal data beeing made available...
ID: 1353981 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1353983 - Posted: 6 Apr 2013, 4:12:23 UTC
Last modified: 6 Apr 2013, 4:18:31 UTC

Something that would work in this instance would be a simple script that precludes sending a tie-breaker to Hosts with over a set number of invalids. That might prevent the broken clock from being correct twice a day...
ID: 1353983 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1353984 - Posted: 6 Apr 2013, 4:45:44 UTC

I haven't had many "error" WUs myself over the years (I think I've had maybe 10 in total?). I have noticed that the "maximum per day" does reset back to 100 if you were anywhere over 100, and it is supposed to cut in half for every consecutive error, down to 1.

By those rules.. if you were at say.. 1500/day and one became an error, you are down to 100. The next one is valid, so you are at 101. Next one is an error, and you're at 100 again, etc.

I'm thinking that should be reduced to something smaller to keep runaway machines from going rampant. Something like 10 or 25 should do. If your machine does good work and just had one bad WU, then you won't have a problem rebuilding back up to a decent number again. If your machine is a runaway, then it won't have a detrimental effect on the overall science.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1353984 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1353987 - Posted: 6 Apr 2013, 5:23:39 UTC

This problem runs all the way up to the top rigs. 6656656 which is currently number 8 on the RAC scoreboard has a crook GTX580 that has been producing bad results for months. I've PM'ed the owner a couple of times but nothing has been done about it.

T.A.
ID: 1353987 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34841
Credit: 261,360,520
RAC: 489
Australia
Message 1353991 - Posted: 6 Apr 2013, 5:48:03 UTC - in response to Message 1353987.  

This problem runs all the way up to the top rigs. 6656656 which is currently number 8 on the RAC scoreboard has a crook GTX580 that has been producing bad results for months. I've PM'ed the owner a couple of times but nothing has been done about it.

T.A.

Yes, that is another one but at least it's not as bad as it use to be.

Cheers.
ID: 1353991 · Report as offensive
Horacio

Send message
Joined: 14 Jan 00
Posts: 536
Credit: 75,967,266
RAC: 0
Argentina
Message 1353992 - Posted: 6 Apr 2013, 5:48:24 UTC - in response to Message 1353984.  

The way it works is more complex than that, it takes into account the basic quota, the success or error outcome (how was it reported: normally finished or as a computation error) and the validation outcome (after compairing the different results of wingmen)

Given this project's setting of 100 for daily_result_quota:
- If an error is reported the "Max tasks per day" is reduced to less than the basic 100 quota, 99 if the host was previously OK or subtract one if it was already below.
- If a "success" is reported and the host was below the basic quota, "Max tasks per day" is doubled but capped at 100.
- A task judged valid increases "Max tasks per day" by one.
- A task judged invalid reduces "Max tasks per day" by one, but only if it was above the basic quota.
(quoted from a post by Josef Segur in this thread)

Some of my hosts have currently a daily quota of more than a thousand, if one of them start to fail only on the validation then it will take a lot of time to get the quota reduced specially because they will have a lot of previous tasks that are going to succeed on validation (rising the quota) while the invalids will need a 3rd wingman and are going to take more time to get the invalid mark...
As it is, it works for avoiding seriuos hardware issues, but not to effectively throttle subtle errors...
ID: 1353992 · Report as offensive
andybutt
Volunteer tester
Avatar

Send message
Joined: 18 Mar 03
Posts: 262
Credit: 164,205,187
RAC: 516
United Kingdom
Message 1353997 - Posted: 6 Apr 2013, 5:59:23 UTC - in response to Message 1353987.  

TA
Sorry but I have only received the one PM this morning. I know the card is playing up a little and have been playing around with it to see if it was something i'd changed but to no avail. Just off to the computer store to get a new card so should be replaced today.

Andy
ID: 1353997 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1354005 - Posted: 6 Apr 2013, 6:51:08 UTC - in response to Message 1353997.  
Last modified: 6 Apr 2013, 6:52:04 UTC

TA
Sorry but I have only received the one PM this morning. I know the card is playing up a little and have been playing around with it to see if it was something i'd changed but to no avail. Just off to the computer store to get a new card so should be replaced today.

Andy

Thanks Andy. I've seen that rig with over 1000 invalid tasks and 1500 inconclusives which was a worry.

The only gripe I have is that now you're going to get further ahead of me :)

T.A.
ID: 1354005 · Report as offensive
andybutt
Volunteer tester
Avatar

Send message
Joined: 18 Mar 03
Posts: 262
Credit: 164,205,187
RAC: 516
United Kingdom
Message 1354039 - Posted: 6 Apr 2013, 9:01:34 UTC - in response to Message 1354005.  

TA
I don't remember being anywhere near that high! Just ordered two more 690's, should be here Monday

Andy
ID: 1354039 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1354049 - Posted: 6 Apr 2013, 9:24:06 UTC - in response to Message 1354039.  

It was few months ago. Before the last couple of extended outages.

T.A.
ID: 1354049 · Report as offensive
Profile Floyd
Avatar

Send message
Joined: 19 May 11
Posts: 524
Credit: 1,870,625
RAC: 0
United States
Message 1354142 - Posted: 6 Apr 2013, 17:57:52 UTC - in response to Message 1353984.  

I haven't had many "error" WUs myself over the years (I think I've had maybe 10 in total?). I have noticed that the "maximum per day" does reset back to 100 if you were anywhere over 100, and it is supposed to cut in half for every consecutive error, down to 1.

By those rules.. if you were at say.. 1500/day and one became an error, you are down to 100. The next one is valid, so you are at 101. Next one is an error, and you're at 100 again, etc.

I'm thinking that should be reduced to something smaller to keep runaway machines from going rampant. Something like 10 or 25 should do. If your machine does good work and just had one bad WU, then you won't have a problem rebuilding back up to a decent number again. If your machine is a runaway, then it won't have a detrimental effect on the overall science.


Well there has been a bit of an unknown issue of abandoned tasks , by boinc or the seti servers , that show up as errors , that shouldn't effect our max per day quota , as it doesn't seem to be caused by anything our machines have done .
As shown in the abandoned tasks thread :

http://setiathome.berkeley.edu/forum_thread.php?id=70946
ID: 1354142 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : You might want to check this one Again...


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.