Discussion of Invalid Host Messaging

Message boards : Number crunching : Discussion of Invalid Host Messaging
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1299733 - Posted: 28 Oct 2012, 12:06:42 UTC - in response to Message 1299086.  

I got a rather unusual reply from 1 person via PM today, rob99999_2 ID 129322, the owner of Computer 6797265, which produces nothing but errors from its GT 650M but here's his reply.

i am not going to make a thread when i am not going to check it. I have downloaded ths lastest software and the lastest drivers so if there is any issues SETI needs to get their shit straight.


Not a very nice attitude IMO. :(

Cheers.


Well... rob99999_2 is right.

No, he is not. He is the owner/user of his computer and only he is responsible for what this computer is doing, just like he has to watch, that it's not sending out spam mails or participate in DDoS attacs, so he has to watch what it is doing with the SETI WUs, that it gets assigned.

Just like a car driver is responsible for his car, listen to it and watch how it behaves carefully and if you suspect, that something might be wrong, stop and call help, if you can't fix it yourself.
ID: 1299733 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1299757 - Posted: 28 Oct 2012, 13:21:09 UTC - in response to Message 1299733.  

I got a rather unusual reply from 1 person via PM today, rob99999_2 ID 129322, the owner of Computer 6797265, which produces nothing but errors from its GT 650M but here's his reply.

i am not going to make a thread when i am not going to check it. I have downloaded ths lastest software and the lastest drivers so if there is any issues SETI needs to get their shit straight.


Not a very nice attitude IMO. :(

Cheers.


Well... rob99999_2 is right.

No, he is not. He is the owner/user of his computer and only he is responsible for what this computer is doing, just like he has to watch, that it's not sending out spam mails or participate in DDoS attacs, so he has to watch what it is doing with the SETI WUs, that it gets assigned.

Just like a car driver is responsible for his car, listen to it and watch how it behaves carefully and if you suspect, that something might be wrong, stop and call help, if you can't fix it yourself.


Nothing is wrong with his car nothing is wrong with his engine he has it serviced correctly, however the manufacturer has failed to tell him that there is a fault that will mean his engine is about to breakdown. Can the manufacturer fix it, no you have to do it yourself. Or stop using the car!

ID: 1299757 · Report as offensive
Josef W. Segur
Volunteer developer
Volunteer tester

Send message
Joined: 30 Oct 99
Posts: 4504
Credit: 1,414,761
RAC: 0
United States
Message 1299800 - Posted: 28 Oct 2012, 15:40:35 UTC

A car is a deadly weapon so there is a definite moral obligation to keep it well maintained, backed most places by legal requirements.

A system crunching SaH data without producing correct results is turning electrical energy into heat energy without advancing the scientific search we're trying to do. Because the user presumably intended to help the cause, sending a heads up message when there's good evidence the user hasn't noticed the problem makes sense.

The additional load on the servers caused by systems gone bad can't be separated from cases where the user has decided to stop crunching, etc. But there's an easy way to see how much overall waste there is. If there were no waste, the ratio of "Results waiting for db purging" to "Workunits waiting for db purging" would be exactly 2. In practice the MB ratio is usually between 2.1 and 2.2 indicating waste of 5 to 10 percent.
                                                                    Joe
ID: 1299800 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1299832 - Posted: 28 Oct 2012, 16:47:41 UTC - in response to Message 1299757.  

I got a rather unusual reply from 1 person via PM today, rob99999_2 ID 129322, the owner of Computer 6797265, which produces nothing but errors from its GT 650M but here's his reply.

i am not going to make a thread when i am not going to check it. I have downloaded ths lastest software and the lastest drivers so if there is any issues SETI needs to get their shit straight.


Not a very nice attitude IMO. :(

Cheers.


Well... rob99999_2 is right.

No, he is not. He is the owner/user of his computer and only he is responsible for what this computer is doing, just like he has to watch, that it's not sending out spam mails or participate in DDoS attacs, so he has to watch what it is doing with the SETI WUs, that it gets assigned.

Just like a car driver is responsible for his car, listen to it and watch how it behaves carefully and if you suspect, that something might be wrong, stop and call help, if you can't fix it yourself.


Nothing is wrong with his car nothing is wrong with his engine he has it serviced correctly, however the manufacturer has failed to tell him that there is a fault that will mean his engine is about to breakdown. Can the manufacturer fix it, no you have to do it yourself. Or stop using the car!

I know that car-computer comparisons are crap, but sometimes I have not a better one. Point is: his computer fails, he should be the first who notices it and see if he can fix it or ask for help.

I have the similar situation with Milkyway right now: my old ATI HD3850 can only run the older (not really supported anymore) CAL application, I have to watch if new batches of WUs are still compatible with it, if not I have to stop crunching. It was already once the case, I had to stop crunching for about a month, than it worked again. Wether old or new hardware, you have to watch it, something might always not work as expected. Specially after any changes on the system, for example if you buy a new card or install new drivers, you have to first see that it actually works before you let it do it's work without too much attention from your side. And I'm pretty sure that most of the owners of those 560Ti cards have skipped that part. Something like "set and forget" does not exist with computers anyway, even if many think so. In best case it's "set, see that it works and hope it lasts for a while".
ID: 1299832 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1299879 - Posted: 28 Oct 2012, 18:42:31 UTC - in response to Message 1299800.  

A car is a deadly weapon so there is a definite moral obligation to keep it well maintained, backed most places by legal requirements.

Not all issues with a car make it more dangerous, if it's leaking a drop of oil every now and than, it's still safe to drive but bad for the environment. And so are such hosts for the SETI environment, they waste bandwidth and eventually (if two such hosts validate against eachother like fermi cards did before) even compromise the science.
ID: 1299879 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1299920 - Posted: 28 Oct 2012, 20:16:27 UTC

Everyone is correct, however it means SETI@Home is not set and forget. It should have instructions posted that explain the problems with GPU crunching and a need to check on a regular basis to see if your results are valid. Also to warn people that if they are not prepared to do this they could return invalid results and it is best they don't crunch using a GPU.

They especially will need to check when updating the GPU drivers as this has introduced several bugs in the past. Also before buying a new modern graphics card please check on the forums to see if it will work with SET@Home and or the latest drivers! If you are unsure of any of this please do not attach your computer to SETI@Home.

In real terms that is what anyone crunching needs to be aware of

Of course no one want to post that on the front page but something like that is needed. I am aware that updates are due, but who is to say that in 3 or 6 months time a new GPU or driver. won't start this whole thing off again.
ID: 1299920 · Report as offensive
Profile Link
Avatar

Send message
Joined: 18 Sep 03
Posts: 834
Credit: 1,807,369
RAC: 0
Germany
Message 1299945 - Posted: 28 Oct 2012, 21:12:53 UTC - in response to Message 1299920.  
Last modified: 28 Oct 2012, 21:40:33 UTC

GPU computing disabled by default and a red "READ THIS FIRST" link to a page with a short info like the one you posted would be IMO a good solution. CPU-only crunching might be "set and forget", GPU crunching is not.

An alternative would be a better quota system, one which counts invalids as errors and which expects something better than 1 valid result out of 50. 98% failure rate can't be "OK", even 50% would be IMHO way to much, but should be good enough to start with.

I mean it's not just SETI, I crunch also for Milkyway and Collatz and issues like that, i.e. with new hardware or drivers occur on those projects as well everytime nVidia or ATI comes up with something "revolutionary". Hence I don't see it as a fault of the project staff, if their apps don't run properly on a new hardware.
ID: 1299945 · Report as offensive
Profile Bernie Vine
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 26 May 99
Posts: 9954
Credit: 103,452,613
RAC: 328
United Kingdom
Message 1299961 - Posted: 28 Oct 2012, 22:01:07 UTC
Last modified: 28 Oct 2012, 22:02:37 UTC

Hence I don't see it as a fault of the project staff, if their apps don't run properly on a new hardware.


No possibly not, but if GPU crunching is not "set and forget" both current and prospective users need to know, otherwise as you say we could end up with errors validating against each other, corrupting the science!

Users like rob99999_2 need to know what they are getting into.
ID: 1299961 · Report as offensive
Murat Adas

Send message
Joined: 9 Aug 10
Posts: 1
Credit: 1,585,782
RAC: 0
Australia
Message 1300176 - Posted: 29 Oct 2012, 11:07:29 UTC - in response to Message 1297944.  

After reading NVidia driver problems which cause computation errors by Richard Haselgrove
I've changed my avanced power settings, below are the steps I used to accomplish this (Windows 7)
right clicking on desktop selecting personalize
then selected screen saver
made sure I have none selected for screen saver
then clicked on Change power settings
next I clicked Change plan settings
made sure Turn off the display and Put the computer to sleep are Never
Finally I clicked on Change advanced power settings
Under the toolbar Sleep - Allow hybrid sleep
I turned the setting to "Off". (default was On)

I hope this helps, if not please let me know weather to roll back to a previous driver?

Thanks
ID: 1300176 · Report as offensive
Profile Tron

Send message
Joined: 16 Aug 09
Posts: 180
Credit: 2,250,468
RAC: 0
United States
Message 1300272 - Posted: 29 Oct 2012, 17:13:47 UTC - in response to Message 1300176.  

After reading NVidia driver problems which cause computation errors by Richard Haselgrove
I've changed my avanced power settings, below are the steps I used to accomplish this (Windows 7)
right clicking on desktop selecting personalize
then selected screen saver
made sure I have none selected for screen saver
then clicked on Change power settings
next I clicked Change plan settings
made sure Turn off the display and Put the computer to sleep are Never
Finally I clicked on Change advanced power settings
Under the toolbar Sleep - Allow hybrid sleep
I turned the setting to "Off". (default was On)

I hope this helps, if not please let me know weather to roll back to a previous driver?

Thanks


Thank You for posting. It will take a few days for the dust to settle on the invalids before you can see for sure whether the changes you made helped.
You can keep an eye on your finished tasks in the mean time , .. watch for short run times , those tend to be the -9 error you are experiencing.

ID: 1300272 · Report as offensive
Profile BilBg
Volunteer tester
Avatar

Send message
Joined: 27 May 07
Posts: 3720
Credit: 9,385,827
RAC: 0
Bulgaria
Message 1300629 - Posted: 31 Oct 2012, 2:32:48 UTC - in response to Message 1300176.  

After reading NVidia driver problems which cause computation errors by Richard Haselgrove
I've changed my avanced power settings ...

Read again:
1) Sleeping Monitor Bug
Drivers affected: 295.51 (BETA), 295.73 and 296.10

You use driver: 306.97 so what you did was not needed.

You also don't have 'Kepler' GPU (GT 6xx and GTX 6xx) so the other (CUDA_GRID_SIZE_COMPAT) fix do not apply to you.

GTX 560 Ti problems are 'famous' and not related to 'Sleeping Monitor Bug' nor 'Kepler'

Read 'a few' threads about GTX 560 Ti problems:
http://www.google.com/#hl=en&q=560+Ti+problems+site:setiathome.berkeley.edu


 


- ALF - "Find out what you don't do well ..... then don't do it!" :)
 
ID: 1300629 · Report as offensive
Profile Thndr

Send message
Joined: 4 Mar 02
Posts: 18
Credit: 3,477,289
RAC: 1
United States
Message 1302137 - Posted: 4 Nov 2012, 16:32:29 UTC

I have told it not to use gpu and set power settings to never turn off monitor. let me know.
ID: 1302137 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1302158 - Posted: 4 Nov 2012, 17:02:42 UTC - in response to Message 1302137.  
Last modified: 4 Nov 2012, 17:04:59 UTC

I have told it not to use gpu and set power settings to never turn off monitor. let me know.


Setting power settings to never turn off your monitor won't help, you're not running 295.xx or 296.xx drivers,

Looking at your inconclusive/errored tasks, they are a mixture of CPU and GPU tasks,
all the one's i looked at all say 'Restarted at 100.00 percent.' which is strange,
then looking at the stderr.txt results, multiple tasks have the same result, eithier:

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
Restarted at 100.00 percent.

Flopcounter: 48049228806222.281000

Spike count: 1
Pulse count: 1
Triplet count: 8
Gaussian count: 0
called boinc_finish

</stderr_txt>
]]>

http://setiathome.berkeley.edu/result.php?resultid=2686739603

http://setiathome.berkeley.edu/result.php?resultid=2686762450

http://setiathome.berkeley.edu/result.php?resultid=2686762474

http://setiathome.berkeley.edu/result.php?resultid=2688750025

Or:

Spike count: 10
Pulse count: 0
Triplet count: 0
Gaussian count: 3
called boinc_finish

http://setiathome.berkeley.edu/result.php?resultid=2687146759

http://setiathome.berkeley.edu/result.php?resultid=2687140765

Or:

Spike count: 14
Pulse count: 5
Triplet count: 12
Gaussian count: 0

http://setiathome.berkeley.edu/result.php?resultid=2687140761

http://setiathome.berkeley.edu/result.php?resultid=2686771628

Looks like your slot directories aren't getting cleared for some reason,

Please post your Boinc startup messages from the Event Log, the first 30 lines will do.

Claggy
ID: 1302158 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1303041 - Posted: 7 Nov 2012, 10:15:57 UTC - in response to Message 1302158.  
Last modified: 7 Nov 2012, 10:16:40 UTC

Looks as if Thndr has fixed his problems with his slot directories, he's now fully completing 6.03, 6.10 (cuda_fermi), AstroPulse v6 v6.01 and AstroPulse v6 v6.04 (cuda_opencl_100) tasks,
although it would have been good if he had responded and told us what he found and did.

All tasks for computer 6432659

Claggy
ID: 1303041 · Report as offensive
Profile Thndr

Send message
Joined: 4 Mar 02
Posts: 18
Credit: 3,477,289
RAC: 1
United States
Message 1305991 - Posted: 14 Nov 2012, 4:36:27 UTC - in response to Message 1303041.  

Looks as if Thndr has fixed his problems with his slot directories, he's now fully completing 6.03, 6.10 (cuda_fermi), AstroPulse v6 v6.01 and AstroPulse v6 v6.04 (cuda_opencl_100) tasks,
although it would have been good if he had responded and told us what he found and did.

All tasks for computer 6432659

Claggy


Well.... to make a long story short, I scrapped the boinc software and started over... that and I changed power settings and reset the project and environment but!! I'm back to 6.10 errors again! I checked everything and gpu usage was turned back on?? how?? Clearly this is NOT just a driver problem.
https://www.facebook.com/LAKEVILLEUNITYGARDENS/
ID: 1305991 · Report as offensive
Profile Thndr

Send message
Joined: 4 Mar 02
Posts: 18
Credit: 3,477,289
RAC: 1
United States
Message 1306124 - Posted: 14 Nov 2012, 16:45:07 UTC - in response to Message 1305991.  

I have removed the boinc manager from my machine again. I will watch this thread for further developments. I can not see wasting my efforts and messing up data packets until there is a fix.
https://www.facebook.com/LAKEVILLEUNITYGARDENS/
ID: 1306124 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1306191 - Posted: 14 Nov 2012, 20:06:32 UTC - in response to Message 1305991.  
Last modified: 14 Nov 2012, 20:09:14 UTC

Looks as if Thndr has fixed his problems with his slot directories, he's now fully completing 6.03, 6.10 (cuda_fermi), AstroPulse v6 v6.01 and AstroPulse v6 v6.04 (cuda_opencl_100) tasks,
although it would have been good if he had responded and told us what he found and did.

All tasks for computer 6432659

Claggy


Well.... to make a long story short, I scrapped the boinc software and started over... that and I changed power settings and reset the project and environment but!! I'm back to 6.10 errors again! I checked everything and gpu usage was turned back on?? how?? Clearly this is NOT just a driver problem.

What enviromental setting? If it's the one in the 'NVidia driver problems which cause computation errors' thread, please Note that is for 6** Keplar GPUs only and is not required on your GTS 450,
and you also don't need to change power settings as you're not running 295.xx or 296.xx drivers,

Uninstalling and Reinstalling the Boinc software didn't help as that only installs the program, the Boinc Data directory is left intact, and that is where your problem is,
looking at your errored tasks still shows 'Restarted at 100.00 percent.', did you go and empty all the slot directories, did you delete them? or did you not touch them?

Please post your Boinc startup messages from the Event Log, the first 20 to 30 lines will do (I've already asked you for it once before)

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
- exit code -1073741819 (0xc0000005)
</message>
<stderr_txt>
setiathome_CUDA: Found 1 CUDA device(s):
Device 1 : GeForce GTS 450
totalGlobalMem = 1073414144
sharedMemPerBlock = 49152
regsPerBlock = 32768
warpSize = 32
memPitch = 2147483647
maxThreadsPerBlock = 1024
clockRate = 1566000
totalConstMem = 65536
major = 2
minor = 1
textureAlignment = 512
deviceOverlap = 1
multiProcessorCount = 4
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce GTS 450 is okay
SETI@home using CUDA accelerated device GeForce GTS 450
Restarted at 100.00 percent.

Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Access Violation (0xc0000005) at address 0x0040173F read attempt to address 0x07EC7078

Engaging BOINC Windows Runtime Debugger...


Claggy
ID: 1306191 · Report as offensive
Profile Thndr

Send message
Joined: 4 Mar 02
Posts: 18
Credit: 3,477,289
RAC: 1
United States
Message 1306228 - Posted: 14 Nov 2012, 21:49:53 UTC - in response to Message 1306191.  

didnt delete anything
https://www.facebook.com/LAKEVILLEUNITYGARDENS/
ID: 1306228 · Report as offensive
Profile Thndr

Send message
Joined: 4 Mar 02
Posts: 18
Credit: 3,477,289
RAC: 1
United States
Message 1306230 - Posted: 14 Nov 2012, 21:53:33 UTC - in response to Message 1306228.  

didnt delete anything


program is completely uninstalled.
https://www.facebook.com/LAKEVILLEUNITYGARDENS/
ID: 1306230 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1306234 - Posted: 14 Nov 2012, 21:58:24 UTC - in response to Message 1306230.  
Last modified: 14 Nov 2012, 21:58:49 UTC

didnt delete anything


program is completely uninstalled.

Then reinstall, The Boinc Data directory is Never removed when uninstalling Boinc, post the startup messages, then i'll known what directory you're installed the Data directory to, and should known whether permissions will be correct or not,
then we can get the slot directories cleaned up, after that things should just work,

Claggy
ID: 1306234 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : Discussion of Invalid Host Messaging


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.