Panic Mode On (82) Server Problems?

Message boards : Number crunching : Panic Mode On (82) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 24 · Next

AuthorMessage
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,575,259
RAC: 0
Greece
Message 1342873 - Posted: 4 Mar 2013, 5:08:09 UTC - in response to Message 1342800.  

Until we figure out the bandwidth problem, I am going to eliminate AP work from my machines.366k downloads much better than 8MB.
My 660 kept getting locked out of SETI work because the AP work was getting stuck in backoff and I was already out of MB work.

Yup, i did that last week for the same reason :(


Don’t do that people… :-)
Someone must be my wingman.
I have over 270 AP wu’s pending.

Tim

ID: 1342873 · Report as offensive
Bruce
Volunteer tester

Send message
Joined: 15 Mar 02
Posts: 123
Credit: 124,955,234
RAC: 11
United States
Message 1342906 - Posted: 4 Mar 2013, 7:28:40 UTC - in response to Message 1342351.  

Hi Richard
Thought I would let you know that I got that download_retry.cmd to work, thanks to Horacio pointing out my rather embarrassing mistake. I redid the file and placed it into C:\tasks and it runs just fine.
Seems like an elegant solution that you came up with for SETI only crunchers. I am not sure, but it looks like it might work with any version of Windows running any version of Boinc.
Many thanks to you and everyone who helped me sort this out.
Thanks.
Bruce
Bruce
ID: 1342906 · Report as offensive
Profile Bebe
Volunteer tester

Send message
Joined: 6 Jun 99
Posts: 3
Credit: 33,741,950
RAC: 0
Germany
Message 1342915 - Posted: 4 Mar 2013, 9:44:59 UTC - in response to Message 1342610.  

[quote].........
I'm using HTTP 165.24.10.8:8080, and "don't use for:"
http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler,http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi


.....


Thank you.

bebe
ID: 1342915 · Report as offensive
Tom*

Send message
Joined: 12 Aug 11
Posts: 127
Credit: 20,769,223
RAC: 9
United States
Message 1342970 - Posted: 4 Mar 2013, 16:33:22 UTC
Last modified: 4 Mar 2013, 17:05:17 UTC

Master database queries/second 1,305

Sure hope oscar doesn't blow a gasket!

Things have been running soooooo smoothly recently

Edit - back down to 544
ID: 1342970 · Report as offensive
Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar

Send message
Joined: 1 Mar 99
Posts: 1444
Credit: 957,058
RAC: 0
United States
Message 1343012 - Posted: 4 Mar 2013, 18:29:13 UTC - in response to Message 1342970.  

Master database queries/second 1,305

Sure hope oscar doesn't blow a gasket!

Things have been running soooooo smoothly recently

Edit - back down to 544



Under some circumstances oscar has handled over 30,000 queries/second continuously without breaking a sweat. It's actually more worrisome if the queries/second is low (say, under 200) during what should be normal operations, because that means something is clogging the pipes.

- Matt
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude
ID: 1343012 · Report as offensive
Terror Australis
Volunteer tester

Send message
Joined: 14 Feb 04
Posts: 1817
Credit: 262,693,308
RAC: 44
Australia
Message 1343025 - Posted: 4 Mar 2013, 18:49:30 UTC - in response to Message 1343012.  

...... because that means something is clogging the pipes.

- Matt

Speaking of clogged pipes. After a couple of days of relatively normal operation, downloads are starting to choke up again.

T.A.
ID: 1343025 · Report as offensive
__W__
Avatar

Send message
Joined: 28 Mar 09
Posts: 116
Credit: 5,943,642
RAC: 0
Germany
Message 1343029 - Posted: 4 Mar 2013, 19:02:29 UTC - in response to Message 1342873.  

Don’t do that people… :-)
Someone must be my wingman.
I have over 270 AP wu’s pending.

Doing my best ;-)
For the first time ever i crunch seti, my cache is filled up only with AP-WUs.

__W__

_______________________________________________________________________________
ID: 1343029 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1343049 - Posted: 4 Mar 2013, 19:54:23 UTC - in response to Message 1342970.  

Master database queries/second 1,305

Sure hope oscar doesn't blow a gasket!

Things have been running soooooo smoothly recently

Edit - back down to 544

It could have been the daily stats being generated, but since Matt commented & didn't mention that maybe not.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1343049 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1343051 - Posted: 4 Mar 2013, 19:56:48 UTC - in response to Message 1343025.  

...... because that means something is clogging the pipes.

- Matt

Speaking of clogged pipes. After a couple of days of relatively normal operation, downloads are starting to choke up again.

I think it's the start of another shorty storm.
ID: 1343051 · Report as offensive
Claggy
Volunteer tester

Send message
Joined: 5 Jul 99
Posts: 4654
Credit: 47,537,079
RAC: 4
United Kingdom
Message 1343055 - Posted: 4 Mar 2013, 20:06:41 UTC - in response to Message 1343051.  

...... because that means something is clogging the pipes.

- Matt

Speaking of clogged pipes. After a couple of days of relatively normal operation, downloads are starting to choke up again.

I think it's the start of another shorty storm.

looks like it.

Claggy
ID: 1343055 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1343105 - Posted: 4 Mar 2013, 22:15:51 UTC - in response to Message 1343029.  
Last modified: 4 Mar 2013, 22:36:47 UTC

Don’t do that people… :-)
Someone must be my wingman.
I have over 270 AP wu’s pending.

Doing my best ;-)
For the first time ever i crunch seti, my cache is filled up only with AP-WUs.

__W__

It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?
ID: 1343105 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1343232 - Posted: 5 Mar 2013, 13:22:29 UTC - in response to Message 1343105.  


It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?

I suspect there are exactly three persons that know how the AP validator works (i.e. what parameters are checked with what error levels): Josh, Eric and Joe.

I know roughly what is checked on MB - number and type of signals and some scores attached to the individual signals. With MB any tasks that has more than 50% match to the canonical one will be valid. I would assume it works similarly for AP.

The number and type of signals found by those three closely related apps was the same, so something in the score must have been too different - beyond our ability to check, that is in the uploaded result file, not in stderr.

Why you are repeatedly coming up inconclusive against r1316 might need investigation.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1343232 · Report as offensive
Profile Tim
Volunteer tester
Avatar

Send message
Joined: 19 May 99
Posts: 211
Credit: 278,575,259
RAC: 0
Greece
Message 1343233 - Posted: 5 Mar 2013, 13:33:20 UTC - in response to Message 1343105.  




It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?


The only thing I see it's the Boinc version.
You have .52 they have .28

Tim
ID: 1343233 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1343243 - Posted: 5 Mar 2013, 14:08:47 UTC - in response to Message 1343233.  
Last modified: 5 Mar 2013, 14:10:17 UTC

It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?


The only thing I see it's the Boinc version.
You have .52 they have .28

Tim

That is of no consequence to the validity of results.

Boinc only provides the infrastructure - what task is run when, what project to contact for tasks, for how much work to ask, etc. the science is all done by the respective applications. If something goes wrong there, the app is at fault not boinc. [the only exception that springs to mind being the problem that AP may not get boinc 'suspend' orders under some circumstances, that's something where the API is probably involved - that's the bit that handles communication between boinc and the apps]
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1343243 · Report as offensive
Profile James Sotherden
Avatar

Send message
Joined: 16 May 99
Posts: 10436
Credit: 110,373,059
RAC: 54
United States
Message 1343246 - Posted: 5 Mar 2013, 14:14:11 UTC

Im starting to see the occasional -12 again, Is anyone else?

And is there a way to tell if its just the work units or my machines?
[/quote]

Old James
ID: 1343246 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1343247 - Posted: 5 Mar 2013, 14:15:11 UTC - in response to Message 1343243.  

It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?


The only thing I see it's the Boinc version.
You have .52 they have .28

Tim

That is of no consequence to the validity of results.

Boinc only provides the infrastructure - what task is run when, what project to contact for tasks, for how much work to ask, etc. the science is all done by the respective applications. If something goes wrong there, the app is at fault not boinc. [the only exception that springs to mind being the problem that AP may not get boinc 'suspend' orders under some circumstances, that's something where the API is probably involved - that's the bit that handles communication between boinc and the apps]

And even then, I'd expect all versions of BOINC to be sending the suspend orders - by that stage, the API code responsible for listening for them and acting on them has been compiled and linked into the science application. Again, fixing it is outside the end-user's control.
ID: 1343247 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1343248 - Posted: 5 Mar 2013, 14:16:28 UTC
Last modified: 5 Mar 2013, 14:17:18 UTC

-12 errors are nvidia related.
Not your fault.


With each crime and every kindness we birth our future.
ID: 1343248 · Report as offensive
Profile Mike Special Project $75 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 01
Posts: 34253
Credit: 79,922,639
RAC: 80
Germany
Message 1343249 - Posted: 5 Mar 2013, 14:18:47 UTC - in response to Message 1343248.  

-12 errors are nvidia related.
Not your fault.


But you can try x41zc.
It will reduce them.



With each crime and every kindness we birth our future.
ID: 1343249 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1343250 - Posted: 5 Mar 2013, 14:19:15 UTC - in response to Message 1343248.  

-12 errors are nvidia related.
Not your fault.

They do tend to be application specific, with the optimised applications having a much reduced, and still reducing, incidence compared to the stock applications - though I don't think we're down to zero yet.
ID: 1343250 · Report as offensive
Profile William
Volunteer tester
Avatar

Send message
Joined: 14 Feb 13
Posts: 2037
Credit: 17,689,662
RAC: 0
Message 1343251 - Posted: 5 Mar 2013, 14:23:25 UTC - in response to Message 1343246.  

Im starting to see the occasional -12 again, Is anyone else?

And is there a way to tell if its just the work units or my machines?

I wonder why in all those years we never came up with a FAQ entry for that ::)

It is the units in conjunction with the app.
-12 are connected with a certain triplet condition - or a number of conditions. The original NVidia design did not account properly for those conditions - rather than finding a way to calculate it throws an error. Apparently that's a design flaw not a bug [side glance @ Jason].

Over the years intensive development by Jason has all but abolished -12.
The stock applications (6.08/6.09/6.10) will throw a lot of them.
The latest public installer based release x41g only throws a few.
the x41zc currently in public beta has to my knowledge completely abolished them.

IOW nothing to worry about and nothing you can do, besides doing a manual upgrade to x41zc.
A person who won't read has no advantage over one who can't read. (Mark Twain)
ID: 1343251 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (82) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.