Panic Mode On (82) Server Problems?


log in

Advanced search

Message boards : Number crunching : Panic Mode On (82) Server Problems?

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 24 · Next
Author Message
Profile Tim
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 205
Credit: 250,529,842
RAC: 52,025
Greece
Message 1342873 - Posted: 4 Mar 2013, 5:08:09 UTC - in response to Message 1342800.

Until we figure out the bandwidth problem, I am going to eliminate AP work from my machines.366k downloads much better than 8MB.
My 660 kept getting locked out of SETI work because the AP work was getting stuck in backoff and I was already out of MB work.

Yup, i did that last week for the same reason :(


Don’t do that people… :-)
Someone must be my wingman.
I have over 270 AP wu’s pending.

Tim

____________

Bruce
Send message
Joined: 15 Mar 02
Posts: 10
Credit: 33,516,498
RAC: 22,347
United States
Message 1342906 - Posted: 4 Mar 2013, 7:28:40 UTC - in response to Message 1342351.

Hi Richard
Thought I would let you know that I got that download_retry.cmd to work, thanks to Horacio pointing out my rather embarrassing mistake. I redid the file and placed it into C:\tasks and it runs just fine.
Seems like an elegant solution that you came up with for SETI only crunchers. I am not sure, but it looks like it might work with any version of Windows running any version of Boinc.
Many thanks to you and everyone who helped me sort this out.
Thanks.
Bruce
____________

Profile Bebe
Send message
Joined: 6 Jun 99
Posts: 3
Credit: 25,315,448
RAC: 5,940
Germany
Message 1342915 - Posted: 4 Mar 2013, 9:44:59 UTC - in response to Message 1342610.

[quote].........
I'm using HTTP 165.24.10.8:8080, and "don't use for:"
http://setiboincdata.ssl.berkeley.edu/sah_cgi/file_upload_handler,http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi


.....


Thank you.

bebe
____________

Tom*
Send message
Joined: 12 Aug 11
Posts: 114
Credit: 4,815,461
RAC: 82
United States
Message 1342970 - Posted: 4 Mar 2013, 16:33:22 UTC
Last modified: 4 Mar 2013, 17:05:17 UTC

Master database queries/second 1,305

Sure hope oscar doesn't blow a gasket!

Things have been running soooooo smoothly recently

Edit - back down to 544

Profile Matt Lebofsky
Volunteer moderator
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 1 Mar 99
Posts: 1390
Credit: 74,079
RAC: 0
United States
Message 1343012 - Posted: 4 Mar 2013, 18:29:13 UTC - in response to Message 1342970.

Master database queries/second 1,305

Sure hope oscar doesn't blow a gasket!

Things have been running soooooo smoothly recently

Edit - back down to 544



Under some circumstances oscar has handled over 30,000 queries/second continuously without breaking a sweat. It's actually more worrisome if the queries/second is low (say, under 200) during what should be normal operations, because that means something is clogging the pipes.

- Matt
____________
-- BOINC/SETI@home network/web/science/development person
-- "Any idiot can have a good idea. What is hard is to do it." - Jeanne-Claude

Terror Australis
Volunteer tester
Send message
Joined: 14 Feb 04
Posts: 1758
Credit: 206,460,507
RAC: 15,584
Australia
Message 1343025 - Posted: 4 Mar 2013, 18:49:30 UTC - in response to Message 1343012.

...... because that means something is clogging the pipes.

- Matt

Speaking of clogged pipes. After a couple of days of relatively normal operation, downloads are starting to choke up again.

T.A.

__W__
Avatar
Send message
Joined: 28 Mar 09
Posts: 114
Credit: 3,270,411
RAC: 371
Germany
Message 1343029 - Posted: 4 Mar 2013, 19:02:29 UTC - in response to Message 1342873.

Don’t do that people… :-)
Someone must be my wingman.
I have over 270 AP wu’s pending.

Doing my best ;-)
For the first time ever i crunch seti, my cache is filled up only with AP-WUs.

__W__

____________
_______________________________________________________________________________

Profile HAL9000
Volunteer tester
Avatar
Send message
Joined: 11 Sep 99
Posts: 4592
Credit: 121,555,852
RAC: 50,731
United States
Message 1343049 - Posted: 4 Mar 2013, 19:54:23 UTC - in response to Message 1342970.

Master database queries/second 1,305

Sure hope oscar doesn't blow a gasket!

Things have been running soooooo smoothly recently

Edit - back down to 544

It could have been the daily stats being generated, but since Matt commented & didn't mention that maybe not.
____________
SETI@home classic workunits: 93,865 CPU time: 863,447 hours

Join the BP6/VP6 User Group today!

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8760
Credit: 52,709,504
RAC: 25,984
United Kingdom
Message 1343051 - Posted: 4 Mar 2013, 19:56:48 UTC - in response to Message 1343025.

...... because that means something is clogging the pipes.

- Matt

Speaking of clogged pipes. After a couple of days of relatively normal operation, downloads are starting to choke up again.

I think it's the start of another shorty storm.

ClaggyProject donor
Volunteer tester
Send message
Joined: 5 Jul 99
Posts: 4209
Credit: 34,468,444
RAC: 19,151
United Kingdom
Message 1343055 - Posted: 4 Mar 2013, 20:06:41 UTC - in response to Message 1343051.

...... because that means something is clogging the pipes.

- Matt

Speaking of clogged pipes. After a couple of days of relatively normal operation, downloads are starting to choke up again.

I think it's the start of another shorty storm.

looks like it.

Claggy

TBar
Volunteer tester
Send message
Joined: 22 May 99
Posts: 1496
Credit: 53,077,452
RAC: 46,857
United States
Message 1343105 - Posted: 4 Mar 2013, 22:15:51 UTC - in response to Message 1343029.
Last modified: 4 Mar 2013, 22:36:47 UTC

Don’t do that people… :-)
Someone must be my wingman.
I have over 270 AP wu’s pending.

Doing my best ;-)
For the first time ever i crunch seti, my cache is filled up only with AP-WUs.

__W__

It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?

Profile WilliamProject donor
Volunteer tester
Avatar
Send message
Joined: 14 Feb 13
Posts: 1610
Credit: 9,470,168
RAC: 16
Message 1343232 - Posted: 5 Mar 2013, 13:22:29 UTC - in response to Message 1343105.


It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?

I suspect there are exactly three persons that know how the AP validator works (i.e. what parameters are checked with what error levels): Josh, Eric and Joe.

I know roughly what is checked on MB - number and type of signals and some scores attached to the individual signals. With MB any tasks that has more than 50% match to the canonical one will be valid. I would assume it works similarly for AP.

The number and type of signals found by those three closely related apps was the same, so something in the score must have been too different - beyond our ability to check, that is in the uploaded result file, not in stderr.

Why you are repeatedly coming up inconclusive against r1316 might need investigation.
____________
A person who won't read has no advantage over one who can't read. (Mark Twain)

Profile Tim
Volunteer tester
Avatar
Send message
Joined: 19 May 99
Posts: 205
Credit: 250,529,842
RAC: 52,025
Greece
Message 1343233 - Posted: 5 Mar 2013, 13:33:20 UTC - in response to Message 1343105.




It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?


The only thing I see it's the Boinc version.
You have .52 they have .28

Tim
____________

Profile WilliamProject donor
Volunteer tester
Avatar
Send message
Joined: 14 Feb 13
Posts: 1610
Credit: 9,470,168
RAC: 16
Message 1343243 - Posted: 5 Mar 2013, 14:08:47 UTC - in response to Message 1343233.
Last modified: 5 Mar 2013, 14:10:17 UTC

It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?


The only thing I see it's the Boinc version.
You have .52 they have .28

Tim

That is of no consequence to the validity of results.

Boinc only provides the infrastructure - what task is run when, what project to contact for tasks, for how much work to ask, etc. the science is all done by the respective applications. If something goes wrong there, the app is at fault not boinc. [the only exception that springs to mind being the problem that AP may not get boinc 'suspend' orders under some circumstances, that's something where the API is probably involved - that's the bit that handles communication between boinc and the apps]
____________
A person who won't read has no advantage over one who can't read. (Mark Twain)

Profile James Sotherden
Avatar
Send message
Joined: 16 May 99
Posts: 9026
Credit: 36,974,752
RAC: 22,841
United States
Message 1343246 - Posted: 5 Mar 2013, 14:14:11 UTC

Im starting to see the occasional -12 again, Is anyone else?

And is there a way to tell if its just the work units or my machines?
____________

Old James

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8760
Credit: 52,709,504
RAC: 25,984
United Kingdom
Message 1343247 - Posted: 5 Mar 2013, 14:15:11 UTC - in response to Message 1343243.

It is beginning to be a pain. I have 170 waiting on one machine. Then you have little incidents where three different hosts, using three different versions of the software, all arrive at the same general results yet the one using the most recent software is marked as 'Invalid'. Not very encouraging...

Someone care to explain why that was marked as Invalid?


The only thing I see it's the Boinc version.
You have .52 they have .28

Tim

That is of no consequence to the validity of results.

Boinc only provides the infrastructure - what task is run when, what project to contact for tasks, for how much work to ask, etc. the science is all done by the respective applications. If something goes wrong there, the app is at fault not boinc. [the only exception that springs to mind being the problem that AP may not get boinc 'suspend' orders under some circumstances, that's something where the API is probably involved - that's the bit that handles communication between boinc and the apps]

And even then, I'd expect all versions of BOINC to be sending the suspend orders - by that stage, the API code responsible for listening for them and acting on them has been compiled and linked into the science application. Again, fixing it is outside the end-user's control.

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24889
Credit: 34,404,700
RAC: 11,927
Germany
Message 1343248 - Posted: 5 Mar 2013, 14:16:28 UTC
Last modified: 5 Mar 2013, 14:17:18 UTC

-12 errors are nvidia related.
Not your fault.
____________

Profile MikeProject donor
Volunteer tester
Avatar
Send message
Joined: 17 Feb 01
Posts: 24889
Credit: 34,404,700
RAC: 11,927
Germany
Message 1343249 - Posted: 5 Mar 2013, 14:18:47 UTC - in response to Message 1343248.

-12 errors are nvidia related.
Not your fault.


But you can try x41zc.
It will reduce them.

____________

Richard HaselgroveProject donor
Volunteer tester
Send message
Joined: 4 Jul 99
Posts: 8760
Credit: 52,709,504
RAC: 25,984
United Kingdom
Message 1343250 - Posted: 5 Mar 2013, 14:19:15 UTC - in response to Message 1343248.

-12 errors are nvidia related.
Not your fault.

They do tend to be application specific, with the optimised applications having a much reduced, and still reducing, incidence compared to the stock applications - though I don't think we're down to zero yet.

Profile WilliamProject donor
Volunteer tester
Avatar
Send message
Joined: 14 Feb 13
Posts: 1610
Credit: 9,470,168
RAC: 16
Message 1343251 - Posted: 5 Mar 2013, 14:23:25 UTC - in response to Message 1343246.

Im starting to see the occasional -12 again, Is anyone else?

And is there a way to tell if its just the work units or my machines?

I wonder why in all those years we never came up with a FAQ entry for that ::)

It is the units in conjunction with the app.
-12 are connected with a certain triplet condition - or a number of conditions. The original NVidia design did not account properly for those conditions - rather than finding a way to calculate it throws an error. Apparently that's a design flaw not a bug [side glance @ Jason].

Over the years intensive development by Jason has all but abolished -12.
The stock applications (6.08/6.09/6.10) will throw a lot of them.
The latest public installer based release x41g only throws a few.
the x41zc currently in public beta has to my knowledge completely abolished them.

IOW nothing to worry about and nothing you can do, besides doing a manual upgrade to x41zc.
____________
A person who won't read has no advantage over one who can't read. (Mark Twain)

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · 8 · 9 . . . 24 · Next

Message boards : Number crunching : Panic Mode On (82) Server Problems?

Copyright © 2014 University of California