Panic Mode On (98) Server Problems?

Message boards : Number crunching : Panic Mode On (98) Server Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 30 · Next

AuthorMessage
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1698561 - Posted: 4 Jul 2015, 23:29:28 UTC - in response to Message 1698552.  
Last modified: 4 Jul 2015, 23:40:45 UTC

I was wondering. The ati-nocal and ati5-nocal. I have been getting storms of them. They are short to very short. Is SETI trying for a quick turn around on these type of wu's?

There are no ati-nocal and ati5-nocal wus. Only tasks marked as ati-nocal and ati5-nocal after you ask the server for work. We often get groups of VHAR tasks or "shortie storms". Pretty standard things. Unless your GPU is having errors and just trashing all of the task it gets, but looking over your tasks they are just VHARs. So nothing to worry about.

Those tasks at true angle range 0.408901 aren't VHARs, http://setiathome.berkeley.edu/results.php?hostid=7368710&offset=80
He's just having a hard time believing his card is working Sooo much better than before, http://setiathome.berkeley.edu/result.php?resultid=4175896531

The New App r2929 is Much better than the old r1831, but, I'm having a hard time believing it's that much better. Perhaps his recent repairs helped a little.
Anyway, it's good to see his card working the way it's suppose to. Hopefully CreditFew will realize an AR of 0.408901 should be scoring around 100 credits before too much longer.


Well since you know exactly which tasks I was looking at when they mentioned they are running "very short tasks". Why not look at the same ones I did.
http://setiathome.berkeley.edu/result.php?resultid=4248289513
http://setiathome.berkeley.edu/result.php?resultid=4248212965
http://setiathome.berkeley.edu/result.php?resultid=4248212969

Okay. I am totally blown away by my GPU cards performance. My RAC has gone up 400% since I replaced the CPU cooler and PSU.

It does seem to be preforming better than just the new app would warrant. It might go even faster, you should try each of those 3 mb_cmdlines I gave you for about half a day each. The one with all the numbers set at 256 was the last one being used on my Vista host. It might be the best for Windows, you'll just have to try it...for a few hours.
Just in time for the Cruncher challenge later this month...
ID: 1698561 · Report as offensive
Admiral Gloval
Avatar

Send message
Joined: 31 Mar 13
Posts: 21184
Credit: 5,308,449
RAC: 0
United States
Message 1698571 - Posted: 4 Jul 2015, 23:54:02 UTC - in response to Message 1698561.  

It does seem to be preforming better than just the new app would warrant. It might go even faster, you should try each of those 3 mb_cmdlines I gave you for about half a day each. The one with all the numbers set at 256 was the last one being used on my Vista host. It might be the best for Windows, you'll just have to try it...for a few hours.
Just in time for the Cruncher challenge later this month...

I am seeing credit numbers climbing in the 8000-9000's lately.

ID: 1698571 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1698898 - Posted: 6 Jul 2015, 5:03:46 UTC
Last modified: 6 Jul 2015, 5:04:14 UTC

How the hell can this validate?

Stderr output
<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


http://setiathome.berkeley.edu/workunit.php?wuid=1819553789

I have more than a few tasks doing this

And I haven't figured out yet why I'm shooting blanks on MB GPU tasks :(([/url]
ID: 1698898 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1698909 - Posted: 6 Jul 2015, 5:40:42 UTC - in response to Message 1698898.  

How the hell can this validate?

Stderr output
<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


http://setiathome.berkeley.edu/workunit.php?wuid=1819553789

I have more than a few tasks doing this

And I haven't figured out yet why I'm shooting blanks on MB GPU tasks :(([/url]

There's an old thread about that. There were even Apps developed to stop it. I'm not sure about the current status, however, the old Apps are here, http://www.jgopt.org/download.html
The one you would want is - Lunatics_x41zc_win32_cuda50_commode_exeonly.7z, since you are running setiathome enhanced x41zc, Cuda 5.00 under Anonymous platform it wouldn't be very hard to give it a try.
If it were me, I'd probably downgrade to a different version of BOINC....if the Commode trick doesn't work.
ID: 1698909 · Report as offensive
Speedy
Volunteer tester
Avatar

Send message
Joined: 26 Jun 04
Posts: 1643
Credit: 12,921,799
RAC: 89
New Zealand
Message 1698920 - Posted: 6 Jul 2015, 6:31:30 UTC - in response to Message 1698898.  

How the hell can this validate?

Stderr output
<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


http://setiathome.berkeley.edu/workunit.php?wuid=1819553789


I'm not 100% sure but it could be because the under the result that ran for over 13,000 seconds & gave output data
ID: 1698920 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1698961 - Posted: 6 Jul 2015, 12:51:23 UTC - in response to Message 1698898.  

How the hell can this validate?

Stderr output
<core_client_version>7.4.42</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


http://setiathome.berkeley.edu/workunit.php?wuid=1819553789

I have more than a few tasks doing this

And I haven't figured out yet why I'm shooting blanks on MB GPU tasks :(([/url]

To answer your question. SETI@home result data is separate from stderr_txt.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1698961 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1698969 - Posted: 6 Jul 2015, 13:22:28 UTC

Thanks Hal.
ID: 1698969 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1698982 - Posted: 6 Jul 2015, 14:13:08 UTC - in response to Message 1698969.  
Last modified: 6 Jul 2015, 14:21:58 UTC

BTW, Some of them are Not Validating, as here, Invalid tasks for computer 7454279.
Those are called Instant Invalids in the thread here;
Strange Invalid MB Overflow tasks with truncated Stderr outputs...
It's Exactly what you are experiencing, an Invalid Overflow with a truncated Stderr and a Spike count of less than 30.

Yes, that thread was a Long time ago, the problem was identified, yet it still exists.
ID: 1698982 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1698985 - Posted: 6 Jul 2015, 14:34:55 UTC - in response to Message 1698982.  
Last modified: 6 Jul 2015, 14:42:56 UTC

Hey TBar,

Yes I have a pile of invalids for June 29/30. I reinstalled Lunatics to fix some AP problems and inadvertently installed cuda32 for MB. So it errored out my entire GPU cache until I caught it and reinstalled.

I only have 1 error since then.

EDIT: opps I meant 1 invalid since then, my 3 aborted errors are from me getting rid of cuda32 tasks before reinstalling.

That's why I was asking about blank stderr files validating, been watching to make sure i'm running clean.
ID: 1698985 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1698988 - Posted: 6 Jul 2015, 14:44:10 UTC - in response to Message 1698985.  
Last modified: 6 Jul 2015, 15:31:14 UTC

Hey TBar,

Yes I have a pile of invalids for June 29/30. I reinstalled Lunatics to fix some AP problems and inadvertently installed cuda32 for MB. So it errored out my entire GPU cache until I caught it and reinstalled.

I only have 1 error since then.

That One 'Error', actually an Invalid, just happens to be an Instant Invalid on an Overflow with a truncated Stderr and a Spike count of less than 30. As are a few of the others.
Right...
If you follow the thread you will discover part of the Stderr is used by the Validator on certain Overflows.
Whenever your results are Invalidated as soon as the validator looks at them, it means something is missing/wrong.
What is missing is the part of the Stderr the validator is looking for.
But, nevermind.
ID: 1698988 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1699005 - Posted: 6 Jul 2015, 16:43:06 UTC - in response to Message 1698982.  

Yes, that thread was a Long time ago, the problem was identified, yet it still exists.

Most irritating to me, not only was the problem identified, but a simple fix for the validation side of it was proposed by Joe Segur and passed along to Eric....never to be heard of again. While Jason's commode build seems to successfully address the Stderr truncation, it only covers NVIDIA GPUs. Unfortunately, the truncations can happen on CPUs and ATI GPUs, as well, so the validation-side fix would be more comprehensive, even though it wouldn't eliminate the truncations. I usually see the truncations happening every day, but only a small subset of those end up Invalid, currently averaging about 5 a month for me in 2015.
ID: 1699005 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1699009 - Posted: 6 Jul 2015, 17:14:44 UTC - in response to Message 1699005.  

Yes, that thread was a Long time ago, the problem was identified, yet it still exists.

Most irritating to me, not only was the problem identified, but a simple fix for the validation side of it was proposed by Joe Segur and passed along to Eric....never to be heard of again. While Jason's commode build seems to successfully address the Stderr truncation, it only covers NVIDIA GPUs. Unfortunately, the truncations can happen on CPUs and ATI GPUs, as well, so the validation-side fix would be more comprehensive, even though it wouldn't eliminate the truncations. I usually see the truncations happening every day, but only a small subset of those end up Invalid, currently averaging about 5 a month for me in 2015.

I ran the commode build for months and months on my Win 8.1 system and never had the truncated Stderr again. Then one day I tried it with the normal build and still haven't had the problem. I attribute it to BOINC 7.2.33 which seems to be the best version I've come across. That's why it's on all my hosts, plus, look at what Eric is running, http://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=72274. I wonder if Eric has had any of those truncated Stderrs, I really haven't checked...
;-)
ID: 1699009 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1699015 - Posted: 6 Jul 2015, 17:30:11 UTC - in response to Message 1699009.  

I attribute it to BOINC 7.2.33 which seems to be the best version I've come across.

Well, take a look at my active hosts and see which BOINC version I'm running....on all of them. :^) I still average a couple of truncations a day, I think, though as I mentioned, only(!) about 5 a month end up Invalid.
ID: 1699015 · Report as offensive
atlov

Send message
Joined: 11 Aug 12
Posts: 35
Credit: 32,718,664
RAC: 34
Germany
Message 1699020 - Posted: 6 Jul 2015, 17:39:02 UTC

Hi folks!

I just checked my pending MBs and noticed some recent results haven't validated yet, although my wingmen and I finished the WUs. For example http://setiathome.berkeley.edu/workunit.php?wuid=1830619685
Any idea what's wrong with the validators?
ID: 1699020 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1699021 - Posted: 6 Jul 2015, 17:39:32 UTC - in response to Message 1699015.  

I attribute it to BOINC 7.2.33 which seems to be the best version I've come across.

Well, take a look at my active hosts and see which BOINC version I'm running....on all of them. :^) I still average a couple of truncations a day, I think, though as I mentioned, only(!) about 5 a month end up Invalid.

In that case, I'd have to resort to the suspicion my GTS250 isn't fast enough to outrun the Windows file system, most of the time. Since it's strictly a Windows problem, and I'm only running 1 Windows host, I suppose my other systems are immune...lucky me.
ID: 1699021 · Report as offensive
Profile Donald L. Johnson
Avatar

Send message
Joined: 5 Aug 02
Posts: 8240
Credit: 14,654,533
RAC: 20
United States
Message 1699027 - Posted: 6 Jul 2015, 18:13:56 UTC - in response to Message 1699020.  

Hi folks!

I just checked my pending MBs and noticed some recent results haven't validated yet, although my wingmen and I finished the WUs. For example http://setiathome.berkeley.edu/workunit.php?wuid=1830619685
Any idea what's wrong with the validators?

27 June was the day that S@H shut down while the CoLo did some power systems work. My best guess is that when the project came back up, the validators missed some tasks. In the past, IIRC, those Tasks sit on the servers until the Deadline date, which for that Task is 18 August, then they get revisited by the Validators. If they don't get picked up then, Eric or Matt can manually validate them.
Donald
Infernal Optimist / Submariner, retired
ID: 1699027 · Report as offensive
Profile Jeff Buck Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 11 Feb 00
Posts: 1441
Credit: 148,764,870
RAC: 0
United States
Message 1699028 - Posted: 6 Jul 2015, 18:14:18 UTC - in response to Message 1699020.  

Hi folks!

I just checked my pending MBs and noticed some recent results haven't validated yet, although my wingmen and I finished the WUs. For example http://setiathome.berkeley.edu/workunit.php?wuid=1830619685
Any idea what's wrong with the validators?

It seems that, from time to time, the validators take a nap and some WUs like that one slip past them. Last September I had a whole bunch of tasks that fell into a validation black hole that lasted about 10 minutes or so. Richard Haselgrove responded to my report, as follows:

There is a failsafe in place, which weeds out most errors like that - the transitioners/validators take a second look at stuck WUs on the day the original deadlines would have passed (three weeks for shorties, six/seven/eight weeks away for the rest).

He was right. When the original deadline was reached for those tasks, the validators successfully picked them all up. In the case of your task, that should be about August 18. You'll just have to be patient until then.
ID: 1699028 · Report as offensive
atlov

Send message
Joined: 11 Aug 12
Posts: 35
Credit: 32,718,664
RAC: 34
Germany
Message 1699031 - Posted: 6 Jul 2015, 18:26:42 UTC - in response to Message 1699028.  

Thanks Donald & Jeff. Panic mode off until the deadline :D
ID: 1699031 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51477
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1699051 - Posted: 6 Jul 2015, 19:15:22 UTC
Last modified: 6 Jul 2015, 19:15:53 UTC

Uh, oh...
I see the SSP has not updated in over an hour, and the loading of stats seems to be crawling again.
"Time is simply the mechanism that keeps everything from happening all at once."

ID: 1699051 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1699056 - Posted: 6 Jul 2015, 19:27:12 UTC - in response to Message 1699051.  

Uh, oh...
I see the SSP has not updated in over an hour, and the loading of stats seems to be crawling again.

I am going to assume that boinc.berkeley.edu suddenly being offline as well is unrelated.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1699056 · Report as offensive
Previous · 1 . . . 16 · 17 · 18 · 19 · 20 · 21 · 22 . . . 30 · Next

Message boards : Number crunching : Panic Mode On (98) Server Problems?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.