| SETI@home | task postponed 600.000000 sec: Waiting to acquire lock

Questions and Answers : Windows : | SETI@home | task postponed 600.000000 sec: Waiting to acquire lock
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile JBird Project Donor
Avatar

Send message
Joined: 3 Sep 02
Posts: 297
Credit: 325,260,309
RAC: 549
United States
Message 1687768 - Posted: 4 Jun 2015, 16:14:11 UTC
Last modified: 4 Jun 2015, 16:29:43 UTC

Explanation? Does it self-correct or something I can do?

[Edit] I guess question is, what, how and why - it happened

ID: 1687768 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1687780 - Posted: 4 Jun 2015, 16:42:49 UTC - in response to Message 1687768.  
Last modified: 4 Jun 2015, 16:43:08 UTC

Check Jason's answer in this thread.

That's recovery from a failure during transfer of gaussfit results from the VRAM back to the host.

If it only happened the once, or extremely rarely, I wouldn't worry about it. The temporary exit did what it could to back out and try again, hopefully saving the result from some spurious problem.

If it happens more often, then you'd be looking at a range of possible issues from disk integrity through drivers into some hardware fault.

What actually caused that particular glitch, assuming ordinary consumer grade equipment like the rest of us, could be just about anything, including one-off spurious glitches, from temperature induced, [some other application messing with the GPU or system at the time], some immature system driver, even through to radiation in the silicon chip packaging material.

That's why enterprise gear with ECC RAM with Tesla compute cards, redundant PSUs etc exists... but for me a little bit of fault tolerance is more practical ;)

ID: 1687780 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1687800 - Posted: 4 Jun 2015, 17:31:29 UTC - in response to Message 1687780.  

JBird...

How many task were you trying to run? 4? My guess, you ran out of memory and it's waiting for some memory to free up so it can restart.

Exit and restart.

Are you monitoring how much memory each task is using?

I'd also lower the number of task down to 2, possibly 3 but that's pushing it

Zalster
ID: 1687800 · Report as offensive
Profile JBird Project Donor
Avatar

Send message
Joined: 3 Sep 02
Posts: 297
Credit: 325,260,309
RAC: 549
United States
Message 1687805 - Posted: 4 Jun 2015, 17:49:05 UTC - in response to Message 1687780.  

Check Jason's answer in this thread.

That's recovery from a failure during transfer of gaussfit results from the VRAM back to the host.

If it only happened the once, or extremely rarely, I wouldn't worry about it. The temporary exit did what it could to back out and try again, hopefully saving the result from some spurious problem.

If it happens more often, then you'd be looking at a range of possible issues from disk integrity through drivers into some hardware fault.

What actually caused that particular glitch, assuming ordinary consumer grade equipment like the rest of us, could be just about anything, including one-off spurious glitches, from temperature induced, [some other application messing with the GPU or system at the time], some immature system driver, even through to radiation in the silicon chip packaging material.

That's why enterprise gear with ECC RAM with Tesla compute cards, redundant PSUs etc exists... but for me a little bit of fault tolerance is more practical ;)

==Heh! I'll take it, Jord! Thanks (to both of you). I'm drawn to the "one-off spurious glitch" idea -- although this is new, not older hardware; aside from the likelihood that it may have occurred while switching between/or concurrently using Einstein and SETI.
=
This *is after all, only 5 days into the Burn-in period of my new Host here.
And my first experience with an 8 core(i7-4790K 4-4.4GHz)
Still haven't figured out why it vacillates between 6 and 8 cores usage (but I *will know why!)-- in all, I'm really tickled with it. Mainly have to get used to its new Scale of ops.
=
And I'm not any good at Networking these 2 machines--is there a Help Desk here about *that?

ID: 1687805 · Report as offensive
Profile JBird Project Donor
Avatar

Send message
Joined: 3 Sep 02
Posts: 297
Credit: 325,260,309
RAC: 549
United States
Message 1687811 - Posted: 4 Jun 2015, 18:34:49 UTC - in response to Message 1687800.  
Last modified: 4 Jun 2015, 19:02:32 UTC

JBird...

How many task were you trying to run? 4? My guess, you ran out of memory and it's waiting for some memory to free up so it can restart.

Exit and restart.

Are you monitoring how much memory each task is using?

I'd also lower the number of task down to 2, possibly 3 but that's pushing it

Zalster

===
Thanks for checkin in Zalster
=
Honestly, this thing is Flyin - not thru modifying/tweaking stuff yet.

To answer your questions tho:
This is a 16 GB RAM Build (may add some more) - but Lasso shows 75% mem use(of Physical RAM)when I'm running flat-out 100% Load
And I've got a 12GB RAMDISK on board in addition to 16GB SSD/RAM swapfile; so Memory is never a problem.
=
960 GPU at 99% Load still only uses Half of its 2048 GDDR5 Memory (headroom)
4up(.25)CUDAs on the 960 SC - still works great. (My clocks are 1404 Core and 1752 Mem)- The Maxwell GM206 chip has more headroom than most ppl realize
>Just based on Direct Compute/Compute Cap = 5/5.2 as a *guide, this GPU should do 5 WUs without a hitch - in fact since it has 8 multi-procs (CUs) mebe more
The Direct Compute spec is a Reference to DirectX - and Shaders Module(SM5)<
=
And 6 or 8 AVX-MBs
Concurrent usage - The CUDAs only use 1-4% CPU - for seconds at a time- so low, the AVX apps barely *blink - 4GHz to work with - nothing to it!
=
CPU Cores/Threads and internal stuff like CL and CUDA are between 4GHZ and 4.4GHz speeds
Hyperthread ON, Turboboost ON, Speedstep OFF - iGD speed too
=
I think the error was just a one-off "switching" glitch - fairly short-lived (it went to "waiting to run" after the 600 seconds)

So, still just 5 days into this Burn-in
Boinc/SETI-load prob I had, "healed" I think - very few glitches there, remain

ID: 1687811 · Report as offensive

Questions and Answers : Windows : | SETI@home | task postponed 600.000000 sec: Waiting to acquire lock


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.