BOINC V6.6.36 BUGs

Message boards : Number crunching : BOINC V6.6.36 BUGs
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 909993 - Posted: 22 Jun 2009, 2:33:46 UTC
Last modified: 22 Jun 2009, 2:51:14 UTC


I had let run BOINC V6.6.36..

Two days well.. after some time I reached a ~ 4 day WU cache.. ~ 3,500 WUs..

And then.. surprise..? ;-) ..no.. :-(


Normally every CUDA task [AR 0.44x WU] get ~ 95 MB system RAM support.

Normally ~ 700 MB system RAM usage whole PC. [2 x 1 GB installed]



Then BOINC V6.6.36 get panic and gone in EDF mode. [EarliestDeadlineFirst]
Or something like this.. because..
..but.. there were WUs with earlier deadline in BOINC.


After and after he started and finished a lot of WUs.
BOINC started and some sec. before the WU would normally finish - the WU suspended and a new WU start.


But.. then.. I had ~ 20 suspended WUs.

Every suspended CUDA task used RAM and finally the whole system RAM was full.
Also the CPU was 100 % used.
And this is an only GPU cruncher! No tasks on CPU!

Normally every CUDA task get max. ~ 6 % CPU [~ 24 % Core] support.
But now the active CUDA tasks get sometimes [every ~ 5 sec.] ~ 1 % CPU [~ 4 % Core] support.
So no CUDA crunching possible.


And this happened although:
Leave applications in memory while suspended? NO



[EDIT:
BTW. A reboot didn't helped.. the suspended tasks used the system RAM again.
I needed to abort this WUs.]



So.. SETI@home community.. :-)


  • You saw also that BOINC V6.6.36 have probs to support high WU caches?

  • You saw also that suspended WUs are in the system RAM although set NO ?




Thanks! :-)


ID: 909993 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 910007 - Posted: 22 Jun 2009, 3:24:00 UTC


Ahh.. yes..

..also I had two 'Falling back to HOST CPU'..:



[...]
Work Unit Info:
...............
WU true angle range is : 0.202289
Cuda error 'cudaMalloc((void**) &dev_WorkData' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcceleration.cu' in line 293 : out of memory.
setiathome_CUDA: CUDA runtime ERROR in device memory allocation (Step 1 of 3). Falling back to HOST CPU processing...

Flopcounter: 26974829277152.035000
[...]


and


[...]
setiathome_CUDA: CUDA Device 1 specified, checking...
Device 1: GeForce GTX 260 is okay
SETI@home using CUDA accelerated device GeForce GTX 260
V10 modification by Raistmer
Priority of worker thread rised successfully
Priority of process adjusted successfully
Total GPU memory 0 free GPU memory 0
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_fft.cu' in line 49 : no CUDA-capable device is available.
Cuda error 'cufftPlan1d(&fft_analysis_plans[FftNum], FftLen, CUFFT_C2C, NumDataPoints / FftLen)' in file 'd:/BTR/SETI6/SETI_MB_CUDA/client/cuda/cudaAcc_fft.cu' in line 49 : no CUDA-capable device is available.
setiathome_CUDA: CUDA runtime ERROR in plan FFT. Falling back to HOST CPU processing...
setiathome_enhanced 6.02 Visual Studio/Microsoft C++
[...]



..but this happened ~ one day before I reached the ~ 4 day cache.

ID: 910007 · Report as offensive
1mp0£173
Volunteer tester

Send message
Joined: 3 Apr 99
Posts: 8423
Credit: 356,897
RAC: 0
United States
Message 910013 - Posted: 22 Jun 2009, 3:32:00 UTC - in response to Message 909993.  


I had let run BOINC V6.6.36..

Two days well.. after some time I reached a ~ 4 day WU cache.. ~ 3,500 WUs..

And then.. surprise..? ;-) ..no.. :-(


Normally every CUDA task [AR 0.44x WU] get ~ 95 MB system RAM support.

Normally ~ 700 MB system RAM usage whole PC. [2 x 1 GB installed]



Then BOINC V6.6.36 get panic and gone in EDF mode. [EarliestDeadlineFirst]
Or something like this.. because..
..but.. there were WUs with earlier deadline in BOINC.


After and after he started and finished a lot of WUs.
BOINC started and some sec. before the WU would normally finish - the WU suspended and a new WU start.


But.. then.. I had ~ 20 suspended WUs.

Every suspended CUDA task used RAM and finally the whole system RAM was full.
Also the CPU was 100 % used.
And this is an only GPU cruncher! No tasks on CPU!

Normally every CUDA task get max. ~ 6 % CPU [~ 24 % Core] support.
But now the active CUDA tasks get sometimes [every ~ 5 sec.] ~ 1 % CPU [~ 4 % Core] support.
So no CUDA crunching possible.


And this happened although:
Leave applications in memory while suspended? NO



[EDIT:
BTW. A reboot didn't helped.. the suspended tasks used the system RAM again.
I needed to abort this WUs.]



So.. SETI@home community.. :-)


  • You saw also that BOINC V6.6.36 have probs to support high WU caches?

  • You saw also that suspended WUs are in the system RAM although set NO ?




Thanks! :-)


If you want to report problems with BOINC you really should report them to the BOINC developers and not to SETI@Home.

SETI@Home is a science project that uses BOINC.
ID: 910013 · Report as offensive
Profile Sutaru Tsureku
Volunteer tester

Send message
Joined: 6 Apr 07
Posts: 7105
Credit: 147,663,825
RAC: 5
Germany
Message 910017 - Posted: 22 Jun 2009, 3:39:01 UTC
Last modified: 22 Jun 2009, 3:39:42 UTC


You mean in the BOINC forum?

http://boinc.berkeley.edu/dev

AFAIK, the devs don't look there.

And the user frequency is very very low there.


Here would be the frequency of people bigger which have maybe also found this BUGs.


Also.. hmm.. no thanks - I'll never again open a ticket..
My only one was deleted because of for me don't understood reason..

ID: 910017 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 910024 - Posted: 22 Jun 2009, 3:50:39 UTC - in response to Message 910017.  

The devs look there, be sure.
BOINC problems better report to this board:
http://boinc.berkeley.edu/dev/forum_forum.php?id=10
And situation you described really looks as new nasty bug in BOINC so it's really worth to report on BOINC forum.
ID: 910024 · Report as offensive
Profile [AF>france>pas-de-calais]symaski62
Volunteer tester

Send message
Joined: 12 Aug 05
Posts: 258
Credit: 100,548
RAC: 0
France
Message 910073 - Posted: 22 Jun 2009, 9:26:17 UTC

oué ^^

NVIDIA 185.85 version & CUDA 2.2 version


SETI@Home Informational message -9 result_overflow
with a general handicap of 80% and it makes much d' efforts for the community and s' expimer, thank you d' to be understanding.
ID: 910073 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 910077 - Posted: 22 Jun 2009, 9:40:38 UTC - in response to Message 910024.  

The devs look there, be sure.

The devs hardly look there. They usually only come over for a quick snoop when one of us emails them.

You can better email the boinc_dev email list, which is a more direct line to them and then in a clear and concise way, with evidence, logs and all that, explain that you have found a bug. The list needs registration.
ID: 910077 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 910089 - Posted: 22 Jun 2009, 10:19:17 UTC - in response to Message 910077.  

The devs look there, be sure.

The devs hardly look there. They usually only come over for a quick snoop when one of us emails them.

You can better email the boinc_dev email list, which is a more direct line to them and then in a clear and concise way, with evidence, logs and all that, explain that you have found a bug. The list needs registration.

I find it works best if you restrict yourself to one bug per email, and help them to help you by putting all the relevant information in that one email.
ID: 910089 · Report as offensive
Profile Hammeh
Volunteer tester
Avatar

Send message
Joined: 21 May 01
Posts: 135
Credit: 1,143,316
RAC: 0
United Kingdom
Message 910092 - Posted: 22 Jun 2009, 10:28:08 UTC
Last modified: 22 Jun 2009, 10:29:38 UTC

Yes - agreed with above. Bugs reported on the mailing list are normally investigated by the devs. To provide all of the needed information you need to run BOINC with <work_fetch_debug>, <sched_op_debug> and one run with <debt_debug> from the cc_config.xml file. Catch that log and post with it on the BOINC dev mailing list, putting the log into the email. Attachments to the list will be deleted by the email server.
This may help the devs identify and fix the bug you have found. If you don't include the logs above, they will probably reply and tell you to run them anyway!
ID: 910092 · Report as offensive
Profile -= Vyper =-
Volunteer tester
Avatar

Send message
Joined: 5 Sep 99
Posts: 1652
Credit: 1,065,191,981
RAC: 2,537
Sweden
Message 910115 - Posted: 22 Jun 2009, 12:10:07 UTC - in response to Message 910024.  

The devs look there, be sure.
BOINC problems better report to this board:
http://boinc.berkeley.edu/dev/forum_forum.php?id=10
And situation you described really looks as new nasty bug in BOINC so it's really worth to report on BOINC forum.


Touche, at worst i had around 200 suspended ones whereas Sutaru had 20.

I run this version atm but before this i needed 6.4.7 because it didn't EDF at all.

Regards Vyper

_________________________________________________________________________
Addicted to SETI crunching!
Founder of GPU Users Group
ID: 910115 · Report as offensive

Message boards : Number crunching : BOINC V6.6.36 BUGs


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.