"Boinc Virtualbox Wrapper has stopped working"

Questions and Answers : GPU applications : "Boinc Virtualbox Wrapper has stopped working"
Message board moderation

To post messages, you must log in.

AuthorMessage
Matt Selter

Send message
Joined: 17 Jan 00
Posts: 4
Credit: 13,329,829
RAC: 5
United States
Message 1984640 - Posted: 12 Mar 2019, 1:39:37 UTC
Last modified: 12 Mar 2019, 1:59:53 UTC

for the last few days i have been getting a few of these windows error code popups for the Virtualbox Wrapper program for anything that needs the GPU to run (Seti, LHC and Einstein)... i'm not positive, but i don't think the CPU WU's have been effected JUST the GPU ones (i think?)... i have not been able to complete any WU's that use the GPU as ALL of them error out under "computation error", sometime it goes for an hour, sometimes a few seconds go by before it happens...

I'm getting REALLY annoyed by the windows popping up when I'm in the middle of watching something on YouTube, or reading something online... The last straw: Today when i came home from work i had FIFTEEN of these things on my screen (yes, 15 of them), and had to click on each one to close out the program once each time for each instance... the first few times i tried to "check online for a solution and close the program" but it didn't help. finally i just started "close the program" for all of them... i tried to take a look at a few of the crash details, abut it's jibberish to me as I'm not really computer savvy enough... here is a cut'n'paste of the windows crash details for one of the 15 i had today, i'm not certain they were ALL identical or different for each window:

Problem signature:
Problem Event Name: BEX64
Application Name: vboxwrapper_26196_windows_x86_64.exe
Application Version: 7.7.26196.0
Application Timestamp: 5785dce0
Fault Module Name: vboxwrapper_26196_windows_x86_64.exe
Fault Module Version: 7.7.26196.0
Fault Module Timestamp: 5785dce0
Exception Offset: 000000000001fcf9
Exception Code: c0000409
Exception Data: 0000000000000000
OS Version: 6.1.7601.2.1.0.256.1
Locale ID: 1033
Additional Information 1: 457f
Additional Information 2: 457f7ca8bb82c95ba25567f6d06ed21d
Additional Information 3: 478b
Additional Information 4: 478b88ea8e7c7998e6dfd66c03cd3c3c


as you can see it's the virtual box that comes with Boinc, i've tried to update Boinc but it says that it's the latest and greatest version so nothing changes there. The only thing Windows related was it's newest antivirus definition update and I'm pretty sure that wasn't causing it...

i'm on an Intel I7-4790K CPU with an Nvidia Geforce GTX1070 video card and running Win 7 ultimate. Can anyone help out with what's causing it to error out and let me get back to my prior experience of a "fire and forget" mentality as concerning running Boinc and not having to babysit it?

Thanks in advance, Matt.

PS. if you need logs and whatnot, let me know the "who, what, when, where, why and how" to find them and i'll get them posted...

Edit: i forgot to mention that I'm running the most current version's: Boinc 7.14.2 with Virtualbox 5.2.8:
ID: 1984640 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984877 - Posted: 13 Mar 2019, 1:58:52 UTC - in response to Message 1984640.  

If you are running an AV program of ANY kind you need to exclude the BOINC folders from scanning. If an AV program is accessing a BOINC file for scanning, it locks it from being accessed by BOINC and will create the problems you describe. The BOINC resource is unavailable to BOINC while it is being locked by the AV scanner.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984877 · Report as offensive
Matt Selter

Send message
Joined: 17 Jan 00
Posts: 4
Credit: 13,329,829
RAC: 5
United States
Message 1984880 - Posted: 13 Mar 2019, 2:13:24 UTC - in response to Message 1984877.  
Last modified: 13 Mar 2019, 2:13:36 UTC

Thanks for the anti-virus conflicts info... i'll check soonish, but i'm pretty sure that it isn't scanning those folders... however, when i got home today, i saw that the task list in Boinc is NOW saying that my "GPU is Missing... " for a bunch of WU's ... and from other threads I've seen here, it's now is looking more and more like i got a windows update that nuked my video card drivers and now Boinc can't "see" my video card to run anything on it... i'm going to see if the problem goes away by doing the remedy in those threads... I'll let you guys know here if it works...
ID: 1984880 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984887 - Posted: 13 Mar 2019, 3:08:38 UTC

Yes, the infamous loss of drivers by the "friendly" update by Microsoft. It used to be you could configure Windows 7 and Windows 10 to "not accept driver" updates in the configuration. MS is totally ignoring that configuration now and installs what IT wants. You have no control now and can only pick up the pieces of a wrecked installation after it has happened.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984887 · Report as offensive
Matt Selter

Send message
Joined: 17 Jan 00
Posts: 4
Credit: 13,329,829
RAC: 5
United States
Message 1984898 - Posted: 13 Mar 2019, 4:50:53 UTC - in response to Message 1984887.  

well, i THINK that the drivers getting borked was the problem... i hope... i went to NVIDIA and updated to their newest drivers for my video card (Geforce GTX1070 driver 419.35) which should fix the can't find a GPU error. it seems as though it did because Boinc recognizes my GPU again.

i'm currently running an LHC WU that needs vbox64, an Einstein WU that needs FGRPopencl1K-nvidia, and finally a Seti WU that needs opencl_nvidia_SoG and they all SEEM to be working so far...

GRRR!!! LITERALLY AS I AM TYPING THIS the Vbox64 window kicked up and the LHC WU errored out with two of the three failed units trying to use (vbox64_mt_mcore) and the third is just (vbox64)

good news however!, the Seti and the Einstein units completed properly and are now in the "ready to report" state...
ID: 1984898 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1984902 - Posted: 13 Mar 2019, 6:36:02 UTC - in response to Message 1984898.  

If the LHC cpu work unit is a <plan_class>mt</plan_class> type, is it possible you aren't giving it enough cores to run? If the work unit defaults to using 4 cores, can't you use an app_config and put in a <avg_ncpus>2</avg_ncpus> statement in the app version to limit the task to using just 2 cores?

Don't forget the Seti gpu app expects to use a full cpu core to support the task. Einstein does not require as much cpu support for its gpu tasks but it does use some. Try to reduce the total cpu usage so everyone is happy.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1984902 · Report as offensive
Matt Selter

Send message
Joined: 17 Jan 00
Posts: 4
Credit: 13,329,829
RAC: 5
United States
Message 1985030 - Posted: 14 Mar 2019, 2:59:22 UTC - in response to Message 1984902.  
Last modified: 14 Mar 2019, 3:34:48 UTC

If the LHC cpu work unit is a <plan_class>mt</plan_class> type, is it possible you aren't giving it enough cores to run? If the work unit defaults to using 4 cores, can't you use an app_config and put in a <avg_ncpus>2</avg_ncpus> statement in the app version to limit the task to using just 2 cores?


Um... Er... I have no idea what you just said, meaning that while i understand the individual words themselves in those sentences, but i have no bloody clue as to what exactly they are meant for my non-programming brain to DO with.!, Sorry, but could you repeat as if talking to a non-programmer in any way-shape-or-form, please?

Don't forget the Seti GPU app expects to use a full CPU core to support the task. Einstein does not require as much CPU support for its GPU tasks but it does use some. Try to reduce the total CPU usage so everyone is happy.


see above statement... it wasn't until this discussion came up that i realized that Boinc is not running the way i "thought" it should be running... (I'm sure I goobered it up way back when i originally set up the 2nd and 3rd projects in Boinc, somehow... it's been this way so long I've forgotten anything different, anyway, it's no big deal, I'm still number crunching, it's all good.) seeing as how i 'thought' that i had it set to run just one of the three projects for about two hours using as much CPU, GPU and Ram as needed. Continuously doing as many WU's for that one and only project as it possibly can during that time, up and downloading new WU's and sending Results as needed... then, at the two hour mark, it would shift to another project for another two hours straight, and repeat for the third project... but what it seems to ACTUALLY be doing is running all three projects AT THE SAME TIME with no shifting at all... and if what you say about default number of CPU cores being needed, then could that be an issue?
ID: 1985030 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1985040 - Posted: 14 Mar 2019, 5:39:34 UTC - in response to Message 1985030.  
Last modified: 14 Mar 2019, 5:42:44 UTC

OK, I hope that someone with LHC experience chimes in. I have no personal experience with LHC so do not know the project tasks types at all. I THOUGHT I heard that the cpu tasks were multi core. IOW, each cpu task splits its workload across multiple concurrent cores. The typical multi-core work unit defaults to using 4 cpu cores all at once for each task. The configuration file change I mentioned would reduce that to only 2 cpu cores for each task.
So that means that when you run any LHC cpu task, you are using four cores out of your 8 core cpu and leaving only 4 cores to run your other projects and the desktop.

Which brings me to the other point. BOINC does NOT have the capability of choosing to run only a single project for just two hours and not running the other projects you are attached, BOINC only respects the "switch between tasks every X minutes" setting in Computing preferences. Default is 60 minutes. However it DOES NOT mean you will only run a single project for 60 minutes and then switch to another project. The amount of time your host spends on any project is based on the percentage of time out of 100% that you have allocated to each project. The next thing that determines project time allocation is based on credit. BOINC bases its allocation on REC or Recent Estimated Credit and tries to balance all attached projects to produce the same EXACT amount of credit produced each day for each attached project.

The problem arise then if all attached projects DO NOT award credit based on the same EXACT mechanism. Which unfortunately is the case. Some projects obey the rules of the CreditNew credit allocation algorithm and some projects award credit based on their own static definition. So a project may award 100 credits for 100 minutes of calculated cpu FLOPS and another project may award 1,000.000 credits for the same 100 minutes of cpu FLOPS calculations. So BOINC will try to run the low awarding project for a longer period of time to try and balance the equation of the high awarding credit project.

So what you are trying to achieve is not possible with your current configurations. You will have to tune each projects resource allocation to try and help BOINC balance the credit REC calculation.

Also it is very likely you are cpu resource starved trying to run three projects at the same time when LHC is commandeering at least 50% of your total available cores. Thus the error messages you are seeing.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1985040 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1985071 - Posted: 14 Mar 2019, 12:05:32 UTC - in response to Message 1985040.  

LHC does use multithreaded applications on the CPU, though not all their CPU applications are multithreaded. Yes, make it more confusing, Jord. 😁

What I am not so sure about is if you can curtail the multithreaded application with the app_config.xml file to only run on two cores. But that is something to ask on the LHC forums. Just as the crashes in VBox are a question for LHC, as Seti doesn't use VirtualBox for anything.
ID: 1985071 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1985161 - Posted: 14 Mar 2019, 20:11:56 UTC - in response to Message 1985071.  
Last modified: 14 Mar 2019, 20:12:56 UTC

The GPUGrid cpu tasks are multi-threaded. I know that many people like Zalster reduce the normal core count of 4 down to 2. That way he can run multiple MT task on his 10 core cpu. OK dug through some PM's from him on QC (Quantum Chemistry) mt app. I see I got the method for restricting core usage wrong in my previous post. This is part of his app_config for QC.

<app_version>
<app_name>QC</app_name>
<plan_class>mt</plan_class>
<avg_ncpus>2</avg_ncpus>
<cmdline>--nthreads 2</cmdline>
</app_version>

So the cmdline statement restricting the number of nthreads is how you control the total number concurrent cores working on an mt application.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1985161 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1985162 - Posted: 14 Mar 2019, 20:19:03 UTC - in response to Message 1985161.  

That's okay, but I would still ask at LHC if their application allows this, as it may not.
ID: 1985162 · Report as offensive
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 1985322 - Posted: 15 Mar 2019, 16:24:43 UTC - in response to Message 1985162.  

At LHC@home I am using the 8 cores of my Ryzen 5 1400 to run Atlas@home tasks. It used the 2 cores of my Opteron 1210 but unfortunately that system has died after 11 years of service on a SUN M20 workstation. LHC@home does not use GPUs.
Tullio
ID: 1985322 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1985324 - Posted: 15 Mar 2019, 16:36:31 UTC - in response to Message 1985161.  

<avg_ncpus>2</avg_ncpus> is a direction to the BOINC client for scheduling purposes - this is what would define "Five tasks at once" on a 10-core CPU.

<cmdline>--nthreads 2</cmdline> is a direction to the mt science app itself - "this is the amount of silicon you're allowed to use - don't overstep the mark".

It's probably most efficient to keep the two values identical.
ID: 1985324 · Report as offensive

Questions and Answers : GPU applications : "Boinc Virtualbox Wrapper has stopped working"


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.