ZOTAC GAMING GeForce GTX 1650

Message boards : Number crunching : ZOTAC GAMING GeForce GTX 1650
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012356 - Posted: 17 Sep 2019, 23:30:05 UTC - in response to Message 2012354.  

My errors:

Stderr output
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

I added this to /etc/default/grub as suggested but it did not help:
GRUB_CMDLINE_LINUX="vsyscall=emulate"
Radjin~
ID: 2012356 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2012358 - Posted: 17 Sep 2019, 23:36:47 UTC

Sigsegv errors are usually caused by unstable cpu clocks or unstable memory clocks. Something is corrupting memory addresses. This a OS issue and not a BOINC or Seti issue.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2012358 · Report as offensive
Spartana

Send message
Joined: 24 Apr 16
Posts: 99
Credit: 41,712,387
RAC: 25
United States
Message 2012365 - Posted: 18 Sep 2019, 2:09:24 UTC - in response to Message 2012356.  

My errors:

Stderr output
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

I added this to /etc/default/grub as suggested but it did not help:
GRUB_CMDLINE_LINUX="vsyscall=emulate"


Looks like your are running a stock BOINC install with the app "SETI@home v8 8.00 x86_64-pc-linux-gnu" and not Tbar's AIO. Was that your intention? I thought you were working towards getting the AIO up and running.
ID: 2012365 · Report as offensive
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 2012392 - Posted: 18 Sep 2019, 6:18:11 UTC
Last modified: 18 Sep 2019, 6:29:41 UTC

@Radjin
I also am running a Debian (buster) 10 distribution but in a "normal" desktop, with GUI, context. I pass along the following thoughts as I follow this thread:
(1) I "think" (low confidence) that I had some signal 11 errors a long time ago. One easy thing to check is the permissions for the /BOINC/slots directory. That is where boinc keeps the running status of each task. The "user" who started boinc must have "w" write permission for that directory. I just set everything... drwxrwxrwx for ~/BOINC/slots. If the boinc "user" can't write to "slots" it will crash immediately as boinc can't set up the task controls.
(2) I "think" (again not 100% confidence) that boincmgr is a GUI application and so I would not expect it to respond over a ssh connection to a headless server, which I presume does not have an X server running. And I'm not even sure how one would start an X server on a host that didn't have a graphics card before you plugged in the 1650.
(3) Early in this thread you asked how to find any repository-installed files that didn't get removed by an apt-remove or apt-purge action. I use "dpkg" (Debian package manager) with the "aptitude" GUI frontend but the native "dpkg" is a command line application to do lots of package management activities. One useful option is: dpkg -L "package-name" , which will list ALL the files that were created when "package-name" was installed. A bit tedious, but one can check for their existence and manually remove any residual files as necessary. Maybe "apt" has an equivalent option but I couldn't find it on a quick look at the man.
(4) Regarding the "./" thing to prefix a command: The interpreter (bash?) will search for a given command in the directories given in $PATH, usually /bin, /sbin, /usr/bin, /usr/sbin, and others but NOT in the user's current working directory! There are two solutions: (1) give a full absolute path name, like /home/radjin/BOINC/boinccmd (where the leading "/" signals the interpreter that an absolute path follows; or (2) use the "./" prefix, which signals the interpreter that a "relative" path follows - and assuming you're in the /home/radjin/BOINC directory you get the right application in the current directory.
You may hear, or have already heard, that the Debian distribution is a difficult/complex one to deal with. I have grown with Debian Linux from it's early days and I guess I've adapted gradually to it's style so I find it more challenging to switch to Ubuntu, or Mint, etc., than to stay with what I (more or less) understand. It does the job for me.
ID: 2012392 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22231
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2012393 - Posted: 18 Sep 2019, 6:49:32 UTC - in response to Message 2012341.  

You have two things going on - first the very high error count is stopping you getting any new work:

17-Sep-2019 14:16:16 [SETI@home] No tasks sent
17-Sep-2019 14:16:16 [SETI@home] No tasks are available for AstroPulse v7
17-Sep-2019 14:16:16 [SETI@home] No tasks are available for SETI@home v8
17-Sep-2019 14:16:16 [SETI@home] This computer has finished a daily quota of 1 tasks


Second - your CPU work is erroring out (never mind the fact that you are worrying about not detecting any GPUs on this computer - having a CPU that is dumping every task is not going to help that.
Task 8056275741
Name blc32_2bit_guppi_58643_84072_HIP35821_0124.8489.818.24.47.110.vlar_1
Workunit 3656197359
Created 17 Sep 2019, 12:10:03 UTC
Sent 17 Sep 2019, 20:03:09 UTC
Report deadline 10 Nov 2019, 1:02:51 UTC
Received 17 Sep 2019, 20:10:24 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 11 (0x0000000B) Unknown error code
Computer ID 8816320
Run time
CPU time
Validate state Invalid
Credit 0.00
Device peak FLOPS 3.83 GFLOPS
Application version SETI@home v8 v8.00
x86_64-pc-linux-gnu
Stderr output

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2012393 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012785 - Posted: 21 Sep 2019, 22:16:43 UTC - in response to Message 2012365.  

My errors:

Stderr output
<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>

I added this to /etc/default/grub as suggested but it did not help:
GRUB_CMDLINE_LINUX="vsyscall=emulate"


Looks like your are running a stock BOINC install with the app "SETI@home v8 8.00 x86_64-pc-linux-gnu" and not Tbar's AIO. Was that your intention? I thought you were working towards getting the AIO up and running.


AIO, another version of an app? If so I was using http://www.arkayn.us/lunatics/BOINC.7z but getting the same errors.
Radjin~
ID: 2012785 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012787 - Posted: 21 Sep 2019, 22:31:26 UTC - in response to Message 2012392.  

@Radjin
I also am running a Debian (buster) 10 distribution but in a "normal" desktop, with GUI, context. I pass along the following thoughts as I follow this thread:
(1) I "think" (low confidence) that I had some signal 11 errors a long time ago. One easy thing to check is the permissions for the /BOINC/slots directory. That is where boinc keeps the running status of each task. The "user" who started boinc must have "w" write permission for that directory. I just set everything... drwxrwxrwx for ~/BOINC/slots. If the boinc "user" can't write to "slots" it will crash immediately as boinc can't set up the task controls.

I set the permissions as you suggested; owner and group were boinc.

(2) I "think" (again not 100% confidence) that boincmgr is a GUI application and so I would not expect it to respond over a ssh connection to a headless server, which I presume does not have an X server running. And I'm not even sure how one would start an X server on a host that didn't have a graphics card before you plugged in the 1650.

I don’t use the GUI at all.

(3) Early in this thread you asked how to find any repository-installed files that didn't get removed by an apt-remove or apt-purge action. I use "dpkg" (Debian package manager) with the "aptitude" GUI frontend but the native "dpkg" is a command line application to do lots of package management activities. One useful option is: dpkg -L "package-name" , which will list ALL the files that were created when "package-name" was installed. A bit tedious, but one can check for their existence and manually remove any residual files as necessary. Maybe "apt" has an equivalent option but I couldn't find it on a quick look at the man.

I use aptitude but it may not state what changed to config files, hence the path reverting to another location.


(4) Regarding the "./" thing to prefix a command: The interpreter (bash?) will search for a given command in the directories given in $PATH, usually /bin, /sbin, /usr/bin, /usr/sbin, and others but NOT in the user's current working directory! There are two solutions: (1) give a full absolute path name, like /home/radjin/BOINC/boinccmd (where the leading "/" signals the interpreter that an absolute path follows; or (2) use the "./" prefix, which signals the interpreter that a "relative" path follows - and assuming you're in the /home/radjin/BOINC directory you get the right application in the current directory.
You may hear, or have already heard, that the Debian distribution is a difficult/complex one to deal with. I have grown with Debian Linux from it's early days and I guess I've adapted gradually to it's style so I find it more challenging to switch to Ubuntu, or Mint, etc., than to stay with what I (more or less) understand. It does the job for me.

I spent an hour learning about absolute and relative paths in Linux.


Thanks for the info

Radjin~
Radjin~
ID: 2012787 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012788 - Posted: 21 Sep 2019, 22:34:36 UTC - in response to Message 2012393.  

You have two things going on - first the very high error count is stopping you getting any new work:

17-Sep-2019 14:16:16 [SETI@home] No tasks sent
17-Sep-2019 14:16:16 [SETI@home] No tasks are available for AstroPulse v7
17-Sep-2019 14:16:16 [SETI@home] No tasks are available for SETI@home v8
17-Sep-2019 14:16:16 [SETI@home] This computer has finished a daily quota of 1 tasks
Yes I understand my errors are limiting my work. Hence all the posts trying to resolve it.

Second - your CPU work is erroring out (never mind the fact that you are worrying about not detecting any GPUs on this computer - having a CPU that is dumping every task is not going to help that.
[quote]Task 8056275741
Name blc32_2bit_guppi_58643_84072_HIP35821_0124.8489.818.24.47.110.vlar_1
Workunit 3656197359
Created 17 Sep 2019, 12:10:03 UTC
Sent 17 Sep 2019, 20:03:09 UTC
Report deadline 10 Nov 2019, 1:02:51 UTC
Received 17 Sep 2019, 20:10:24 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 11 (0x0000000B) Unknown error code
Computer ID 8816320
Run time
CPU time
Validate state Invalid
Credit 0.00
Device peak FLOPS 3.83 GFLOPS
Application version SETI@home v8 v8.00
x86_64-pc-linux-gnu
Stderr output

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process got signal 11</message>
<stderr_txt>
SIGSEGV: segmentation violation

</stderr_txt>
]]>
I have not concerned myself with the GPU since getting these errors. One thing at a time.
Radjin~
ID: 2012788 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 2012791 - Posted: 21 Sep 2019, 22:49:15 UTC

I going to make few silly questions/observations and probably you already answer but the thread is big to follow so please forgive me

1 - You make so many installations and uninstallations in the last days, so is highly probable something was left so it's hard to fix.

2 - Is possible to reinstall the host from scratch? Including the OS?

3 - Instead of Debian, could you run Ubuntu? Or maybe Mint?

If you decided to start the host form 0, please post so you could be guided step by step to make your host work.

FYI I never run any Linux computer before this one. Configure and put to run from zero running Ubuntu takes very little time (a couple of hours at most) and almost no headache. OK a little but nothing hard to do.

Just a suggestion.
ID: 2012791 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012801 - Posted: 22 Sep 2019, 0:16:11 UTC - in response to Message 2012791.  

I going to make few silly questions/observations and probably you already answer but the thread is big to follow so please forgive me

1 - You make so many installations and uninstallations in the last days, so is highly probable something was left so it's hard to fix.

2 - Is possible to reinstall the host from scratch? Including the OS?

3 - Instead of Debian, could you run Ubuntu? Or maybe Mint?

If you decided to start the host form 0, please post so you could be guided step by step to make your host work.

FYI I never run any Linux computer before this one. Configure and put to run from zero running Ubuntu takes very little time (a couple of hours at most) and almost no headache. OK a little but nothing hard to do.

Just a suggestion.


The thought did occur to me, but this is a fairly new install. I have a web server running on it and to migrate that including the MySQL database and VPN server and all the related files an settings would definitely push my limits. I basically set up a system that was pretty failsafe with a raid 1, and nightly backups. If I had the skills to back all that up and do a clean install then restore the website and VPN I probably would. I only expected to ever have to redo the system if my raid had a catastrophic failure.
Radjin~
ID: 2012801 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2012808 - Posted: 22 Sep 2019, 1:38:19 UTC - in response to Message 2012801.  

One easy test that will determine if it's boinc or the Science App is fairly easy. Create a new folder, call it test, and place inside the Science App and a WU renamed - work_unit.sah . Then cd to the folder and run the App from the terminal, ./setiathome_8.00_x86_64-pc-linux-gnu, and see if it still crashes.
ID: 2012808 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012822 - Posted: 22 Sep 2019, 5:13:50 UTC - in response to Message 2012808.  

One easy test that will determine if it's boinc or the Science App is fairly easy. Create a new folder, call it test, and place inside the Science App and a WU renamed - work_unit.sah . Then cd to the folder and run the App from the terminal, ./setiathome_8.00_x86_64-pc-linux-gnu, and see if it still crashes.


This is a fantastic idea, but need a bit more information.

1. Create a directory called test inside the science app.
Not sure what you mean by science app? Inside the boinc-client directory?

2. Copy a work unit to the test directory and rename it “work_unit.sah”
Where are these work units stored?

3. cd to the test folder and run:  ./setiathome_8.00_x86_64-pc-linux-gnu.
I got this part.
Radjin~
ID: 2012822 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13751
Credit: 208,696,464
RAC: 304
Australia
Message 2012823 - Posted: 22 Sep 2019, 5:26:42 UTC - in response to Message 2012822.  

1. Create a directory called test inside the science app.
Not sure what you mean by science app? Inside the boinc-client directory?
Nope- Create a new folder, call it test, and place inside (the new folder) the Science App (setiathome_8.00_x86_64-pc-linux-gnu) and a WU (eg blc34_2bit_guppi_58643_86349_HIP33332_0131.20734.409.23.46.102.vlar) renamed - work_unit.sah


2. Copy a work unit to the test directory and rename it “work_unit.sah”
Where are these work units stored?
In the Seti project directory.
Grant
Darwin NT
ID: 2012823 · Report as offensive
Gene Project Donor

Send message
Joined: 26 Apr 99
Posts: 150
Credit: 48,393,279
RAC: 118
United States
Message 2012833 - Posted: 22 Sep 2019, 7:22:53 UTC

@Radjin
Just to set my mind at ease regarding read/write permissions... look at the contents of the /BOINC/slots/ directory (i.e. from BOINC working directory, cd slots, then ls -al) and there should be a few directory entries, like "0" "1" "2" etc. Those get used, and re-used, by boinc in sequential fashion to hold the work unit status. Their contents get deleted when the work unit finishes but the "numbered" directories seem to be persistent. If they don't exist then there is certainly a permissions mistake. A trick that Keith M passed along to me long ago is to "disable network activity" in the boinc options- the result being that the /slots/ information is not deleted (since they can't be uploaded) and one has time to inspect the contents at leisure. Here's what it looks like for me:
drwxrwx--x 2 gene gene 4096 Sep 22 00:07 0
drwxrwx--x 2 gene gene 4096 Sep 22 00:07 1
drwxrwx--x 2 gene gene 4096 Sep 22 00:06 2
drwxrwx--x 2 gene gene 4096 Sep 21 23:51 3
drwxrwx--x 2 gene gene 4096 Sep 21 22:33 4
drwxrwx--x 2 gene gene 4096 Sep 21 23:11 5
drwxrwx--x 2 gene gene 4096 Sep 21 23:41 6
drwxrwx--x 2 gene gene 4096 Sep 22 00:08 7
drwxrwx--x 2 gene gene 4096 Sep 21 09:25 8
drwxrwx--x 2 gene gene 4096 Sep 21 09:46 9

You can tell that I'm "owner" and "group" within that directory and that anybody else does NOT have r/w permissions for those directories.
I don't have any experience with headless systems so I am not sure whether "boinc" is the correct owner and group setting in that context. Are you logging in as user=boinc ? I manually start my boinc (client) and boincmgr as user=gene but in a headless machine maybe those get started some other way and the owner/group setup I use is not relevant.

I saw your "strace" output (in the Q&A forum) in message #2012495 a couple of days ago. I suppose you've figured out by now that your were running an Nvidia graphics Seti (GPU) application which would fail miserably if there was no graphics card with drivers installed. Might be worthwhile to try that strace procedure again with a CPU application.

Keep at it... Eventually "we" will get it fixed and there will be a collective slap of the forehead that it was so obvious.
ID: 2012833 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 2012834 - Posted: 22 Sep 2019, 7:44:13 UTC - in response to Message 2012265.  

how are you accessing the system since it's headless and no monitor, etc?

ssh on the local network? VPN+ssh to a remote machine?


Just ssh locally from my Mac or via an app on the iPad.
This has been touched on a couple of times now, but you always answer with the Linux host in mind I think. You ssh into the Linux host from a Mac, right? You can install BOINC for the Mac and use its BOINC Manager (GUI) to control the remote client, if you set up the remote control options as have been given in this thread (by Keith I think).

If you also do everything by terminal command on your Mac only then the above is moot. But it isn't if only the headless Linux boz is the one you work on via commandline.
ID: 2012834 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 2012884 - Posted: 22 Sep 2019, 15:20:27 UTC - in response to Message 2012823.  

1. Create a directory called test inside the science app.
Not sure what you mean by science app? Inside the boinc-client directory?
Nope- Create a new folder, call it test, and place inside (the new folder) the Science App (setiathome_8.00_x86_64-pc-linux-gnu) and a WU (eg blc34_2bit_guppi_58643_86349_HIP33332_0131.20734.409.23.46.102.vlar) renamed - work_unit.sah

2. Copy a work unit to the test directory and rename it “work_unit.sah”
Where are these work units stored?
In the Seti project directory.
Uh, yes. The Science App is the CPU, or GPU, or other App, and is in the same place as the Work Units, the setiathome.berkeley.edu folder. Just place the CPU App and any WU into an empty folder and rename the WU to: work_unit.sah
Then run the CPU App from the Terminal. This bypasses anything to do with boinc and will determine if the problem still exists. You could do the same thing with the GPU App, IF you had a GPU properly installed in that machine.
Does it still crash?
ID: 2012884 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2012977 - Posted: 23 Sep 2019, 12:06:18 UTC - in response to Message 2012884.  
Last modified: 23 Sep 2019, 12:08:05 UTC

Out of the blue it starts working. I did nothing other than fly out for a few days.

Thanks for all the suggestions and tips.

When I have time, I will drop in the GPU that started this thread and see what challenges it will offer.

Radjin~
Radjin~
ID: 2012977 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2013489 - Posted: 27 Sep 2019, 13:36:38 UTC

I attempted once again to install the card and the system failed to boot; this after installing a higher output PSU. I contacted the manufacturer and with a meter we ran through some tests looking for a known problem. Sure enough, there was a shorted diode which causes a ground. They shipped me another card and sent a pre-paid return label. So I should have the replacement soon.

Kudos to Zotac and their techs for such great support.
Radjin~
ID: 2013489 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2013505 - Posted: 27 Sep 2019, 15:26:03 UTC - in response to Message 2013489.  

Was the card defective as shipped? Or did you damage it during testing and installation?
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2013505 · Report as offensive
Profile Radjin Project Donor
Avatar

Send message
Joined: 2 May 00
Posts: 105
Credit: 14,928,529
RAC: 102
United States
Message 2021271 - Posted: 1 Dec 2019, 1:36:37 UTC - in response to Message 2013505.  

It was a known defect.
Radjin~
ID: 2021271 · Report as offensive
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : ZOTAC GAMING GeForce GTX 1650


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.