Anything relating to AstroPulse (2) tasks

Message boards : Number crunching : Anything relating to AstroPulse (2) tasks
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 50 · Next

AuthorMessage
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1949090 - Posted: 11 Aug 2018, 22:51:37 UTC

The ancients talk of Astro Pulse but I think it is a myth.
ID: 1949090 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949896 - Posted: 15 Aug 2018, 16:25:38 UTC

So I’ve had a few APs pop up.

2 of the systems are just immediately “Aborting” the AP tasks without even running them (runtime = 0).

What usually causes this?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949896 · Report as offensive
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11361
Credit: 29,581,041
RAC: 66
United States
Message 1949902 - Posted: 15 Aug 2018, 16:48:05 UTC

I got one, woo hoo
ID: 1949902 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1949904 - Posted: 15 Aug 2018, 17:04:24 UTC - in response to Message 1949896.  

So I’ve had a few APs pop up.

2 of the systems are just immediately “Aborting” the AP tasks without even running them (runtime = 0).

What usually causes this?

Either you don't have the OpenCL drivers loaded or the tasks are 100% radar blanked.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1949904 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1949906 - Posted: 15 Aug 2018, 17:09:32 UTC - in response to Message 1949904.  

got 9, only 1 was 100% blanked
ID: 1949906 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949911 - Posted: 15 Aug 2018, 17:23:23 UTC - in response to Message 1949904.  

So I’ve had a few APs pop up.

2 of the systems are just immediately “Aborting” the AP tasks without even running them (runtime = 0).

What usually causes this?

Either you don't have the OpenCL drivers loaded or the tasks are 100% radar blanked.


i have the nvidia 396.51 drivers on both systems in question. i assume that has openCL right?

what does "radar blanked" mean?

see this system here, you can check the tasks (i dont know what i'm looking at for task details) :
https://setiathome.berkeley.edu/results.php?hostid=8559920&offset=0&show_names=0&state=6&appid=20
and here:
https://setiathome.berkeley.edu/results.php?hostid=8561893&offset=0&show_names=0&state=0&appid=20

these are the 2 new systems that were just brought online in the past 2-3 days.
each on nvidia drivers 396.51 (installed from ppa)
each on Ubuntu 18.04.1 LTS
each with Petris special app v0.96 (but i think this only applies to the v8 app, not v7. the v7 app is still there from the stock zi3v package
each with a 1080ti and a a handful of 1060s

this linux computer seems to be running them ok: https://setiathome.berkeley.edu/results.php?hostid=8432395&offset=0&show_names=0&state=0&appid=20
nvidia driver 396.45 installed from nvidia run file
it is on Ubuntu 17.10
with Petris special app v0.96 (but i think this only applies to the v8 app, not v7. the v7 app is still there from the stock zi3v package
has 2x 1060

the Windows systems running SoG/Lunatics also seem to be running AP tasks fine

this one: https://setiathome.berkeley.edu/results.php?hostid=8433872&offset=0&show_names=0&state=0&appid=20
nvidia driver 391.35
windows 7 x64
2x 1080ti

this one: https://setiathome.berkeley.edu/results.php?hostid=8555582&offset=0&show_names=0&state=0&appid=20
nvidia driver 397.64
windows 7 x64
2x 1060
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949911 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34763
Credit: 261,360,520
RAC: 489
Australia
Message 1949942 - Posted: 15 Aug 2018, 20:20:02 UTC

So I’ve had a few APs pop up.

2 of the systems are just immediately “Aborting” the AP tasks without even running them (runtime = 0).

What usually causes this?
I can see from your computer list that 2 of your computers do not show any OpenCL component with your driver version and that is the reason that they are being aborted.

Install the missing OpenCL component and all should be well.

Cheers.
ID: 1949942 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1949944 - Posted: 15 Aug 2018, 20:42:25 UTC - in response to Message 1949942.  
Last modified: 15 Aug 2018, 20:47:25 UTC

Yes, that is what I suspected. The OpenCL component of the drivers didn't get installed. If you have the ppa installed for the 396 drivers. Then
sudo apt install nvidia-compute-396

This package provides a set of libraries which enable the NVIDIA driver
to use GPUs for parallel general purpose computation through CUDA and
OpenCL.
[Edit] The Arecibo telescope is also used as a high power radar transmitter. When we are recording on the Alpha recorder, when the main dish pops out a radar pulse, we encode that time interval into our recordings so that the data is marked as corrupted by the radar pulse.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1949944 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949955 - Posted: 15 Aug 2018, 21:21:25 UTC - in response to Message 1949942.  

So I’ve had a few APs pop up.

2 of the systems are just immediately “Aborting” the AP tasks without even running them (runtime = 0).

What usually causes this?
I can see from your computer list that 2 of your computers do not show any OpenCL component with your driver version and that is the reason that they are being aborted.

Install the missing OpenCL component and all should be well.

Cheers.


that's weird that the 396.45 drivers would include the openCL components and the 396.51 drivers dont.

i usually just install the drivers by adding the nvidia proprietary PPA repo, then simply "sudo apt-get install nvidia-drivers-396" and the PPA autoselects whichever version.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949955 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949956 - Posted: 15 Aug 2018, 21:25:53 UTC - in response to Message 1949944.  
Last modified: 15 Aug 2018, 21:28:27 UTC

Yes, that is what I suspected. The OpenCL component of the drivers didn't get installed. If you have the ppa installed for the 396 drivers. Then
sudo apt install nvidia-compute-396

This package provides a set of libraries which enable the NVIDIA driver
to use GPUs for parallel general purpose computation through CUDA and
OpenCL.
[Edit] The Arecibo telescope is also used as a high power radar transmitter. When we are recording on the Alpha recorder, when the main dish pops out a radar pulse, we encode that time interval into our recordings so that the data is marked as corrupted by the radar pulse.


nvidia drivers are obviously installed. nvidia-smi reports 396.51 as such. and the v8 app crunches cuda90 just fine.

i've never had to do a separate install for nvidia-compute-[version] vs just running nvidia-drivers-[version].

about the "radar" jargon, what i'm asking is, how do i tell if a specific work unit is "radar blanked". i don't see that verbiage in any of the associated WU info.

this is a task that failed : https://setiathome.berkeley.edu/result.php?resultid=6893323875

was it "radar blanked"?
and how can you tell?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949956 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1949957 - Posted: 15 Aug 2018, 21:29:32 UTC - in response to Message 1949956.  
Last modified: 15 Aug 2018, 21:29:58 UTC

this is a task that failed : https://setiathome.berkeley.edu/result.php?resultid=6893323875

was it "radar blanked"?
and how can you tell?


https://setiathome.berkeley.edu/result.php?resultid=6893323875

Look the exit... 201 (0x000000C9) EXIT_MISSING_COPROC

So is not radar blank related.
ID: 1949957 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34763
Credit: 261,360,520
RAC: 489
Australia
Message 1949958 - Posted: 15 Aug 2018, 21:31:31 UTC

Both your rigs with driver 396.51 have no OpenCL support listed, which means that you have CUDA support (but no OpenCL which is needed for Astropulse).

Cheers.
ID: 1949958 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949960 - Posted: 15 Aug 2018, 21:37:02 UTC - in response to Message 1949957.  

this is a task that failed : https://setiathome.berkeley.edu/result.php?resultid=6893323875

was it "radar blanked"?
and how can you tell?


https://setiathome.berkeley.edu/result.php?resultid=6893323875

Look the exit... 201 (0x000000C9) EXIT_MISSING_COPROC

So is not radar blank related.


what would it say if it was?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949960 · Report as offensive
juan BFP Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 16 Mar 07
Posts: 9786
Credit: 572,710,851
RAC: 3,799
Panama
Message 1949964 - Posted: 15 Aug 2018, 21:52:41 UTC - in response to Message 1949960.  

what would it say if it was?

AFAIK the exit will be like any other WU: 0 (0x00000000)

The crunching will be very fast... few seconds in my host.

I don`t do AP anymore to post an example.
ID: 1949964 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34763
Credit: 261,360,520
RAC: 489
Australia
Message 1949968 - Posted: 15 Aug 2018, 22:04:27 UTC
Last modified: 15 Aug 2018, 22:07:10 UTC

this is a task that failed : https://setiathome.berkeley.edu/result.php?resultid=6893323875

was it "radar blanked"?
and how can you tell?


https://setiathome.berkeley.edu/result.php?resultid=6893323875

Look the exit... 201 (0x000000C9) EXIT_MISSING_COPROC

So is not radar blank related.


what would it say if it was?
As I've already explained, those 2 setups have no OpenCL support in the drivers (it will say so if it was just like it shows on your other rigs) and that is what the app is looking for and when it can't find that support the app exits with the "EXIT_MISSING_COPROC" in your Stderr output for that task (which is what you are getting).

Either install the missing OpenCL part of your drivers or stop accepting Astropulse work.

Keith Myers has already posted how to get that OpenCL component installed. ;-)

[edit] BTW I picked up 11 AP's this morning.

Cheers.
ID: 1949968 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949969 - Posted: 15 Aug 2018, 22:09:50 UTC

i recognize that. but im still on the hunt to understand the exact behavior. nothing wrong with wanting to know exactly how things are working and how you can tell certain aspects of a WU by looking at it.

I know the driver is missing the OpenCL component now. I'm not disputing that. i can remedy that once i get home

but it IS strange that i've never had this issue before. using linux and older driver versions (390 and older), i've always just run "sudo apt-get install nvidia-driver-[version]". it installs, i reboot, and on my way.

why would previous driver version install the opencl component for me, while now it does not?

something is different, given an identical install. there has to be a reason for that.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949969 · Report as offensive
Profile Wiggo
Avatar

Send message
Joined: 24 Jan 00
Posts: 34763
Credit: 261,360,520
RAC: 489
Australia
Message 1949971 - Posted: 15 Aug 2018, 22:27:59 UTC

From what I've heard it has something to do with the 18.04 version you're using on those 2 rigs, your other Linux rigs are older versions and are not effected by this.

Cheers.
ID: 1949971 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1949994 - Posted: 16 Aug 2018, 0:09:17 UTC

I think there is something wrong with the 396.51 release from the ppa. Too many people reporting the OpenCL component missing and not getting installed when the driver moved from 396.45 to 396.51.

The output in stderr.txt will specifically report 100% blanked for a radar blanked AP task. It will run very fast and give you maybe 2 credits. Don't have any examples to show since they would have been long cleared by now.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1949994 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1949997 - Posted: 16 Aug 2018, 0:31:00 UTC
Last modified: 16 Aug 2018, 0:40:09 UTC

sudo apt install nvidia-compute-396

Returns unable to locate package.

What now?

just tried this from the other thread:

sudo apt-get install ocl-icd-libopencl1


seems to have worked. we'll see when another AP task rolls around
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1949997 · Report as offensive
Profile Brent Norman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester

Send message
Joined: 1 Dec 99
Posts: 2786
Credit: 685,657,289
RAC: 835
Canada
Message 1950001 - Posted: 16 Aug 2018, 0:45:53 UTC - in response to Message 1949997.  

You could try 'ldd' on the AstroPulse app and see if it reports anything missing.
ID: 1950001 · Report as offensive
Previous · 1 . . . 10 · 11 · 12 · 13 · 14 · 15 · 16 . . . 50 · Next

Message boards : Number crunching : Anything relating to AstroPulse (2) tasks


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.