all 14au18ac GPU tasks on one system running 3 seconds?

Message boards : Number crunching : all 14au18ac GPU tasks on one system running 3 seconds?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1950213 - Posted: 17 Aug 2018, 3:16:45 UTC

so i just put up a new system. it's having all kinds of strange issues.

one of them being that all the 14au18ac tasks are finishing in 3-5 seconds. this doesnt seem right. and my other systems dont seem to be doing this. none of them have been validated yet, but i have a feeling they will all go invalid.

ubuntu 18.04
nvidia 396.51 drivers
gtx 1060

it doesnt report an error, but it's suspect that they are always 3 seconds on this kind of job, on this system.

task example: https://setiathome.berkeley.edu/result.php?resultid=6897037707

Stderr output
<core_client_version>7.9.3</core_client_version>
<![CDATA[
<stderr_txt>
setiathome_CUDA: Found 2 CUDA device(s):
  Device 1: GeForce GTX 1060 6GB, 6076 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 10 
     pciBusID = 4, pciSlotID = 0
  Device 2: GeForce GTX 1060 6GB, 6078 MiB, regsPerBlock 65536
     computeCap 6.1, multiProcs 10 
     pciBusID = 5, pciSlotID = 0
In cudaAcc_initializeDevice(): Boinc passed DevPref 1
setiathome_CUDA: CUDA Device 1 specified, checking...
   Device 1: GeForce GTX 1060 6GB is okay
SETI@home using CUDA accelerated device GeForce GTX 1060 6GB
Unroll autotune 10. Overriding Pulse find periods per launch. Parameter -pfp set to 10
Using default pulse Fft limit (-pfl 64)

setiathome v8 enhanced x41p_V0.96, Cuda 9.20 special
Compiled with NVCC, using static libraries. Modifications done by petri33 and released to the public by TBar.



Detected setiathome_enhanced_v8 task. Autocorrelations enabled, size 128k elements.
Work Unit Info:
...............
WU true angle range is :  9.297712
Sigma 0
Thread call stack limit is: 1k
Pulse: peak=inf, time=23.36, period=0.172, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=23.65, period=0.1524, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=23.93, period=0.1921, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=24.22, period=0.1577, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=24.51, period=0.1593, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=24.8, period=0.1528, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=25.09, period=0.1843, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=25.38, period=0.1516, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=25.66, period=0.1581, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=25.95, period=0.1724, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=26.24, period=0.1864, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=26.53, period=0.1819, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=26.82, period=0.1896, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=27.11, period=0.1675, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=27.39, period=0.1905, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=27.68, period=0.181, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=27.97, period=0.1671, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=28.26, period=0.1577, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=28.55, period=0.1556, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=28.84, period=0.1581, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=29.12, period=0.1499, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=29.41, period=0.1831, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=29.7, period=0.1749, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=29.99, period=0.1827, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=30.28, period=0.1761, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=30.57, period=0.154, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=30.85, period=0.1487, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=31.14, period=0.1651, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=31.43, period=0.186, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
Pulse: peak=inf, time=31.72, period=0.1313, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8 
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.

Best spike: peak=6.95862, time=55.06, d_freq=1420894775.39, chirp=0, fft_len=8 
Best autocorr: peak=0, time=-2.124e+11, delay=0, d_freq=0, chirp=0, fft_len=0 
Best gaussian: peak=0, mean=0, ChiSq=0, time=-2.124e+11, d_freq=0,
	score=-12, null_hyp=0, chirp=0, fft_len=0 
Best pulse: peak=0, time=-2.124e+11, period=0, d_freq=0, score=0, chirp=0, fft_len=0 
Best triplet: peak=0, time=-2.124e+11, period=0, d_freq=0, chirp=0, fft_len=0 
Spike count:    0
Autocorr count: 0
Pulse count:    30
Triplet count:  0
Gaussian count: 0
22:51:23 (1489): called boinc_finish(0)

</stderr_txt>
]]>

Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1950213 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1950214 - Posted: 17 Aug 2018, 3:27:03 UTC - in response to Message 1950213.  

Those are overflows. You can see the -9 here

Pulse: peak=inf, time=31.43, period=0.186, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8
Pulse: peak=inf, time=31.72, period=0.1313, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.
ID: 1950214 · Report as offensive
Profile arkayn
Volunteer tester
Avatar

Send message
Joined: 14 May 99
Posts: 4438
Credit: 55,006,323
RAC: 0
United States
Message 1950215 - Posted: 17 Aug 2018, 3:34:37 UTC - in response to Message 1950214.  

Those are overflows. You can see the -9 here

Pulse: peak=inf, time=31.43, period=0.186, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8
Pulse: peak=inf, time=31.72, period=0.1313, d_freq=1420899658.2, score=nan, chirp=0, fft_len=8
SETI@Home Informational message -9 result_overflow
NOTE: The number of results detected equals the storage space allocated.


Those are running between 7 and 10 minutes on my 1070's.

ID: 1950215 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1950217 - Posted: 17 Aug 2018, 3:40:39 UTC - in response to Message 1950213.  

so i just put up a new system. it's having all kinds of strange issues.

Can't say you weren't warned;
I'm still trying to figure out why they hate Arecibo shorties, once that is fixed I will post a link to the fully working App.

Not only does V0.91-7 Hate Shorties, but on the Mac it also hates the normal length Arecibo tasks.
Your only solution is to run zi3v on the Arecibo tasks, like I'm trying to do.
Or, just go back to zi3v period.
ID: 1950217 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1950218 - Posted: 17 Aug 2018, 3:41:42 UTC

The very first ones split off the tape were all early overflows. I have several pages were almost all of them overflowed.
https://setiathome.berkeley.edu/results.php?userid=14084&offset=840&show_names=1&state=0&appid=
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1950218 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1950222 - Posted: 17 Aug 2018, 3:58:56 UTC

Does overflow = 3 second run time is normal?
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1950222 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1950224 - Posted: 17 Aug 2018, 4:00:38 UTC - in response to Message 1950217.  

so i just put up a new system. it's having all kinds of strange issues.

Can't say you weren't warned;
I'm still trying to figure out why they hate Arecibo shorties, once that is fixed I will post a link to the fully working App.

Not only does V0.91-7 Hate Shorties, but on the Mac it also hates the normal length Arecibo tasks.
Your only solution is to run zi3v on the Arecibo tasks, like I'm trying to do.
Or, just go back to zi3v period.


I don’t think that’s it. All of my linux systems are running v0.96.

None of them have this behavior. Only this one system.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1950224 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1950226 - Posted: 17 Aug 2018, 4:04:59 UTC - in response to Message 1950224.  

Are the other ones running Shorties? Or just the Normal length Arecibos?
The ones with an AR of more that 1.5 are the ones that give problems on the Linux Systems.
ID: 1950226 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1950230 - Posted: 17 Aug 2018, 4:18:37 UTC

I don’t know what AR means. Or how to find out what the AR of a task is/was.

I just setup and let it run
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1950230 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1950231 - Posted: 17 Aug 2018, 4:23:41 UTC - in response to Message 1950230.  

Work Unit Info:
...............
WU true angle range is : 9.297712
Sigma 0

That is from the one posted above, AR is 9.297712.
Hence above 1.5 and fails on Every one of my machines.
The ones at 1.2 and below work on the My Linux systems.
On the Mac they sorta work, but give too high of a Pulse count. I.E. Count should be 4 and it reports 8.
ID: 1950231 · Report as offensive
Profile Zalster Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 5517
Credit: 528,817,460
RAC: 242
United States
Message 1950232 - Posted: 17 Aug 2018, 4:23:42 UTC - in response to Message 1950230.  

AR stands for angle range and is located in the stderr report. For the work unit you listed this is it

WU true angle range is : 9.297712

ID: 1950232 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 1950285 - Posted: 17 Aug 2018, 13:17:10 UTC - in response to Message 1950231.  

I guess I just got unlucky to get 20+ of these tasks in a row on the same system?

My other v0.96 systems don’t see this nearly as often. Maybe once every couple days. And they have a lot more power on them and process many more WUs than this one
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 1950285 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1950356 - Posted: 17 Aug 2018, 18:37:27 UTC - in response to Message 1950285.  

Look at the good side. Now Petri is convinced it's a serious problem and he's looking into it more seriously.
He got the same results BTW, https://setiathome.berkeley.edu/results.php?hostid=7475713&state=5
ID: 1950356 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1950458 - Posted: 18 Aug 2018, 0:21:56 UTC

It is a known problem. I'm working on it.
I did not have a WU in my test cases that would have revealed it.
Now I grabbed one.
I'll work on it tomorrow.
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1950458 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1950460 - Posted: 18 Aug 2018, 0:36:07 UTC - in response to Message 1950458.  

If it ain't broke . . . .ya can't fix it!
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1950460 · Report as offensive
Profile Jord
Volunteer tester
Avatar

Send message
Joined: 9 Jun 99
Posts: 15184
Credit: 4,362,181
RAC: 3
Netherlands
Message 1950533 - Posted: 18 Aug 2018, 13:12:53 UTC

All 14au18a* tasks run for double to triple their normal time on my Android devices.
ID: 1950533 · Report as offensive
TBar
Volunteer tester

Send message
Joined: 22 May 99
Posts: 5204
Credit: 840,779,836
RAC: 2,768
United States
Message 1950535 - Posted: 18 Aug 2018, 13:16:17 UTC - in response to Message 1950533.  

All Shorties, no matter where they came from, screw up on All my machines when using the same App as the OP.
ID: 1950535 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1950582 - Posted: 18 Aug 2018, 19:50:26 UTC - in response to Message 1950535.  
Last modified: 18 Aug 2018, 19:59:09 UTC

Try this on Linux and 10x0
executable https://drive.google.com/open?id=1pe6-p5zn27tXFvvszyGCzo0OfCkkCqDt
source https://drive.google.com/open?id=17Djj2E8Pxcd7k2WouYBskfPGrtxwjvFO
To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1950582 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13161
Credit: 1,160,866,277
RAC: 1,873
United States
Message 1950604 - Posted: 18 Aug 2018, 22:06:18 UTC - in response to Message 1950582.  

Try this on Linux and 10x0
executable https://drive.google.com/open?id=1pe6-p5zn27tXFvvszyGCzo0OfCkkCqDt
source https://drive.google.com/open?id=17Djj2E8Pxcd7k2WouYBskfPGrtxwjvFO

Petri, could use some simple instructions on how to make this one work. Installed the CUDA9.2 Toolkit and verified nvcc and clinfo. Changed the app name in app_info and then threw away all my gpu work. Did not install the 396.24 driver in the Toolkit because I was already running 396.51.

I followed these instructions.
How-to-install-CUDA-9-2-on-Ubuntu-18-04

Can you point out where I went wrong? TIA.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 1950604 · Report as offensive
Profile petri33
Volunteer tester

Send message
Joined: 6 Jun 02
Posts: 1668
Credit: 623,086,772
RAC: 156
Finland
Message 1950610 - Posted: 18 Aug 2018, 22:20:05 UTC - in response to Message 1950604.  

Here is an example of an app_info.xml file.
The <file info> part must state the name of the executable so it does not get deleted when boinc starts.
The same name is in the <app version> part <file name> xxx </file name>
Sometimes it is easiest to suspend boinc. copy a new executable over the old one and keep the old file name. And resume computing.

<app_info>
  <app>
    <name>astropulse_v7</name>
  </app>
  
  <file_info>
    <name>ap_7.01r2793_sse3_clGPU_x86_64</name>
    <executable/>
  </file_info>
  
 
  <app_version>
    <app_name>astropulse_v7</app_name>
    <version_num>708</version_num>
    <platform>linux_x86_64</platform>
    <plan_class>opencl_nvidia_100</plan_class> 
    <cmdline> -verb -st -nog -unroll 80 -ffa_block 2304 -ffa_block_fetch 1152 -oclFFT_plan 256 16 256 </cmdline> 
    <coproc> 
      <type>NVIDIA</type> 
      <count>0.33</count> 
    </coproc>
    <file_ref>
      <file_name>ap_7.01r2793_sse3_clGPU_x86_64</file_name>
      <main_program/>
    </file_ref>
  </app_version>

  
  <app>
    <name>setiathome_v8</name>
  </app>
  
  <file_info>
    <name>MBv8_8.22r3712_avx2_x86_64-pc-linux-gnu</name>
    <namex>MBv8_8.05r3345_avx_linux64</namex>
    <executable/>
  </file_info>
  
  <file_info>
    <name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda65_v8</name>
    <executable/>
  </file_info>
  
  <app_version>
    <app_name>setiathome_v8</app_name>
    <version_num>800</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>1.000000</avg_ncpus>
    <max_ncpus>1.000000</max_ncpus>
    <file_ref>
      <file_name>MBv8_8.22r3712_avx2_x86_64-pc-linux-gnu</file_name>
      <main_program/>
    </file_ref>
  </app_version>

  <app_version>
    <app_name>setiathome_v8</app_name>
    <version_num>801</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>1.000000</avg_ncpus>
    <max_ncpus>1.000000</max_ncpus>
    <file_ref>
      <file_name>MBv8_8.22r3712_avx2_x86_64-pc-linux-gnu</file_name>
      <main_program/>
    </file_ref>
  </app_version>
  
  <app_version>
    <app_name>setiathome_v8</app_name>
    <version_num>804</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>1.000000</avg_ncpus>
    <max_ncpus>1.000000</max_ncpus>
    <file_ref>
      <file_name>MBv8_8.22r3712_avx2_x86_64-pc-linux-gnu</file_name>
      <main_program/>
    </file_ref>
  </app_version>   

  <app_version>
    <app_name>setiathome_v8</app_name>
    <version_num>808</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>0.1</avg_ncpus> 
    <max_ncpus>0.1</max_ncpus> 
    <plan_class>nvidia_gpu</plan_class> 
    <cmdline> -nobs -pfb 32 </cmdline> 
    <coproc> 
      <type>NVIDIA</type> 
      <count>1.00</count> 
    </coproc>
    <file_ref>
      <file_name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda65_v8</file_name>
      <main_program/>
    </file_ref>
  </app_version>

  <app_version>
    <app_name>setiathome_v8</app_name>
    <version_num>809</version_num>
    <platform>x86_64-pc-linux-gnu</platform>
    <avg_ncpus>0.1</avg_ncpus> 
    <max_ncpus>0.1</max_ncpus> 
    <plan_class>opencl_nvidia_sah</plan_class> 
    <cmdline> -nobs -pfb 32  </cmdline> 
    <coproc> 
      <type>NVIDIA</type> 
      <count>1.00</count> 
    </coproc>
    <file_ref>
      <file_name>setiathome_x41zc_x86_64-pc-linux-gnu_cuda65_v8</file_name>
      <main_program/>
    </file_ref>
  </app_version>
  
</app_info>



An here is an example of a app_config.xml
<app_config>
  <xproject_max_concurrent>10</xproject_max_concurrent>
  
  <app>
    <name>astropulse_v7</name>
    <max_concurrent>8</max_concurrent>
    <gpu_versions>
      <gpu_usage>0.50</gpu_usage>
      <cpu_usage>0.125</cpu_usage>
    </gpu_versions>
  </app>

  <app>
    <name>setiathome_v8</name>
    <xmax_concurrent>10</xmax_concurrent>
    <gpu_versions>
      <gpu_usage>1.0</gpu_usage>
      <cpu_usage>0.1</cpu_usage>
    </gpu_versions>
  </app>
  
  <app_version>
    <app_name>astropulse_v7</app_name>
    <plan_class>opencl_nvidia_100</plan_class>
    <cmdline> -verb -st -nog -unroll 80 -sbs 2048 -ffa_block 2304 -ffa_block_fetch 1152 -oclFFT_plan 256 16 256 </cmdline>
  </app_version>

  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>opencl_nvidia_sah</plan_class>
    <cmdline> -pfb 32 -nobs -pfl 64 </cmdline>
  </app_version>

  <app_version>
    <app_name>setiathome_v8</app_name>
    <plan_class>nvidia_gpu</plan_class>
    <cmdline> -pfb 32 -nobs -pfl 64 </cmdline>
  </app_version>
  
 </app_config>

To overcome Heisenbergs:
"You can't always get what you want / but if you try sometimes you just might find / you get what you need." -- Rolling Stones
ID: 1950610 · Report as offensive
1 · 2 · 3 · Next

Message boards : Number crunching : all 14au18ac GPU tasks on one system running 3 seconds?


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.