SETI orphans

Message boards : Number crunching : SETI orphans
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 43 · Next

AuthorMessage
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073304 - Posted: 14 Apr 2021, 12:11:57 UTC - in response to Message 2073298.  

Bah. I thought I was getting somewhere, until...


So, Microsoft, it was you that b****y decided to set up OneDrive for me, you're managing it for me, and now you tell me it's full? Now you know why I stick with Windows 7.

500GB SSD is in the post...
ID: 2073304 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2073311 - Posted: 14 Apr 2021, 13:18:13 UTC

FYI, you can run these GPU tasks on openCL 1.1, 1.2 doesnt seem to be absolutely necessary (maybe just an artificial limit project-side)

jobs ran fine on my GTX 550ti which is OpenCL 1.1

https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,43352_offset,30#655567
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2073311 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073317 - Posted: 14 Apr 2021, 14:28:27 UTC - in response to Message 2073304.  

Richard, even if you could manage to catch full run it will be hardly usable. Even with 2-3 min log I wait 10-20 seconds on each presentation update...
Also, the bigger run, the coarser resolution.
Better to start on pause and hit "play" only before one job finishes and hit pause or stop when second job do few gradient steps. If you want to catch the gap between jobs.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073317 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073318 - Posted: 14 Apr 2021, 14:29:03 UTC - in response to Message 2073311.  
Last modified: 14 Apr 2021, 14:33:13 UTC

FYI, you can run these GPU tasks on openCL 1.1, 1.2 doesnt seem to be absolutely necessary (maybe just an artificial limit project-side)

jobs ran fine on my GTX 550ti which is OpenCL 1.1


I ran it (until driver crash) on GTX 460 SE :)
Didn't manage to catch profile log so far though.

Only nSight for Visual Studio is able to trace (in theory) OpenCL on NV. So I had to install VS 2019 also.
nSight Compute is only for CUDA....
[And of course latest build dropped support of pre-Win10 OSes so had to use older one]
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073318 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2073323 - Posted: 14 Apr 2021, 15:04:29 UTC - in response to Message 2073318.  

I'm going to go even older, I'll see if a GTX 295 will run LOL
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2073323 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073324 - Posted: 14 Apr 2021, 15:14:57 UTC - in response to Message 2073323.  

I'm going to go even older, I'll see if a GTX 295 will run LOL
I've got a 9800 GT we could try...
ID: 2073324 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2073325 - Posted: 14 Apr 2021, 15:30:53 UTC - in response to Message 2073324.  

I'm going to go even older, I'll see if a GTX 295 will run LOL
I've got a 9800 GT we could try...

oh lord LOL
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2073325 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2073335 - Posted: 14 Apr 2021, 17:59:28 UTC - in response to Message 2073323.  

I'm going to go even older, I'll see if a GTX 295 will run LOL


it took some fiddling but it just picked up a couple OPNG tasks on it.

GPU is nearly pegged to 99-100% on both cores (two tasks, dual GPU), i haven't noticed many dips.
~145MB VRAM used
temps also fine at about 62C on an open air test bench
looks like it'll take an hour to run

we'll see if it completes and/or validates, an absolute waste of power though LOL.

not bad for an OpenCL 1.0 card (online specs list 1.1, driver reports 1.0)
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2073335 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073336 - Posted: 14 Apr 2021, 18:15:22 UTC - in response to Message 2073335.  

I'm going to go even older, I'll see if a GTX 295 will run LOL


it took some fiddling but it just picked up a couple OPNG tasks on it.

GPU is nearly pegged to 99-100% on both cores (two tasks, dual GPU), i haven't noticed many dips.
~145MB VRAM used
temps also fine at about 62C on an open air test bench
looks like it'll take an hour to run

we'll see if it completes and/or validates, an absolute waste of power though LOL.

not bad for an OpenCL 1.0 card (online specs list 1.1, driver reports 1.0)

And no driver restarts due to TDR? Or did you disable it already?
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073336 · Report as offensive     Reply Quote
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2073341 - Posted: 14 Apr 2021, 19:04:41 UTC - in response to Message 2073336.  
Last modified: 14 Apr 2021, 19:05:33 UTC

And no driver restarts due to TDR? Or did you disable it already?


I don't know what "TDR" refers to. I didn't really do anything special on the software side. I only edited the coproc_info.xml file to make the card appear to BOINC as OpenCL 1.2 capable (so that WCG would send me a task).

the "fiddling" was on the hardware side. the card seems to have broken display outputs. so i used the GTX550ti to run the monitor and disabled the GTX550ti from being used by BOINC with the <ignore_nvidia_dev> flag in cc_config.

the card also didnt want to boot or play nice, probably CSM/UEFI issues and conflicts with such a new OS (Ubuntu 20.04) and also could just be some idiosyncrasies with my specific motherboard (ASUS P9X79 E-WS) which is not normal for PCIe layout and has embedded PLX switches for several PCIe slots to expand more lanes. So I played around with different slots until I found a slot that let things boot properly and run.

once there, I just installed the driver package as normal (340.108 drivers from Ubuntu graphics drivers PPA). once drivers were installed I noticed no other issues. I did not experience any driver crashes or anything like that.

pretty standard stuff.

it completed a task fine with no errors, but was marked invalid. The project is having an issue right now with a high number of Invalids though, so I can't say if this is a "real" invalid, or a victim of the current situation.

Result Name: OPNG_ 0001892_ 00298_ 1--
<core_client_version>7.17.0</core_client_version>
<![CDATA[
<stderr_txt>
../../projects/www.worldcommunitygrid.org/wcgrid_opng_autodockgpu_7.28_x86_64-pc-linux-gnu__opencl_nvidia_102 -jobs OPNG_0001892_00298.job -input OPNG_0001892_00298.zip -seed 483169743 -wcgruns 1000 -wcgdpf 20
INFO: Using gpu device from app init data 0
INFO:[13:15:07] Start AutoGrid...

autogrid4: Successful Completion.
INFO:[13:15:11] End AutoGrid...
INFO:[13:15:11] Start AutoDock for ZINC000818558340_2-ACR2.11_RX1--fr2266benz_002--CYS114.dpf(Job #0)...
OpenCL device: GeForce GTX 295
INFO:[13:15:33] End AutoDock...
INFO:[13:15:33] Start AutoDock for ZINC000818542949-ACR2.1_RX1--fr2266benz_002--CYS114.dpf(Job #1)...
OpenCL device: GeForce GTX 295
INFO:[13:19:07] End AutoDock...
INFO:[13:19:07] Start AutoDock for ZINC000424339466-ACR2.27_RX1--fr2266benz_002--CYS114.dpf(Job #2)...
OpenCL device: GeForce GTX 295
INFO:[13:23:02] End AutoDock...
INFO:[13:23:02] Start AutoDock for ZINC000884599826-ACR2.24_RX1--fr2266benz_002--CYS114.dpf(Job #3)...
OpenCL device: GeForce GTX 295
INFO:[13:28:50] End AutoDock...
INFO:[13:28:50] Start AutoDock for ZINC000871824283-ACR2.22_RX1--fr2266benz_002--CYS114.dpf(Job #4)...
OpenCL device: GeForce GTX 295
INFO:[13:35:54] End AutoDock...
INFO:[13:35:54] Start AutoDock for ZINC000582912506-ACR2.16_RX1--fr2266benz_002--CYS114.dpf(Job #5)...
OpenCL device: GeForce GTX 295
INFO:[13:38:27] End AutoDock...
INFO:[13:38:27] Start AutoDock for ZINC000912023328_2-ACR2.22_RX1--fr2266benz_002--CYS114.dpf(Job #6)...
OpenCL device: GeForce GTX 295
INFO:[13:43:47] End AutoDock...
INFO:[13:43:47] Start AutoDock for ZINC000871821582-ACR2.9_RX1--fr2266benz_002--CYS114.dpf(Job #7)...
OpenCL device: GeForce GTX 295
INFO:[13:50:54] End AutoDock...
INFO:[13:50:54] Start AutoDock for ZINC000557347706-ACR2.25_RX1--fr2266benz_002--CYS114.dpf(Job #8)...
OpenCL device: GeForce GTX 295
INFO:[13:52:39] End AutoDock...
INFO:[13:52:39] Start AutoDock for ZINC000874429584-ACR2.1_RX1--fr2266benz_002--CYS114.dpf(Job #9)...
OpenCL device: GeForce GTX 295
INFO:[13:55:01] End AutoDock...
INFO:[13:55:01] Start AutoDock for ZINC000418854936-ACR2.23_RX1--fr2266benz_002--CYS114.dpf(Job #10)...
OpenCL device: GeForce GTX 295
INFO:[13:58:34] End AutoDock...
INFO:[13:58:34] Start AutoDock for ZINC000420396030-ACR2.27_RX1--fr2266benz_002--CYS114.dpf(Job #11)...
OpenCL device: GeForce GTX 295
INFO:[14:04:24] End AutoDock...
INFO:[14:04:24] Start AutoDock for ZINC000332478091-ACR2.22_RX1--fr2266benz_002--CYS114.dpf(Job #12)...
OpenCL device: GeForce GTX 295
INFO:[14:04:40] End AutoDock...
INFO:[14:04:40] Start AutoDock for ZINC000419568129-ACR2.15_RX1--fr2266benz_002--CYS114.dpf(Job #13)...
OpenCL device: GeForce GTX 295
INFO:[14:08:01] End AutoDock...
INFO:[14:08:01] Start AutoDock for ZINC000424391671-ACR2.25_RX1--fr2266benz_002--CYS114.dpf(Job #14)...
OpenCL device: GeForce GTX 295
INFO:[14:10:26] End AutoDock...
INFO:[14:10:26] Start AutoDock for ZINC000818558531_2-ACR2.11_RX1--fr2266benz_002--CYS114.dpf(Job #15)...
OpenCL device: GeForce GTX 295
INFO:[14:10:43] End AutoDock...
INFO:[14:10:43] Start AutoDock for ZINC000423946843-ACR2.25_RX1--fr2266benz_002--CYS114.dpf(Job #16)...
OpenCL device: GeForce GTX 295
INFO:[14:16:10] End AutoDock...
INFO:[14:16:10] Start AutoDock for ZINC000440730361-ACR2.23_RX1--fr2266benz_002--CYS114.dpf(Job #17)...
OpenCL device: GeForce GTX 295
INFO:[14:23:58] End AutoDock...
INFO:[14:23:58] Start AutoDock for ZINC000565249302-ACR2.1_RX1--fr2266benz_002--CYS114.dpf(Job #18)...
OpenCL device: GeForce GTX 295
INFO:[14:27:25] End AutoDock...
INFO:[14:27:25] Start AutoDock for ZINC000415451377-ACR2.25_RX1--fr2266benz_002--CYS114.dpf(Job #19)...
OpenCL device: GeForce GTX 295
INFO:[14:30:47] End AutoDock...
INFO:Cpu time = 4539.054616
14:30:47 (2766): called boinc_finish(0)

</stderr_txt>

Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2073341 · Report as offensive     Reply Quote
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14650
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2073343 - Posted: 14 Apr 2021, 19:21:54 UTC - in response to Message 2073341.  

I don't know what "TDR" refers to.
TDR is a Windows specific concept, but I think it accounts for a large proportion of the iGPU 'time limit exceeded' errors under Windows.
ID: 2073343 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073349 - Posted: 14 Apr 2021, 20:43:26 UTC - in response to Message 2073343.  

I don't know what "TDR" refers to.
TDR is a Windows specific concept, but I think it accounts for a large proportion of the iGPU 'time limit exceeded' errors under Windows.

http://developer.download.nvidia.com/NsightVisualStudio/2.2/Documentation/UserGuide/HTML/Content/Timeout_Detection_Recovery.htm
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073349 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073351 - Posted: 14 Apr 2021, 20:46:12 UTC - in response to Message 2073349.  

If you under Linux you don't need to know about TDR :))

Regarding invalids. As I said before I don't know how they could do validation at all with so different results from run to run on same dataset.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073351 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13163
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2073352 - Posted: 14 Apr 2021, 21:09:33 UTC - in response to Message 2073351.  

If you under Linux you don't need to know about TDR :))

Regarding invalids. As I said before I don't know how they could do validation at all with so different results from run to run on same dataset.

My concern also. From looking at my invalids, I wonder just what the criteria or threshold is for the validator.

Some tasks have nothing but invalids for all attempts and I surmise a bad task parameter set, but others look like mine are the odd-man out type of invalid where the validated tasks match closely in OS or device or whatever.

Sort of what we ran into with same device validations at Seti. Which is not good for producing valid results consistently.

Would have to look at the actual result file to see what it contains since the stderr.txt output does not show anything other than successful job completion.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2073352 · Report as offensive     Reply Quote
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 2073355 - Posted: 14 Apr 2021, 21:58:13 UTC - in response to Message 2073352.  
Last modified: 14 Apr 2021, 22:01:55 UTC

@SETI we at least got exactly same result from run to run on same task.
Results could be little different between devices/compilers/OSes, but they were reproducible (at least if app comes from alpha to beta stage of development).
Cause SETI data processing is deterministic one.
Genetic algorithms for energy of interaction minimization this app uses have random component.
It's possible to get reproducible results even in this case (if pseudo-random generator always gives same sequence starting with same initial seed). But... perhaps they don't use same seed or their random number generator more "random" :)

EDIT: also, simple error like using value of uninitialized memory cell will give unreproducible results too. Hard to tell w/o diving deeply in code what case is here.
SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 2073355 · Report as offensive     Reply Quote
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13163
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2073370 - Posted: 14 Apr 2021, 23:35:01 UTC - in response to Message 2073355.  

There are currently a lot invalid tasks being generated or bad results returned. Looks like the worst culprit has been Apple Mac hosts with Nvidia cards. The admin has stopped sending work to those hosts now. Should reduce the current influx of invalid replications being sent out.

But back at the start of the non-beta task releases, there were a lot of single replication tasks once the host was validated as a "trusted" host.

That seems to have mostly stopped and most of the work I have been getting requires a wingman to validate the task.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2073370 · Report as offensive     Reply Quote
Profile tullio
Volunteer tester

Send message
Joined: 9 Apr 04
Posts: 8797
Credit: 2,930,782
RAC: 1
Italy
Message 2073387 - Posted: 15 Apr 2021, 6:03:50 UTC
Last modified: 15 Apr 2021, 6:21:07 UTC

Got my first GPU tasks on my WCG OpenPandemics-Covid-19. On the BOINC workshop also BOINC@TACC said it was working on Covid-19 but it does not send me any task.
Tullio
Thet take about 10 minutes on my GTX 1060 using opencl_nvidia_102.
ID: 2073387 · Report as offensive     Reply Quote
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2073413 - Posted: 15 Apr 2021, 15:22:47 UTC
Last modified: 15 Apr 2021, 15:23:49 UTC

Just realized I got in 300+ WCG GPU tasks

I was unable to do an exact count of the valid ones as there was no way to filter on "OPNG" and too many pages to review at their website.

300 Valid -vs- 17 Invalid on 3 different systems, all GTX 1070 or better

Unfortunately, my fastest system was not crunching when the boatload of tasks came in nor was it recording BoincTasks history which would have been useful for statistics.
ID: 2073413 · Report as offensive     Reply Quote
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2073419 - Posted: 15 Apr 2021, 16:02:12 UTC - in response to Message 2073349.  
Last modified: 15 Apr 2021, 16:08:09 UTC

I don't know what "TDR" refers to.
TDR is a Windows specific concept, but I think it accounts for a large proportion of the iGPU 'time limit exceeded' errors under Windows.

http://developer.download.nvidia.com/NsightVisualStudio/2.2/Documentation/UserGuide/HTML/Content/Timeout_Detection_Recovery.htm


Thanks, I was unaware of TDR nor that app.

Thinking about this, I do not recall seeing "Nvidia driver was reset" warnings recently (like since last year) and I suspect they are being hidden or dealt with another way.

I do get occasional and unexpected "desktop refresh" but the event viewer noes not list nvidia which I think is suspicious.

nvidia-smi gives the error message "NVidia requested a reboot" on rare occasions which I trap for on my linux temperature reporting app.
ID: 2073419 · Report as offensive     Reply Quote
Profile Joseph Stateson Project Donor
Volunteer tester
Avatar

Send message
Joined: 27 May 99
Posts: 309
Credit: 70,759,933
RAC: 3
United States
Message 2073435 - Posted: 15 Apr 2021, 19:51:57 UTC

To late to edit my first post but I did manage to find some statistics on completion of the COVID apps
Significantly faster times for GPU calculations as shown in attached pic.
About 3 minutes to do a calculation as opposed to 2.5 to 6 hours for the CPUs.
Not all results were in my Boinctasks history. WCG site shows 300+ results of which 17 were invalid.




====and best of all====

ID: 2073435 · Report as offensive     Reply Quote
Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 43 · Next

Message boards : Number crunching : SETI orphans


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.