Message boards :
Number crunching :
Setting up Linux to crunch CUDA90 and above for Windows users
Message board moderation
Previous · 1 . . . 143 · 144 · 145 · 146 · 147 · 148 · 149 . . . 162 · Next
Author | Message |
---|---|
Tom M Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 |
I will turn the Mining mode back on and see if it enables the Cpu virtualization or not. I accidentally turned off the Mining mode when I toggled at least one of the Pcie settings in the bio to "Auto". Will poke around and make report(s) on a fully loaded set of gpus with limited cpu threads. I should also see what happens with "-nobs" engaged even though I can't use that when I am running more than 16 gpus.... Oh, yes. I am limiting the Seti@Home cpu tasks being crunched because my dinky little Lga 1151 cpu cooler wasn't really sized for this cpu. I can't fit the larger lga 1151 cooler in there without moving the MB to the back of the rack. And I am not sure (yet) if I have holes to do that. Here flashlight. Tom A proud member of the OFA (Old Farts Association). |
Buckeye4LF Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 |
so everything seemed to be working okay, but now the host is telling me nothing is computing and BOINC says i have 122 things in process. I tried resetting and updating via boinccmd. Is this a symptom of purging my install and starting over? I made sure nothing was computing before deleting the old directory but now it seems there might have been some ghosts created. Do I just have to wait till scheduler pulls new work? |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
If you didn't set NNT and finish all your work, and just purged the distro folder as you said you did. Then you have "ghosted" all those work units. The scheduler still thinks you have them and why they show as "In Progress" Plus you have abandoned all your existing work and you are in the "Penalty Box" Until you return some work, the scheduler will only send you one task per day. Once you start returning valid work, the scheduler will start bumping up the amount of work it sends you. You can try the ghost recovery procedure and see if you can recover those 122 tasks. https://setiathome.berkeley.edu/forum_thread.php?id=84176 Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13861 Credit: 208,696,464 RAC: 304 |
Do I just have to wait till scheduler pulls new work?I would probably start from scratch. If this is a crunching only system, just wipe everything & reinstall the OS. Once that is done, update the video driver from Nouveau to the Nvidia supplied driver. If you plan to process AstroPulse WUs you'll most likely need to manually install OpenCL support. Once all of that is done, copy the All-In-One folder to the new system, extract all the files & folders in to a BOINC folder in your Home directory. Follow the instructions in the readme file to setup the applications you wish to use (if different from the defaults). Double click on boincmgr. Attach to Seti. Crunch away. Grant Darwin NT |
Buckeye4LF Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 |
That does make sense though, anything reported " anonymous platform" is running out of the new install directory. My heat map shows 0% CPU at the moment though. I have no idea what those 122 tasks are. I will try recovery as mentioned above. It is irritating that my dedicated machine is idle while my windows machine chugs along.... |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
For a newbie Linux user, the first startup can be a bit bumpy. Once you get some work turned in, the rig will start cranking out the work and should soon eclipse the Windows rig. Might take a day is all and you need to have patience. Frustrating for sure with the capital outlay for new hardware and nothing to be shown for it initially. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Buckeye4LF Send message Joined: 19 Jun 00 Posts: 173 Credit: 54,916,209 RAC: 833 |
no luck on ghost recovery.... guess i will only be running seti 1 wu day for the next month until everything expires. Just connected dedicated machine to Collatz and got 100+ tasks at least I can crunch something.... |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
If I'm looking at the right machine, it shows you haven't contacted the Server since 8 Feb 2020, 8:00:16 UTC, is 8894243 the right machine? https://setiathome.berkeley.edu/hosts_user.php?userid=74809 If true, that's a good reason for not reporting/receiving work.... |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13861 Credit: 208,696,464 RAC: 304 |
no luck on ghost recovery.... guess i will only be running seti 1 wu day for the next month until everything expires.For every WU that Validates, you get 2 more to process. As Tbar posted, that system hasn't even contacted the server lately, so you could do as i suggested- start from scratch with a clean slate, and a new system ID with no outstanding errors. Get the system set up & ready to run the Special Application, with no previous installations to deal with. Copy over the All in one, select your chosen Applications to use, attach to Seti and off you go. Grant Darwin NT |
TBar Send message Joined: 22 May 99 Posts: 5204 Credit: 840,779,836 RAC: 2,768 |
Well, a little good news for the upcoming Ubuntu 20.04. I installed today's daily build and aside from a little video driver weirdness everything CUDA seems to work. My recent version of boinc 7.16.2 works without any dependencies, and just One dependency is needed for the Manager. Seems they, so far, have left out a needed gtk2.0-x11 library, which can be solved by installing the libgtk2.0-dev package. Hopefully the release will include the library later on, but, running one command-line to install the package isn't bad. The Main install includes nVidia driver 440.59 out of the box and worked fine until I installed the cool-bits command. After that it was a stuck in the login loop until you went back to the Nouveau driver. Nothing seemed to fix it after that, but, if you boot to the recovery mode and then hit Resume it will boot normally with a working driver even though they say the driver needs a full boot. Hey, if it works...and saves doing a reinstall, I'll use it for now. Seems to work, Operating System: Linux Ubuntu Ubuntu Focal Fossa (development branch) [5.4.0-12-generic|libc 2.30 (Ubuntu GLIBC 2.30-0ubuntu3)] |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Good to know TBar. April is approaching fast and I hope that Ubuntu 20.04 LTS is fully baked. I doubt I will try the in place distro upgrade from 18.04 to 20.04. My attempt to move from 19.04 to 19.10 via distro-upgrade did not go well. Probably will just zip up the BOINC folder and blow the 18.04 away and install from scratch. Too much deadwood in the existing build likely. If there is just one missing dependency for the Manager, that is not that bad and easily overcome. Thanks for being the early pioneer with arrows in your back. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
Great suggestion Richard, need to mention it to the GPUUG developers to hunt out the checkpointing code in the special app and disable it. Then there would be no need to set a checkpoint interval longer than the normal crunching time for gpu tasks. . . It has an appeal and would solve issues with special sauce on GPUs, but what if the host is also crunching on CPUs? Would it not mean that a task that was restarted after running for an hour or so would have to go back to the beginning? Stephen ? ? |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
no luck on ghost recovery.... guess i will only be running seti 1 wu day for the next month until everything expires.For every WU that Validates, you get 2 more to process. . . Once that is done you can use the Juan method to restore it to the original Host identity if you wish. It would be the cleanest way to get past the current issues. Stephen |
Grant (SSSF) Send message Joined: 19 Aug 99 Posts: 13861 Credit: 208,696,464 RAC: 304 |
Nope.Great suggestion Richard, need to mention it to the GPUUG developers to hunt out the checkpointing code in the special app and disable it. Then there would be no need to set a checkpoint interval longer than the normal crunching time for gpu tasks.. . It has an appeal and would solve issues with special sauce on GPUs, but what if the host is also crunching on CPUs? Would it not mean that a task that was restarted after running for an hour or so would have to go back to the beginning? The change would just be to the Special Application; if it doesn't actually write a checkpoint file to disk when it's asked to, then when the client restarts, the WU starts from scratch. Grant Darwin NT |
Keith Myers Send message Joined: 29 Apr 01 Posts: 13164 Credit: 1,160,866,277 RAC: 1,873 |
Great suggestion Richard, need to mention it to the GPUUG developers to hunt out the checkpointing code in the special app and disable it. Then there would be no need to set a checkpoint interval longer than the normal crunching time for gpu tasks. No Richard requested the special app have the checkpoint removed. Not the client. The cpu would continue to checkpoint normally set by the account settings. Same for the stock SoG gpu or any other app. Just the special app would have the change. Seti@Home classic workunits:20,676 CPU time:74,226 hours A proud member of the OFA (Old Farts Association) |
Tom M Send message Joined: 28 Nov 02 Posts: 5126 Credit: 276,046,078 RAC: 462 |
And that would allow us to go back to the checkpoint every 60 seconds so instead of losing 4 minutes on a shutdown, we would lose a minute. Tom A proud member of the OFA (Old Farts Association). |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
No Richard requested the special app have the checkpoint removed. Not the client. The cpu would continue to checkpoint normally set by the account settings. Same for the stock SoG gpu or any other app. Just the special app would have the change. . . OK, that would be good, I had the impression that checkpointing was under the control of the BOINC client. Stephen . . |
Stephen "Heretic" Send message Joined: 20 Sep 12 Posts: 5557 Credit: 192,787,363 RAC: 628 |
. . True. Stephen |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14680 Credit: 200,643,578 RAC: 874 |
. . OK, that would be good, I had the impression that checkpointing was under the control of the BOINC client.BOINC permits checkpointing at the stated interval, but doesn't control it. Only the science application knows when the task data is in a sufficiently coherent state to make a consistent checkpoint file, usable for a restart. |
rob smith Send message Joined: 7 Mar 03 Posts: 22556 Credit: 416,307,556 RAC: 380 |
While it is under the control of BOINC it is up to the developers of each application to include it or not, and exactly what is written to disk during a checkpoint operation. Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.