Data Chat

Message boards : Number crunching : Data Chat
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 24 · 25 · 26 · 27 · 28 · 29 · 30 . . . 34 · Next

AuthorMessage
Profile betreger Project Donor
Avatar

Send message
Joined: 29 Jun 99
Posts: 11365
Credit: 29,581,041
RAC: 66
United States
Message 2022643 - Posted: 10 Dec 2019, 16:02:04 UTC - in response to Message 2022633.  

We have a nice RTS queue just in time for our Tuesday maintenance !! I'm so glad with all the improvements and upgrades that the system has had to make it so stable and able to handle so much more.

Thanks for the graph Kiska. I think this is a great place to talk about the how the system is doing when it isn't a panic situation.

Yep things look good.
ID: 2022643 · Report as offensive
Grant (SSSF)
Volunteer tester

Send message
Joined: 19 Aug 99
Posts: 13765
Credit: 208,696,464
RAC: 304
Australia
Message 2022733 - Posted: 11 Dec 2019, 7:26:01 UTC - in response to Message 2022627.  
Last modified: 11 Dec 2019, 7:27:39 UTC

What "feature" requests do you want me to implement?
If you could move the Validation values from the present graph (sahv8 Workunits) to the one above (sahv8 Results in Progress), as they are of similar magnitude- in the millions.
Where as the Assimilation, Deletion & Result_deletion values should all be zero (or very close to it) under normal conditions. Even when things go bad, they're usually in the thousands (occasionally the 100's of thousands when things go really, really bad). But never in the millions, as the Validation numbers always are.
Thanks for your efforts.
Grant
Darwin NT
ID: 2022733 · Report as offensive
Profile Tom M
Volunteer tester

Send message
Joined: 28 Nov 02
Posts: 5124
Credit: 276,046,078
RAC: 462
Message 2022759 - Posted: 11 Dec 2019, 13:25:28 UTC

Does anyone have an automated method for providing an average processing time on cpus for the wildly varying named tasks?

I am seeing under 40 minutes and slightly over and hour on the cpu side of my biggest box. I would be interested in seeing a report on that. But I don't really want it to be a manual process.

Tom
A proud member of the OFA (Old Farts Association).
ID: 2022759 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022760 - Posted: 11 Dec 2019, 13:49:20 UTC - in response to Message 2022759.  

Does anyone have an automated method for providing an average processing time on cpus for the wildly varying named tasks?
Remember that you have a file called 'job_log_setiathome.berkeley.edu.txt' in your BOINC data directory.

That contains a single line of information for every task you've ever run, since the machine was first attached to the project. Name, runtime, CPU time are all accessible.

The major fly in the ointment is that there's no data about which device a task was run on: you can only really perform the sort of analysis you're interested in on the log from a machine which runs SETI on one device only: CPU only, or a single GPU only. But it's a start.

Also remember that for Arecibo tasks, the name doesn't contain any information about the task complexity.
ID: 2022760 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2022765 - Posted: 11 Dec 2019, 15:05:37 UTC
Last modified: 11 Dec 2019, 15:06:22 UTC

You can just scrape the info from the website. I do/have done it plenty. Then sort and analyze the info offline with excel or whatever tools you prefer.

You’re limited by whatever tasks still remain on the website though.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2022765 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 2022767 - Posted: 11 Dec 2019, 15:29:57 UTC - in response to Message 2022760.  

Hi Richard,

Remember that you have a file called 'job_log_setiathome.berkeley.edu.txt' in your BOINC data directory.

That contains a single line of information for every task you've ever run, since the machine was first attached to the project. Name, runtime, CPU time are all accessible.

WOW!!! Who's idea was it to do such a thing? I just loaded that file into a text editor and it took about half a minute or more to load. I though it crashed the PC, at first. My file is currently almost 14MB HUGE! That's just over 3 months worth. I'd hate to see that file after 3 years or even longer. You'd never get it loaded into a text editor! If I delete the file, (I really have no purpose for it), will BOINC recreate it and continue on? Do we have any kind of control over that file?

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2022767 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 2022768 - Posted: 11 Dec 2019, 15:34:55 UTC - in response to Message 2022765.  

You can just scrape the info from the website. I do/have done it plenty. Then sort and analyze the info offline with excel or whatever tools you prefer.

You’re limited by whatever tasks still remain on the website though.

Hi Ian,

What does a website scraper look like? I have scrapers out in my Snap-On roll-away tool box in the garage, but I doubt any of them would work. ;)

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2022768 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2022770 - Posted: 11 Dec 2019, 16:11:25 UTC - in response to Message 2022768.  

Scraping is a common term for using a script to retrieve data from the web.

It doesn’t “look” like anything. It’s code.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2022770 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022772 - Posted: 11 Dec 2019, 16:29:50 UTC - in response to Message 2022767.  

My file is currently almost 14MB HUGE!
And mine's 50 MB - 427,309 lines.

I recommend Notepad++ - 'opens' (displays the first page of) the file instantly. Also copes with the *nix convention line endings - that's a lot of the reason for the slow opening with other Windows tools.

I strongly recommend using the file rather than web scraping - we need to protect our rather delicate SETI servers.
ID: 2022772 · Report as offensive
Ian&Steve C.
Avatar

Send message
Joined: 28 Sep 99
Posts: 4267
Credit: 1,282,604,591
RAC: 6,640
United States
Message 2022774 - Posted: 11 Dec 2019, 17:03:22 UTC - in response to Message 2022772.  

I've done enough fairly large scrapes (1000+ tasks), to say that it has little impact on the site/server.

either the server intentionally slows down page requests, or my code just isn't written in a way to accelerate the process. each task is on a new page, which forces you to constantly be loading new pages, which slows it all down to about 1 task/second.
Seti@Home classic workunits: 29,492 CPU time: 134,419 hours

ID: 2022774 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022775 - Posted: 11 Dec 2019, 17:20:12 UTC - in response to Message 2022774.  

That's still one database query per second. BOINC tries to aggregate multiple database accesses when possible - delaying reporting tasks until a batch can be process together, for example.
ID: 2022775 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22258
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2022779 - Posted: 11 Dec 2019, 17:54:18 UTC - in response to Message 2022760.  

One can probably deduce CPU vs GPU for anything other then low power GPUs by simply looking at the run time. Now to work out what the format of the file is (unless I can find the key somewhere)
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2022779 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022780 - Posted: 11 Dec 2019, 18:02:37 UTC - in response to Message 2022779.  
Last modified: 11 Dec 2019, 18:17:03 UTC

My 'SETI data distribution history' is driven from those files via an Access (==SQL) database. I'll get you the table schema.

Bah - it's only queries where you can edit the SQL directly. Is this enough?

ID: 2022780 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22258
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2022781 - Posted: 11 Dec 2019, 18:19:46 UTC
Last modified: 11 Dec 2019, 18:23:39 UTC

Ta.

So far worked out that 9th column is the task name (pretty obvious really!~) and 11th is the run-time (not so obvious).
For most folks that would be a good starter for ten (even when there are over a million entries in the list as one of mine has!)
I'm going to guess (so that Richard can put me right) that the first column is some sort of modified Julian date format, probably the run date & time - this is just based on the way it increments....

And Richard & I crossed paths mid flight - as usual
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2022781 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22258
Credit: 416,307,556
RAC: 380
United Kingdom
Message 2022782 - Posted: 11 Dec 2019, 18:21:02 UTC

Looking at the data those make sense.

Now to do some playing and see what happens.
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 2022782 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022787 - Posted: 11 Dec 2019, 18:28:41 UTC - in response to Message 2022782.  

This is how I spec'd it

ID: 2022787 · Report as offensive
Stephen "Heretic" Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 20 Sep 12
Posts: 5557
Credit: 192,787,363
RAC: 628
Australia
Message 2022798 - Posted: 11 Dec 2019, 20:44:54 UTC - in response to Message 2022760.  
Last modified: 11 Dec 2019, 20:55:15 UTC

Remember that you have a file called 'job_log_setiathome.berkeley.edu.txt' in your BOINC data directory.
That contains a single line of information for every task you've ever run, since the machine was first attached to the project. Name, runtime, CPU time are all accessible.


. . Thanks for the additional information ...

Stephen

:)
ID: 2022798 · Report as offensive
Profile Siran d'Vel'nahr
Volunteer tester
Avatar

Send message
Joined: 23 May 99
Posts: 7379
Credit: 44,181,323
RAC: 238
United States
Message 2022815 - Posted: 11 Dec 2019, 22:50:19 UTC - in response to Message 2022772.  

My file is currently almost 14MB HUGE!
And mine's 50 MB - 427,309 lines.

I recommend Notepad++ - 'opens' (displays the first page of) the file instantly. Also copes with the *nix convention line endings - that's a lot of the reason for the slow opening with other Windows tools.

I strongly recommend using the file rather than web scraping - we need to protect our rather delicate SETI servers.

Hi Richard,

I'm not running BOINC on Winders, I'm running on Linux. I use the Notepad that is installed with Wine. Perhaps I'll try something other than Notepad. ;)

I have no clue on how to scrape the web/database/servers.

Have a great day! :)

Siran
CAPT Siran d'Vel'nahr - L L & P _\\//
Winders 11 OS? "What a piece of junk!" - L. Skywalker
"Logic is the cement of our civilization with which we ascend from chaos using reason as our guide." - T'Plana-hath
ID: 2022815 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14655
Credit: 200,643,578
RAC: 874
United Kingdom
Message 2022823 - Posted: 11 Dec 2019, 22:59:31 UTC - in response to Message 2022815.  

After 6 months, the file on my Linux machine is 37 MB, 274,673 lines long. It opened instantly in the native Ubuntu text editor, but did take several seconds to scroll down to the end.
ID: 2022823 · Report as offensive
Profile Keith Myers Special Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 29 Apr 01
Posts: 13164
Credit: 1,160,866,277
RAC: 1,873
United States
Message 2022842 - Posted: 12 Dec 2019, 2:42:16 UTC - in response to Message 2022823.  

I hesitate to try and open my 280MB file with text editor. What the heck . . . . opens instantly but takes two minutes to crawl the loading process bar to the end.
Seti@Home classic workunits:20,676 CPU time:74,226 hours

A proud member of the OFA (Old Farts Association)
ID: 2022842 · Report as offensive
Previous · 1 . . . 24 · 25 · 26 · 27 · 28 · 29 · 30 . . . 34 · Next

Message boards : Number crunching : Data Chat


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.