Log/Tracking Tool

Message boards : Number crunching : Log/Tracking Tool
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile rebest Project Donor
Volunteer tester
Avatar

Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 45,357,093
RAC: 0
United States
Message 1364469 - Posted: 4 May 2013, 17:39:59 UTC

Greetings, all.

Quick question. Have any of our talented Setizens with way too much time on their hands created a logging or tracking tool that does real time, local monitoring and reporting of work units by machine? BoincStats and the other online systems do a great job of reporting credits. But frankly, after doing this for 13 years, I really could care less about credits. By next week, I'll have 25 Million of them. If I could sell them on eBay, I would.

I am more interested in a report that neatly details how many work units my rig received and processed by type (MB/AP). When was each WU received, how was it processed (CPU/GPU), how long did it take to complete (CPU/GPU time and clock time), when was it reported. What was the turnaround time? In short, I'm more interested in knowing how to get the most bang for my crunching buck than how I compare to anyone else.

All of these variables are tracked by BOINC in one way or another. I could find them by checking each result manually. I'm hoping to find a nice utility that does all that. Some trend lines and graphics would be nice, too. :)

Thanks.

Join the PACK!
ID: 1364469 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1364689 - Posted: 5 May 2013, 7:07:30 UTC

I don't know of any automated method, but if you pull up your application details here on the website for each rig's details, it should give you "number of tasks completed" for each type of work on that machine. Average processing rate and average turn-around time is a recent average and not a total average.

Other than that, you can probably load up 'job_log_setiathome.berkeley.edu.txt' (found in the data directory where client_state and cc_config are) into Excel or something similar and have it separate into fields, using space as a delimiter, and then from there, you can gather some information with formulas about it, or make charts.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1364689 · Report as offensive
Profile rebest Project Donor
Volunteer tester
Avatar

Send message
Joined: 16 Apr 00
Posts: 1296
Credit: 45,357,093
RAC: 0
United States
Message 1364867 - Posted: 5 May 2013, 17:55:20 UTC - in response to Message 1364689.  

I don't know of any automated method, but if you pull up your application details here on the website for each rig's details, it should give you "number of tasks completed" for each type of work on that machine. Average processing rate and average turn-around time is a recent average and not a total average.

Other than that, you can probably load up 'job_log_setiathome.berkeley.edu.txt' (found in the data directory where client_state and cc_config are) into Excel or something similar and have it separate into fields, using space as a delimiter, and then from there, you can gather some information with formulas about it, or make charts.


Many thanks for the reply and identifying more data sources. Unfortunately, I'm not particularly talented or creative with such things and right now I have very little time to undertake such a project.

Tracking a machine's crunching performance by credits is OK, but measures such as RAC are (to borrow a term from economics) a "lagging indicator", meaning that other factors must occur (wingmen and validation) before the credit is actually counted. Also, we have no way of knowing when a tweak behind the scenes in Berkeley has occurred altering the formula for calculating credit. It's happened before.

Thanks again.



Join the PACK!
ID: 1364867 · Report as offensive
rob smith Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer moderator
Volunteer tester

Send message
Joined: 7 Mar 03
Posts: 22160
Credit: 416,307,556
RAC: 380
United Kingdom
Message 1364885 - Posted: 5 May 2013, 18:27:59 UTC

All credits are awarded after the appropriate quorum for the work unit has been reached, so even "credits" is a lagging indicator.
RAC is a long tail rolling average, so is influenced by what has happened in the past.
For any meaningful comparison you have to make your observation at the same time each day, or make allowances for the different observational day lengths. The former can be done with a fairly simple cron job to trigger the grabbing of the "your computers" page, parsing it to get the data out and dumping it into a spreadsheet. Or, do it the way I do, while chewing breakfast (which is normally "at the same time" each weekday) I quickly drop onto the page, scribble the numbers on the pad and do the sums; I suppose if I were feeling enthusiastic I could put the raw numbers into a spreadsheet and do all sorts of wizzy things with them, but I'm too busy chomping, reading emails, working out what's going to happen at work, where I'm meant to be going...
Bob Smith
Member of Seti PIPPS (Pluto is a Planet Protest Society)
Somewhere in the (un)known Universe?
ID: 1364885 · Report as offensive
Cosmic_Ocean
Avatar

Send message
Joined: 23 Dec 00
Posts: 3027
Credit: 13,516,867
RAC: 13
United States
Message 1364993 - Posted: 5 May 2013, 23:31:25 UTC

Well, my machines are slow and I only have two of them, so I actually keep track of all of my APs with Excel. I pull up the tasks page probably 2-3 times a day and add new wuIDs and taskIDs, and the beam/channel for each task assigned, and then add the data for the ones that have been reported, and then print the taskID page to PDF. I have all of them since we switched over to _v6 for both machines.

It only takes a few minutes each day, unless there are some days like when the splitters ran out of tapes to work on for several days and then I get 30 new tasks, then I have a lot of numbers to type. Or when the servers go down for several days to be relocated or to be repaired and reporting tasks can't happen and I accumulate 10-20 completed APs that get reported all at once, then I have a lot of data to enter in and print to PDF.

But for MBs, and especially if there's a GPU involved, I wouldn't even TRY to use my manual method.
Linux laptop:
record uptime: 1511d 20h 19m (ended due to the power brick giving-up)
ID: 1364993 · Report as offensive
Profile HAL9000
Volunteer tester
Avatar

Send message
Joined: 11 Sep 99
Posts: 6534
Credit: 196,805,888
RAC: 57
United States
Message 1365115 - Posted: 6 May 2013, 13:09:57 UTC

For some time I have been wanting to do something like this as well. Sadly I have not got around to it. It would be nice to have something to look at when I see one of my 5 i7-860 machines suddenly climbing in RAC when the other 4 are not, or when an obviously slower machine overtakes a faster one.
SETI@home classic workunits: 93,865 CPU time: 863,447 hours
Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[
ID: 1365115 · Report as offensive
Profile Raistmer
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 16 Jun 01
Posts: 6325
Credit: 106,370,077
RAC: 121
Russia
Message 1365134 - Posted: 6 May 2013, 15:09:12 UTC
Last modified: 6 May 2013, 15:16:40 UTC

There is Perl script I wrote some time ago that parces client_state.xml for MB and AP tasks and extracts some useful statistics. For OpenCL MB it's even some counters so one can plot how execution time depends on number of signals found for example. But more usual usage is to plot execution time vs AR for MB and execution time vs % blanking for AP.
Usage of this script requires some Perl interpretator available (not too hard to find very tiny ones) and network switched off (in BOINC) for some time to collect results locally.
Usually I fill cache, compute day or 2 w/o network then copy client_state.xml in other location and enable network in BOINC to fill cache.

EDIT:
current version I use: ExtractTimes_v3.pl

$path="client_state.xml"; 
$results="Times.txt";


open (IN, $path);  
open (RES, ">".$results);
print RES "Task name"."\t"."Result Type"."\t"."Revision"."\t"."Parameter"."\t"."ElapsedTime"."\t"."CPUTime"."\t"."PoT_transfer_needed"."\t"."PoT_transfer_not_needed"."\t"."Gaussian_transfer_needed"."\t"."Gaussian_transfer_not_needed"."\t"."PC_pulse_find_early_miss"."\n";

while (<IN>) {   
		if( /<result>/ ){
#R: we need only result sections in file 
			$Gaussian_transfer_not_needed=0;
			$Gaussian_transfer_needed=0;
			$PoT_transfer_not_needed=0;
			$PoT_transfer_needed=0;
			$PC_pulse_find_early_miss=0;
			$trueAR=-1;#R: error condition, will not store such blocks
			$WUname="";
			$ElapsedTime=0;
			$CPUTime=0;
			$GPU=0;#R: 2-NV
			$ResultType=0;#R: 1-AKv8,2-ATI MB,3-ATI AP,4-CPU AP, 5-CUDA MB, 6-NV AP
			$Revision=0;#R: to distinguish between different builds
			while(<IN>){
				if( /<\/result>/ ){
					#R: ready to analyse collected info
					if( ($trueAR==-1) || ($ElapsedTime==0) ||($CPUTime==0) || ($Revision==0) ||
						($ResultType==0) ){ last;#R: record unready
					}
					print RES $WUname."\t".$ResultType."\t".$Revision."\t".$trueAR."\t".$ElapsedTime."\t".$CPUTime."\t".$PoT_transfer_needed."\t".$PoT_transfer_not_needed."\t".$Gaussian_transfer_needed."\t".$Gaussian_transfer_not_needed."\t".$PC_pulse_find_early_miss."\n";	
					last;#R: finished with this result record
				}
				if(/<name>(.*)<\/name>/){
					$WUname=$1;
					next;
				}
				if( /<final_cpu_time>(.*)<\/final_cpu_time>/ ){
					$CPUTime=$1;
					next;
				}
				if( /<final_elapsed_time>(.*)<\/final_elapsed_time>/ ){
					$ElapsedTime=$1;
					next;
				}
				if( /<exit_status>(.*)<\/exit_status>/ ){
					if($1 !=0){ last;#R:invalid result, no need to look into it further
					}
					next;
				}
				if(/USE_OPENCL_NV/){$GPU=2;}
				if(/OpenCL/ && $GPU==0){$GPU=1;}
				if( /AstroPulse v./ ){#R: AP detection block
					$ResultType=4;
					while(<IN>){           
						if(/Windows x86 rev (.*), 6/){
							$Revision=$1;
							next;
						}
						if(/Windows x86 rev (.*), V/){
							$Revision=$1;
							next;
						}
						if($GPU==1){ $ResultType=3;} 
					        if($GPU==2){$ResultType=6;}
						if( /<\/stderr_txt>/){ last;}
					 	if( /<\/result>/ ){#R: broken stderr
   					   		$Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;
					 	}
						if( /percent blanked: (\S*)/){
							$trueAR=$1;#R: for AP this parameter means blanking instead of AR
							next;
						}
						if( /Found 30 single pulses and 30 repeating pulses, exiting./){
 							$Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows
						}
						if( /repetitive pulses: 30/){#R: no need task with FFA disabled
 							$Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows
						}
					}
					
				}
				if( /Windows x86 rev (.*), Don't Panic!/ || /Linux 64 bit, rel.  Rev (.*)/ || /Linux 32 bit, rel.  Rev (.*)/){#uje: CPU opt AP detection
          $ResultType=4;
          $Revision=$1;
          next;
        }
				if( /Multibeam x32f Preview/){
					$Revision="32";
					$ResultType=5;
					while (<IN>) {   
					 if( /Informational message -9 result_overflow/){
   					   $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows
					 }
					 if( /<\/result>/ ){#R: broken stderr
   					   $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;
					 }
					 if( /WU true angle range is :  (\S*)/ ){
						$trueAR=$1;
						next;
					 }
					 if( /<\/stderr_txt>/){ last;}
					}
				}
				if( /Build (.*) , Ported by/ ){#R: ATI MB/AKv8 detection block
					$Revision=$1;
					$ResultType=1;#R: suppose AKv8 by default, change if it's ATI MB
					while (<IN>) {   
					 if( /Informational message -9 result_overflow/){
   					   $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows
					 }
					 if( /<\/result>/ ){#R: broken stderr
   					   $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;
					 }
					 if( /WU true angle range is :  (\S*)/ ){
						$trueAR=$1;next;
					 }
					 if(/OpenCL version by Raistmer/){
						$ResultType=2;#R: do result type correction
						next;
					 }
					if(/class Gaussian_transfer_needed:	total=(\S*),/){
						$Gaussian_transfer_needed=$1;
						next;
					}
					if(/class Gaussian_transfer_not_needed:	total=(\S*),/){
						$Gaussian_transfer_not_needed=$1;next;
					}
					if(/class PoT_transfer_not_needed:	total=(\S*),/){
						$PoT_transfer_not_needed=$1;       next;
					}
					if(/class PoT_transfer_needed:	total=(\S*),/){
						$PoT_transfer_needed=$1; next;
					}
                                        if(/class PC_pulse_find_early_miss:	total=(\S*),/){
						$PC_pulse_find_early_miss=$1; next;
					}
                                                    
					 if( /<\/stderr_txt>/){ last;}
					}#R:finish with MB task
				}
			}

		}
}

SETI apps news
We're not gonna fight them. We're gonna transcend them.
ID: 1365134 · Report as offensive
kittyman Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 9 Jul 00
Posts: 51468
Credit: 1,018,363,574
RAC: 1,004
United States
Message 1365151 - Posted: 6 May 2013, 15:56:44 UTC

There used to be a couple of guys that would log results and make graphs of some of my rigs when we were testing apps or the old HT on/off question.
I don't know how automated they were or if they took a lot of manual manipulation.

I think archae86 and maybe Richard were the folks, but don't quote me on that.
"Freedom is just Chaos, with better lighting." Alan Dean Foster

ID: 1365151 · Report as offensive
Richard Haselgrove Project Donor
Volunteer tester

Send message
Joined: 4 Jul 99
Posts: 14649
Credit: 200,643,578
RAC: 874
United Kingdom
Message 1365218 - Posted: 6 May 2013, 18:18:19 UTC - in response to Message 1365151.  

There used to be a couple of guys that would log results and make graphs of some of my rigs when we were testing apps or the old HT on/off question.
I don't know how automated they were or if they took a lot of manual manipulation.

I think archae86 and maybe Richard were the folks, but don't quote me on that.

Archae86 has certainly worked on HT - in more ways than one - most recently at Einstein: http://einstein.phys.uwm.edu/forum_thread.php?id=10000.

I worked with Joe Segur and others (add Fred W's name to the list) on 'estimates and deadlines'.
ID: 1365218 · Report as offensive
MarkJ Crowdfunding Project Donor*Special Project $75 donorSpecial Project $250 donor
Volunteer tester
Avatar

Send message
Joined: 17 Feb 08
Posts: 1139
Credit: 80,854,192
RAC: 5
Australia
Message 1365380 - Posted: 7 May 2013, 10:10:59 UTC

BOINC logx might help. I haven't used it myself but its supposed to be a logging tool for BOINC clients. Maybe those who have experience with it could chime in here.
BOINC blog
ID: 1365380 · Report as offensive

Message boards : Number crunching : Log/Tracking Tool


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.