Message boards :
Number crunching :
Log/Tracking Tool
Message board moderation
Author | Message |
---|---|
rebest Send message Joined: 16 Apr 00 Posts: 1296 Credit: 45,357,093 RAC: 0 |
Greetings, all. Quick question. Have any of our talented Setizens with way too much time on their hands created a logging or tracking tool that does real time, local monitoring and reporting of work units by machine? BoincStats and the other online systems do a great job of reporting credits. But frankly, after doing this for 13 years, I really could care less about credits. By next week, I'll have 25 Million of them. If I could sell them on eBay, I would. I am more interested in a report that neatly details how many work units my rig received and processed by type (MB/AP). When was each WU received, how was it processed (CPU/GPU), how long did it take to complete (CPU/GPU time and clock time), when was it reported. What was the turnaround time? In short, I'm more interested in knowing how to get the most bang for my crunching buck than how I compare to anyone else. All of these variables are tracked by BOINC in one way or another. I could find them by checking each result manually. I'm hoping to find a nice utility that does all that. Some trend lines and graphics would be nice, too. :) Thanks. Join the PACK! |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
I don't know of any automated method, but if you pull up your application details here on the website for each rig's details, it should give you "number of tasks completed" for each type of work on that machine. Average processing rate and average turn-around time is a recent average and not a total average. Other than that, you can probably load up 'job_log_setiathome.berkeley.edu.txt' (found in the data directory where client_state and cc_config are) into Excel or something similar and have it separate into fields, using space as a delimiter, and then from there, you can gather some information with formulas about it, or make charts. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
rebest Send message Joined: 16 Apr 00 Posts: 1296 Credit: 45,357,093 RAC: 0 |
I don't know of any automated method, but if you pull up your application details here on the website for each rig's details, it should give you "number of tasks completed" for each type of work on that machine. Average processing rate and average turn-around time is a recent average and not a total average. Many thanks for the reply and identifying more data sources. Unfortunately, I'm not particularly talented or creative with such things and right now I have very little time to undertake such a project. Tracking a machine's crunching performance by credits is OK, but measures such as RAC are (to borrow a term from economics) a "lagging indicator", meaning that other factors must occur (wingmen and validation) before the credit is actually counted. Also, we have no way of knowing when a tweak behind the scenes in Berkeley has occurred altering the formula for calculating credit. It's happened before. Thanks again. Join the PACK! |
rob smith Send message Joined: 7 Mar 03 Posts: 22160 Credit: 416,307,556 RAC: 380 |
All credits are awarded after the appropriate quorum for the work unit has been reached, so even "credits" is a lagging indicator. RAC is a long tail rolling average, so is influenced by what has happened in the past. For any meaningful comparison you have to make your observation at the same time each day, or make allowances for the different observational day lengths. The former can be done with a fairly simple cron job to trigger the grabbing of the "your computers" page, parsing it to get the data out and dumping it into a spreadsheet. Or, do it the way I do, while chewing breakfast (which is normally "at the same time" each weekday) I quickly drop onto the page, scribble the numbers on the pad and do the sums; I suppose if I were feeling enthusiastic I could put the raw numbers into a spreadsheet and do all sorts of wizzy things with them, but I'm too busy chomping, reading emails, working out what's going to happen at work, where I'm meant to be going... Bob Smith Member of Seti PIPPS (Pluto is a Planet Protest Society) Somewhere in the (un)known Universe? |
Cosmic_Ocean Send message Joined: 23 Dec 00 Posts: 3027 Credit: 13,516,867 RAC: 13 |
Well, my machines are slow and I only have two of them, so I actually keep track of all of my APs with Excel. I pull up the tasks page probably 2-3 times a day and add new wuIDs and taskIDs, and the beam/channel for each task assigned, and then add the data for the ones that have been reported, and then print the taskID page to PDF. I have all of them since we switched over to _v6 for both machines. It only takes a few minutes each day, unless there are some days like when the splitters ran out of tapes to work on for several days and then I get 30 new tasks, then I have a lot of numbers to type. Or when the servers go down for several days to be relocated or to be repaired and reporting tasks can't happen and I accumulate 10-20 completed APs that get reported all at once, then I have a lot of data to enter in and print to PDF. But for MBs, and especially if there's a GPU involved, I wouldn't even TRY to use my manual method. Linux laptop: record uptime: 1511d 20h 19m (ended due to the power brick giving-up) |
HAL9000 Send message Joined: 11 Sep 99 Posts: 6534 Credit: 196,805,888 RAC: 57 |
For some time I have been wanting to do something like this as well. Sadly I have not got around to it. It would be nice to have something to look at when I see one of my 5 i7-860 machines suddenly climbing in RAC when the other 4 are not, or when an obviously slower machine overtakes a faster one. SETI@home classic workunits: 93,865 CPU time: 863,447 hours Join the [url=http://tinyurl.com/8y46zvu]BP6/VP6 User Group[ |
Raistmer Send message Joined: 16 Jun 01 Posts: 6325 Credit: 106,370,077 RAC: 121 |
There is Perl script I wrote some time ago that parces client_state.xml for MB and AP tasks and extracts some useful statistics. For OpenCL MB it's even some counters so one can plot how execution time depends on number of signals found for example. But more usual usage is to plot execution time vs AR for MB and execution time vs % blanking for AP. Usage of this script requires some Perl interpretator available (not too hard to find very tiny ones) and network switched off (in BOINC) for some time to collect results locally. Usually I fill cache, compute day or 2 w/o network then copy client_state.xml in other location and enable network in BOINC to fill cache. EDIT: current version I use: ExtractTimes_v3.pl $path="client_state.xml"; $results="Times.txt"; open (IN, $path); open (RES, ">".$results); print RES "Task name"."\t"."Result Type"."\t"."Revision"."\t"."Parameter"."\t"."ElapsedTime"."\t"."CPUTime"."\t"."PoT_transfer_needed"."\t"."PoT_transfer_not_needed"."\t"."Gaussian_transfer_needed"."\t"."Gaussian_transfer_not_needed"."\t"."PC_pulse_find_early_miss"."\n"; while (<IN>) { if( /<result>/ ){ #R: we need only result sections in file $Gaussian_transfer_not_needed=0; $Gaussian_transfer_needed=0; $PoT_transfer_not_needed=0; $PoT_transfer_needed=0; $PC_pulse_find_early_miss=0; $trueAR=-1;#R: error condition, will not store such blocks $WUname=""; $ElapsedTime=0; $CPUTime=0; $GPU=0;#R: 2-NV $ResultType=0;#R: 1-AKv8,2-ATI MB,3-ATI AP,4-CPU AP, 5-CUDA MB, 6-NV AP $Revision=0;#R: to distinguish between different builds while(<IN>){ if( /<\/result>/ ){ #R: ready to analyse collected info if( ($trueAR==-1) || ($ElapsedTime==0) ||($CPUTime==0) || ($Revision==0) || ($ResultType==0) ){ last;#R: record unready } print RES $WUname."\t".$ResultType."\t".$Revision."\t".$trueAR."\t".$ElapsedTime."\t".$CPUTime."\t".$PoT_transfer_needed."\t".$PoT_transfer_not_needed."\t".$Gaussian_transfer_needed."\t".$Gaussian_transfer_not_needed."\t".$PC_pulse_find_early_miss."\n"; last;#R: finished with this result record } if(/<name>(.*)<\/name>/){ $WUname=$1; next; } if( /<final_cpu_time>(.*)<\/final_cpu_time>/ ){ $CPUTime=$1; next; } if( /<final_elapsed_time>(.*)<\/final_elapsed_time>/ ){ $ElapsedTime=$1; next; } if( /<exit_status>(.*)<\/exit_status>/ ){ if($1 !=0){ last;#R:invalid result, no need to look into it further } next; } if(/USE_OPENCL_NV/){$GPU=2;} if(/OpenCL/ && $GPU==0){$GPU=1;} if( /AstroPulse v./ ){#R: AP detection block $ResultType=4; while(<IN>){ if(/Windows x86 rev (.*), 6/){ $Revision=$1; next; } if(/Windows x86 rev (.*), V/){ $Revision=$1; next; } if($GPU==1){ $ResultType=3;} if($GPU==2){$ResultType=6;} if( /<\/stderr_txt>/){ last;} if( /<\/result>/ ){#R: broken stderr $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last; } if( /percent blanked: (\S*)/){ $trueAR=$1;#R: for AP this parameter means blanking instead of AR next; } if( /Found 30 single pulses and 30 repeating pulses, exiting./){ $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows } if( /repetitive pulses: 30/){#R: no need task with FFA disabled $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows } } } if( /Windows x86 rev (.*), Don't Panic!/ || /Linux 64 bit, rel. Rev (.*)/ || /Linux 32 bit, rel. Rev (.*)/){#uje: CPU opt AP detection $ResultType=4; $Revision=$1; next; } if( /Multibeam x32f Preview/){ $Revision="32"; $ResultType=5; while (<IN>) { if( /Informational message -9 result_overflow/){ $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows } if( /<\/result>/ ){#R: broken stderr $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last; } if( /WU true angle range is : (\S*)/ ){ $trueAR=$1; next; } if( /<\/stderr_txt>/){ last;} } } if( /Build (.*) , Ported by/ ){#R: ATI MB/AKv8 detection block $Revision=$1; $ResultType=1;#R: suppose AKv8 by default, change if it's ATI MB while (<IN>) { if( /Informational message -9 result_overflow/){ $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last;#R: don't count overflows } if( /<\/result>/ ){#R: broken stderr $Revision=$CPUTime=$ElapsedTime=0;$trueAR=-1;last; } if( /WU true angle range is : (\S*)/ ){ $trueAR=$1;next; } if(/OpenCL version by Raistmer/){ $ResultType=2;#R: do result type correction next; } if(/class Gaussian_transfer_needed: total=(\S*),/){ $Gaussian_transfer_needed=$1; next; } if(/class Gaussian_transfer_not_needed: total=(\S*),/){ $Gaussian_transfer_not_needed=$1;next; } if(/class PoT_transfer_not_needed: total=(\S*),/){ $PoT_transfer_not_needed=$1; next; } if(/class PoT_transfer_needed: total=(\S*),/){ $PoT_transfer_needed=$1; next; } if(/class PC_pulse_find_early_miss: total=(\S*),/){ $PC_pulse_find_early_miss=$1; next; } if( /<\/stderr_txt>/){ last;} }#R:finish with MB task } } } } SETI apps news We're not gonna fight them. We're gonna transcend them. |
kittyman Send message Joined: 9 Jul 00 Posts: 51468 Credit: 1,018,363,574 RAC: 1,004 |
There used to be a couple of guys that would log results and make graphs of some of my rigs when we were testing apps or the old HT on/off question. I don't know how automated they were or if they took a lot of manual manipulation. I think archae86 and maybe Richard were the folks, but don't quote me on that. "Freedom is just Chaos, with better lighting." Alan Dean Foster |
Richard Haselgrove Send message Joined: 4 Jul 99 Posts: 14649 Credit: 200,643,578 RAC: 874 |
There used to be a couple of guys that would log results and make graphs of some of my rigs when we were testing apps or the old HT on/off question. Archae86 has certainly worked on HT - in more ways than one - most recently at Einstein: http://einstein.phys.uwm.edu/forum_thread.php?id=10000. I worked with Joe Segur and others (add Fred W's name to the list) on 'estimates and deadlines'. |
MarkJ Send message Joined: 17 Feb 08 Posts: 1139 Credit: 80,854,192 RAC: 5 |
BOINC logx might help. I haven't used it myself but its supposed to be a logging tool for BOINC clients. Maybe those who have experience with it could chime in here. BOINC blog |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.