Message boards :
Number crunching :
question to result.sah
Message board moderation
Author | Message |
---|---|
bjacke Send message Joined: 14 Apr 02 Posts: 346 Credit: 13,761 RAC: 0 |
Hi @all, I like to read some data out of this xml, but I'm not sure wheather this is a real XML, because some problems occured. Any hints? WARR - Wissenschaftliche Arbeitsgemeinschaft für Raketentechnik und Raumfahrt (WARR - scientific working group for rocket technology and space travel) |
ric Send message Joined: 16 Jun 03 Posts: 482 Credit: 666,047 RAC: 0 |
> Hi @all, I like to read some data out of this xml, but I'm not sure wheather > this is a real XML, because some problems occured. Any hints? > > You are focusing the c:\\boincfolder\\slot01234\\result.sah ? the *temp* name of the actual work unit, slot based. What exactly would you read, what kind of data? I guess, you will be more lucky, taken a closer look to the client-state.xml Was sucht Du denn?, vielleicht kannst Du Dein Anliegen detailierter darstellen? put more details here. |
Roberto Virga Send message Joined: 29 May 99 Posts: 32 Credit: 41,362 RAC: 0 |
This is going to be a long response, but I hope it will contain all the information you seek. In the BOINC main directory there are two subdirectories, one named "slots" and one named "projects". The one named "slots" will in turn contain sub-subdirectories named with progressive numbers starting from 0 (e.g. "0", "1", "2", etc.). These correspond to the list of active task set (search for the "active_task_set" tag the file client_state.xml in the BOINC directory). If you're running SETI@home, inside some of these sub-subdirectories you will find files named: "work_unit.sah" "result.sah" "state.sah" These are all XML files, and they are distant cousins of the files with the same name that SETI@home Classic uses. The files "work_unit.sah" and "result.sah" are XML soft links to the real location of the work unit and result files, which are usually located somewhere under the "project" subdirectory. For example, "result.sah" may contain something like: <soft_link>..\\..\\setiathome.berkeley.edu7mr04ab.6777.497.261086.117_0_0</soft_link> This gives you a recipe of how to locate the result file: go up one directory, go up again (".." means "go up one directory"), go to the subdirectory "projects", go to the sub-subdirectory "setiathome.berkeley.edu", and the result file will be a file named "07mr04ab.6777.497.261086.117_0_0" in there. The file "state.sah" instead is not a soft link, but a big XML file that will contain information regarding the state of the SETI@home client. If you're going to parse this with a XML parser, there are a few bugs you should be aware of. These are XML syntax errors, and if you don't correct these before passing the XML text to the parser, parsing will probably fail. They are: 1) There are some tags that have as attributes length=n, where n is a number. According to XML syntax, all attribute values should be quoted, and these aren't. Hence you should replace something like this: <pot length=225 encoding="x-csv"> with: <pot length="225" encoding="x-csv"> 2) The tag bs_fft_ind is not terminated. If you encounter a line like: <bs_fft_ind>6</bs_fft_ind you shoud add a > at the end: <bs_fft_ind>6</bs_fft_ind> Again, this is a XML syntax error (one much more serious than (1) above), and the file won't parse until you fix it. Now, let's talk about the "projects" subdirectory. This is where most of the stuff is located. Inside "projects", there should be either a sub-subdirectory named "setiathome.berkeley.edu", or one named "setiweb.ssl.berkeley.edu". Inside this you will find all the workunit and result files. As a rule of thumb, result files are the one with the names that end in "_0". Again, if you're using a XML parser, you'll have trouble parsing both workunit and result files. For workunit files, you should ignore anything which is between "<data>" and "</data>". The characters contained within violate XML syntax because contain special characters like "<", ">" and "&". XML syntax dictates that when these characters are to be included in an XML file, they should be replaced by their corresponding XML escape sequences. Result files have the same problem as point (1) of the "state.sah" files: there are attributes length=n where the n is an unquoted number. This is what I learned, by trial and error, using the XML parser provided by the Qt Toolkit to parse the SETI@home files. - Roberto |
bjacke Send message Joined: 14 Apr 02 Posts: 346 Credit: 13,761 RAC: 0 |
@Roberto: Thx for your detailed message, it fully answered my question. But what I want to know is wheather I can change the state.sah during process, so that it won't effect curing crunching. WARR - Wissenschaftliche Arbeitsgemeinschaft für Raketentechnik und Raumfahrt (WARR - scientific working group for rocket technology and space travel) |
Roberto Virga Send message Joined: 29 May 99 Posts: 32 Credit: 41,362 RAC: 0 |
Basti, don't change the files on disk. In order to feed them to the XML parser, you'll have to read them in memory. As you read them, fix the errors (in the copy you have in memory). That's what I do in my program. - Roberto |
bjacke Send message Joined: 14 Apr 02 Posts: 346 Credit: 13,761 RAC: 0 |
> Basti, > > don't change the files on disk. In order to feed them to the XML parser, > you'll have to read them in memory. As you read them, fix the errors (in the > copy you have in memory). That's what I do in my program. > > - Roberto > Thx I'll try to stream the file! WARR - Wissenschaftliche Arbeitsgemeinschaft für Raketentechnik und Raumfahrt (WARR - scientific working group for rocket technology and space travel) |
©2024 University of California
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.