question to result.sah

Message boards : Number crunching : question to result.sah
Message board moderation

To post messages, you must log in.

AuthorMessage
bjacke
Volunteer tester
Avatar

Send message
Joined: 14 Apr 02
Posts: 346
Credit: 13,761
RAC: 0
Germany
Message 49518 - Posted: 27 Nov 2004, 7:56:58 UTC

Hi @all, I like to read some data out of this xml, but I'm not sure wheather this is a real XML, because some problems occured. Any hints?



WARR - Wissenschaftliche Arbeitsgemeinschaft für Raketentechnik und Raumfahrt
(WARR - scientific working group for rocket technology and space travel)
ID: 49518 · Report as offensive
ric
Volunteer tester
Avatar

Send message
Joined: 16 Jun 03
Posts: 482
Credit: 666,047
RAC: 0
Switzerland
Message 49731 - Posted: 28 Nov 2004, 21:10:32 UTC - in response to Message 49518.  

> Hi @all, I like to read some data out of this xml, but I'm not sure wheather
> this is a real XML, because some problems occured. Any hints?
>
>
You are focusing the c:\\boincfolder\\slot01234\\result.sah ?
the *temp* name of the actual work unit, slot based.

What exactly would you read, what kind of data?
I guess, you will be more lucky, taken a closer look to the client-state.xml

Was sucht Du denn?, vielleicht kannst Du Dein Anliegen detailierter darstellen?
put more details here.

ID: 49731 · Report as offensive
Roberto Virga
Volunteer tester

Send message
Joined: 29 May 99
Posts: 32
Credit: 41,362
RAC: 0
Italy
Message 49743 - Posted: 28 Nov 2004, 23:17:54 UTC - in response to Message 49518.  
Last modified: 29 Nov 2004, 0:30:26 UTC

This is going to be a long response, but I hope it will contain all the information you seek.

In the BOINC main directory there are two subdirectories, one named "slots" and one named "projects".

The one named "slots" will in turn contain sub-subdirectories named with progressive numbers starting from 0 (e.g. "0", "1", "2", etc.). These correspond to the list of active task set (search for the "active_task_set" tag the file client_state.xml in the BOINC directory). If you're running SETI@home, inside some of these sub-subdirectories you will find files named:
"work_unit.sah"
"result.sah"
"state.sah"
These are all XML files, and they are distant cousins of the files with the same name that SETI@home Classic uses.
The files "work_unit.sah" and "result.sah" are XML soft links to the real location of the work unit and result files, which are usually located somewhere under the "project" subdirectory. For example, "result.sah" may contain something like:
<soft_link>..\\..\\setiathome.berkeley.edu7mr04ab.6777.497.261086.117_0_0</soft_link>
This gives you a recipe of how to locate the result file: go up one directory, go up again (".." means "go up one directory"), go to the subdirectory "projects", go to the sub-subdirectory "setiathome.berkeley.edu", and the result file will be a file named "07mr04ab.6777.497.261086.117_0_0" in there.
The file "state.sah" instead is not a soft link, but a big XML file that will contain information regarding the state of the SETI@home client. If you're going to parse this with a XML parser, there are a few bugs you should be aware of. These are XML syntax errors, and if you don't correct these before passing the XML text to the parser, parsing will probably fail. They are:

1) There are some tags that have as attributes length=n, where n is a number. According to XML syntax, all attribute values should be quoted, and these aren't. Hence you should replace something like this:
<pot length=225 encoding="x-csv">
with:
<pot length="225" encoding="x-csv">

2) The tag bs_fft_ind is not terminated. If you encounter a line like:
<bs_fft_ind>6</bs_fft_ind
you shoud add a > at the end:
<bs_fft_ind>6</bs_fft_ind>
Again, this is a XML syntax error (one much more serious than (1) above), and the file won't parse until you fix it.

Now, let's talk about the "projects" subdirectory. This is where most of the stuff is located. Inside "projects", there should be either a sub-subdirectory named "setiathome.berkeley.edu", or one named "setiweb.ssl.berkeley.edu". Inside this you will find all the workunit and result files. As a rule of thumb, result files are the one with the names that end in "_0". Again, if you're using a XML parser, you'll have trouble parsing both workunit and result files.
For workunit files, you should ignore anything which is between "<data>" and "</data>". The characters contained within violate XML syntax because contain special characters like "<", ">" and "&". XML syntax dictates that when these characters are to be included in an XML file, they should be replaced by their corresponding XML escape sequences.
Result files have the same problem as point (1) of the "state.sah" files: there are attributes length=n where the n is an unquoted number.

This is what I learned, by trial and error, using the XML parser provided by the Qt Toolkit to parse the SETI@home files.

- Roberto

ID: 49743 · Report as offensive
bjacke
Volunteer tester
Avatar

Send message
Joined: 14 Apr 02
Posts: 346
Credit: 13,761
RAC: 0
Germany
Message 49909 - Posted: 29 Nov 2004, 11:45:10 UTC

@Roberto: Thx for your detailed message, it fully answered my question. But what I want to know is wheather I can change the state.sah during process, so that it won't effect curing crunching.




WARR - Wissenschaftliche Arbeitsgemeinschaft für Raketentechnik und Raumfahrt
(WARR - scientific working group for rocket technology and space travel)
ID: 49909 · Report as offensive
Roberto Virga
Volunteer tester

Send message
Joined: 29 May 99
Posts: 32
Credit: 41,362
RAC: 0
Italy
Message 49918 - Posted: 29 Nov 2004, 13:33:17 UTC - in response to Message 49909.  
Last modified: 29 Nov 2004, 13:34:36 UTC

Basti,

don't change the files on disk. In order to feed them to the XML parser, you'll have to read them in memory. As you read them, fix the errors (in the copy you have in memory). That's what I do in my program.

- Roberto
ID: 49918 · Report as offensive
bjacke
Volunteer tester
Avatar

Send message
Joined: 14 Apr 02
Posts: 346
Credit: 13,761
RAC: 0
Germany
Message 49922 - Posted: 29 Nov 2004, 14:03:57 UTC - in response to Message 49918.  

> Basti,
>
> don't change the files on disk. In order to feed them to the XML parser,
> you'll have to read them in memory. As you read them, fix the errors (in the
> copy you have in memory). That's what I do in my program.
>
> - Roberto
>
Thx I'll try to stream the file!



WARR - Wissenschaftliche Arbeitsgemeinschaft für Raketentechnik und Raumfahrt
(WARR - scientific working group for rocket technology and space travel)
ID: 49922 · Report as offensive

Message boards : Number crunching : question to result.sah


 
©2024 University of California
 
SETI@home and Astropulse are funded by grants from the National Science Foundation, NASA, and donations from SETI@home volunteers. AstroPulse is funded in part by the NSF through grant AST-0307956.