xmlFlatListTree             package:XML             R Documentation

_C_o_n_s_t_r_u_c_t_o_r_s _f_o_r _t_r_e_e_s _s_t_o_r_e_d _a_s _f_l_a_t _l_i_s_t _o_f _n_o_d_e_s _w_i_t_h
_i_n_f_o_r_m_a_t_i_o_n _a_b_o_u_t _p_a_r_e_n_t_s _a_n_d _c_h_i_l_d_r_e_n.

_D_e_s_c_r_i_p_t_i_o_n:

     These (and related internal) functions allow us to represent trees
     as a simple, non-hierarchical collection of nodes along with
     corresponding tables that identify the parent and child
     relationships. This is different from representing a tree as a
     list of lists of lists ...  in which each node has a list of its
     own children. In a functional language like R, it is not possible
     then for the children to be able to identify their parents.

     We use an environment to represent these flat trees.  Since these
     are mutable without requiring the change to be reassigned, we can
     modify a part of the tree locally without having to reassign the
     top-level object.

     We can use either a list (with names) to store the nodes or a hash
     table/associative array that uses names. There is a non-trivial
     performance difference.

_U_s_a_g_e:

     xmlHashTree(nodes = list(), parents = character(), children = list(), 
                  env = new.env(TRUE))
     xmlFlatListTree(nodes = list(), parents = character(), children = list(), env = new.env(), n = 200)

_A_r_g_u_m_e_n_t_s:

   nodes: a collection of existing nodes that are to be added to the
          tree. These are used to initialize the tree. If this is
          specified, you must also specify 'children' and 'parents'. 

 parents: the parent relationships for the nodes given by 'nodes'.

children: the children relationships for the nodes given by 'nodes'.

     env: an environment in which the information for the tree  will be
          stored. This is essentially the tree object as it allows us
          to modify parts of the tree without having to reassign the
          top-level object.    Unlike most R data types, environments
          are mutable. 

       n: for 'xmlFlatListTree', this is used as the default size to
          allocate for the list containing the nodes

_D_e_t_a_i_l_s:

_V_a_l_u_e:

     An object of class XMLFlatTree which is specialized to
     'XMLFlatListTree' by the 'xmlFlatListTree' function and
     'XMLHashTree' by the 'xmlHashTree' function. Both objects are
     simply the environment which contains information about the tree
     elements and functions to access this information.

     An 'xmlHashTree' object has an accessor method via '$' for
     accessing individual  nodes within the tree. One can use the node
     name/identifier in an expression such as 'tt$myNode' to obtain the
     element. The name of a node is either its XML node name or if that
     is already present in the tree, a machine generated name.

     One can find the names of all the nodes using the 'objects'
     function since these trees are regular environments in R. Using
     the 'all = TRUE' argument, one can also find the hidden elements
     that make define the tree's structure. These are '.children' and
     '.parents'. The former is an (hashed) environment. Each element is
     identified by the node in the tree by the node's identifier
     (corresponding to the name of the node in the tree's environment).
     The value of that element is simply a character vector giving the
     identifiers of all of the children of that node.

     The '.parents' element is also an environemnt. Each element in
     this gives the pair of node and parent identifiers with the parent
     identifier being the value of the variable in the environment. In
     other words, we look up the parent of a node named 'kid' by
     retrieving the value of the variable 'kid' in the '.parents'
     environment of this hash tree.

     The function '.addNode' is used to insert a new node into the
     tree.

     The structure of this tree allows one to easily travers all nodes,
     navigate up the tree from a node via its parent.  Certain tasks
     are more complex as the hierarchy is not implicit within a node.

_A_u_t_h_o_r(_s):

     Duncan Temple Lang

_R_e_f_e_r_e_n_c_e_s:

     <URL: http://www.w3.org/XML>

_S_e_e _A_l_s_o:

     'xmlTreeParse' 'xmlTree' 'xmlOutputBuffer' 'xmlOutputDOM'

_E_x_a_m_p_l_e_s:

      f = system.file("exampleData", "dataframe.xml", package = "XML")
      tr  = xmlHashTree()
      xmlTreeParse(f, handlers = tr[[".addNode"]])

      tr # print the tree on the screen

       # Get the two child nodes of the dataframe node.
      xmlChildren(tr$dataframe)

       # Find the names of all the nodes.
      objects(tr)
       # Which nodes have children
      objects(tr$.children)

       # Which nodes are leaves, i.e. do not have children
      setdiff(objects(tr), objects(tr$.children))

       # find the class of each of these leaf nodes.
      sapply(setdiff(objects(tr), objects(tr$.children)),
              function(id) class(tr[[id]]))

       # distribution of number of children
      sapply(tr$.children, length)

       # Get the first A node
      tr$A

       # Get is parent node.
      xmlParent(tr$A)

