2 Generalized suffix trees (GSTs).
4 Computes generalized suffix trees from list of strings. A terminal symbol is
5 implicitly added to them, but is not returned in the word labeling nodes and
6 leaves. This should allow a rather transparent handling of GSTs.
8 Node-based accesses are provided (sequences, root, children, suffix links,
9 node labels, index), as well as a functional for synthesizing attributes from
10 a GST. A readable representation of GSTs is derived from the later.
13 (* made by Sebastien Ferre *)
15 (* extension by Yoann Padioleau: the function add, allowing to extend a gst
16 (internally use now a DynArray instead of an Array)
20 (** Type of nodes in GSTs. *)
25 val make
: string list
-> t
26 (** [make l_str] computes a GST based on the set of strings given in [l_str]. *)
28 val add
: string -> t
-> unit
29 (** [add l_str gst] add a new string in the GST. Does it via a side effect. *)
31 val string_list
: t
-> string list
32 (** [string_list gst] returns the list of strings from which [gst] was computed. *)
34 val string : t
-> int -> string
35 (** [string gst k] returns the sequence number [k] (starting from 0). *)
38 (** [root gst] returns the root node of the gst. *)
40 val word : t
-> node
-> string
41 (** [word gst n] returns the word labeling node [n] in [gst]. *)
43 val children
: t
-> node
-> node list
44 (** [children gst n] returns a list of the children nodes of [n] in [gst]. *)
46 val linked_node
: t
-> node
-> node
47 (** [linked_node gst n] returns the node pointed by the suffix link from [n] in [gst]. *)
49 val index
: t
-> node
-> int * int
50 (** [index gst n] returns the index of a leaf [n] in [gst].
51 This index is a pair [(k,i)], where [k] is the number of the sequence (as used by [string]), and
52 [i] is the position of the related suffix (starting from [0] as usual in strings).
53 @raise Invalid_argument "Suffix_tree.index: not a leaf" if [n] is not a leaf (has some child). *)
55 val implicit_node
: t
-> string -> node
* string * node
56 (** [implicit_node gst word] returns an implicit_node [(node,word',child)], where [node] is the lowest
57 node in the suffix tre such that the concatenation of the word recognized by [node] and [word'] is
58 equal to [word], if [word'] is not the empty string, then [child] is the child node of [node], whose
59 label has [word'] as a prefix.
60 @raise Not_found when [word] is not a substring of [string_list gst]. *)
63 val fold
: t
-> ('h
-> node
-> bool) -> ('h
-> node
-> 'h
) -> ('s list
-> 'h
-> node
-> 's
) -> 'h
-> 's
64 (** [fold gst filter herit synth init] computes some attribute(s) over a GST by using the 3 functions
65 [filter], [herit], [synth], and the initial value [init] inherited by the root node. ['h] is the type
66 of inherited attributes, and ['s] is the type of synthesized attributes, and so the type of the result.
68 The meaning of 3 functions is as follows:
69 - [filter h child] returns [true] if the node [child] must be explored given the inherited value of the current
70 node (parent of [child]),
71 - [herit h child] returns the value inherited by [child] given the inherited value of the current node
73 - [synth l h node] returns the synthesized value of the current node, given its inherited value [h], and
74 the list [l] of synthesized values of explored children of [node] (according to [filter]).
78 val fold_node
: t
-> ('h
-> node
-> bool) -> ('h
-> node
-> 'h
) -> ('s list
-> 'h
-> node
-> 's
) -> 'h
-> node
-> 's
79 (** Same as [fold], except the computation starts and finishes at the last argument node. *)
81 val fold_s
: t
-> ('s list
-> node
-> 's
) -> 's
82 (** [fold_s gst synth] is equivalent to [fold gst filter herit synth init], where there is no filtering, and
83 no inherited values: purely synthetic. *)
85 val fold_s_node
: t
-> ('s list
-> node
-> 's
) -> node
-> 's
86 (** Same as [fold_s], except the computation starts and finishes at the last argument node. *)
88 val fold_fs
: t
-> (node
-> bool) -> ('s list
-> node
-> 's
) -> 's
89 (** [fold_fs gst filter synth] is equivalent to [fold gst filter herit synth init], where there is no inherited
92 type tree
= Node
of string * tree list
| Leaf
of string * (int * int)
93 val readable
: t
-> tree
94 (** [readable gst] returns a (more) readable representation of [gst].
95 Each node and leaf is decorated by its word label, and leaves are
96 also decorated by their index. *)
98 val exact_matches
: t
-> string -> (int * int) list