Spelling fixes.
[bpt/emacs.git] / doc / misc / semantic.texi
CommitLineData
3149927d
CY
1\input texinfo
2@setfilename ../../info/semantic
3@set TITLE Semantic Manual
607d4369 4@set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim
3149927d
CY
5@settitle @value{TITLE}
6
7@c *************************************************************************
8@c @ Header
9@c *************************************************************************
10
11@c Merge all indexes into a single index for now.
12@c We can always separate them later into two or more as needed.
13@syncodeindex vr cp
14@syncodeindex fn cp
15@syncodeindex ky cp
16@syncodeindex pg cp
17@syncodeindex tp cp
18
19@c @footnotestyle separate
20@c @paragraphindent 2
21@c @@smallbook
22@c %**end of header
23
24@copying
25This manual documents the Semantic library and utilities.
26
73b0cd50 27Copyright @copyright{} 1999-2005, 2007, 2009-2011 Free Software Foundation, Inc.
3149927d
CY
28
29@quotation
30Permission is granted to copy, distribute and/or modify this document
31under the terms of the GNU Free Documentation License, Version 1.3 or
32any later version published by the Free Software Foundation; with no
33Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
34and with the Back-Cover Texts as in (a) below. A copy of the license
35is included in the section entitled ``GNU Free Documentation License.''
36
37(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
38modify this GNU manual. Buying copies from the FSF supports it in
39developing GNU and promoting software freedom.''
40@end quotation
41@end copying
42
0c973505 43@dircategory Emacs misc features
f819af10 44@direntry
62e034c2 45* Semantic: (semantic). Source code parser library and utilities.
f819af10 46@end direntry
3149927d
CY
47
48@titlepage
49@center @titlefont{Semantic}
50@sp 4
51@center by @value{AUTHOR}
52@end titlepage
53@page
54
55@macro semantic{}
56@i{Semantic}
57@end macro
58
59@macro keyword{kw}
60@anchor{\kw\}
61@b{\kw\}
62@end macro
63
64@macro obsolete{old,new}
65@sp 1
66@strong{Compatibility}:
0105dc3e 67@code{\new\} introduced in @semantic{} version 2.0 supersedes
3149927d
CY
68@code{\old\} which is now obsolete.
69@end macro
70
71@c *************************************************************************
72@c @ Document
73@c *************************************************************************
74@contents
75
76@node top
77@top @value{TITLE}
78
79@semantic{} is a suite of Emacs libraries and utilities for parsing
80source code. At its core is a lexical analyzer and two parser
81generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
82@semantic{} provides a variety of tools for making use of the parser
83output, including user commands for code navigation and completion, as
84well as enhancements for imenu, speedbar, whichfunc, eldoc,
85hippie-expand, and several other parts of Emacs.
86
87To send bug reports, or participate in discussions about semantic,
88use the mailing list cedet-semantic@@sourceforge.net via the URL:
89@url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
90
91@ifnottex
92@insertcopying
93@end ifnottex
94
95@menu
96* Introduction::
97* Using Semantic::
98* Semantic Internals::
99* Glossary::
100* GNU Free Documentation License::
101* Index::
102@end menu
103
104@node Introduction
105@chapter Introduction
106
107This chapter gives an overview of @semantic{} and its goals.
108
109Ordinarily, Emacs uses regular expressions (and syntax tables) to
110analyze source code for purposes such as syntax highlighting. This
111approach, though simple and efficient, has its limitations: roughly
112speaking, it only ``guesses'' the meaning of each piece of source code
113in the context of the programming language, instead of rigorously
114``understanding'' it.
115
116@semantic{} provides a new infrastructure to analyze source code using
117@dfn{parsers} instead of regular expressions. It contains two
118built-in parser generators (an @acronym{LL} generator named
119@code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
120both written in Emacs Lisp), and parsers for several common
121programming languages. It can also make use of @dfn{external
122parsers}---programs such as GNU Global and GNU IDUtils.
123
124@semantic{} provides a uniform, language-independent @acronym{API} for
125accessing the parser output. This output can be used by other Emacs
126Lisp programs to implement ``syntax-aware'' behavior. @semantic{}
127itself includes several such utilities, including user-level Emacs
128commands for navigating, searching, and completing source code.
129
130The following diagram illustrates the structure of the @semantic{}
131package:
132
133@table @strong
134@item Please Note:
135The words in all-capital are those that @semantic{} itself provides.
136Others are current or future languages or applications that are not
137distributed along with @semantic{}.
138@end table
139
140@example
141 Applications
142 and
143 Utilities
144 -------
145 / \
146 +---------------+ +--------+ +--------+
147 C --->| C PARSER |--->| | | |
148 +---------------+ | | | |
149 +---------------+ | COMMON | | COMMON |<--- SPEEDBAR
150 Java --->| JAVA PARSER |--->| PARSE | | |
151 +---------------+ | TREE | | PARSE |<--- SEMANTICDB
8e687433
CY
152 +---------------+ | FORMAT | | API |
153 Scheme --->| SCHEME PARSER |--->| | | |<--- ecb
3149927d
CY
154 +---------------+ | | | |
155 +---------------+ | | | |
156 Texinfo --->| TEXI. PARSER |--->| | | |
157 +---------------+ | | | |
158
159 ... ... ... ...
160
3149927d
CY
161 +---------------+ | | | |
162 Lang. Y --->| Y Parser |--->| | | |<--- app. ?
163 +---------------+ | | | |
164 +---------------+ | | | |<--- app. ?
165 Lang. Z --->| Z Parser |--->| | | |
166 +---------------+ +--------+ +--------+
167@end example
168
169@menu
170* Semantic Components::
171@end menu
172
173@node Semantic Components
174@section Semantic Components
175
176In this section, we provide a more detailed description of the major
177components of @semantic{}, and how they interact with one another.
178
179The first step in parsing a source code file is to break it up into
180its fundamental components. This step is called lexical analysis:
181
182@example
183 syntax table, keywords list, and options
184 |
185 |
186 v
187 input file ----> Lexer ----> token stream
188@end example
189
190@noindent
191The output of the lexical analyzer is a list of tokens that make up
192the file. The next step is the actual parsing, shown below:
193
194@example
195 parser tables
196 |
197 v
198 token stream ---> Parser ----> parse tree
199@end example
200
201@noindent
202The end result, the parse tree, is @semantic{}'s internal
203representation of the language grammar. @semantic{} provides an
204@acronym{API} for Emacs Lisp programs to access the parse tree.
205
206Parsing large files can take several seconds or more. By default,
207@semantic{} automatically caches parse trees by saving them in your
208@file{.emacs.d} directory. When you revisit a previously-parsed file,
209the parse tree is automatically reloaded from this cache, to save
210time. @xref{SemanticDB}.
211
212@node Using Semantic
213@chapter Using Semantic
214
215@include sem-user.texi
216
217@node Semantic Internals
218@chapter Semantic Internals
219
220This chapter provides an overview of the internals of @semantic{}.
8e687433
CY
221This information is usually not needed by application developers or
222grammar developers; it is useful mostly for the hackers who would like
223to learn more about how @semantic{} works.
3149927d
CY
224
225@menu
9360256a
GM
226* Parser code :: Code used for the parsers
227* Tag handling :: Code used for manipulating tags
fd1cefda
CY
228* Semanticdb Internals :: Code used in the semantic database
229* Analyzer Internals :: Code used in the code analyzer
be479117
JB
230* Tools :: Code used in user tools
231* Tests :: Code used for testing
3149927d
CY
232@end menu
233
234@node Parser code
235@section Parser code
236
237@semantic{} parsing code is spread across a range of files.
238
239@table @file
240@item semantic.el
241The core infrastructure sets up buffers for parsing, and has all the
242core parsing routines. Most parsing routines are overloadable, so the
243actual implementation may be somewhere else.
244
245@item semantic-edit.el
246Incremental reparse based on user edits.
247
248@item semantic-grammar.el
249@itemx semantic-grammar.wy
250Parser for the different grammar languages, and a major mode for
251editing grammars in Emacs.
252
253@item semantic-lex.el
254Infrastructure for implementing lexical analyzers. Provides macros
255for creating individual analyzers for specific features, and a way to
256combine them together.
257
258@item semantic-lex-spp.el
259Infrastructure for a lexical symbolic preprocessor. This was written
260to implement the C preprocessor, but could be used for other lexical
261preprocessors.
262
263@item bovine/bovine-grammar.el
264@itemx bovine/bovine-grammar-macros.el
265@itemx bovine/semantic-bovine.el
266The ``bovine'' grammar. This is the first grammar mode written for
267@semantic{} and is useful for simple creating simple parsers.
268
269@item wisent/wisent.el
270@itemx wisent/bison-wisent.el
271@itemx wisent/semantic-wisent.el
272@itemx wisent/semantic-debug-grammar.el
273A port of bison to Emacs. This infrastructure lets you create LALR
274based parsers for @semantic{}.
275
276@item semantic-ast.el
277Manage Abstract Syntax Trees for parsers.
278
279@item semantic-debug.el
280Infrastructure for debugging grammars.
281
282@item semantic-util.el
283Various utilities for manipulating tags, such as describing the tag
284under point, adding labels, and the all important
285@code{semantic-something-to-tag-table}.
286
287@end table
288
289@node Tag handling
290@section Tag handling
291
292A tag represents an individual item found in a buffer, such as a
293function or variable. Tag handling is handled in several source
294files.
295
296@table @file
297@item semantic-tag.el
298Basic tag creation, queries, cloning, binding, and unbinding.
299
300@item semantic-tag-write.el
301Write a tag or tag list to a stream. These routines are used by
302@file{semanticdb-file.el} when saving a list of tags.
303
304@item semantic-tag-file.el
305Files associated with tags. Goto-tag, file for include, and file for
306a prototype.
307
308@item semantic-tag-ls.el
40a8bdf6 309Language dependent features of a tag, such as parent calculation, slot
3149927d
CY
310protection, and other states like abstract, virtual, static, and leaf.
311
312@item semantic-dep.el
313Include file handling. Contains the include path concepts, and
314routines for looking up file names in the include path.
315
316@item semantic-format.el
317Convert a tag into a nicely formatted and colored string. Use
318@code{semantic-test-all-format-tag-functions} to test different output
319options.
320
321@item semantic-find.el
322Find tags matching different conditions in a tag table.
323These routines are used by @file{semanticdb-find.el} once the database
324has been converted into a simpler tag table.
325
326@item semantic-sort.el
327Sorting lists of tags in different ways. Includes sorting a plain
328list of tags forward or backward. Includes binning tags based on
329attributes (bucketize), and tag adoption for multiple references to
330the same thing.
331
332@item semantic-doc.el
333Capture documentation comments from near a tag.
334
335@end table
336
fd1cefda
CY
337@node Semanticdb Internals
338@section Semanticdb Internals
3149927d
CY
339
340@acronym{Semanticdb} complexity is certainly an issue. It is a rather
341hairy problem to try and solve.
342
343@table @file
344@item semanticdb.el
345Defines a @dfn{database} and a @dfn{table} base class. You can
346instantiate these classes, and use them, but they are not persistent.
347
348This file also provides support for @code{semanticdb-minor-mode},
349which automatically associates files with tables in databases so that
350tags are @emph{saved} while a buffer is not in memory.
351
91af3942 352The database and tables both also provide applicable cache information,
3149927d
CY
353and cache flushing system. The semanticdb search routines use caches
354to save datastructures that are complex to calculate.
355
356Lastly, it provides the concept of @dfn{project root}. It is a system
357by which a file can be associated with the root of a project, so if
358you have a tree of directories and source files, it can find the root,
359and allow a tag-search to span all available databases in that
360directory hierarchy.
361
362@item semanticdb-file.el
363Provides a subclass of the basic table so that it can be saved to
364disk. Implements all the code needed to unbind/rebind tags to a
365buffer and writing them to a file.
366
367@item semanticdb-el.el
368Implements a special kind of @dfn{system} database that uses Emacs
369internals to perform queries.
370
371@item semanticdb-ebrowse.el
372Implements a system database that uses Ebrowse to parse files into a
373table that can be queried for tag names. Successful tag hits during a
374find causes @semantic{} to pick up and parse the reference files to
375get the full details.
376
377@item semanticdb-find.el
378Infrastructure for searching groups @semantic{} databases, and dealing
379with the search results format.
380
381@item semanticdb-ref.el
382Tracks crossreferences. Cross references are needed when buffer is
383reparsed, and must alert other tables that any dependant caches may
384need to be flushed. References are in the form of include files.
385
386@end table
387
fd1cefda
CY
388@node Analyzer Internals
389@section Analyzer Internals
3149927d
CY
390
391The @semantic{} analyzer is a complex engine which has been broken
392down across several modules. When the @semantic{} analyzer fails,
393start with @code{semantic-analyze-debug-assist}, then dive into some
394of these files.
395
396@table @file
397@item semantic-analyze.el
398The core analyzer for defining the @dfn{current context}. The
399current context is an object that contains references to aspects of
400the local context including the current prefix, and a tag list
401defining what the prefix means.
402
403@item semantic-analyze-complete.el
404Provides @code{semantic-analyze-possible-completions}.
405
406@item semantic-analyze-debug.el
407The analyzer debugger. Useful when attempting to get everything
408configured.
409
410@item semantic-analyze-fcn.el
411Various support functions needed by the analyzer.
412
413@item semantic-ctxt.el
414Local context parser. Contains overloadable functions used to move
415around through different scopes, get local variables, and collect the
416current prefix used when doing completion.
417
418@item semantic-scope.el
419Calculate @dfn{scope} for a location in a buffer. The scope includes
420local variables, and tag lists in scope for various reasons, such as
421C++ using statements.
422
423@item semanticdb-typecache.el
e1dbe924 424The typecache is part of @code{semanticdb}, but is used primarily by
3149927d
CY
425the analyzer to look up datatypes and complex names. The typecache is
426bound across source files and builds a master lookup table for data
427type names.
428
429@item semantic-ia.el
430Interactive Analyzer functions. Simple routines that do completion or
431lookups based on the results from the Analyzer. These routines are
432meant as examples for application writers, but are quite useful as
433they are.
434
435@item semantic-ia-sb.el
436Speedbar support for the analyzer, displaying context info, and
437completion lists.
438
439@end table
440
441@node Tools
442@section Tools
443
444These files contain various tools a user can use.
445
446@table @file
447@item semantic-idle.el
448Idle scheduler for @semantic{}. Manages reparsing buffers after
449edits, and large work tasks in idle time. Includes modes for showing
450summary help and pop-up completion.
451
452@item senator.el
453The @semantic{} navigator. Provides many ways to move through a
454buffer based on the active tag table.
455
456@item semantic-decorate.el
457A minor mode for decorating tags based on details from the parser.
458Includes overlines for functions, or coloring class fields based on
459protection.
460
461@item semantic-decorate-include.el
462A decoration mode for include files, which assists users in setting up
463parsing for their includes.
464
465@item semantic-complete.el
466Advanced completion prompts for reading tag names in the minibuffer, or
467inline in a buffer.
468
469@item semantic-imenu.el
470Imenu support for using @semantic{} tags in imenu.
471
472@item semantic-mru-bookmark.el
473Automatic bookmarking based on tags. Jump to locations you've been
474before based on tag name.
475
476@item semantic-sb.el
477Support for @semantic{} tag usage in Speedbar.
478
479@item semantic-util-modes.el
480A bunch of small minor-modes that exposes aspects of the semantic
481parser state. Includes @code{semantic-stickyfunc-mode}.
482
483@item document.el
484@itemx document-vars.el
485Create an update comments for tags.
486
487@item semantic-adebug.el
488Extensions of @file{data-debug.el} for @semantic{}.
489
490@item semantic-chart.el
491Draw some charts from stats generated from parsing.
492
493
494@item semantic-elp.el
495Profiler for helping to optimize the @semantic{} analyzer.
496
497
498@end table
499
500@node Tests
501@section Tests
502
503@table @file
504
505@item semantic-utest.el
506Basic testing of parsing and incremental parsing for most supported
507languages.
508
509@item semantic-ia-utest.el
510Test the semantic analyzer's ability to provide smart completions.
511
512@item semantic-utest-c.el
513Tests for the C parser's lexical pre-processor.
514
515@item semantic-regtest.el
516Regression tests from the older Semantic 1.x API.
517
518@end table
519
520@node Glossary
521@appendix Glossary
522
523@table @keyword
524@item BNF
525In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
526grammar file used for the 1.4 parser generator. This was a play on
527Backus-Naur Form which proved too confusing.
528
529@item bovinate
530A verb representing what happens when a bovine parser parses a file.
531
532@item bovine lambda
533In a bovine, or LL parser, the bovine lambda is a function to execute
534when a specific set of match rules has succeeded in matching text from
535the buffer.
536
537@item bovine parser
538A parser using the bovine parser generator. It is an LL parser
539suitible for small simple languages.
540
541@item context
542
543@item LALR
544
545@item lexer
546A program which converts text into a stream of tokens by analyzing
547them lexically. Lexers will commonly create strings, symbols,
548keywords and punctuation, and strip whitespaces and comments.
549
550@item LL
551
552@item nonterminal
553A nonterminal symbol or simply a nonterminal stands for a class of
be479117 554syntactically equivalent groupings. A nonterminal symbol name is used
3149927d
CY
555in writing grammar rules.
556
557@item overloadable
558Some functions are defined via @code{define-overload}.
559These can be overloaded via ....
560
561@item parser
562A program that converts @b{tokens} to @b{tags}.
563
564@item tag
565A tag is a representation of some entity in a language file, such as a
566function, variable, or include statement. In semantic, the word tag is
567used the same way it is used for the etags or ctags tools.
568
569A tag is usually bound to a buffer region via overlay, or it just
570specifies character locations in a file.
571
572@item token
573A single atomic item returned from a lexer. It represents some set
574of characters found in a buffer.
575
576@item token stream
577The output of the lexer as well as the input to the parser.
578
579@item wisent parser
580A parser using the wisent parser generator. It is a port of bison to
581Emacs Lisp. It is an LALR parser suitable for complex languages.
582@end table
583
584
585@node GNU Free Documentation License
586@appendix GNU Free Documentation License
587@include doclicense.texi
588
589@node Index
590@unnumbered Index
591@printindex cp
592
593@iftex
594@contents
595@summarycontents
596@end iftex
597
598@bye
599
600@c Following comments are for the benefit of ispell.
601
602@c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
603@c LocalWords: backquote bnf bovinate bovinates LALR
604@c LocalWords: bovinating bovination bovinator bucketize
605@c LocalWords: cb cdr charquote checkcache cindex CLOS
606@c LocalWords: concat concocting const constantness ctxt Decl defcustom
607@c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
608@c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
4c36be58 609@c LocalWords: eq Exp EXPANDFULL expression fn foo func funcall
3149927d
CY
610@c LocalWords: ia ids iff ifinfo imenu imenus init int isearch itemx java kbd
611@c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
612@c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
613@c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
614@c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
615@c LocalWords: popup positionalonly positiononly positionormarker pre
616@c LocalWords: printf printindex Programmatically pt punctuations quotemode
617@c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
618@c LocalWords: scopestart SEmantic semanticdb setfilename setq
619@c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
620@c LocalWords: streamorbuffer struct subalist submenu submenus
621@c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
622@c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
623@c LocalWords: uref usedb var vskip xref yak