(message-sort-headers): Define for compiler.
[bpt/emacs.git] / doc / misc / semantic.texi
CommitLineData
3149927d
CY
1\input texinfo
2@setfilename ../../info/semantic
3@set TITLE Semantic Manual
4@set AUTHOR Eric M. Ludlam and David Ponce
5@settitle @value{TITLE}
6
7@c *************************************************************************
8@c @ Header
9@c *************************************************************************
10
11@c Merge all indexes into a single index for now.
12@c We can always separate them later into two or more as needed.
13@syncodeindex vr cp
14@syncodeindex fn cp
15@syncodeindex ky cp
16@syncodeindex pg cp
17@syncodeindex tp cp
18
19@c @footnotestyle separate
20@c @paragraphindent 2
21@c @@smallbook
22@c %**end of header
23
24@copying
25This manual documents the Semantic library and utilities.
26
27Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007,
282009 Free Software Foundation, Inc.
29
30@quotation
31Permission is granted to copy, distribute and/or modify this document
32under the terms of the GNU Free Documentation License, Version 1.3 or
33any later version published by the Free Software Foundation; with no
34Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
35and with the Back-Cover Texts as in (a) below. A copy of the license
36is included in the section entitled ``GNU Free Documentation License.''
37
38(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
39modify this GNU manual. Buying copies from the FSF supports it in
40developing GNU and promoting software freedom.''
41@end quotation
42@end copying
43
44@ifinfo
45@format
46START-INFO-DIR-ENTRY
47* Semantic: (semantic). Source code parser library and utilities.
48END-INFO-DIR-ENTRY
49@end format
50@end ifinfo
51
52@titlepage
53@center @titlefont{Semantic}
54@sp 4
55@center by @value{AUTHOR}
56@end titlepage
57@page
58
59@macro semantic{}
60@i{Semantic}
61@end macro
62
63@macro keyword{kw}
64@anchor{\kw\}
65@b{\kw\}
66@end macro
67
68@macro obsolete{old,new}
69@sp 1
70@strong{Compatibility}:
71@code{\new\} introduced in @semantic{} version 2.0 supercedes
72@code{\old\} which is now obsolete.
73@end macro
74
75@c *************************************************************************
76@c @ Document
77@c *************************************************************************
78@contents
79
80@node top
81@top @value{TITLE}
82
83@semantic{} is a suite of Emacs libraries and utilities for parsing
84source code. At its core is a lexical analyzer and two parser
85generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
86@semantic{} provides a variety of tools for making use of the parser
87output, including user commands for code navigation and completion, as
88well as enhancements for imenu, speedbar, whichfunc, eldoc,
89hippie-expand, and several other parts of Emacs.
90
91To send bug reports, or participate in discussions about semantic,
92use the mailing list cedet-semantic@@sourceforge.net via the URL:
93@url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
94
95@ifnottex
96@insertcopying
97@end ifnottex
98
99@menu
100* Introduction::
101* Using Semantic::
102* Semantic Internals::
103* Glossary::
104* GNU Free Documentation License::
105* Index::
106@end menu
107
108@node Introduction
109@chapter Introduction
110
111This chapter gives an overview of @semantic{} and its goals.
112
113Ordinarily, Emacs uses regular expressions (and syntax tables) to
114analyze source code for purposes such as syntax highlighting. This
115approach, though simple and efficient, has its limitations: roughly
116speaking, it only ``guesses'' the meaning of each piece of source code
117in the context of the programming language, instead of rigorously
118``understanding'' it.
119
120@semantic{} provides a new infrastructure to analyze source code using
121@dfn{parsers} instead of regular expressions. It contains two
122built-in parser generators (an @acronym{LL} generator named
123@code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
124both written in Emacs Lisp), and parsers for several common
125programming languages. It can also make use of @dfn{external
126parsers}---programs such as GNU Global and GNU IDUtils.
127
128@semantic{} provides a uniform, language-independent @acronym{API} for
129accessing the parser output. This output can be used by other Emacs
130Lisp programs to implement ``syntax-aware'' behavior. @semantic{}
131itself includes several such utilities, including user-level Emacs
132commands for navigating, searching, and completing source code.
133
134The following diagram illustrates the structure of the @semantic{}
135package:
136
137@table @strong
138@item Please Note:
139The words in all-capital are those that @semantic{} itself provides.
140Others are current or future languages or applications that are not
141distributed along with @semantic{}.
142@end table
143
144@example
145 Applications
146 and
147 Utilities
148 -------
149 / \
150 +---------------+ +--------+ +--------+
151 C --->| C PARSER |--->| | | |
152 +---------------+ | | | |
153 +---------------+ | COMMON | | COMMON |<--- SPEEDBAR
154 Java --->| JAVA PARSER |--->| PARSE | | |
155 +---------------+ | TREE | | PARSE |<--- SEMANTICDB
156 +---------------+ | FORMAT | | API |<--- ecb
157 Scheme --->| SCHEME PARSER |--->| | | |
158 +---------------+ | | | |
159 +---------------+ | | | |
160 Texinfo --->| TEXI. PARSER |--->| | | |
161 +---------------+ | | | |
162
163 ... ... ... ...
164
165 +---------------+ | | | |<--- app. 1
166 Lang. A --->| A Parser |--->| | | |
167 +---------------+ | | | |<--- app. 2
168 +---------------+ | | | |
169 Lang. B --->| B Parser |--->| | | |<--- app. 3
170 +---------------+ | | | |
171
172 ... ... ... ... ...
173
174 +---------------+ | | | |
175 Lang. Y --->| Y Parser |--->| | | |<--- app. ?
176 +---------------+ | | | |
177 +---------------+ | | | |<--- app. ?
178 Lang. Z --->| Z Parser |--->| | | |
179 +---------------+ +--------+ +--------+
180@end example
181
182@menu
183* Semantic Components::
184@end menu
185
186@node Semantic Components
187@section Semantic Components
188
189In this section, we provide a more detailed description of the major
190components of @semantic{}, and how they interact with one another.
191
192The first step in parsing a source code file is to break it up into
193its fundamental components. This step is called lexical analysis:
194
195@example
196 syntax table, keywords list, and options
197 |
198 |
199 v
200 input file ----> Lexer ----> token stream
201@end example
202
203@noindent
204The output of the lexical analyzer is a list of tokens that make up
205the file. The next step is the actual parsing, shown below:
206
207@example
208 parser tables
209 |
210 v
211 token stream ---> Parser ----> parse tree
212@end example
213
214@noindent
215The end result, the parse tree, is @semantic{}'s internal
216representation of the language grammar. @semantic{} provides an
217@acronym{API} for Emacs Lisp programs to access the parse tree.
218
219Parsing large files can take several seconds or more. By default,
220@semantic{} automatically caches parse trees by saving them in your
221@file{.emacs.d} directory. When you revisit a previously-parsed file,
222the parse tree is automatically reloaded from this cache, to save
223time. @xref{SemanticDB}.
224
225@node Using Semantic
226@chapter Using Semantic
227
228@include sem-user.texi
229
230@node Semantic Internals
231@chapter Semantic Internals
232
233This chapter provides an overview of the internals of @semantic{}.
234This information would not be needed by neither application developers
235nor grammar developers.
236
237It would be useful mostly for the hackers who would like to learn
238more about how @semantic{} works.
239
240@menu
241* Parser code :: Code used for the parsers
242* Tag handling :: Code used for manipulating tags
fd1cefda
CY
243* Semanticdb Internals :: Code used in the semantic database
244* Analyzer Internals :: Code used in the code analyzer
be479117
JB
245* Tools :: Code used in user tools
246* Tests :: Code used for testing
3149927d
CY
247@end menu
248
249@node Parser code
250@section Parser code
251
252@semantic{} parsing code is spread across a range of files.
253
254@table @file
255@item semantic.el
256The core infrastructure sets up buffers for parsing, and has all the
257core parsing routines. Most parsing routines are overloadable, so the
258actual implementation may be somewhere else.
259
260@item semantic-edit.el
261Incremental reparse based on user edits.
262
263@item semantic-grammar.el
264@itemx semantic-grammar.wy
265Parser for the different grammar languages, and a major mode for
266editing grammars in Emacs.
267
268@item semantic-lex.el
269Infrastructure for implementing lexical analyzers. Provides macros
270for creating individual analyzers for specific features, and a way to
271combine them together.
272
273@item semantic-lex-spp.el
274Infrastructure for a lexical symbolic preprocessor. This was written
275to implement the C preprocessor, but could be used for other lexical
276preprocessors.
277
278@item bovine/bovine-grammar.el
279@itemx bovine/bovine-grammar-macros.el
280@itemx bovine/semantic-bovine.el
281The ``bovine'' grammar. This is the first grammar mode written for
282@semantic{} and is useful for simple creating simple parsers.
283
284@item wisent/wisent.el
285@itemx wisent/bison-wisent.el
286@itemx wisent/semantic-wisent.el
287@itemx wisent/semantic-debug-grammar.el
288A port of bison to Emacs. This infrastructure lets you create LALR
289based parsers for @semantic{}.
290
291@item semantic-ast.el
292Manage Abstract Syntax Trees for parsers.
293
294@item semantic-debug.el
295Infrastructure for debugging grammars.
296
297@item semantic-util.el
298Various utilities for manipulating tags, such as describing the tag
299under point, adding labels, and the all important
300@code{semantic-something-to-tag-table}.
301
302@end table
303
304@node Tag handling
305@section Tag handling
306
307A tag represents an individual item found in a buffer, such as a
308function or variable. Tag handling is handled in several source
309files.
310
311@table @file
312@item semantic-tag.el
313Basic tag creation, queries, cloning, binding, and unbinding.
314
315@item semantic-tag-write.el
316Write a tag or tag list to a stream. These routines are used by
317@file{semanticdb-file.el} when saving a list of tags.
318
319@item semantic-tag-file.el
320Files associated with tags. Goto-tag, file for include, and file for
321a prototype.
322
323@item semantic-tag-ls.el
324Language dependant features of a tag, such as parent calculation, slot
325protection, and other states like abstract, virtual, static, and leaf.
326
327@item semantic-dep.el
328Include file handling. Contains the include path concepts, and
329routines for looking up file names in the include path.
330
331@item semantic-format.el
332Convert a tag into a nicely formatted and colored string. Use
333@code{semantic-test-all-format-tag-functions} to test different output
334options.
335
336@item semantic-find.el
337Find tags matching different conditions in a tag table.
338These routines are used by @file{semanticdb-find.el} once the database
339has been converted into a simpler tag table.
340
341@item semantic-sort.el
342Sorting lists of tags in different ways. Includes sorting a plain
343list of tags forward or backward. Includes binning tags based on
344attributes (bucketize), and tag adoption for multiple references to
345the same thing.
346
347@item semantic-doc.el
348Capture documentation comments from near a tag.
349
350@end table
351
fd1cefda
CY
352@node Semanticdb Internals
353@section Semanticdb Internals
3149927d
CY
354
355@acronym{Semanticdb} complexity is certainly an issue. It is a rather
356hairy problem to try and solve.
357
358@table @file
359@item semanticdb.el
360Defines a @dfn{database} and a @dfn{table} base class. You can
361instantiate these classes, and use them, but they are not persistent.
362
363This file also provides support for @code{semanticdb-minor-mode},
364which automatically associates files with tables in databases so that
365tags are @emph{saved} while a buffer is not in memory.
366
367The database and tables both also provide applicate cache information,
368and cache flushing system. The semanticdb search routines use caches
369to save datastructures that are complex to calculate.
370
371Lastly, it provides the concept of @dfn{project root}. It is a system
372by which a file can be associated with the root of a project, so if
373you have a tree of directories and source files, it can find the root,
374and allow a tag-search to span all available databases in that
375directory hierarchy.
376
377@item semanticdb-file.el
378Provides a subclass of the basic table so that it can be saved to
379disk. Implements all the code needed to unbind/rebind tags to a
380buffer and writing them to a file.
381
382@item semanticdb-el.el
383Implements a special kind of @dfn{system} database that uses Emacs
384internals to perform queries.
385
386@item semanticdb-ebrowse.el
387Implements a system database that uses Ebrowse to parse files into a
388table that can be queried for tag names. Successful tag hits during a
389find causes @semantic{} to pick up and parse the reference files to
390get the full details.
391
392@item semanticdb-find.el
393Infrastructure for searching groups @semantic{} databases, and dealing
394with the search results format.
395
396@item semanticdb-ref.el
397Tracks crossreferences. Cross references are needed when buffer is
398reparsed, and must alert other tables that any dependant caches may
399need to be flushed. References are in the form of include files.
400
401@end table
402
fd1cefda
CY
403@node Analyzer Internals
404@section Analyzer Internals
3149927d
CY
405
406The @semantic{} analyzer is a complex engine which has been broken
407down across several modules. When the @semantic{} analyzer fails,
408start with @code{semantic-analyze-debug-assist}, then dive into some
409of these files.
410
411@table @file
412@item semantic-analyze.el
413The core analyzer for defining the @dfn{current context}. The
414current context is an object that contains references to aspects of
415the local context including the current prefix, and a tag list
416defining what the prefix means.
417
418@item semantic-analyze-complete.el
419Provides @code{semantic-analyze-possible-completions}.
420
421@item semantic-analyze-debug.el
422The analyzer debugger. Useful when attempting to get everything
423configured.
424
425@item semantic-analyze-fcn.el
426Various support functions needed by the analyzer.
427
428@item semantic-ctxt.el
429Local context parser. Contains overloadable functions used to move
430around through different scopes, get local variables, and collect the
431current prefix used when doing completion.
432
433@item semantic-scope.el
434Calculate @dfn{scope} for a location in a buffer. The scope includes
435local variables, and tag lists in scope for various reasons, such as
436C++ using statements.
437
438@item semanticdb-typecache.el
439The typecache is part of @code{semanticdb}, but is used primarilly by
440the analyzer to look up datatypes and complex names. The typecache is
441bound across source files and builds a master lookup table for data
442type names.
443
444@item semantic-ia.el
445Interactive Analyzer functions. Simple routines that do completion or
446lookups based on the results from the Analyzer. These routines are
447meant as examples for application writers, but are quite useful as
448they are.
449
450@item semantic-ia-sb.el
451Speedbar support for the analyzer, displaying context info, and
452completion lists.
453
454@end table
455
456@node Tools
457@section Tools
458
459These files contain various tools a user can use.
460
461@table @file
462@item semantic-idle.el
463Idle scheduler for @semantic{}. Manages reparsing buffers after
464edits, and large work tasks in idle time. Includes modes for showing
465summary help and pop-up completion.
466
467@item senator.el
468The @semantic{} navigator. Provides many ways to move through a
469buffer based on the active tag table.
470
471@item semantic-decorate.el
472A minor mode for decorating tags based on details from the parser.
473Includes overlines for functions, or coloring class fields based on
474protection.
475
476@item semantic-decorate-include.el
477A decoration mode for include files, which assists users in setting up
478parsing for their includes.
479
480@item semantic-complete.el
481Advanced completion prompts for reading tag names in the minibuffer, or
482inline in a buffer.
483
484@item semantic-imenu.el
485Imenu support for using @semantic{} tags in imenu.
486
487@item semantic-mru-bookmark.el
488Automatic bookmarking based on tags. Jump to locations you've been
489before based on tag name.
490
491@item semantic-sb.el
492Support for @semantic{} tag usage in Speedbar.
493
494@item semantic-util-modes.el
495A bunch of small minor-modes that exposes aspects of the semantic
496parser state. Includes @code{semantic-stickyfunc-mode}.
497
498@item document.el
499@itemx document-vars.el
500Create an update comments for tags.
501
502@item semantic-adebug.el
503Extensions of @file{data-debug.el} for @semantic{}.
504
505@item semantic-chart.el
506Draw some charts from stats generated from parsing.
507
508
509@item semantic-elp.el
510Profiler for helping to optimize the @semantic{} analyzer.
511
512
513@end table
514
515@node Tests
516@section Tests
517
518@table @file
519
520@item semantic-utest.el
521Basic testing of parsing and incremental parsing for most supported
522languages.
523
524@item semantic-ia-utest.el
525Test the semantic analyzer's ability to provide smart completions.
526
527@item semantic-utest-c.el
528Tests for the C parser's lexical pre-processor.
529
530@item semantic-regtest.el
531Regression tests from the older Semantic 1.x API.
532
533@end table
534
535@node Glossary
536@appendix Glossary
537
538@table @keyword
539@item BNF
540In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
541grammar file used for the 1.4 parser generator. This was a play on
542Backus-Naur Form which proved too confusing.
543
544@item bovinate
545A verb representing what happens when a bovine parser parses a file.
546
547@item bovine lambda
548In a bovine, or LL parser, the bovine lambda is a function to execute
549when a specific set of match rules has succeeded in matching text from
550the buffer.
551
552@item bovine parser
553A parser using the bovine parser generator. It is an LL parser
554suitible for small simple languages.
555
556@item context
557
558@item LALR
559
560@item lexer
561A program which converts text into a stream of tokens by analyzing
562them lexically. Lexers will commonly create strings, symbols,
563keywords and punctuation, and strip whitespaces and comments.
564
565@item LL
566
567@item nonterminal
568A nonterminal symbol or simply a nonterminal stands for a class of
be479117 569syntactically equivalent groupings. A nonterminal symbol name is used
3149927d
CY
570in writing grammar rules.
571
572@item overloadable
573Some functions are defined via @code{define-overload}.
574These can be overloaded via ....
575
576@item parser
577A program that converts @b{tokens} to @b{tags}.
578
579@item tag
580A tag is a representation of some entity in a language file, such as a
581function, variable, or include statement. In semantic, the word tag is
582used the same way it is used for the etags or ctags tools.
583
584A tag is usually bound to a buffer region via overlay, or it just
585specifies character locations in a file.
586
587@item token
588A single atomic item returned from a lexer. It represents some set
589of characters found in a buffer.
590
591@item token stream
592The output of the lexer as well as the input to the parser.
593
594@item wisent parser
595A parser using the wisent parser generator. It is a port of bison to
596Emacs Lisp. It is an LALR parser suitable for complex languages.
597@end table
598
599
600@node GNU Free Documentation License
601@appendix GNU Free Documentation License
602@include doclicense.texi
603
604@node Index
605@unnumbered Index
606@printindex cp
607
608@iftex
609@contents
610@summarycontents
611@end iftex
612
613@bye
614
615@c Following comments are for the benefit of ispell.
616
617@c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
618@c LocalWords: backquote bnf bovinate bovinates LALR
619@c LocalWords: bovinating bovination bovinator bucketize
620@c LocalWords: cb cdr charquote checkcache cindex CLOS
621@c LocalWords: concat concocting const constantness ctxt Decl defcustom
622@c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
623@c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
624@c LocalWords: eq Exp EXPANDFULL expresssion fn foo func funcall
625@c LocalWords: ia ids iff ifinfo imenu imenus init int isearch itemx java kbd
626@c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
627@c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
628@c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
629@c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
630@c LocalWords: popup positionalonly positiononly positionormarker pre
631@c LocalWords: printf printindex Programmatically pt punctuations quotemode
632@c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
633@c LocalWords: scopestart SEmantic semanticdb setfilename setq
634@c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
635@c LocalWords: streamorbuffer struct subalist submenu submenus
636@c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
637@c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
638@c LocalWords: uref usedb var vskip xref yak
964f5b2b
MB
639
640@ignore
641 arch-tag: cbc6e78c-4ff1-410e-9fc7-936487e39bbf
642@end ignore