Enhance CC Mode manual to cover the new Java features.
[bpt/emacs.git] / doc / misc / semantic.texi
CommitLineData
3149927d
CY
1\input texinfo
2@setfilename ../../info/semantic
3@set TITLE Semantic Manual
607d4369 4@set AUTHOR Eric M. Ludlam, David Ponce, and Richard Y. Kim
3149927d
CY
5@settitle @value{TITLE}
6
7@c *************************************************************************
8@c @ Header
9@c *************************************************************************
10
11@c Merge all indexes into a single index for now.
12@c We can always separate them later into two or more as needed.
13@syncodeindex vr cp
14@syncodeindex fn cp
15@syncodeindex ky cp
16@syncodeindex pg cp
17@syncodeindex tp cp
18
19@c @footnotestyle separate
20@c @paragraphindent 2
21@c @@smallbook
22@c %**end of header
23
24@copying
25This manual documents the Semantic library and utilities.
26
27Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007,
114f9c96 282009, 2010 Free Software Foundation, Inc.
3149927d
CY
29
30@quotation
31Permission is granted to copy, distribute and/or modify this document
32under the terms of the GNU Free Documentation License, Version 1.3 or
33any later version published by the Free Software Foundation; with no
34Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
35and with the Back-Cover Texts as in (a) below. A copy of the license
36is included in the section entitled ``GNU Free Documentation License.''
37
38(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
39modify this GNU manual. Buying copies from the FSF supports it in
40developing GNU and promoting software freedom.''
41@end quotation
42@end copying
43
f819af10
MH
44@dircategory Emacs
45@direntry
62e034c2 46* Semantic: (semantic). Source code parser library and utilities.
f819af10 47@end direntry
3149927d
CY
48
49@titlepage
50@center @titlefont{Semantic}
51@sp 4
52@center by @value{AUTHOR}
53@end titlepage
54@page
55
56@macro semantic{}
57@i{Semantic}
58@end macro
59
60@macro keyword{kw}
61@anchor{\kw\}
62@b{\kw\}
63@end macro
64
65@macro obsolete{old,new}
66@sp 1
67@strong{Compatibility}:
68@code{\new\} introduced in @semantic{} version 2.0 supercedes
69@code{\old\} which is now obsolete.
70@end macro
71
72@c *************************************************************************
73@c @ Document
74@c *************************************************************************
75@contents
76
77@node top
78@top @value{TITLE}
79
80@semantic{} is a suite of Emacs libraries and utilities for parsing
81source code. At its core is a lexical analyzer and two parser
82generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
83@semantic{} provides a variety of tools for making use of the parser
84output, including user commands for code navigation and completion, as
85well as enhancements for imenu, speedbar, whichfunc, eldoc,
86hippie-expand, and several other parts of Emacs.
87
88To send bug reports, or participate in discussions about semantic,
89use the mailing list cedet-semantic@@sourceforge.net via the URL:
90@url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
91
92@ifnottex
93@insertcopying
94@end ifnottex
95
96@menu
97* Introduction::
98* Using Semantic::
99* Semantic Internals::
100* Glossary::
101* GNU Free Documentation License::
102* Index::
103@end menu
104
105@node Introduction
106@chapter Introduction
107
108This chapter gives an overview of @semantic{} and its goals.
109
110Ordinarily, Emacs uses regular expressions (and syntax tables) to
111analyze source code for purposes such as syntax highlighting. This
112approach, though simple and efficient, has its limitations: roughly
113speaking, it only ``guesses'' the meaning of each piece of source code
114in the context of the programming language, instead of rigorously
115``understanding'' it.
116
117@semantic{} provides a new infrastructure to analyze source code using
118@dfn{parsers} instead of regular expressions. It contains two
119built-in parser generators (an @acronym{LL} generator named
120@code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
121both written in Emacs Lisp), and parsers for several common
122programming languages. It can also make use of @dfn{external
123parsers}---programs such as GNU Global and GNU IDUtils.
124
125@semantic{} provides a uniform, language-independent @acronym{API} for
126accessing the parser output. This output can be used by other Emacs
127Lisp programs to implement ``syntax-aware'' behavior. @semantic{}
128itself includes several such utilities, including user-level Emacs
129commands for navigating, searching, and completing source code.
130
131The following diagram illustrates the structure of the @semantic{}
132package:
133
134@table @strong
135@item Please Note:
136The words in all-capital are those that @semantic{} itself provides.
137Others are current or future languages or applications that are not
138distributed along with @semantic{}.
139@end table
140
141@example
142 Applications
143 and
144 Utilities
145 -------
146 / \
147 +---------------+ +--------+ +--------+
148 C --->| C PARSER |--->| | | |
149 +---------------+ | | | |
150 +---------------+ | COMMON | | COMMON |<--- SPEEDBAR
151 Java --->| JAVA PARSER |--->| PARSE | | |
152 +---------------+ | TREE | | PARSE |<--- SEMANTICDB
8e687433
CY
153 +---------------+ | FORMAT | | API |
154 Scheme --->| SCHEME PARSER |--->| | | |<--- ecb
3149927d
CY
155 +---------------+ | | | |
156 +---------------+ | | | |
157 Texinfo --->| TEXI. PARSER |--->| | | |
158 +---------------+ | | | |
159
160 ... ... ... ...
161
3149927d
CY
162 +---------------+ | | | |
163 Lang. Y --->| Y Parser |--->| | | |<--- app. ?
164 +---------------+ | | | |
165 +---------------+ | | | |<--- app. ?
166 Lang. Z --->| Z Parser |--->| | | |
167 +---------------+ +--------+ +--------+
168@end example
169
170@menu
171* Semantic Components::
172@end menu
173
174@node Semantic Components
175@section Semantic Components
176
177In this section, we provide a more detailed description of the major
178components of @semantic{}, and how they interact with one another.
179
180The first step in parsing a source code file is to break it up into
181its fundamental components. This step is called lexical analysis:
182
183@example
184 syntax table, keywords list, and options
185 |
186 |
187 v
188 input file ----> Lexer ----> token stream
189@end example
190
191@noindent
192The output of the lexical analyzer is a list of tokens that make up
193the file. The next step is the actual parsing, shown below:
194
195@example
196 parser tables
197 |
198 v
199 token stream ---> Parser ----> parse tree
200@end example
201
202@noindent
203The end result, the parse tree, is @semantic{}'s internal
204representation of the language grammar. @semantic{} provides an
205@acronym{API} for Emacs Lisp programs to access the parse tree.
206
207Parsing large files can take several seconds or more. By default,
208@semantic{} automatically caches parse trees by saving them in your
209@file{.emacs.d} directory. When you revisit a previously-parsed file,
210the parse tree is automatically reloaded from this cache, to save
211time. @xref{SemanticDB}.
212
213@node Using Semantic
214@chapter Using Semantic
215
216@include sem-user.texi
217
218@node Semantic Internals
219@chapter Semantic Internals
220
221This chapter provides an overview of the internals of @semantic{}.
8e687433
CY
222This information is usually not needed by application developers or
223grammar developers; it is useful mostly for the hackers who would like
224to learn more about how @semantic{} works.
3149927d
CY
225
226@menu
9360256a
GM
227* Parser code :: Code used for the parsers
228* Tag handling :: Code used for manipulating tags
fd1cefda
CY
229* Semanticdb Internals :: Code used in the semantic database
230* Analyzer Internals :: Code used in the code analyzer
be479117
JB
231* Tools :: Code used in user tools
232* Tests :: Code used for testing
3149927d
CY
233@end menu
234
235@node Parser code
236@section Parser code
237
238@semantic{} parsing code is spread across a range of files.
239
240@table @file
241@item semantic.el
242The core infrastructure sets up buffers for parsing, and has all the
243core parsing routines. Most parsing routines are overloadable, so the
244actual implementation may be somewhere else.
245
246@item semantic-edit.el
247Incremental reparse based on user edits.
248
249@item semantic-grammar.el
250@itemx semantic-grammar.wy
251Parser for the different grammar languages, and a major mode for
252editing grammars in Emacs.
253
254@item semantic-lex.el
255Infrastructure for implementing lexical analyzers. Provides macros
256for creating individual analyzers for specific features, and a way to
257combine them together.
258
259@item semantic-lex-spp.el
260Infrastructure for a lexical symbolic preprocessor. This was written
261to implement the C preprocessor, but could be used for other lexical
262preprocessors.
263
264@item bovine/bovine-grammar.el
265@itemx bovine/bovine-grammar-macros.el
266@itemx bovine/semantic-bovine.el
267The ``bovine'' grammar. This is the first grammar mode written for
268@semantic{} and is useful for simple creating simple parsers.
269
270@item wisent/wisent.el
271@itemx wisent/bison-wisent.el
272@itemx wisent/semantic-wisent.el
273@itemx wisent/semantic-debug-grammar.el
274A port of bison to Emacs. This infrastructure lets you create LALR
275based parsers for @semantic{}.
276
277@item semantic-ast.el
278Manage Abstract Syntax Trees for parsers.
279
280@item semantic-debug.el
281Infrastructure for debugging grammars.
282
283@item semantic-util.el
284Various utilities for manipulating tags, such as describing the tag
285under point, adding labels, and the all important
286@code{semantic-something-to-tag-table}.
287
288@end table
289
290@node Tag handling
291@section Tag handling
292
293A tag represents an individual item found in a buffer, such as a
294function or variable. Tag handling is handled in several source
295files.
296
297@table @file
298@item semantic-tag.el
299Basic tag creation, queries, cloning, binding, and unbinding.
300
301@item semantic-tag-write.el
302Write a tag or tag list to a stream. These routines are used by
303@file{semanticdb-file.el} when saving a list of tags.
304
305@item semantic-tag-file.el
306Files associated with tags. Goto-tag, file for include, and file for
307a prototype.
308
309@item semantic-tag-ls.el
310Language dependant features of a tag, such as parent calculation, slot
311protection, and other states like abstract, virtual, static, and leaf.
312
313@item semantic-dep.el
314Include file handling. Contains the include path concepts, and
315routines for looking up file names in the include path.
316
317@item semantic-format.el
318Convert a tag into a nicely formatted and colored string. Use
319@code{semantic-test-all-format-tag-functions} to test different output
320options.
321
322@item semantic-find.el
323Find tags matching different conditions in a tag table.
324These routines are used by @file{semanticdb-find.el} once the database
325has been converted into a simpler tag table.
326
327@item semantic-sort.el
328Sorting lists of tags in different ways. Includes sorting a plain
329list of tags forward or backward. Includes binning tags based on
330attributes (bucketize), and tag adoption for multiple references to
331the same thing.
332
333@item semantic-doc.el
334Capture documentation comments from near a tag.
335
336@end table
337
fd1cefda
CY
338@node Semanticdb Internals
339@section Semanticdb Internals
3149927d
CY
340
341@acronym{Semanticdb} complexity is certainly an issue. It is a rather
342hairy problem to try and solve.
343
344@table @file
345@item semanticdb.el
346Defines a @dfn{database} and a @dfn{table} base class. You can
347instantiate these classes, and use them, but they are not persistent.
348
349This file also provides support for @code{semanticdb-minor-mode},
350which automatically associates files with tables in databases so that
351tags are @emph{saved} while a buffer is not in memory.
352
353The database and tables both also provide applicate cache information,
354and cache flushing system. The semanticdb search routines use caches
355to save datastructures that are complex to calculate.
356
357Lastly, it provides the concept of @dfn{project root}. It is a system
358by which a file can be associated with the root of a project, so if
359you have a tree of directories and source files, it can find the root,
360and allow a tag-search to span all available databases in that
361directory hierarchy.
362
363@item semanticdb-file.el
364Provides a subclass of the basic table so that it can be saved to
365disk. Implements all the code needed to unbind/rebind tags to a
366buffer and writing them to a file.
367
368@item semanticdb-el.el
369Implements a special kind of @dfn{system} database that uses Emacs
370internals to perform queries.
371
372@item semanticdb-ebrowse.el
373Implements a system database that uses Ebrowse to parse files into a
374table that can be queried for tag names. Successful tag hits during a
375find causes @semantic{} to pick up and parse the reference files to
376get the full details.
377
378@item semanticdb-find.el
379Infrastructure for searching groups @semantic{} databases, and dealing
380with the search results format.
381
382@item semanticdb-ref.el
383Tracks crossreferences. Cross references are needed when buffer is
384reparsed, and must alert other tables that any dependant caches may
385need to be flushed. References are in the form of include files.
386
387@end table
388
fd1cefda
CY
389@node Analyzer Internals
390@section Analyzer Internals
3149927d
CY
391
392The @semantic{} analyzer is a complex engine which has been broken
393down across several modules. When the @semantic{} analyzer fails,
394start with @code{semantic-analyze-debug-assist}, then dive into some
395of these files.
396
397@table @file
398@item semantic-analyze.el
399The core analyzer for defining the @dfn{current context}. The
400current context is an object that contains references to aspects of
401the local context including the current prefix, and a tag list
402defining what the prefix means.
403
404@item semantic-analyze-complete.el
405Provides @code{semantic-analyze-possible-completions}.
406
407@item semantic-analyze-debug.el
408The analyzer debugger. Useful when attempting to get everything
409configured.
410
411@item semantic-analyze-fcn.el
412Various support functions needed by the analyzer.
413
414@item semantic-ctxt.el
415Local context parser. Contains overloadable functions used to move
416around through different scopes, get local variables, and collect the
417current prefix used when doing completion.
418
419@item semantic-scope.el
420Calculate @dfn{scope} for a location in a buffer. The scope includes
421local variables, and tag lists in scope for various reasons, such as
422C++ using statements.
423
424@item semanticdb-typecache.el
425The typecache is part of @code{semanticdb}, but is used primarilly by
426the analyzer to look up datatypes and complex names. The typecache is
427bound across source files and builds a master lookup table for data
428type names.
429
430@item semantic-ia.el
431Interactive Analyzer functions. Simple routines that do completion or
432lookups based on the results from the Analyzer. These routines are
433meant as examples for application writers, but are quite useful as
434they are.
435
436@item semantic-ia-sb.el
437Speedbar support for the analyzer, displaying context info, and
438completion lists.
439
440@end table
441
442@node Tools
443@section Tools
444
445These files contain various tools a user can use.
446
447@table @file
448@item semantic-idle.el
449Idle scheduler for @semantic{}. Manages reparsing buffers after
450edits, and large work tasks in idle time. Includes modes for showing
451summary help and pop-up completion.
452
453@item senator.el
454The @semantic{} navigator. Provides many ways to move through a
455buffer based on the active tag table.
456
457@item semantic-decorate.el
458A minor mode for decorating tags based on details from the parser.
459Includes overlines for functions, or coloring class fields based on
460protection.
461
462@item semantic-decorate-include.el
463A decoration mode for include files, which assists users in setting up
464parsing for their includes.
465
466@item semantic-complete.el
467Advanced completion prompts for reading tag names in the minibuffer, or
468inline in a buffer.
469
470@item semantic-imenu.el
471Imenu support for using @semantic{} tags in imenu.
472
473@item semantic-mru-bookmark.el
474Automatic bookmarking based on tags. Jump to locations you've been
475before based on tag name.
476
477@item semantic-sb.el
478Support for @semantic{} tag usage in Speedbar.
479
480@item semantic-util-modes.el
481A bunch of small minor-modes that exposes aspects of the semantic
482parser state. Includes @code{semantic-stickyfunc-mode}.
483
484@item document.el
485@itemx document-vars.el
486Create an update comments for tags.
487
488@item semantic-adebug.el
489Extensions of @file{data-debug.el} for @semantic{}.
490
491@item semantic-chart.el
492Draw some charts from stats generated from parsing.
493
494
495@item semantic-elp.el
496Profiler for helping to optimize the @semantic{} analyzer.
497
498
499@end table
500
501@node Tests
502@section Tests
503
504@table @file
505
506@item semantic-utest.el
507Basic testing of parsing and incremental parsing for most supported
508languages.
509
510@item semantic-ia-utest.el
511Test the semantic analyzer's ability to provide smart completions.
512
513@item semantic-utest-c.el
514Tests for the C parser's lexical pre-processor.
515
516@item semantic-regtest.el
517Regression tests from the older Semantic 1.x API.
518
519@end table
520
521@node Glossary
522@appendix Glossary
523
524@table @keyword
525@item BNF
526In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
527grammar file used for the 1.4 parser generator. This was a play on
528Backus-Naur Form which proved too confusing.
529
530@item bovinate
531A verb representing what happens when a bovine parser parses a file.
532
533@item bovine lambda
534In a bovine, or LL parser, the bovine lambda is a function to execute
535when a specific set of match rules has succeeded in matching text from
536the buffer.
537
538@item bovine parser
539A parser using the bovine parser generator. It is an LL parser
540suitible for small simple languages.
541
542@item context
543
544@item LALR
545
546@item lexer
547A program which converts text into a stream of tokens by analyzing
548them lexically. Lexers will commonly create strings, symbols,
549keywords and punctuation, and strip whitespaces and comments.
550
551@item LL
552
553@item nonterminal
554A nonterminal symbol or simply a nonterminal stands for a class of
be479117 555syntactically equivalent groupings. A nonterminal symbol name is used
3149927d
CY
556in writing grammar rules.
557
558@item overloadable
559Some functions are defined via @code{define-overload}.
560These can be overloaded via ....
561
562@item parser
563A program that converts @b{tokens} to @b{tags}.
564
565@item tag
566A tag is a representation of some entity in a language file, such as a
567function, variable, or include statement. In semantic, the word tag is
568used the same way it is used for the etags or ctags tools.
569
570A tag is usually bound to a buffer region via overlay, or it just
571specifies character locations in a file.
572
573@item token
574A single atomic item returned from a lexer. It represents some set
575of characters found in a buffer.
576
577@item token stream
578The output of the lexer as well as the input to the parser.
579
580@item wisent parser
581A parser using the wisent parser generator. It is a port of bison to
582Emacs Lisp. It is an LALR parser suitable for complex languages.
583@end table
584
585
586@node GNU Free Documentation License
587@appendix GNU Free Documentation License
588@include doclicense.texi
589
590@node Index
591@unnumbered Index
592@printindex cp
593
594@iftex
595@contents
596@summarycontents
597@end iftex
598
599@bye
600
601@c Following comments are for the benefit of ispell.
602
603@c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
604@c LocalWords: backquote bnf bovinate bovinates LALR
605@c LocalWords: bovinating bovination bovinator bucketize
606@c LocalWords: cb cdr charquote checkcache cindex CLOS
607@c LocalWords: concat concocting const constantness ctxt Decl defcustom
608@c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
609@c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
610@c LocalWords: eq Exp EXPANDFULL expresssion fn foo func funcall
611@c LocalWords: ia ids iff ifinfo imenu imenus init int isearch itemx java kbd
612@c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
613@c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
614@c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
615@c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
616@c LocalWords: popup positionalonly positiononly positionormarker pre
617@c LocalWords: printf printindex Programmatically pt punctuations quotemode
618@c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
619@c LocalWords: scopestart SEmantic semanticdb setfilename setq
620@c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
621@c LocalWords: streamorbuffer struct subalist submenu submenus
622@c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
623@c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
624@c LocalWords: uref usedb var vskip xref yak
964f5b2b
MB
625
626@ignore
627 arch-tag: cbc6e78c-4ff1-410e-9fc7-936487e39bbf
628@end ignore