* sem-user.texi (Semantic mode, Idle Scheduler, Smart Completion)
[bpt/emacs.git] / doc / misc / semantic.texi
CommitLineData
3149927d
CY
1\input texinfo
2@setfilename ../../info/semantic
3@set TITLE Semantic Manual
4@set AUTHOR Eric M. Ludlam and David Ponce
5@settitle @value{TITLE}
6
7@c *************************************************************************
8@c @ Header
9@c *************************************************************************
10
11@c Merge all indexes into a single index for now.
12@c We can always separate them later into two or more as needed.
13@syncodeindex vr cp
14@syncodeindex fn cp
15@syncodeindex ky cp
16@syncodeindex pg cp
17@syncodeindex tp cp
18
19@c @footnotestyle separate
20@c @paragraphindent 2
21@c @@smallbook
22@c %**end of header
23
24@copying
25This manual documents the Semantic library and utilities.
26
27Copyright @copyright{} 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2007,
282009 Free Software Foundation, Inc.
29
30@quotation
31Permission is granted to copy, distribute and/or modify this document
32under the terms of the GNU Free Documentation License, Version 1.3 or
33any later version published by the Free Software Foundation; with no
34Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
35and with the Back-Cover Texts as in (a) below. A copy of the license
36is included in the section entitled ``GNU Free Documentation License.''
37
38(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
39modify this GNU manual. Buying copies from the FSF supports it in
40developing GNU and promoting software freedom.''
41@end quotation
42@end copying
43
44@ifinfo
45@format
46START-INFO-DIR-ENTRY
47* Semantic: (semantic). Source code parser library and utilities.
48END-INFO-DIR-ENTRY
49@end format
50@end ifinfo
51
52@titlepage
53@center @titlefont{Semantic}
54@sp 4
55@center by @value{AUTHOR}
56@end titlepage
57@page
58
59@macro semantic{}
60@i{Semantic}
61@end macro
62
63@macro keyword{kw}
64@anchor{\kw\}
65@b{\kw\}
66@end macro
67
68@macro obsolete{old,new}
69@sp 1
70@strong{Compatibility}:
71@code{\new\} introduced in @semantic{} version 2.0 supercedes
72@code{\old\} which is now obsolete.
73@end macro
74
75@c *************************************************************************
76@c @ Document
77@c *************************************************************************
78@contents
79
80@node top
81@top @value{TITLE}
82
83@semantic{} is a suite of Emacs libraries and utilities for parsing
84source code. At its core is a lexical analyzer and two parser
85generators (@code{bovinator} and @code{wisent}) written in Emacs Lisp.
86@semantic{} provides a variety of tools for making use of the parser
87output, including user commands for code navigation and completion, as
88well as enhancements for imenu, speedbar, whichfunc, eldoc,
89hippie-expand, and several other parts of Emacs.
90
91To send bug reports, or participate in discussions about semantic,
92use the mailing list cedet-semantic@@sourceforge.net via the URL:
93@url{http://lists.sourceforge.net/lists/listinfo/cedet-semantic}
94
95@ifnottex
96@insertcopying
97@end ifnottex
98
99@menu
100* Introduction::
101* Using Semantic::
102* Semantic Internals::
103* Glossary::
104* GNU Free Documentation License::
105* Index::
106@end menu
107
108@node Introduction
109@chapter Introduction
110
111This chapter gives an overview of @semantic{} and its goals.
112
113Ordinarily, Emacs uses regular expressions (and syntax tables) to
114analyze source code for purposes such as syntax highlighting. This
115approach, though simple and efficient, has its limitations: roughly
116speaking, it only ``guesses'' the meaning of each piece of source code
117in the context of the programming language, instead of rigorously
118``understanding'' it.
119
120@semantic{} provides a new infrastructure to analyze source code using
121@dfn{parsers} instead of regular expressions. It contains two
122built-in parser generators (an @acronym{LL} generator named
123@code{Bovine} and an @acronym{LALR} generator named @code{Wisent},
124both written in Emacs Lisp), and parsers for several common
125programming languages. It can also make use of @dfn{external
126parsers}---programs such as GNU Global and GNU IDUtils.
127
128@semantic{} provides a uniform, language-independent @acronym{API} for
129accessing the parser output. This output can be used by other Emacs
130Lisp programs to implement ``syntax-aware'' behavior. @semantic{}
131itself includes several such utilities, including user-level Emacs
132commands for navigating, searching, and completing source code.
133
134The following diagram illustrates the structure of the @semantic{}
135package:
136
137@table @strong
138@item Please Note:
139The words in all-capital are those that @semantic{} itself provides.
140Others are current or future languages or applications that are not
141distributed along with @semantic{}.
142@end table
143
144@example
145 Applications
146 and
147 Utilities
148 -------
149 / \
150 +---------------+ +--------+ +--------+
151 C --->| C PARSER |--->| | | |
152 +---------------+ | | | |
153 +---------------+ | COMMON | | COMMON |<--- SPEEDBAR
154 Java --->| JAVA PARSER |--->| PARSE | | |
155 +---------------+ | TREE | | PARSE |<--- SEMANTICDB
8e687433
CY
156 +---------------+ | FORMAT | | API |
157 Scheme --->| SCHEME PARSER |--->| | | |<--- ecb
3149927d
CY
158 +---------------+ | | | |
159 +---------------+ | | | |
160 Texinfo --->| TEXI. PARSER |--->| | | |
161 +---------------+ | | | |
162
163 ... ... ... ...
164
3149927d
CY
165 +---------------+ | | | |
166 Lang. Y --->| Y Parser |--->| | | |<--- app. ?
167 +---------------+ | | | |
168 +---------------+ | | | |<--- app. ?
169 Lang. Z --->| Z Parser |--->| | | |
170 +---------------+ +--------+ +--------+
171@end example
172
173@menu
174* Semantic Components::
175@end menu
176
177@node Semantic Components
178@section Semantic Components
179
180In this section, we provide a more detailed description of the major
181components of @semantic{}, and how they interact with one another.
182
183The first step in parsing a source code file is to break it up into
184its fundamental components. This step is called lexical analysis:
185
186@example
187 syntax table, keywords list, and options
188 |
189 |
190 v
191 input file ----> Lexer ----> token stream
192@end example
193
194@noindent
195The output of the lexical analyzer is a list of tokens that make up
196the file. The next step is the actual parsing, shown below:
197
198@example
199 parser tables
200 |
201 v
202 token stream ---> Parser ----> parse tree
203@end example
204
205@noindent
206The end result, the parse tree, is @semantic{}'s internal
207representation of the language grammar. @semantic{} provides an
208@acronym{API} for Emacs Lisp programs to access the parse tree.
209
210Parsing large files can take several seconds or more. By default,
211@semantic{} automatically caches parse trees by saving them in your
212@file{.emacs.d} directory. When you revisit a previously-parsed file,
213the parse tree is automatically reloaded from this cache, to save
214time. @xref{SemanticDB}.
215
216@node Using Semantic
217@chapter Using Semantic
218
219@include sem-user.texi
220
221@node Semantic Internals
222@chapter Semantic Internals
223
224This chapter provides an overview of the internals of @semantic{}.
8e687433
CY
225This information is usually not needed by application developers or
226grammar developers; it is useful mostly for the hackers who would like
227to learn more about how @semantic{} works.
3149927d
CY
228
229@menu
230* Parser code :: Code used for the parsers
231* Tag handling :: Code used for manipulating tags
fd1cefda
CY
232* Semanticdb Internals :: Code used in the semantic database
233* Analyzer Internals :: Code used in the code analyzer
be479117
JB
234* Tools :: Code used in user tools
235* Tests :: Code used for testing
3149927d
CY
236@end menu
237
238@node Parser code
239@section Parser code
240
241@semantic{} parsing code is spread across a range of files.
242
243@table @file
244@item semantic.el
245The core infrastructure sets up buffers for parsing, and has all the
246core parsing routines. Most parsing routines are overloadable, so the
247actual implementation may be somewhere else.
248
249@item semantic-edit.el
250Incremental reparse based on user edits.
251
252@item semantic-grammar.el
253@itemx semantic-grammar.wy
254Parser for the different grammar languages, and a major mode for
255editing grammars in Emacs.
256
257@item semantic-lex.el
258Infrastructure for implementing lexical analyzers. Provides macros
259for creating individual analyzers for specific features, and a way to
260combine them together.
261
262@item semantic-lex-spp.el
263Infrastructure for a lexical symbolic preprocessor. This was written
264to implement the C preprocessor, but could be used for other lexical
265preprocessors.
266
267@item bovine/bovine-grammar.el
268@itemx bovine/bovine-grammar-macros.el
269@itemx bovine/semantic-bovine.el
270The ``bovine'' grammar. This is the first grammar mode written for
271@semantic{} and is useful for simple creating simple parsers.
272
273@item wisent/wisent.el
274@itemx wisent/bison-wisent.el
275@itemx wisent/semantic-wisent.el
276@itemx wisent/semantic-debug-grammar.el
277A port of bison to Emacs. This infrastructure lets you create LALR
278based parsers for @semantic{}.
279
280@item semantic-ast.el
281Manage Abstract Syntax Trees for parsers.
282
283@item semantic-debug.el
284Infrastructure for debugging grammars.
285
286@item semantic-util.el
287Various utilities for manipulating tags, such as describing the tag
288under point, adding labels, and the all important
289@code{semantic-something-to-tag-table}.
290
291@end table
292
293@node Tag handling
294@section Tag handling
295
296A tag represents an individual item found in a buffer, such as a
297function or variable. Tag handling is handled in several source
298files.
299
300@table @file
301@item semantic-tag.el
302Basic tag creation, queries, cloning, binding, and unbinding.
303
304@item semantic-tag-write.el
305Write a tag or tag list to a stream. These routines are used by
306@file{semanticdb-file.el} when saving a list of tags.
307
308@item semantic-tag-file.el
309Files associated with tags. Goto-tag, file for include, and file for
310a prototype.
311
312@item semantic-tag-ls.el
313Language dependant features of a tag, such as parent calculation, slot
314protection, and other states like abstract, virtual, static, and leaf.
315
316@item semantic-dep.el
317Include file handling. Contains the include path concepts, and
318routines for looking up file names in the include path.
319
320@item semantic-format.el
321Convert a tag into a nicely formatted and colored string. Use
322@code{semantic-test-all-format-tag-functions} to test different output
323options.
324
325@item semantic-find.el
326Find tags matching different conditions in a tag table.
327These routines are used by @file{semanticdb-find.el} once the database
328has been converted into a simpler tag table.
329
330@item semantic-sort.el
331Sorting lists of tags in different ways. Includes sorting a plain
332list of tags forward or backward. Includes binning tags based on
333attributes (bucketize), and tag adoption for multiple references to
334the same thing.
335
336@item semantic-doc.el
337Capture documentation comments from near a tag.
338
339@end table
340
fd1cefda
CY
341@node Semanticdb Internals
342@section Semanticdb Internals
3149927d
CY
343
344@acronym{Semanticdb} complexity is certainly an issue. It is a rather
345hairy problem to try and solve.
346
347@table @file
348@item semanticdb.el
349Defines a @dfn{database} and a @dfn{table} base class. You can
350instantiate these classes, and use them, but they are not persistent.
351
352This file also provides support for @code{semanticdb-minor-mode},
353which automatically associates files with tables in databases so that
354tags are @emph{saved} while a buffer is not in memory.
355
356The database and tables both also provide applicate cache information,
357and cache flushing system. The semanticdb search routines use caches
358to save datastructures that are complex to calculate.
359
360Lastly, it provides the concept of @dfn{project root}. It is a system
361by which a file can be associated with the root of a project, so if
362you have a tree of directories and source files, it can find the root,
363and allow a tag-search to span all available databases in that
364directory hierarchy.
365
366@item semanticdb-file.el
367Provides a subclass of the basic table so that it can be saved to
368disk. Implements all the code needed to unbind/rebind tags to a
369buffer and writing them to a file.
370
371@item semanticdb-el.el
372Implements a special kind of @dfn{system} database that uses Emacs
373internals to perform queries.
374
375@item semanticdb-ebrowse.el
376Implements a system database that uses Ebrowse to parse files into a
377table that can be queried for tag names. Successful tag hits during a
378find causes @semantic{} to pick up and parse the reference files to
379get the full details.
380
381@item semanticdb-find.el
382Infrastructure for searching groups @semantic{} databases, and dealing
383with the search results format.
384
385@item semanticdb-ref.el
386Tracks crossreferences. Cross references are needed when buffer is
387reparsed, and must alert other tables that any dependant caches may
388need to be flushed. References are in the form of include files.
389
390@end table
391
fd1cefda
CY
392@node Analyzer Internals
393@section Analyzer Internals
3149927d
CY
394
395The @semantic{} analyzer is a complex engine which has been broken
396down across several modules. When the @semantic{} analyzer fails,
397start with @code{semantic-analyze-debug-assist}, then dive into some
398of these files.
399
400@table @file
401@item semantic-analyze.el
402The core analyzer for defining the @dfn{current context}. The
403current context is an object that contains references to aspects of
404the local context including the current prefix, and a tag list
405defining what the prefix means.
406
407@item semantic-analyze-complete.el
408Provides @code{semantic-analyze-possible-completions}.
409
410@item semantic-analyze-debug.el
411The analyzer debugger. Useful when attempting to get everything
412configured.
413
414@item semantic-analyze-fcn.el
415Various support functions needed by the analyzer.
416
417@item semantic-ctxt.el
418Local context parser. Contains overloadable functions used to move
419around through different scopes, get local variables, and collect the
420current prefix used when doing completion.
421
422@item semantic-scope.el
423Calculate @dfn{scope} for a location in a buffer. The scope includes
424local variables, and tag lists in scope for various reasons, such as
425C++ using statements.
426
427@item semanticdb-typecache.el
428The typecache is part of @code{semanticdb}, but is used primarilly by
429the analyzer to look up datatypes and complex names. The typecache is
430bound across source files and builds a master lookup table for data
431type names.
432
433@item semantic-ia.el
434Interactive Analyzer functions. Simple routines that do completion or
435lookups based on the results from the Analyzer. These routines are
436meant as examples for application writers, but are quite useful as
437they are.
438
439@item semantic-ia-sb.el
440Speedbar support for the analyzer, displaying context info, and
441completion lists.
442
443@end table
444
445@node Tools
446@section Tools
447
448These files contain various tools a user can use.
449
450@table @file
451@item semantic-idle.el
452Idle scheduler for @semantic{}. Manages reparsing buffers after
453edits, and large work tasks in idle time. Includes modes for showing
454summary help and pop-up completion.
455
456@item senator.el
457The @semantic{} navigator. Provides many ways to move through a
458buffer based on the active tag table.
459
460@item semantic-decorate.el
461A minor mode for decorating tags based on details from the parser.
462Includes overlines for functions, or coloring class fields based on
463protection.
464
465@item semantic-decorate-include.el
466A decoration mode for include files, which assists users in setting up
467parsing for their includes.
468
469@item semantic-complete.el
470Advanced completion prompts for reading tag names in the minibuffer, or
471inline in a buffer.
472
473@item semantic-imenu.el
474Imenu support for using @semantic{} tags in imenu.
475
476@item semantic-mru-bookmark.el
477Automatic bookmarking based on tags. Jump to locations you've been
478before based on tag name.
479
480@item semantic-sb.el
481Support for @semantic{} tag usage in Speedbar.
482
483@item semantic-util-modes.el
484A bunch of small minor-modes that exposes aspects of the semantic
485parser state. Includes @code{semantic-stickyfunc-mode}.
486
487@item document.el
488@itemx document-vars.el
489Create an update comments for tags.
490
491@item semantic-adebug.el
492Extensions of @file{data-debug.el} for @semantic{}.
493
494@item semantic-chart.el
495Draw some charts from stats generated from parsing.
496
497
498@item semantic-elp.el
499Profiler for helping to optimize the @semantic{} analyzer.
500
501
502@end table
503
504@node Tests
505@section Tests
506
507@table @file
508
509@item semantic-utest.el
510Basic testing of parsing and incremental parsing for most supported
511languages.
512
513@item semantic-ia-utest.el
514Test the semantic analyzer's ability to provide smart completions.
515
516@item semantic-utest-c.el
517Tests for the C parser's lexical pre-processor.
518
519@item semantic-regtest.el
520Regression tests from the older Semantic 1.x API.
521
522@end table
523
524@node Glossary
525@appendix Glossary
526
527@table @keyword
528@item BNF
529In semantic 1.4, a BNF file represented ``Bovine Normal Form'', the
530grammar file used for the 1.4 parser generator. This was a play on
531Backus-Naur Form which proved too confusing.
532
533@item bovinate
534A verb representing what happens when a bovine parser parses a file.
535
536@item bovine lambda
537In a bovine, or LL parser, the bovine lambda is a function to execute
538when a specific set of match rules has succeeded in matching text from
539the buffer.
540
541@item bovine parser
542A parser using the bovine parser generator. It is an LL parser
543suitible for small simple languages.
544
545@item context
546
547@item LALR
548
549@item lexer
550A program which converts text into a stream of tokens by analyzing
551them lexically. Lexers will commonly create strings, symbols,
552keywords and punctuation, and strip whitespaces and comments.
553
554@item LL
555
556@item nonterminal
557A nonterminal symbol or simply a nonterminal stands for a class of
be479117 558syntactically equivalent groupings. A nonterminal symbol name is used
3149927d
CY
559in writing grammar rules.
560
561@item overloadable
562Some functions are defined via @code{define-overload}.
563These can be overloaded via ....
564
565@item parser
566A program that converts @b{tokens} to @b{tags}.
567
568@item tag
569A tag is a representation of some entity in a language file, such as a
570function, variable, or include statement. In semantic, the word tag is
571used the same way it is used for the etags or ctags tools.
572
573A tag is usually bound to a buffer region via overlay, or it just
574specifies character locations in a file.
575
576@item token
577A single atomic item returned from a lexer. It represents some set
578of characters found in a buffer.
579
580@item token stream
581The output of the lexer as well as the input to the parser.
582
583@item wisent parser
584A parser using the wisent parser generator. It is a port of bison to
585Emacs Lisp. It is an LALR parser suitable for complex languages.
586@end table
587
588
589@node GNU Free Documentation License
590@appendix GNU Free Documentation License
591@include doclicense.texi
592
593@node Index
594@unnumbered Index
595@printindex cp
596
597@iftex
598@contents
599@summarycontents
600@end iftex
601
602@bye
603
604@c Following comments are for the benefit of ispell.
605
606@c LocalWords: alist API APIs arg argc args argv asis assoc autoload Wisent
607@c LocalWords: backquote bnf bovinate bovinates LALR
608@c LocalWords: bovinating bovination bovinator bucketize
609@c LocalWords: cb cdr charquote checkcache cindex CLOS
610@c LocalWords: concat concocting const constantness ctxt Decl defcustom
611@c LocalWords: deffn deffnx defun defvar destructor's dfn diff dir
612@c LocalWords: doc docstring EDE EIEIO elisp emacsman emph enum
613@c LocalWords: eq Exp EXPANDFULL expresssion fn foo func funcall
614@c LocalWords: ia ids iff ifinfo imenu imenus init int isearch itemx java kbd
615@c LocalWords: keymap keywordtable lang languagemode lexer lexing Ludlam
616@c LocalWords: menubar metaparent metaparents min minibuffer Misc mode's
617@c LocalWords: multitable NAvigaTOR noindent nomedian nonterm noselect
618@c LocalWords: nosnarf obarray OLE OO outputfile paren parsetable POINT's
619@c LocalWords: popup positionalonly positiononly positionormarker pre
620@c LocalWords: printf printindex Programmatically pt punctuations quotemode
621@c LocalWords: ref regex regexp Regexps reparse resetfile samp sb
622@c LocalWords: scopestart SEmantic semanticdb setfilename setq
623@c LocalWords: settitle setupfunction sexp sp SPC speedbar speedbar's
624@c LocalWords: streamorbuffer struct subalist submenu submenus
625@c LocalWords: subsubsection sw sym texi texinfo titlefont titlepage
626@c LocalWords: tok TOKEN's toplevel typemodifiers uml unset untar
627@c LocalWords: uref usedb var vskip xref yak
964f5b2b
MB
628
629@ignore
630 arch-tag: cbc6e78c-4ff1-410e-9fc7-936487e39bbf
631@end ignore