| 1 | @c -*-texinfo-*- |
| 2 | @c This is part of the GNU Emacs Lisp Reference Manual. |
| 3 | @c Copyright (C) 1999, 2002, 2003, 2004, 2005, |
| 4 | @c 2006 Free Software Foundation, Inc. |
| 5 | @c See the file elisp.texi for copying conditions. |
| 6 | @setfilename ../info/hash |
| 7 | @node Hash Tables, Symbols, Sequences Arrays Vectors, Top |
| 8 | @chapter Hash Tables |
| 9 | @cindex hash tables |
| 10 | |
| 11 | A hash table is a very fast kind of lookup table, somewhat like |
| 12 | an alist in that it maps keys to corresponding values. It differs |
| 13 | from an alist in these ways: |
| 14 | |
| 15 | @itemize @bullet |
| 16 | @item |
| 17 | Lookup in a hash table is extremely fast for large tables---in fact, the |
| 18 | time required is essentially @emph{independent} of how many elements are |
| 19 | stored in the table. For smaller tables (a few tens of elements) |
| 20 | alists may still be faster because hash tables have a more-or-less |
| 21 | constant overhead. |
| 22 | |
| 23 | @item |
| 24 | The correspondences in a hash table are in no particular order. |
| 25 | |
| 26 | @item |
| 27 | There is no way to share structure between two hash tables, |
| 28 | the way two alists can share a common tail. |
| 29 | @end itemize |
| 30 | |
| 31 | Emacs Lisp provides a general-purpose hash table data type, along |
| 32 | with a series of functions for operating on them. Hash tables have no |
| 33 | read syntax, and print in hash notation, like this: |
| 34 | |
| 35 | @example |
| 36 | (make-hash-table) |
| 37 | @result{} #<hash-table 'eql nil 0/65 0x83af980> |
| 38 | @end example |
| 39 | |
| 40 | @noindent |
| 41 | (The term ``hash notation'' refers to the initial @samp{#} |
| 42 | character---@pxref{Printed Representation}---and has nothing to do with |
| 43 | the term ``hash table.'') |
| 44 | |
| 45 | Obarrays are also a kind of hash table, but they are a different type |
| 46 | of object and are used only for recording interned symbols |
| 47 | (@pxref{Creating Symbols}). |
| 48 | |
| 49 | @menu |
| 50 | * Creating Hash:: Functions to create hash tables. |
| 51 | * Hash Access:: Reading and writing the hash table contents. |
| 52 | * Defining Hash:: Defining new comparison methods |
| 53 | * Other Hash:: Miscellaneous. |
| 54 | @end menu |
| 55 | |
| 56 | @node Creating Hash |
| 57 | @section Creating Hash Tables |
| 58 | |
| 59 | The principal function for creating a hash table is |
| 60 | @code{make-hash-table}. |
| 61 | |
| 62 | @tindex make-hash-table |
| 63 | @defun make-hash-table &rest keyword-args |
| 64 | This function creates a new hash table according to the specified |
| 65 | arguments. The arguments should consist of alternating keywords |
| 66 | (particular symbols recognized specially) and values corresponding to |
| 67 | them. |
| 68 | |
| 69 | Several keywords make sense in @code{make-hash-table}, but the only two |
| 70 | that you really need to know about are @code{:test} and @code{:weakness}. |
| 71 | |
| 72 | @table @code |
| 73 | @item :test @var{test} |
| 74 | This specifies the method of key lookup for this hash table. The |
| 75 | default is @code{eql}; @code{eq} and @code{equal} are other |
| 76 | alternatives: |
| 77 | |
| 78 | @table @code |
| 79 | @item eql |
| 80 | Keys which are numbers are ``the same'' if they are @code{equal}, that |
| 81 | is, if they are equal in value and either both are integers or both |
| 82 | are floating point numbers; otherwise, two distinct objects are never |
| 83 | ``the same''. |
| 84 | |
| 85 | @item eq |
| 86 | Any two distinct Lisp objects are ``different'' as keys. |
| 87 | |
| 88 | @item equal |
| 89 | Two Lisp objects are ``the same'', as keys, if they are equal |
| 90 | according to @code{equal}. |
| 91 | @end table |
| 92 | |
| 93 | You can use @code{define-hash-table-test} (@pxref{Defining Hash}) to |
| 94 | define additional possibilities for @var{test}. |
| 95 | |
| 96 | @item :weakness @var{weak} |
| 97 | The weakness of a hash table specifies whether the presence of a key or |
| 98 | value in the hash table preserves it from garbage collection. |
| 99 | |
| 100 | The value, @var{weak}, must be one of @code{nil}, @code{key}, |
| 101 | @code{value}, @code{key-or-value}, @code{key-and-value}, or @code{t} |
| 102 | which is an alias for @code{key-and-value}. If @var{weak} is @code{key} |
| 103 | then the hash table does not prevent its keys from being collected as |
| 104 | garbage (if they are not referenced anywhere else); if a particular key |
| 105 | does get collected, the corresponding association is removed from the |
| 106 | hash table. |
| 107 | |
| 108 | If @var{weak} is @code{value}, then the hash table does not prevent |
| 109 | values from being collected as garbage (if they are not referenced |
| 110 | anywhere else); if a particular value does get collected, the |
| 111 | corresponding association is removed from the hash table. |
| 112 | |
| 113 | If @var{weak} is @code{key-and-value} or @code{t}, both the key and |
| 114 | the value must be live in order to preserve the association. Thus, |
| 115 | the hash table does not protect either keys or values from garbage |
| 116 | collection; if either one is collected as garbage, that removes the |
| 117 | association. |
| 118 | |
| 119 | If @var{weak} is @code{key-or-value}, either the key or |
| 120 | the value can preserve the association. Thus, associations are |
| 121 | removed from the hash table when both their key and value would be |
| 122 | collected as garbage (if not for references from weak hash tables). |
| 123 | |
| 124 | The default for @var{weak} is @code{nil}, so that all keys and values |
| 125 | referenced in the hash table are preserved from garbage collection. |
| 126 | |
| 127 | @item :size @var{size} |
| 128 | This specifies a hint for how many associations you plan to store in the |
| 129 | hash table. If you know the approximate number, you can make things a |
| 130 | little more efficient by specifying it this way. If you specify too |
| 131 | small a size, the hash table will grow automatically when necessary, but |
| 132 | doing that takes some extra time. |
| 133 | |
| 134 | The default size is 65. |
| 135 | |
| 136 | @item :rehash-size @var{rehash-size} |
| 137 | When you add an association to a hash table and the table is ``full,'' |
| 138 | it grows automatically. This value specifies how to make the hash table |
| 139 | larger, at that time. |
| 140 | |
| 141 | If @var{rehash-size} is an integer, it should be positive, and the hash |
| 142 | table grows by adding that much to the nominal size. If |
| 143 | @var{rehash-size} is a floating point number, it had better be greater |
| 144 | than 1, and the hash table grows by multiplying the old size by that |
| 145 | number. |
| 146 | |
| 147 | The default value is 1.5. |
| 148 | |
| 149 | @item :rehash-threshold @var{threshold} |
| 150 | This specifies the criterion for when the hash table is ``full'' (so |
| 151 | it should be made larger). The value, @var{threshold}, should be a |
| 152 | positive floating point number, no greater than 1. The hash table is |
| 153 | ``full'' whenever the actual number of entries exceeds this fraction |
| 154 | of the nominal size. The default for @var{threshold} is 0.8. |
| 155 | @end table |
| 156 | @end defun |
| 157 | |
| 158 | @tindex makehash |
| 159 | @defun makehash &optional test |
| 160 | This is equivalent to @code{make-hash-table}, but with a different style |
| 161 | argument list. The argument @var{test} specifies the method |
| 162 | of key lookup. |
| 163 | |
| 164 | This function is obsolete. Use @code{make-hash-table} instead. |
| 165 | @end defun |
| 166 | |
| 167 | @node Hash Access |
| 168 | @section Hash Table Access |
| 169 | |
| 170 | This section describes the functions for accessing and storing |
| 171 | associations in a hash table. In general, any Lisp object can be used |
| 172 | as a hash key, unless the comparison method imposes limits. Any Lisp |
| 173 | object can also be used as the value. |
| 174 | |
| 175 | @tindex gethash |
| 176 | @defun gethash key table &optional default |
| 177 | This function looks up @var{key} in @var{table}, and returns its |
| 178 | associated @var{value}---or @var{default}, if @var{key} has no |
| 179 | association in @var{table}. |
| 180 | @end defun |
| 181 | |
| 182 | @tindex puthash |
| 183 | @defun puthash key value table |
| 184 | This function enters an association for @var{key} in @var{table}, with |
| 185 | value @var{value}. If @var{key} already has an association in |
| 186 | @var{table}, @var{value} replaces the old associated value. |
| 187 | @end defun |
| 188 | |
| 189 | @tindex remhash |
| 190 | @defun remhash key table |
| 191 | This function removes the association for @var{key} from @var{table}, if |
| 192 | there is one. If @var{key} has no association, @code{remhash} does |
| 193 | nothing. |
| 194 | |
| 195 | @b{Common Lisp note:} In Common Lisp, @code{remhash} returns |
| 196 | non-@code{nil} if it actually removed an association and @code{nil} |
| 197 | otherwise. In Emacs Lisp, @code{remhash} always returns @code{nil}. |
| 198 | @end defun |
| 199 | |
| 200 | @tindex clrhash |
| 201 | @defun clrhash table |
| 202 | This function removes all the associations from hash table @var{table}, |
| 203 | so that it becomes empty. This is also called @dfn{clearing} the hash |
| 204 | table. |
| 205 | |
| 206 | @b{Common Lisp note:} In Common Lisp, @code{clrhash} returns the empty |
| 207 | @var{table}. In Emacs Lisp, it returns @code{nil}. |
| 208 | @end defun |
| 209 | |
| 210 | @tindex maphash |
| 211 | @defun maphash function table |
| 212 | @anchor{Definition of maphash} |
| 213 | This function calls @var{function} once for each of the associations in |
| 214 | @var{table}. The function @var{function} should accept two |
| 215 | arguments---a @var{key} listed in @var{table}, and its associated |
| 216 | @var{value}. @code{maphash} returns @code{nil}. |
| 217 | @end defun |
| 218 | |
| 219 | @node Defining Hash |
| 220 | @section Defining Hash Comparisons |
| 221 | @cindex hash code |
| 222 | |
| 223 | You can define new methods of key lookup by means of |
| 224 | @code{define-hash-table-test}. In order to use this feature, you need |
| 225 | to understand how hash tables work, and what a @dfn{hash code} means. |
| 226 | |
| 227 | You can think of a hash table conceptually as a large array of many |
| 228 | slots, each capable of holding one association. To look up a key, |
| 229 | @code{gethash} first computes an integer, the hash code, from the key. |
| 230 | It reduces this integer modulo the length of the array, to produce an |
| 231 | index in the array. Then it looks in that slot, and if necessary in |
| 232 | other nearby slots, to see if it has found the key being sought. |
| 233 | |
| 234 | Thus, to define a new method of key lookup, you need to specify both a |
| 235 | function to compute the hash code from a key, and a function to compare |
| 236 | two keys directly. |
| 237 | |
| 238 | @tindex define-hash-table-test |
| 239 | @defun define-hash-table-test name test-fn hash-fn |
| 240 | This function defines a new hash table test, named @var{name}. |
| 241 | |
| 242 | After defining @var{name} in this way, you can use it as the @var{test} |
| 243 | argument in @code{make-hash-table}. When you do that, the hash table |
| 244 | will use @var{test-fn} to compare key values, and @var{hash-fn} to compute |
| 245 | a ``hash code'' from a key value. |
| 246 | |
| 247 | The function @var{test-fn} should accept two arguments, two keys, and |
| 248 | return non-@code{nil} if they are considered ``the same.'' |
| 249 | |
| 250 | The function @var{hash-fn} should accept one argument, a key, and return |
| 251 | an integer that is the ``hash code'' of that key. For good results, the |
| 252 | function should use the whole range of integer values for hash codes, |
| 253 | including negative integers. |
| 254 | |
| 255 | The specified functions are stored in the property list of @var{name} |
| 256 | under the property @code{hash-table-test}; the property value's form is |
| 257 | @code{(@var{test-fn} @var{hash-fn})}. |
| 258 | @end defun |
| 259 | |
| 260 | @tindex sxhash |
| 261 | @defun sxhash obj |
| 262 | This function returns a hash code for Lisp object @var{obj}. |
| 263 | This is an integer which reflects the contents of @var{obj} |
| 264 | and the other Lisp objects it points to. |
| 265 | |
| 266 | If two objects @var{obj1} and @var{obj2} are equal, then @code{(sxhash |
| 267 | @var{obj1})} and @code{(sxhash @var{obj2})} are the same integer. |
| 268 | |
| 269 | If the two objects are not equal, the values returned by @code{sxhash} |
| 270 | are usually different, but not always; once in a rare while, by luck, |
| 271 | you will encounter two distinct-looking objects that give the same |
| 272 | result from @code{sxhash}. |
| 273 | @end defun |
| 274 | |
| 275 | This example creates a hash table whose keys are strings that are |
| 276 | compared case-insensitively. |
| 277 | |
| 278 | @example |
| 279 | (defun case-fold-string= (a b) |
| 280 | (compare-strings a nil nil b nil nil t)) |
| 281 | |
| 282 | (defun case-fold-string-hash (a) |
| 283 | (sxhash (upcase a))) |
| 284 | |
| 285 | (define-hash-table-test 'case-fold |
| 286 | 'case-fold-string= 'case-fold-string-hash) |
| 287 | |
| 288 | (make-hash-table :test 'case-fold) |
| 289 | @end example |
| 290 | |
| 291 | Here is how you could define a hash table test equivalent to the |
| 292 | predefined test value @code{equal}. The keys can be any Lisp object, |
| 293 | and equal-looking objects are considered the same key. |
| 294 | |
| 295 | @example |
| 296 | (define-hash-table-test 'contents-hash 'equal 'sxhash) |
| 297 | |
| 298 | (make-hash-table :test 'contents-hash) |
| 299 | @end example |
| 300 | |
| 301 | @node Other Hash |
| 302 | @section Other Hash Table Functions |
| 303 | |
| 304 | Here are some other functions for working with hash tables. |
| 305 | |
| 306 | @tindex hash-table-p |
| 307 | @defun hash-table-p table |
| 308 | This returns non-@code{nil} if @var{table} is a hash table object. |
| 309 | @end defun |
| 310 | |
| 311 | @tindex copy-hash-table |
| 312 | @defun copy-hash-table table |
| 313 | This function creates and returns a copy of @var{table}. Only the table |
| 314 | itself is copied---the keys and values are shared. |
| 315 | @end defun |
| 316 | |
| 317 | @tindex hash-table-count |
| 318 | @defun hash-table-count table |
| 319 | This function returns the actual number of entries in @var{table}. |
| 320 | @end defun |
| 321 | |
| 322 | @tindex hash-table-test |
| 323 | @defun hash-table-test table |
| 324 | This returns the @var{test} value that was given when @var{table} was |
| 325 | created, to specify how to hash and compare keys. See |
| 326 | @code{make-hash-table} (@pxref{Creating Hash}). |
| 327 | @end defun |
| 328 | |
| 329 | @tindex hash-table-weakness |
| 330 | @defun hash-table-weakness table |
| 331 | This function returns the @var{weak} value that was specified for hash |
| 332 | table @var{table}. |
| 333 | @end defun |
| 334 | |
| 335 | @tindex hash-table-rehash-size |
| 336 | @defun hash-table-rehash-size table |
| 337 | This returns the rehash size of @var{table}. |
| 338 | @end defun |
| 339 | |
| 340 | @tindex hash-table-rehash-threshold |
| 341 | @defun hash-table-rehash-threshold table |
| 342 | This returns the rehash threshold of @var{table}. |
| 343 | @end defun |
| 344 | |
| 345 | @tindex hash-table-size |
| 346 | @defun hash-table-size table |
| 347 | This returns the current nominal size of @var{table}. |
| 348 | @end defun |
| 349 | |
| 350 | @ignore |
| 351 | arch-tag: 3b5107f9-d2f0-47d5-ad61-3498496bea0e |
| 352 | @end ignore |