doc/guide/localhost/CommonArg

   1 <!DOCTYPE html>
   2 <html lang="en">
   3 <head>
   4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
   5 <meta name="generator" content="AsciiDoc 8.6.9">
   6 <title>CommonArg</title>
   7 <link rel="stylesheet" href="./asciidoc.css" type="text/css">
   8 <link rel="stylesheet" href="./pygments.css" type="text/css">
   9
  10
  11 <script type="text/javascript" src="./asciidoc.js"></script>
  12 <script type="text/javascript">
  13 /*<![CDATA[*/
  14 asciidoc.install();
  15 /*]]>*/
  16 </script>
  17 <link rel="stylesheet" href="./mlton.css" type="text/css">
  18 </head>
  19 <body class="article">
  20 <div id="banner">
  21 <div id="banner-home">
  22 <a href="./Home">MLton 20180207</a>
  23 </div>
  24 </div>
  25 <div id="header">
  26 <h1>CommonArg</h1>
  27 </div>
  28 <div id="content">
  29 <div id="preamble">
  30 <div class="sectionbody">
  31 <div class="paragraph"><p><a href="CommonArg">CommonArg</a> is an optimization pass for the <a href="SSA">SSA</a>
  32 <a href="IntermediateLanguage">IntermediateLanguage</a>, invoked from <a href="SSASimplify">SSASimplify</a>.</p></div>
  33 </div>
  34 </div>
  35 <div class="sect1">
  36 <h2 id="_description">Description</h2>
  37 <div class="sectionbody">
  38 <div class="paragraph"><p>It optimizes instances of <span class="monospaced">Goto</span> transfers that pass the same
  39 arguments to the same label; e.g.</p></div>
  40 <div class="listingblock">
  41 <div class="content monospaced">
  42 <pre>L_1 ()
  43   ...
  44   z1 = ?
  45   ...
  46   L_3 (x, y, z1)
  47 L_2 ()
  48   ...
  49   z2 = ?
  50   ...
  51   L_3 (x, y, z2)
  52 L_3 (a, b, c)
  53   ...</pre>
  54 </div></div>
  55 <div class="paragraph"><p>This code can be simplified to:</p></div>
  56 <div class="listingblock">
  57 <div class="content monospaced">
  58 <pre>L_1 ()
  59   ...
  60   z1 = ?
  61   ...
  62   L_3 (z1)
  63 L_2 ()
  64   ...
  65   z2 = ?
  66   ...
  67   L_3 (z2)
  68 L_3 (c)
  69   a = x
  70   b = y</pre>
  71 </div></div>
  72 <div class="paragraph"><p>which saves a number of resources: time of setting up the arguments
  73 for the jump to <span class="monospaced">L_3</span>, space (either stack or pseudo-registers) for
  74 the arguments of <span class="monospaced">L_3</span>, etc.  It may also expose some other
  75 optimizations, if more information is known about <span class="monospaced">x</span> or <span class="monospaced">y</span>.</p></div>
  76 </div>
  77 </div>
  78 <div class="sect1">
  79 <h2 id="_implementation">Implementation</h2>
  80 <div class="sectionbody">
  81 <div class="ulist"><ul>
  82 <li>
  83 <p>
  84 <a href="https://github.com/MLton/mlton/blob/master/mlton/ssa/common-arg.fun"><span class="monospaced">common-arg.fun</span></a>
  85 </p>
  86 </li>
  87 </ul></div>
  88 </div>
  89 </div>
  90 <div class="sect1">
  91 <h2 id="_details_and_notes">Details and Notes</h2>
  92 <div class="sectionbody">
  93 <div class="paragraph"><p>Three analyses were originally proposed to drive the optimization
  94 transformation.  Only the <em>Dominator Analysis</em> is currently
  95 implemented.  (Implementations of the other analyses are available in
  96 the <a href="Sources">repository history</a>.)</p></div>
  97 <div class="sect2">
  98 <h3 id="_syntactic_analysis">Syntactic Analysis</h3>
  99 <div class="paragraph"><p>The simplest analysis I could think of maintains</p></div>
 100 <div class="listingblock">
 101 <div class="content monospaced">
 102 <pre>varInfo: Var.t -&gt; Var.t option list ref</pre>
 103 </div></div>
 104 <div class="paragraph"><p>initialized to <span class="monospaced">[]</span>.</p></div>
 105 <div class="ulist"><ul>
 106 <li>
 107 <p>
 108 For each variable <span class="monospaced">v</span> bound in a <span class="monospaced">Statement.t</span> or in the
 109 <span class="monospaced">Function.t</span> args, then <span class="monospaced">List.push(varInfo v, NONE)</span>.
 110 </p>
 111 </li>
 112 <li>
 113 <p>
 114 For each <span class="monospaced">L (x1, ..., xn)</span> transfer where <span class="monospaced">(a1, ..., an)</span> are the
 115 formals of <span class="monospaced">L</span>, then <span class="monospaced">List.push(varInfo ai, SOME xi)</span>.
 116 </p>
 117 </li>
 118 <li>
 119 <p>
 120 For each block argument a used in an unknown context (e.g.,
 121 arguments of blocks used as continuations, handlers, arith success,
 122 runtime return, or case switch labels), then
 123 <span class="monospaced">List.push(varInfo a, NONE)</span>.
 124 </p>
 125 </li>
 126 </ul></div>
 127 <div class="paragraph"><p>Now, any block argument <span class="monospaced">a</span> such that <span class="monospaced">varInfo a = xs</span>, where all of
 128 the elements of <span class="monospaced">xs</span> are equal to <span class="monospaced">SOME x</span>, can be optimized by
 129 setting <span class="monospaced">a = x</span> at the beginning of the block and dropping the
 130 argument from <span class="monospaced">Goto</span> transfers.</p></div>
 131 <div class="paragraph"><p>That takes care of the example above.  We can clearly do slightly
 132 better, by changing the transformation criteria to the following: any
 133 block argument a such that <span class="monospaced">varInfo a = xs</span>, where all of the elements
 134 of <span class="monospaced">xs</span> are equal to <span class="monospaced">SOME x</span> <em>or</em> are equal to <span class="monospaced">SOME a</span>, can be
 135 optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
 136 dropping the argument from <span class="monospaced">Goto</span> transfers.  This optimizes a case
 137 like:</p></div>
 138 <div class="listingblock">
 139 <div class="content monospaced">
 140 <pre>L_1 ()
 141   ... z1 = ? ...
 142   L_3 (x, y, z1)
 143 L_2 ()
 144   ... z2 = ? ...
 145   L_3(x, y, z2)
 146 L_3 (a, b, c)
 147   ... w = ? ...
 148   case w of
 149     true =&gt; L_4 | false =&gt; L_5
 150 L_4 ()
 151    ...
 152    L_3 (a, b, w)
 153 L_5 ()
 154    ...</pre>
 155 </div></div>
 156 <div class="paragraph"><p>where a common argument is passed to a loop (and is invariant through
 157 the loop).  Of course, the <a href="LoopInvariant">LoopInvariant</a> optimization pass would
 158 normally introduce a local loop and essentially reduce this to the
 159 first example, but I have seen this in practice, which suggests that
 160 some optimizations after <a href="LoopInvariant">LoopInvariant</a> do enough simplifications
 161 to introduce (new) loop invariant arguments.</p></div>
 162 </div>
 163 <div class="sect2">
 164 <h3 id="_fixpoint_analysis">Fixpoint Analysis</h3>
 165 <div class="paragraph"><p>However, the above analysis and transformation doesn&#8217;t cover the cases
 166 where eliminating one common argument exposes the opportunity to
 167 eliminate other common arguments.  For example:</p></div>
 168 <div class="listingblock">
 169 <div class="content monospaced">
 170 <pre>L_1 ()
 171   ...
 172   L_3 (x)
 173 L_2 ()
 174   ...
 175   L_3 (x)
 176 L_3 (a)
 177   ...
 178   L_5 (a)
 179 L_4 ()
 180   ...
 181   L_5 (x)
 182 L_5 (b)
 183   ...</pre>
 184 </div></div>
 185 <div class="paragraph"><p>One pass of analysis and transformation would eliminate the argument
 186 to <span class="monospaced">L_3</span> and rewrite the <span class="monospaced">L_5(a)</span> transfer to <span class="monospaced">L_5 (x)</span>, thereby
 187 exposing the opportunity to eliminate the common argument to <span class="monospaced">L_5</span>.</p></div>
 188 <div class="paragraph"><p>The interdependency the arguments to <span class="monospaced">L_3</span> and <span class="monospaced">L_5</span> suggest
 189 performing some sort of fixed-point analysis.  This analysis is
 190 relatively simple; maintain</p></div>
 191 <div class="listingblock">
 192 <div class="content monospaced">
 193 <pre>varInfo: Var.t -&gt; VarLattice.t</pre>
 194 </div></div>
 195 <div class="paragraph"><p>where</p></div>
 196 <div class="listingblock">
 197 <div class="content monospaced">
 198 <pre>VarLattice.t ~=~ Bot | Point of Var.t | Top</pre>
 199 </div></div>
 200 <div class="paragraph"><p>(but is implemented by the <a href="FlatLattice">FlatLattice</a> functor with a <span class="monospaced">lessThan</span>
 201 list and <span class="monospaced">value ref</span> under the hood), initialized to <span class="monospaced">Bot</span>.</p></div>
 202 <div class="ulist"><ul>
 203 <li>
 204 <p>
 205 For each variable <span class="monospaced">v</span> bound in a <span class="monospaced">Statement.t</span> or in the
 206 <span class="monospaced">Function.t</span> args, then <span class="monospaced">VarLattice.&lt;= (Point v, varInfo v)</span>
 207 </p>
 208 </li>
 209 <li>
 210 <p>
 211 For each <span class="monospaced">L (x1, ..., xn)</span> transfer where <span class="monospaced">(a1, ..., an)</span> are the
 212 formals of <span class="monospaced">L</span>}, then <span class="monospaced">VarLattice.&lt;= (varInfo xi, varInfo ai)</span>.
 213 </p>
 214 </li>
 215 <li>
 216 <p>
 217 For each block argument a used in an unknown context, then
 218 <span class="monospaced">VarLattice.&lt;= (Point a, varInfo a)</span>.
 219 </p>
 220 </li>
 221 </ul></div>
 222 <div class="paragraph"><p>Now, any block argument a such that <span class="monospaced">varInfo a = Point x</span> can be
 223 optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
 224 dropping the argument from <span class="monospaced">Goto</span> transfers.</p></div>
 225 <div class="paragraph"><p>Now, with the last example, we introduce the ordering constraints:</p></div>
 226 <div class="listingblock">
 227 <div class="content monospaced">
 228 <pre>varInfo x &lt;= varInfo a
 229 varInfo a &lt;= varInfo b
 230 varInfo x &lt;= varInfo b</pre>
 231 </div></div>
 232 <div class="paragraph"><p>Assuming that <span class="monospaced">varInfo x = Point x</span>, then we get <span class="monospaced">varInfo a = Point x</span>
 233 and <span class="monospaced">varInfo b = Point x</span>, and we optimize the example as desired.</p></div>
 234 <div class="paragraph"><p>But, that is a rather weak assumption.  It&#8217;s quite possible for
 235 <span class="monospaced">varInfo x = Top</span>.  For example, consider:</p></div>
 236 <div class="listingblock">
 237 <div class="content monospaced">
 238 <pre>G_1 ()
 239   ... n = 1 ...
 240   L_0 (n)
 241 G_2 ()
 242   ... m = 2 ...
 243   L_0 (m)
 244 L_0 (x)
 245   ...
 246 L_1 ()
 247   ...
 248   L_3 (x)
 249 L_2 ()
 250   ...
 251   L_3 (x)
 252 L_3 (a)
 253   ...
 254   L_5(a)
 255 L_4 ()
 256   ...
 257   L_5(x)
 258 L_5 (b)
 259    ...</pre>
 260 </div></div>
 261 <div class="paragraph"><p>Now <span class="monospaced">varInfo x = varInfo a = varInfo b = Top</span>.  What went wrong here?
 262 When <span class="monospaced">varInfo x</span> went to <span class="monospaced">Top</span>, it got propagated all the way through
 263 to <span class="monospaced">a</span> and <span class="monospaced">b</span>, and prevented the elimination of any common arguments.
 264 What we&#8217;d like to do instead is when <span class="monospaced">varInfo x</span> goes to <span class="monospaced">Top</span>,
 265 propagate on <span class="monospaced">Point x</span>&#8201;&#8212;&#8201;we have no hope of eliminating <span class="monospaced">x</span>, but if
 266 we hold <span class="monospaced">x</span> constant, then we have a chance of eliminating arguments
 267 for which <span class="monospaced">x</span> is passed as an actual.</p></div>
 268 </div>
 269 <div class="sect2">
 270 <h3 id="_dominator_analysis">Dominator Analysis</h3>
 271 <div class="paragraph"><p>Does anyone see where this is going yet?  Pausing for a little
 272 thought, <a href="MatthewFluet">MatthewFluet</a> realized that he had once before tried
 273 proposing this kind of "fix" to a fixed-point analysis&#8201;&#8212;&#8201;when we were
 274 first investigating the <a href="Contify">Contify</a> optimization in light of John
 275 Reppy&#8217;s CWS paper.  Of course, that "fix" failed because it defined a
 276 non-monotonic function and one couldn&#8217;t take the fixed point.  But,
 277 <a href="StephenWeeks">StephenWeeks</a> suggested a dominator based approach, and we were
 278 able to show that, indeed, the dominator analysis subsumed both the
 279 previous call based analysis and the cont based analysis.  And, a
 280 moment&#8217;s reflection reveals further parallels: when
 281 <span class="monospaced">varInfo: Var.t -&gt; Var.t option list ref</span>, we have something analogous
 282 to the call analysis, and when <span class="monospaced">varInfo: Var.t -&gt; VarLattice.t</span>, we
 283 have something analogous to the cont analysis.  Maybe there is
 284 something analogous to the dominator approach (and therefore superior
 285 to the previous analyses).</p></div>
 286 <div class="paragraph"><p>And this turns out to be the case.  Construct the graph <span class="monospaced">G</span> as follows:</p></div>
 287 <div class="listingblock">
 288 <div class="content monospaced">
 289 <pre>nodes(G) = {Root} U Var.t
 290 edges(G) = {Root -&gt; v | v bound in a Statement.t or
 291                                 in the Function.t args} U
 292            {xi -&gt; ai | L(x1, ..., xn) transfer where (a1, ..., an)
 293                                       are the formals of L} U
 294            {Root -&gt; a | a is a block argument used in an unknown context}</pre>
 295 </div></div>
 296 <div class="paragraph"><p>Let <span class="monospaced">idom(x)</span> be the immediate dominator of <span class="monospaced">x</span> in <span class="monospaced">G</span> with root
 297 <span class="monospaced">Root</span>.  Now, any block argument a such that <span class="monospaced">idom(a) = x &lt;&gt; Root</span> can
 298 be optimized by setting <span class="monospaced">a = x</span> at the beginning of the block and
 299 dropping the argument from <span class="monospaced">Goto</span> transfers.</p></div>
 300 <div class="paragraph"><p>Furthermore, experimental evidence suggests (and we are confident that
 301 a formal presentation could prove) that the dominator analysis
 302 subsumes the "syntactic" and "fixpoint" based analyses in this context
 303 as well and that the dominator analysis gets "everything" in one go.</p></div>
 304 </div>
 305 <div class="sect2">
 306 <h3 id="_final_thoughts">Final Thoughts</h3>
 307 <div class="paragraph"><p>I must admit, I was rather surprised at this progression and final
 308 result.  At the outset, I never would have thought of a connection
 309 between <a href="Contify">Contify</a> and <a href="CommonArg">CommonArg</a> optimizations.  They would seem
 310 to be two completely different optimizations.  Although, this may not
 311 really be the case.  As one of the reviewers of the ICFP paper said:</p></div>
 312 <div class="quoteblock">
 313 <div class="content">
 314 <div class="paragraph"><p>I understand that such a form of CPS might be convenient in some
 315 cases, but when we&#8217;re talking about analyzing code to detect that some
 316 continuation is constant, I think it makes a lot more sense to make
 317 all the continuation arguments completely explicit.</p></div>
 318 <div class="paragraph"><p>I believe that making all the continuation arguments explicit will
 319 show that the optimization can be generalized to eliminating constant
 320 arguments, whether continuations or not.</p></div>
 321 </div>
 322 <div class="attribution">
 323 </div></div>
 324 <div class="paragraph"><p>What I think the common argument optimization shows is that the
 325 dominator analysis does slightly better than the reviewer puts it: we
 326 find more than just constant continuations, we find common
 327 continuations.  And I think this is further justified by the fact that
 328 I have observed common argument eliminate some <span class="monospaced">env_X</span> arguments which
 329 would appear to correspond to determining that while the closure being
 330 executed isn&#8217;t constant it is at least the same as the closure being
 331 passed elsewhere.</p></div>
 332 <div class="paragraph"><p>At first, I was curious whether or not we had missed a bigger picture
 333 with the dominator analysis.  When we wrote the contification paper, I
 334 assumed that the dominator analysis was a specialized solution to a
 335 specialized problem; we never suggested that it was a technique suited
 336 to a larger class of analyses.  After initially finding a connection
 337 between <a href="Contify">Contify</a> and <a href="CommonArg">CommonArg</a> (and thinking that the only
 338 connection was the technique), I wondered if the dominator technique
 339 really was applicable to a larger class of analyses.  That is still a
 340 question, but after writing up the above, I&#8217;m suspecting that the
 341 "real story" is that the dominator analysis is a solution to the
 342 common argument optimization, and that the <a href="Contify">Contify</a> optimization is
 343 specializing <a href="CommonArg">CommonArg</a> to the case of continuation arguments (with
 344 a different transformation at the end).  (Note, a whole-program,
 345 inter-procedural common argument analysis doesn&#8217;t really make sense
 346 (in our <a href="SSA">SSA</a> <a href="IntermediateLanguage">IntermediateLanguage</a>), because the only way of
 347 passing values between functions is as arguments.  (Unless of course
 348 in the case that the common argument is also a constant argument, in
 349 which case <a href="ConstantPropagation">ConstantPropagation</a> could lift it to a global.)  The
 350 inter-procedural <a href="Contify">Contify</a> optimization works out because there we
 351 move the function to the argument.)</p></div>
 352 <div class="paragraph"><p>Anyways, it&#8217;s still unclear to me whether or not the dominator based
 353 approach solves other kinds of problems.</p></div>
 354 </div>
 355 <div class="sect2">
 356 <h3 id="_phase_ordering">Phase Ordering</h3>
 357 <div class="paragraph"><p>On the downside, the optimization doesn&#8217;t have a huge impact on
 358 runtime, although it does predictably saved some code size.  I stuck
 359 it in the optimization sequence after <a href="Flatten">Flatten</a> and (the third round
 360 of) <a href="LocalFlatten">LocalFlatten</a>, since it seems to me that we could have cases
 361 where some components of a tuple used as an argument are common, but
 362 the whole tuple isn&#8217;t.  I think it makes sense to add it after
 363 <a href="IntroduceLoops">IntroduceLoops</a> and <a href="LoopInvariant">LoopInvariant</a> (even though <a href="CommonArg">CommonArg</a>
 364 get some things that <a href="LoopInvariant">LoopInvariant</a> gets, it doesn&#8217;t get all of
 365 them).  I also think that it makes sense to add it before
 366 <a href="CommonSubexp">CommonSubexp</a>, since identifying variables could expose more common
 367 subexpressions.  I would think a similar thought applies to
 368 <a href="RedundantTests">RedundantTests</a>.</p></div>
 369 </div>
 370 </div>
 371 </div>
 372 </div>
 373 <div id="footnotes"><hr></div>
 374 <div id="footer">
 375 <div id="footer-text">
 376 </div>
 377 <div id="footer-badges">
 378 </div>
 379 </div>
 380 </body>
 381 </html>