Import Upstream version 20180207
[hcoop/debian/mlton.git] / doc / guide / localhost / HowProfilingWorks
1 <!DOCTYPE html>
2 <html lang="en">
3 <head>
4 <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
5 <meta name="generator" content="AsciiDoc 8.6.9">
6 <title>HowProfilingWorks</title>
7 <link rel="stylesheet" href="./asciidoc.css" type="text/css">
8 <link rel="stylesheet" href="./pygments.css" type="text/css">
9
10
11 <script type="text/javascript" src="./asciidoc.js"></script>
12 <script type="text/javascript">
13 /*<![CDATA[*/
14 asciidoc.install();
15 /*]]>*/
16 </script>
17 <link rel="stylesheet" href="./mlton.css" type="text/css">
18 </head>
19 <body class="article">
20 <div id="banner">
21 <div id="banner-home">
22 <a href="./Home">MLton 20180207</a>
23 </div>
24 </div>
25 <div id="header">
26 <h1>HowProfilingWorks</h1>
27 </div>
28 <div id="content">
29 <div id="preamble">
30 <div class="sectionbody">
31 <div class="paragraph"><p>Here&#8217;s how <a href="Profiling">Profiling</a> works. If profiling is on, the front end
32 (elaborator) inserts <span class="monospaced">Enter</span> and <span class="monospaced">Leave</span> statements into the source
33 program for function entry and exit. For example,</p></div>
34 <div class="listingblock">
35 <div class="content"><div class="highlight"><pre><span class="k">fun</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">then</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="n">+</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="n">-</span><span class="w"> </span><span class="mi">1</span><span class="p">)</span><span class="w"></span>
36 </pre></div></div></div>
37 <div class="paragraph"><p>becomes</p></div>
38 <div class="listingblock">
39 <div class="content"><div class="highlight"><pre><span class="k">fun</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"></span>
40 <span class="w"> </span><span class="k">let</span><span class="w"></span>
41 <span class="w"> </span><span class="k">val</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="n">Enter</span><span class="w"> </span><span class="s">&quot;f&quot;</span><span class="w"></span>
42 <span class="w"> </span><span class="k">val</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="p">(</span><span class="k">if</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">then</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="n">+</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="n">-</span><span class="w"> </span><span class="mi">1</span><span class="p">))</span><span class="w"></span>
43 <span class="w"> </span><span class="k">handle</span><span class="w"> </span><span class="n">e</span><span class="w"> </span><span class="p">=&gt;</span><span class="w"> </span><span class="p">(</span><span class="n">Leave</span><span class="w"> </span><span class="s">&quot;f&quot;</span><span class="p">;</span><span class="w"> </span><span class="k">raise</span><span class="w"> </span><span class="n">e</span><span class="p">)</span><span class="w"></span>
44 <span class="w"> </span><span class="k">val</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="n">Leave</span><span class="w"> </span><span class="s">&quot;f&quot;</span><span class="w"></span>
45 <span class="w"> </span><span class="k">in</span><span class="w"></span>
46 <span class="w"> </span><span class="n">res</span><span class="w"></span>
47 <span class="w"> </span><span class="k">end</span><span class="w"></span>
48 </pre></div></div></div>
49 <div class="paragraph"><p>Actually there is a bit more information than just the source function
50 name; there is also lexical nesting and file position.</p></div>
51 <div class="paragraph"><p>Most of the middle of the compiler ignores, but preserves, <span class="monospaced">Enter</span> and
52 <span class="monospaced">Leave</span>. However, so that profiling preserves tail calls, the
53 <a href="Shrink">SSA shrinker</a> has an optimization that notices when the only
54 operations that cause a call to be a nontail call are profiling
55 operations, and if so, moves them before the call, turning it into a
56 tail call. If you observe a program that has a tail call that appears
57 to be turned into a nontail when compiled with profiling, please
58 <a href="Bug">report a bug</a>.</p></div>
59 <div class="paragraph"><p>There is the <span class="monospaced">checkProf</span> function in
60 <a href="https://github.com/MLton/mlton/blob/master/mlton/ssa/type-check.fun"><span class="monospaced">type-check.fun</span></a>, which checks that
61 the <span class="monospaced">Enter</span>/<span class="monospaced">Leave</span> statements match up.</p></div>
62 <div class="paragraph"><p>In the backend, just before translating to the <a href="Machine"> Machine IL</a>,
63 the profiler uses the <span class="monospaced">Enter</span>/<span class="monospaced">Leave</span> statements to infer the "local"
64 portion of the control stack at each program point. The profiler then
65 removes the <span class="monospaced">Enter</span>s/<span class="monospaced">Leave</span>s and inserts different information
66 depending on which kind of profiling is happening. For time profiling
67 (with the <a href="AMD64Codegen">AMD64Codegen</a> and <a href="X86Codegen">X86Codegen</a>), the profiler inserts labels that cover the
68 code (i.e. each statement has a unique label in its basic block that
69 prefixes it) and associates each label with the local control stack.
70 For time profiling (with the <a href="CCodegen">CCodegen</a> and <a href="LLVMCodegen">LLVMCodegen</a>), the profiler
71 inserts code that sets a global field that records the local control
72 stack. For allocation profiling, the profiler inserts calls to a C
73 function that will maintain byte counts. With stack profiling, the
74 profiler also inserts a call to a C function at each nontail call in
75 order to maintain information at runtime about what SML functions are
76 on the stack.</p></div>
77 <div class="paragraph"><p>At run time, the profiler associates counters (either clock ticks or
78 byte counts) with source functions. When the program finishes, the
79 profiler writes the counts out to the <span class="monospaced">mlmon.out</span> file. Then,
80 <span class="monospaced">mlprof</span> uses source information stored in the executable to
81 associate the counts in the <span class="monospaced">mlmon.out</span> file with source
82 functions.</p></div>
83 <div class="paragraph"><p>For time profiling, the profiler catches the <span class="monospaced">SIGPROF</span> signal 100
84 times per second and increments the appropriate counter, determined by
85 looking at the label prefixing the current program counter and mapping
86 that to the current source function.</p></div>
87 </div>
88 </div>
89 <div class="sect1">
90 <h2 id="_caveats">Caveats</h2>
91 <div class="sectionbody">
92 <div class="paragraph"><p>There may be a few missed clock ticks or bytes allocated at the very
93 end of the program after the data is written.</p></div>
94 <div class="paragraph"><p>Profiling has not been tested with signals or threads. In particular,
95 stack profiling may behave strangely.</p></div>
96 </div>
97 </div>
98 </div>
99 <div id="footnotes"><hr></div>
100 <div id="footer">
101 <div id="footer-text">
102 </div>
103 <div id="footer-badges">
104 </div>
105 </div>
106 </body>
107 </html>