Commit | Line | Data |
---|---|---|
7f918cf1 CE |
1 | <!DOCTYPE html>\r |
2 | <html lang="en">\r | |
3 | <head>\r | |
4 | <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">\r | |
5 | <meta name="generator" content="AsciiDoc 8.6.9">\r | |
6 | <title>HowProfilingWorks</title>\r | |
7 | <link rel="stylesheet" href="./asciidoc.css" type="text/css">\r | |
8 | <link rel="stylesheet" href="./pygments.css" type="text/css">\r | |
9 | \r | |
10 | \r | |
11 | <script type="text/javascript" src="./asciidoc.js"></script>\r | |
12 | <script type="text/javascript">\r | |
13 | /*<![CDATA[*/\r | |
14 | asciidoc.install();\r | |
15 | /*]]>*/\r | |
16 | </script>\r | |
17 | <link rel="stylesheet" href="./mlton.css" type="text/css">\r | |
18 | </head>\r | |
19 | <body class="article">\r | |
20 | <div id="banner">\r | |
21 | <div id="banner-home">\r | |
22 | <a href="./Home">MLton 20180207</a>\r | |
23 | </div>\r | |
24 | </div>\r | |
25 | <div id="header">\r | |
26 | <h1>HowProfilingWorks</h1>\r | |
27 | </div>\r | |
28 | <div id="content">\r | |
29 | <div id="preamble">\r | |
30 | <div class="sectionbody">\r | |
31 | <div class="paragraph"><p>Here’s how <a href="Profiling">Profiling</a> works. If profiling is on, the front end\r | |
32 | (elaborator) inserts <span class="monospaced">Enter</span> and <span class="monospaced">Leave</span> statements into the source\r | |
33 | program for function entry and exit. For example,</p></div>\r | |
34 | <div class="listingblock">\r | |
35 | <div class="content"><div class="highlight"><pre><span class="k">fun</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="k">if</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">then</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="n">+</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="n">-</span><span class="w"> </span><span class="mi">1</span><span class="p">)</span><span class="w"></span>\r | |
36 | </pre></div></div></div>\r | |
37 | <div class="paragraph"><p>becomes</p></div>\r | |
38 | <div class="listingblock">\r | |
39 | <div class="content"><div class="highlight"><pre><span class="k">fun</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"></span>\r | |
40 | <span class="w"> </span><span class="k">let</span><span class="w"></span>\r | |
41 | <span class="w"> </span><span class="k">val</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="n">Enter</span><span class="w"> </span><span class="s">"f"</span><span class="w"></span>\r | |
42 | <span class="w"> </span><span class="k">val</span><span class="w"> </span><span class="n">res</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="p">(</span><span class="k">if</span><span class="w"> </span><span class="n">n</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">then</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="k">else</span><span class="w"> </span><span class="mi">1</span><span class="w"> </span><span class="n">+</span><span class="w"> </span><span class="n">f</span><span class="w"> </span><span class="p">(</span><span class="n">n</span><span class="w"> </span><span class="n">-</span><span class="w"> </span><span class="mi">1</span><span class="p">))</span><span class="w"></span>\r | |
43 | <span class="w"> </span><span class="k">handle</span><span class="w"> </span><span class="n">e</span><span class="w"> </span><span class="p">=></span><span class="w"> </span><span class="p">(</span><span class="n">Leave</span><span class="w"> </span><span class="s">"f"</span><span class="p">;</span><span class="w"> </span><span class="k">raise</span><span class="w"> </span><span class="n">e</span><span class="p">)</span><span class="w"></span>\r | |
44 | <span class="w"> </span><span class="k">val</span><span class="w"> </span><span class="p">()</span><span class="w"> </span><span class="p">=</span><span class="w"> </span><span class="n">Leave</span><span class="w"> </span><span class="s">"f"</span><span class="w"></span>\r | |
45 | <span class="w"> </span><span class="k">in</span><span class="w"></span>\r | |
46 | <span class="w"> </span><span class="n">res</span><span class="w"></span>\r | |
47 | <span class="w"> </span><span class="k">end</span><span class="w"></span>\r | |
48 | </pre></div></div></div>\r | |
49 | <div class="paragraph"><p>Actually there is a bit more information than just the source function\r | |
50 | name; there is also lexical nesting and file position.</p></div>\r | |
51 | <div class="paragraph"><p>Most of the middle of the compiler ignores, but preserves, <span class="monospaced">Enter</span> and\r | |
52 | <span class="monospaced">Leave</span>. However, so that profiling preserves tail calls, the\r | |
53 | <a href="Shrink">SSA shrinker</a> has an optimization that notices when the only\r | |
54 | operations that cause a call to be a nontail call are profiling\r | |
55 | operations, and if so, moves them before the call, turning it into a\r | |
56 | tail call. If you observe a program that has a tail call that appears\r | |
57 | to be turned into a nontail when compiled with profiling, please\r | |
58 | <a href="Bug">report a bug</a>.</p></div>\r | |
59 | <div class="paragraph"><p>There is the <span class="monospaced">checkProf</span> function in\r | |
60 | <a href="https://github.com/MLton/mlton/blob/master/mlton/ssa/type-check.fun"><span class="monospaced">type-check.fun</span></a>, which checks that\r | |
61 | the <span class="monospaced">Enter</span>/<span class="monospaced">Leave</span> statements match up.</p></div>\r | |
62 | <div class="paragraph"><p>In the backend, just before translating to the <a href="Machine"> Machine IL</a>,\r | |
63 | the profiler uses the <span class="monospaced">Enter</span>/<span class="monospaced">Leave</span> statements to infer the "local"\r | |
64 | portion of the control stack at each program point. The profiler then\r | |
65 | removes the <span class="monospaced">Enter</span>s/<span class="monospaced">Leave</span>s and inserts different information\r | |
66 | depending on which kind of profiling is happening. For time profiling\r | |
67 | (with the <a href="AMD64Codegen">AMD64Codegen</a> and <a href="X86Codegen">X86Codegen</a>), the profiler inserts labels that cover the\r | |
68 | code (i.e. each statement has a unique label in its basic block that\r | |
69 | prefixes it) and associates each label with the local control stack.\r | |
70 | For time profiling (with the <a href="CCodegen">CCodegen</a> and <a href="LLVMCodegen">LLVMCodegen</a>), the profiler\r | |
71 | inserts code that sets a global field that records the local control\r | |
72 | stack. For allocation profiling, the profiler inserts calls to a C\r | |
73 | function that will maintain byte counts. With stack profiling, the\r | |
74 | profiler also inserts a call to a C function at each nontail call in\r | |
75 | order to maintain information at runtime about what SML functions are\r | |
76 | on the stack.</p></div>\r | |
77 | <div class="paragraph"><p>At run time, the profiler associates counters (either clock ticks or\r | |
78 | byte counts) with source functions. When the program finishes, the\r | |
79 | profiler writes the counts out to the <span class="monospaced">mlmon.out</span> file. Then,\r | |
80 | <span class="monospaced">mlprof</span> uses source information stored in the executable to\r | |
81 | associate the counts in the <span class="monospaced">mlmon.out</span> file with source\r | |
82 | functions.</p></div>\r | |
83 | <div class="paragraph"><p>For time profiling, the profiler catches the <span class="monospaced">SIGPROF</span> signal 100\r | |
84 | times per second and increments the appropriate counter, determined by\r | |
85 | looking at the label prefixing the current program counter and mapping\r | |
86 | that to the current source function.</p></div>\r | |
87 | </div>\r | |
88 | </div>\r | |
89 | <div class="sect1">\r | |
90 | <h2 id="_caveats">Caveats</h2>\r | |
91 | <div class="sectionbody">\r | |
92 | <div class="paragraph"><p>There may be a few missed clock ticks or bytes allocated at the very\r | |
93 | end of the program after the data is written.</p></div>\r | |
94 | <div class="paragraph"><p>Profiling has not been tested with signals or threads. In particular,\r | |
95 | stack profiling may behave strangely.</p></div>\r | |
96 | </div>\r | |
97 | </div>\r | |
98 | </div>\r | |
99 | <div id="footnotes"><hr></div>\r | |
100 | <div id="footer">\r | |
101 | <div id="footer-text">\r | |
102 | </div>\r | |
103 | <div id="footer-badges">\r | |
104 | </div>\r | |
105 | </div>\r | |
106 | </body>\r | |
107 | </html>\r |