Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 1 | .. SPDX-License-Identifier: GPL-2.0+ |
| 2 | .. Copyright (c) 2013 The Chromium OS Authors. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 3 | |
| 4 | Tracing in U-Boot |
| 5 | ================= |
| 6 | |
| 7 | U-Boot supports a simple tracing feature which allows a record of excecution |
| 8 | to be collected and sent to a host machine for analysis. At present the |
| 9 | main use for this is to profile boot time. |
| 10 | |
| 11 | |
| 12 | Overview |
| 13 | -------- |
| 14 | |
| 15 | The trace feature uses GCC's instrument-functions feature to trace all |
| 16 | function entry/exit points. These are then recorded in a memory buffer. |
| 17 | The memory buffer can be saved to the host over a network link using |
| 18 | tftpput or by writing to an attached memory device such as MMC. |
| 19 | |
| 20 | On the host, the file is first converted with a tool called 'proftool', |
| 21 | which extracts useful information from it. The resulting trace output |
| 22 | resembles that emitted by Linux's ftrace feature, so can be visually |
| 23 | displayed by pytimechart. |
| 24 | |
| 25 | |
| 26 | Quick-start using Sandbox |
| 27 | ------------------------- |
| 28 | |
| 29 | Sandbox is a build of U-Boot that can run under Linux so it is a convenient |
| 30 | way of trying out tracing before you use it on your actual board. To do |
| 31 | this, follow these steps: |
| 32 | |
| 33 | Add the following to include/configs/sandbox.h (if not already there) |
| 34 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 35 | .. code-block:: c |
| 36 | |
| 37 | #define CONFIG_TRACE |
| 38 | #define CONFIG_CMD_TRACE |
| 39 | #define CONFIG_TRACE_BUFFER_SIZE (16 << 20) |
| 40 | #define CONFIG_TRACE_EARLY_SIZE (8 << 20) |
| 41 | #define CONFIG_TRACE_EARLY |
| 42 | #define CONFIG_TRACE_EARLY_ADDR 0x00100000 |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 43 | |
| 44 | Build sandbox U-Boot with tracing enabled: |
| 45 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 46 | .. code-block:: console |
| 47 | |
| 48 | $ make FTRACE=1 O=sandbox sandbox_config |
| 49 | $ make FTRACE=1 O=sandbox |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 50 | |
| 51 | Run sandbox, wait for a bit of trace information to appear, and then capture |
| 52 | a trace: |
| 53 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 54 | .. code-block:: console |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 55 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 56 | $ ./sandbox/u-boot |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 57 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 58 | U-Boot 2013.04-rc2-00100-ga72fcef (Apr 17 2013 - 19:25:24) |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 59 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 60 | DRAM: 128 MiB |
| 61 | trace: enabled |
| 62 | Using default environment |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 63 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 64 | In: serial |
| 65 | Out: serial |
| 66 | Err: serial |
| 67 | =>trace stats |
| 68 | 671,406 function sites |
| 69 | 69,712 function calls |
| 70 | 0 untracked function calls |
| 71 | 73,373 traced function calls |
| 72 | 16 maximum observed call depth |
| 73 | 15 call depth limit |
| 74 | 66,491 calls not traced due to depth |
| 75 | =>trace stats |
| 76 | 671,406 function sites |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 77 | 1,279,450 function calls |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 78 | 0 untracked function calls |
| 79 | 950,490 traced function calls (333217 dropped due to overflow) |
| 80 | 16 maximum observed call depth |
| 81 | 15 call depth limit |
| 82 | 1,275,767 calls not traced due to depth |
| 83 | =>trace calls 0 e00000 |
| 84 | Call list dumped to 00000000, size 0xae0a40 |
| 85 | =>print |
| 86 | baudrate=115200 |
| 87 | profbase=0 |
| 88 | profoffset=ae0a40 |
| 89 | profsize=e00000 |
| 90 | stderr=serial |
| 91 | stdin=serial |
| 92 | stdout=serial |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 93 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 94 | Environment size: 117/8188 bytes |
| 95 | =>host save host 0 trace 0 ${profoffset} |
| 96 | 11405888 bytes written in 10 ms (1.1 GiB/s) |
| 97 | =>reset |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 98 | |
| 99 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 100 | Then run proftool to convert the trace information to ftrace format |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 101 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 102 | .. code-block:: console |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 103 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 104 | $ ./sandbox/tools/proftool -m sandbox/System.map -p trace dump-ftrace >trace.txt |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 105 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 106 | Finally run pytimechart to display it |
| 107 | |
| 108 | .. code-block:: console |
| 109 | |
| 110 | $ pytimechart trace.txt |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 111 | |
| 112 | Using this tool you can zoom and pan across the trace, with the function |
| 113 | calls on the left and little marks representing the start and end of each |
| 114 | function. |
| 115 | |
| 116 | |
| 117 | CONFIG Options |
| 118 | -------------- |
| 119 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 120 | CONFIG_TRACE |
| 121 | Enables the trace feature in U-Boot. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 122 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 123 | CONFIG_CMD_TRACE |
| 124 | Enables the trace command. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 125 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 126 | CONFIG_TRACE_BUFFER_SIZE |
| 127 | Size of trace buffer to allocate for U-Boot. This buffer is |
| 128 | used after relocation, as a place to put function tracing |
| 129 | information. The address of the buffer is determined by |
| 130 | the relocation code. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 131 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 132 | CONFIG_TRACE_EARLY |
| 133 | Define this to start tracing early, before relocation. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 134 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 135 | CONFIG_TRACE_EARLY_SIZE |
| 136 | Size of 'early' trace buffer. Before U-Boot has relocated |
| 137 | it doesn't have a proper trace buffer. On many boards |
| 138 | you can define an area of memory to use for the trace |
| 139 | buffer until the 'real' trace buffer is available after |
| 140 | relocation. The contents of this buffer are then copied to |
| 141 | the real buffer. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 142 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 143 | CONFIG_TRACE_EARLY_ADDR |
| 144 | Address of early trace buffer |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 145 | |
| 146 | |
| 147 | Building U-Boot with Tracing Enabled |
| 148 | ------------------------------------ |
| 149 | |
| 150 | Pass 'FTRACE=1' to the U-Boot Makefile to actually instrument the code. |
| 151 | This is kept as a separate option so that it is easy to enable/disable |
| 152 | instrumenting from the command line instead of having to change board |
| 153 | config files. |
| 154 | |
| 155 | |
| 156 | Collecting Trace Data |
| 157 | --------------------- |
| 158 | |
| 159 | When you run U-Boot on your board it will collect trace data up to the |
| 160 | limit of the trace buffer size you have specified. Once that is exhausted |
| 161 | no more data will be collected. |
| 162 | |
| 163 | Collecting trace data has an affect on execution time/performance. You |
| 164 | will notice this particularly with trvial functions - the overhead of |
| 165 | recording their execution may even exceed their normal execution time. |
| 166 | In practice this doesn't matter much so long as you are aware of the |
| 167 | effect. Once you have done your optimisations, turn off tracing before |
| 168 | doing end-to-end timing. |
| 169 | |
| 170 | The best time to start tracing is right at the beginning of U-Boot. The |
| 171 | best time to stop tracing is right at the end. In practice it is hard |
| 172 | to achieve these ideals. |
| 173 | |
| 174 | This implementation enables tracing early in board_init_f(). This means |
| 175 | that it captures most of the board init process, missing only the |
| 176 | early architecture-specific init. However, it also misses the entire |
| 177 | SPL stage if there is one. |
| 178 | |
| 179 | U-Boot typically ends with a 'bootm' command which loads and runs an |
| 180 | OS. There is useful trace data in the execution of that bootm |
| 181 | command. Therefore this implementation provides a way to collect trace |
| 182 | data after bootm has finished processing, but just before it jumps to |
| 183 | the OS. In practical terms, U-Boot runs the 'fakegocmd' environment |
| 184 | variable at this point. This variable should have a short script which |
| 185 | collects the trace data and writes it somewhere. |
| 186 | |
| 187 | Trace data collection relies on a microsecond timer, accesed through |
| 188 | timer_get_us(). So the first think you should do is make sure that |
| 189 | this produces sensible results for your board. Suitable sources for |
| 190 | this timer include high resolution timers, PWMs or profile timers if |
| 191 | available. Most modern SOCs have a suitable timer for this. Make sure |
| 192 | that you mark this timer (and anything it calls) with |
| 193 | __attribute__((no_instrument_function)) so that the trace library can |
| 194 | use it without causing an infinite loop. |
| 195 | |
| 196 | |
| 197 | Commands |
| 198 | -------- |
| 199 | |
| 200 | The trace command has variable sub-commands: |
| 201 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 202 | stats |
| 203 | Display tracing statistics |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 204 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 205 | pause |
| 206 | Pause tracing |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 207 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 208 | resume |
| 209 | Resume tracing |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 210 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 211 | funclist [<addr> <size>] |
| 212 | Dump a list of functions into the buffer |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 213 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 214 | calls [<addr> <size>] |
| 215 | Dump function call trace into buffer |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 216 | |
| 217 | If the address and size are not given, these are obtained from environment |
| 218 | variables (see below). In any case the environment variables are updated |
| 219 | after the command runs. |
| 220 | |
| 221 | |
| 222 | Environment Variables |
| 223 | --------------------- |
| 224 | |
| 225 | The following are used: |
| 226 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 227 | profbase |
| 228 | Base address of trace output buffer |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 229 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 230 | profoffset |
| 231 | Offset of first unwritten byte in trace output buffer |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 232 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 233 | profsize |
| 234 | Size of trace output buffer |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 235 | |
| 236 | All of these are set by the 'trace calls' command. |
| 237 | |
| 238 | These variables keep track of the amount of data written to the trace |
| 239 | output buffer by the 'trace' command. The trace commands which write data |
| 240 | to the output buffer can use these to specify the buffer to write to, and |
| 241 | update profoffset each time. This allows successive commands to append data |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 242 | to the same buffer, for example:: |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 243 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 244 | => trace funclist 10000 e00000 |
| 245 | => trace calls |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 246 | |
| 247 | (the latter command appends more data to the buffer). |
| 248 | |
| 249 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 250 | fakegocmd |
| 251 | Specifies commands to run just before booting the OS. This |
| 252 | is a useful time to write the trace data to the host for |
| 253 | processing. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 254 | |
| 255 | |
| 256 | Writing Out Trace Data |
| 257 | ---------------------- |
| 258 | |
| 259 | Once the trace data is in an output buffer in memory there are various ways |
| 260 | to transmit it to the host. Notably you can use tftput to send the data |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 261 | over a network link:: |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 262 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 263 | fakegocmd=trace pause; usb start; set autoload n; bootp; |
| 264 | trace calls 10000000 1000000; |
| 265 | tftpput ${profbase} ${profoffset} 192.168.1.4:/tftpboot/calls |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 266 | |
| 267 | This starts up USB (to talk to an attached USB Ethernet dongle), writes |
| 268 | a trace log to address 10000000 and sends it to a host machine using |
| 269 | TFTP. After this, U-Boot will boot the OS normally, albeit a little |
| 270 | later. |
| 271 | |
| 272 | |
| 273 | Converting Trace Output Data |
| 274 | ---------------------------- |
| 275 | |
| 276 | The trace output data is kept in a binary format which is not documented |
| 277 | here. To convert it into something useful, you can use proftool. |
| 278 | |
| 279 | This tool must be given the U-Boot map file and the trace data received |
| 280 | from running that U-Boot. It produces a text output file. |
| 281 | |
| 282 | Options |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 283 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 284 | -m <map_file> |
| 285 | Specify U-Boot map file |
| 286 | |
| 287 | -p <trace_file> |
| 288 | Specifiy profile/trace file |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 289 | |
| 290 | Commands: |
| 291 | |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 292 | dump-ftrace |
| 293 | Write a text dump of the file in Linux ftrace format to stdout |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 294 | |
| 295 | |
| 296 | Viewing the Trace Data |
| 297 | ---------------------- |
| 298 | |
| 299 | You can use pytimechart for this (sudo apt-get pytimechart might work on |
| 300 | your Debian-style machine, and use your favourite search engine to obtain |
| 301 | documentation). It expects the file to have a .txt extension. The program |
| 302 | has terse user interface but is very convenient for viewing U-Boot |
| 303 | profile information. |
| 304 | |
| 305 | |
| 306 | Workflow Suggestions |
| 307 | -------------------- |
| 308 | |
| 309 | The following suggestions may be helpful if you are trying to reduce boot |
| 310 | time: |
| 311 | |
| 312 | 1. Enable CONFIG_BOOTSTAGE and CONFIG_BOOTSTAGE_REPORT. This should get |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 313 | you are helpful overall snapshot of the boot time. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 314 | |
| 315 | 2. Build U-Boot with tracing and run it. Note the difference in boot time |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 316 | (it is common for tracing to add 10% to the time) |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 317 | |
| 318 | 3. Collect the trace information as descibed above. Use this to find where |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 319 | all the time is being spent. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 320 | |
| 321 | 4. Take a look at that code and see if you can optimise it. Perhaps it is |
Heinrich Schuchardt | dce26c7 | 2020-12-12 10:14:22 +0100 | [diff] [blame] | 322 | possible to speed up the initialisation of a device, or remove an unused |
| 323 | feature. |
Simon Glass | b2e16a8 | 2013-06-11 11:14:39 -0700 | [diff] [blame] | 324 | |
| 325 | 5. Rebuild, run and collect again. Compare your results. |
| 326 | |
| 327 | 6. Keep going until you run out of steam, or your boot is fast enough. |
| 328 | |
| 329 | |
| 330 | Configuring Trace |
| 331 | ----------------- |
| 332 | |
| 333 | There are a few parameters in the code that you may want to consider. |
| 334 | There is a function call depth limit (set to 15 by default). When the |
| 335 | stack depth goes above this then no tracing information is recorded. |
| 336 | The maximum depth reached is recorded and displayed by the 'trace stats' |
| 337 | command. |
| 338 | |
| 339 | |
| 340 | Future Work |
| 341 | ----------- |
| 342 | |
| 343 | Tracing could be a little tidier in some areas, for example providing |
| 344 | run-time configuration options for trace. |
| 345 | |
| 346 | Some other features that might be useful: |
| 347 | |
| 348 | - Trace filter to select which functions are recorded |
| 349 | - Sample-based profiling using a timer interrupt |
| 350 | - Better control over trace depth |
| 351 | - Compression of trace information |
| 352 | |
| 353 | |
| 354 | Simon Glass <sjg@chromium.org> |
| 355 | April 2013 |