[PATCH 0/9] codman: Add a new source-code analysis tool

Simon Glass

24 Nov 2025 24 Nov '25

1:49 p.m.

From: Simon Glass <simon.glass@canonical.com> Add a new tool called 'codman' (code manager) for analysing source code usage in U-Boot builds. This tool determines which files and lines of code are actually compiled based on the build configuration. The tool provides three analysis methods: - unifdef: Static preprocessor analysis (default) - DWARF: Debug information from compiled code (-w) - (experimental) LSP: Language server analysis using clangd (-l) Codman supports: - File-level analysis: which files are compiled vs unused - Line-level analysis: which lines are active vs removed by preprocessor - Kconfig-impact analysis with -a/--adjust option - Various output formats: stats, directories, detail, summary Since there is quite a lot of processing involved, Codman uses parallel processing where possible. This tool is admittedly not quite up to my normal code quality, but it has been an interesting experiment in using Claude to create something from scratch. The unifdef part of the tool benefits from some patches I created for that tool: - O(1) algorithm for symbol lookup, instead of O(n) - faster! - support for IS_ENABLED(), CONFIG_IS_ENABLED() Please get in touch if you would like the patches. This series also includes a minor improvement to buildman and a tidy-up of the tout library to reduce code duplication. Simon Glass (9): u_boot_pylib: Add stderr parameter to tprint() u_boot_pylib: Use terminal.tprint() for output in tout buildman: Support comma-separated values in -a flag codman: Add a new source-code analysis tool codman: Provide an unifdef analyser codman: Provide an dwarf analyser codman: Begin an experimental lsp analyser codman: Add some basic tests codman: Add documentation doc/develop/codman.rst | 1 + doc/develop/index.rst | 1 + tools/buildman/buildman.rst | 13 + tools/buildman/cfgutil.py | 19 +- tools/buildman/cmdline.py | 3 +- tools/buildman/test.py | 12 + tools/codman/analyser.py | 76 ++++ tools/codman/codman | 1 + tools/codman/codman.py | 664 +++++++++++++++++++++++++++++++++ tools/codman/codman.rst | 426 +++++++++++++++++++++ tools/codman/dwarf.py | 200 ++++++++++ tools/codman/lsp.py | 319 ++++++++++++++++ tools/codman/lsp_client.py | 225 +++++++++++ tools/codman/output.py | 536 ++++++++++++++++++++++++++ tools/codman/test_codman.py | 470 +++++++++++++++++++++++ tools/codman/test_lsp.py | 153 ++++++++ tools/codman/unifdef.py | 429 +++++++++++++++++++++ tools/u_boot_pylib/terminal.py | 10 +- tools/u_boot_pylib/tout.py | 24 +- 19 files changed, 3554 insertions(+), 28 deletions(-) create mode 120000 doc/develop/codman.rst create mode 100644 tools/codman/analyser.py create mode 120000 tools/codman/codman create mode 100755 tools/codman/codman.py create mode 100644 tools/codman/codman.rst create mode 100644 tools/codman/dwarf.py create mode 100644 tools/codman/lsp.py create mode 100644 tools/codman/lsp_client.py create mode 100644 tools/codman/output.py create mode 100755 tools/codman/test_codman.py create mode 100755 tools/codman/test_lsp.py create mode 100644 tools/codman/unifdef.py -- 2.43.0 base-commit: ac7212f5f9792af31bbacc0f46423d0ca6a39e1d branch: codman

Show replies by date

Simon Glass

24 Nov 24 Nov

1:49 p.m.

New subject: [PATCH 1/9] u_boot_pylib: Add stderr parameter to tprint()

From: Simon Glass <simon.glass@canonical.com> Add optional stderr parameter to tprint() to allow printing to stderr instead of stdout. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/u_boot_pylib/terminal.py | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/tools/u_boot_pylib/terminal.py b/tools/u_boot_pylib/terminal.py index 69c183e85e5..e62fa166dca 100644 --- a/tools/u_boot_pylib/terminal.py +++ b/tools/u_boot_pylib/terminal.py @@ -141,7 +141,7 @@ def trim_ascii_len(text, size): def tprint(text='', newline=True, colour=None, limit_to_line=False, - bright=True, back=None, col=None): + bright=True, back=None, col=None, stderr=False): """Handle a line of output to the terminal. In test mode this is recorded in a list. Otherwise it is output to the @@ -151,6 +151,7 @@ def tprint(text='', newline=True, colour=None, limit_to_line=False, text: Text to print newline: True to add a new line at the end of the text colour: Colour to use for the text + stderr: True to print to stderr instead of stdout """ global last_print_len @@ -161,14 +162,17 @@ def tprint(text='', newline=True, colour=None, limit_to_line=False, if not col: col = Color() text = col.build(colour, text, bright=bright, back=back) + + file = sys.stderr if stderr else sys.stdout + if newline: - print(text) + print(text, file=file) last_print_len = None else: if limit_to_line: cols = shutil.get_terminal_size().columns text = trim_ascii_len(text, cols) - print(text, end='', flush=True) + print(text, end='', flush=True, file=file) last_print_len = calc_ascii_len(text) def print_clear(): -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 2/9] u_boot_pylib: Use terminal.tprint() for output in tout

From: Simon Glass <simon.glass@canonical.com> Refactor tout.py to use terminal.tprint() instead of direct print() calls. This provides better control over output formatting and supports the new stderr parameter. It also reduces code duplication. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/u_boot_pylib/tout.py | 24 +++++++----------------- 1 file changed, 7 insertions(+), 17 deletions(-) diff --git a/tools/u_boot_pylib/tout.py b/tools/u_boot_pylib/tout.py index ca72108d6bc..137b55edfd0 100644 --- a/tools/u_boot_pylib/tout.py +++ b/tools/u_boot_pylib/tout.py @@ -11,8 +11,6 @@ from u_boot_pylib import terminal # Output verbosity levels that we support FATAL, ERROR, WARNING, NOTICE, INFO, DETAIL, DEBUG = range(7) -in_progress = False - """ This class handles output of progress and other useful information to the user. It provides for simple verbosity level control and can @@ -46,11 +44,8 @@ def user_is_present(): def clear_progress(): """Clear any active progress message on the terminal.""" - global in_progress - if verbose > ERROR and stdout_is_tty and in_progress: - _stdout.write('\r%s\r' % (" " * len (_progress))) - _stdout.flush() - in_progress = False + if verbose > ERROR and stdout_is_tty: + terminal.print_clear() def progress(msg, warning=False, trailer='...'): """Display progress information. @@ -58,17 +53,14 @@ def progress(msg, warning=False, trailer='...'): Args: msg: Message to display. warning: True if this is a warning.""" - global in_progress clear_progress() if verbose > ERROR: _progress = msg + trailer if stdout_is_tty: col = _color.YELLOW if warning else _color.GREEN - _stdout.write('\r' + _color.build(col, _progress)) - _stdout.flush() - in_progress = True + terminal.tprint('\r' + _progress, newline=False, colour=col, col=_color) else: - _stdout.write(_progress + '\n') + terminal.tprint(_progress) def _output(level, msg, color=None): """Output a message to the terminal. @@ -81,12 +73,10 @@ def _output(level, msg, color=None): """ if verbose >= level: clear_progress() - if color: - msg = _color.build(color, msg) - if level < NOTICE: - print(msg, file=sys.stderr) + if level <= WARNING: + terminal.tprint(msg, colour=color, col=_color, stderr=True) else: - print(msg) + terminal.tprint(msg, colour=color, col=_color) if level == FATAL: sys.exit(1) -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 3/9] buildman: Support comma-separated values in -a flag

From: Simon Glass <simon.glass@canonical.com> Allow users to specify multiple config adjustments in a single -a argument using commas. This is more convenient than repeating -a multiple times. Examples: buildman -a FOO,~BAR buildman -a FOO,~BAR -a BAZ=123 Add tests to verify comma-separated values work correctly. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/buildman/buildman.rst | 13 +++++++++++++ tools/buildman/cfgutil.py | 19 ++++++++++++------- tools/buildman/cmdline.py | 3 ++- tools/buildman/test.py | 12 ++++++++++++ 4 files changed, 39 insertions(+), 8 deletions(-) diff --git a/tools/buildman/buildman.rst b/tools/buildman/buildman.rst index 487e9d67a4b..c0599757b0b 100644 --- a/tools/buildman/buildman.rst +++ b/tools/buildman/buildman.rst @@ -1307,6 +1307,19 @@ You can disable options by preceding them with tilde (~). You can specify the buildman -a CMD_SETEXPR_FMT -a ~CMDLINE +You can also use comma-separated values to specify multiple options in a single +argument: + +.. code-block:: bash + + buildman -a CMD_SETEXPR_FMT,~CMDLINE + +or mix both styles: + +.. code-block:: bash + + buildman -a CMD_SETEXPR_FMT,~CMDLINE -a BOOTSTD_FULL + Some options have values, in which case you can change them: .. code-block:: bash diff --git a/tools/buildman/cfgutil.py b/tools/buildman/cfgutil.py index a340e01cb6b..5bc97d33595 100644 --- a/tools/buildman/cfgutil.py +++ b/tools/buildman/cfgutil.py @@ -134,7 +134,7 @@ def convert_list_to_dict(adjust_cfg_list): Args: adjust_cfg_list (list of str): List of changes to make to .config file before building. Each is one of (where C is the config option with - or without the CONFIG_ prefix) + or without the CONFIG_ prefix). Items can be comma-separated. C to enable C ~C to disable C @@ -154,12 +154,17 @@ def convert_list_to_dict(adjust_cfg_list): ValueError: if an item in adjust_cfg_list has invalid syntax """ result = {} - for cfg in adjust_cfg_list or []: - m_cfg = RE_CFG.match(cfg) - if not m_cfg: - raise ValueError(f"Invalid CONFIG adjustment '{cfg}'") - negate, _, opt, val = m_cfg.groups() - result[opt] = f'%s{opt}%s' % (negate or '', val or '') + for item in adjust_cfg_list or []: + # Split by comma to support comma-separated values + for cfg in item.split(','): + cfg = cfg.strip() + if not cfg: + continue + m_cfg = RE_CFG.match(cfg) + if not m_cfg: + raise ValueError(f"Invalid CONFIG adjustment '{cfg}'") + negate, _, opt, val = m_cfg.groups() + result[opt] = f'%s{opt}%s' % (negate or '', val or '') return result diff --git a/tools/buildman/cmdline.py b/tools/buildman/cmdline.py index ad07e6cac39..b3c70daeca3 100644 --- a/tools/buildman/cmdline.py +++ b/tools/buildman/cmdline.py @@ -24,7 +24,8 @@ def add_upto_m(parser): """ # Available JqzZ parser.add_argument('-a', '--adjust-cfg', type=str, action='append', - help='Adjust the Kconfig settings in .config before building') + help='Adjust the Kconfig settings in .config before building. ' + + 'Supports comma-separated values') parser.add_argument('-A', '--print-prefix', action='store_true', help='Print the tool-chain prefix for a board (CROSS_COMPILE=)') parser.add_argument('-b', '--branch', type=str, diff --git a/tools/buildman/test.py b/tools/buildman/test.py index a134ac4f917..81e708d9bd6 100644 --- a/tools/buildman/test.py +++ b/tools/buildman/test.py @@ -780,6 +780,18 @@ class TestBuild(unittest.TestCase): 'CONFIG_ANNA="anna"']) self.assertEqual(expect, actual) + # Test comma-separated values + actual = cfgutil.convert_list_to_dict( + ['FRED,~MARY,JOHN=0x123', 'ALICE="alice"', + 'CONFIG_AMY,~CONFIG_ABE', 'CONFIG_MARK=0x456,CONFIG_ANNA="anna"']) + self.assertEqual(expect, actual) + + # Test mixed comma-separated and individual values + actual = cfgutil.convert_list_to_dict( + ['FRED,~MARY', 'JOHN=0x123', 'ALICE="alice",CONFIG_AMY', + '~CONFIG_ABE,CONFIG_MARK=0x456', 'CONFIG_ANNA="anna"']) + self.assertEqual(expect, actual) + def test_check_cfg_file(self): """Test check_cfg_file detects conflicts as expected""" # Check failure to disable CONFIG -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 5/9] codman: Provide an unifdef analyser

From: Simon Glass <simon.glass@canonical.com> Add a way to do static preprocessor analysis using unifdef, as a way of figuring out what code is actually used in the build. I have modified the unifdef tool as follows: - O(1) algorithm for symbol lookup, instead of O(n) - support for IS_ENABLED(), CONFIG_IS_ENABLED() The first patch was sent upstream. The others are U-Boot-specific so I have not submitted those. Please get in touch if you would like the patches. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/codman/unifdef.py | 429 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 429 insertions(+) create mode 100644 tools/codman/unifdef.py diff --git a/tools/codman/unifdef.py b/tools/codman/unifdef.py new file mode 100644 index 00000000000..560b323b460 --- /dev/null +++ b/tools/codman/unifdef.py @@ -0,0 +1,429 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright 2025 Canonical Ltd +# +"""Unifdef-based line-level analysis for source code. + +This module provides functionality to analyse which lines in source files +are active vs inactive based on CONFIG_* settings, using the unifdef tool. +""" + +import multiprocessing +import os +import re +import shutil +import subprocess +import tempfile +import time + +from buildman import kconfiglib +from u_boot_pylib import tout +from analyser import Analyser, FileResult + + +def load_config(config_file, srcdir='.'): + """Load CONFIG_* symbols from a .config file and Kconfig. + + Args: + config_file (str): Path to .config file + srcdir (str): Path to source directory (for Kconfig loading) + + Returns: + tuple: (config_dict, error_message) where config_dict is a dictionary + mapping CONFIG_* symbol names to values, and error_message is None + on success or an error string on failure + """ + config = {} + + # First, load from .config file + with open(config_file, 'r', encoding='utf-8') as f: + for line in f: + line = line.strip() + + # Skip comments and blank lines + if not line or line.startswith('#'): + # Check for "is not set" pattern + if ' is not set' in line: + # Extract CONFIG name: '# CONFIG_FOO is not set' + parts = line.split() + if len(parts) >= 2 and parts[1].startswith('CONFIG_'): + config_name = parts[1] + config[config_name] = None + continue + + # Parse CONFIG_* assignments + if '=' in line: + name, value = line.split('=', 1) + if name.startswith('CONFIG_'): + config[name] = value + + # Then, load all Kconfig symbols and set undefined ones to None + # Only do this if we have a Kconfig file (i.e., in a real U-Boot tree) + kconfig_path = os.path.join(srcdir, 'Kconfig') + if not os.path.exists(kconfig_path): + # No Kconfig - probably a test environment, just use .config values + return config, None + + try: + # Set environment variables needed by kconfiglib + old_srctree = os.environ.get('srctree') + old_ubootversion = os.environ.get('UBOOTVERSION') + old_objdir = os.environ.get('KCONFIG_OBJDIR') + + os.environ['srctree'] = srcdir + os.environ['UBOOTVERSION'] = 'dummy' + os.environ['KCONFIG_OBJDIR'] = '' + + # Load Kconfig + kconf = kconfiglib.Kconfig(warn=False) + + # Add all defined symbols that aren't already in config as None + # kconfiglib provides names without CONFIG_ prefix + for name in kconf.syms: + config_name = f'CONFIG_{name}' + if config_name not in config: + # Symbol is defined in Kconfig but not in .config + config[config_name] = None + + # Restore environment + if old_srctree is not None: + os.environ['srctree'] = old_srctree + elif 'srctree' in os.environ: + del os.environ['srctree'] + if old_ubootversion is not None: + os.environ['UBOOTVERSION'] = old_ubootversion + elif 'UBOOTVERSION' in os.environ: + del os.environ['UBOOTVERSION'] + if old_objdir is not None: + os.environ['KCONFIG_OBJDIR'] = old_objdir + elif 'KCONFIG_OBJDIR' in os.environ: + del os.environ['KCONFIG_OBJDIR'] + + tout.progress(f'Loaded {len(kconf.syms)} Kconfig symbols') + except (OSError, IOError, ValueError, ImportError) as e: + # Return error if kconfiglib fails - we need all symbols for accurate analysis + return None, f'Failed to load Kconfig symbols: {e}' + + return config, None + + +def match_lines(orig_lines, processed_output, source_file): + """Match original and processed lines to determine which are active. + + Parses #line directives from unifdef -n output to determine exactly which + lines from the original source are active vs inactive. + + Args: + orig_lines (list): List of original source lines + processed_output (str): Processed output from unifdef -n + source_file (str): Path to source file (for matching #line directives) + + Returns: + dict: Mapping of line numbers (1-indexed) to 'active'/'inactive' status + """ + total_lines = len(orig_lines) + line_status = {} + + # set up all lines as inactive + for i in range(1, total_lines + 1): + line_status[i] = 'inactive' + + # Parse #line directives to find which lines are active + # Format: #line <number> '<file>' + # When we see a #line directive, all following non-directive lines + # come from that line number onward in the original file + # If no #line directive appears at start, output starts at line 1 + current_line = 1 # Start at line 1 by default + line_pattern = re.compile(r'^#line (\d+) "(.+)"$') + source_basename = source_file.split('/')[-1] + + for output_line in processed_output.splitlines(): + # Check for #line directive + match = line_pattern.match(output_line) + if match: + line_num = int(match.group(1)) + file_path = match.group(2) + # Only track lines from our source file (unifdef may include + # #line directives from headers) + if file_path == source_file or file_path.endswith(source_basename): + current_line = line_num + else: + # This is a #line for a different file (e.g., header) + # Stop tracking until we see our file again + current_line = None + elif current_line is not None: + # This is a real line from the source file + if current_line <= total_lines: + line_status[current_line] = 'active' + current_line += 1 + + return line_status + + +def worker(args): + """Run unifdef on a source file to determine active/inactive lines. + + Uses unifdef with -k flag to process the file, then uses difflib to match + original lines to processed lines to determine which are active vs inactive. + + Args: + args (tuple): Tuple of (source_file, defs_file, unifdef_path, + track_lines) + + Returns: + Tuple of (source_file, total_lines, active_lines, inactive_lines, + line_status, error_msg) + line_status is a dict mapping line numbers to 'active'/'inactive', or + {} if not tracked + error_msg is None on success, or an error string on failure + """ + source_file, defs_file, unifdef_path, track_lines = args + + try: + with open(source_file, 'r', encoding='utf-8', errors='ignore') as f: + orig_lines = f.readlines() + + total_lines = len(orig_lines) + + # Run unifdef to process the file + # -n: add #line directives for tracking original line numbers + # -E: error on unterminated conditionals + # -f: use defs file + result = subprocess.run( + [unifdef_path, '-n', '-E', '-f', defs_file, source_file], + capture_output=True, + text=True, + encoding='utf-8', + errors='ignore', + check=False + ) + + if result.returncode > 1: + # Error running unifdef + # Check if it's an 'obfuscated' error - these are expected for + # complex macros + if 'Obfuscated' in result.stderr: + # Obfuscated error - unifdef still produces output, so + # continue processing (don't return early) + pass + else: + # Real error + error_msg = (f'unifdef failed on {source_file} with return ' + f'code {result.returncode}\nstderr: ' + f'{result.stderr}') + return (source_file, 0, 0, 0, {}, error_msg) + + # Parse unifdef output to determine which lines are active + if track_lines: + line_status = match_lines(orig_lines, result.stdout, source_file) + active_lines = len([s for s in line_status.values() + if s == 'active']) + else: + line_status = {} + # Count non-#line directive lines in output + active_lines = len([line for line in result.stdout.splitlines() + if not line.startswith('#line')]) + inactive_lines = total_lines - active_lines + + return (source_file, total_lines, active_lines, inactive_lines, + line_status, None) + except (OSError, IOError) as e: + # Failed to execute unifdef or read source file + error_msg = f'Failed to process {source_file}: {e}' + return (source_file, 0, 0, 0, {}, error_msg) + + +class UnifdefAnalyser(Analyser): + """Analyser that uses unifdef to determine active lines. + + This analyser handles the creation of a unifdef configuration file from + CONFIG_* symbols and provides methods to analyse source files. + + Attributes: + config (dict): Dictionary of CONFIG_* symbols and their values + unifdef_cfg (str): Path to temporary unifdef configuration file + """ + + def __init__(self, config_file, srcdir, used_sources, unifdef_path, + include_headers, keep_temps=False): + """Set up the analyser with config file path. + + Args: + config_file (str): Path to .config file + srcdir (str): Path to source root directory + used_sources (set): Set of source files that are compiled + unifdef_path (str): Path to unifdef executable + include_headers (bool): If True, include header files; otherwise + only .c and .S + keep_temps (bool): If True, keep temporary files for debugging + """ + super().__init__(srcdir, keep_temps) + self.config_file = config_file + self.used_sources = used_sources + self.unifdef_path = unifdef_path + self.include_headers = include_headers + self.unifdef_cfg = None + + def _create_unifdef_config(self, config): + """Create a temporary unifdef configuration file. + + Args: + config (dict): Dictionary mapping CONFIG_* names to values + + Creates a file with -D and -U directives for each CONFIG_* symbol + that can be passed to unifdef via -f flag. + """ + # Create temporary file for unifdef directives + fd, self.unifdef_cfg = tempfile.mkstemp(prefix='unifdef_', + suffix='.cfg') + + with os.fdopen(fd, 'w') as f: + for name, value in sorted(config.items()): + if value is None or value == '' or value == 'n': + # Symbol is not set - undefine it + f.write(f'#undef {name}\n') + elif value is True or value == 'y': + # Boolean CONFIG - define it as 1 + f.write(f'#define {name} 1\n') + elif value == 'm': + # Module - treat as not set for U-Boot + f.write(f'#undef {name}\n') + elif (isinstance(value, str) and value.startswith('"') and + value.endswith('"')): + # String value with quotes - use as-is + f.write(f'#define {name} {value}\n') + else: + # Numeric or other value + try: + # Try to parse as integer + int_val = int(value, 0) + f.write(f'#define {name} {int_val}\n') + except (ValueError, TypeError): + # Not an integer - escape and quote it + escaped_value = (str(value).replace('\\', '\\\\') + .replace('"', '\\"')) + f.write(f'#define {name} "{escaped_value}"\n') + + def __del__(self): + """Clean up temporary unifdef config file""" + if self.unifdef_cfg and os.path.exists(self.unifdef_cfg): + # Keep the file if requested + if self.keep_temps: + tout.debug(f'Keeping unifdef config file: {self.unifdef_cfg}') + return + try: + os.unlink(self.unifdef_cfg) + except OSError: + pass + + def process(self, jobs=None): + """Perform line-level analysis on used source files. + + Args: + jobs (int): Number of parallel jobs (None = use all CPUs) + + Returns: + Dictionary mapping source files to analysis results, or None on + error + """ + # Validate config file exists + if not os.path.exists(self.config_file): + tout.error(f'Config file not found: {self.config_file}') + return None + + # Check if unifdef exists (check both absolute path and PATH) + if os.path.isabs(self.unifdef_path): + # Absolute path - check if it exists + if not os.path.exists(self.unifdef_path): + tout.fatal(f'unifdef not found at: {self.unifdef_path}') + else: + # Relative path or command name - check PATH + unifdef_full = shutil.which(self.unifdef_path) + if not unifdef_full: + tout.fatal(f'unifdef not found in PATH: {self.unifdef_path}') + self.unifdef_path = unifdef_full + + # Load configuration + tout.progress('Loading configuration...') + config, error = load_config(self.config_file, self.srcdir) + if error: + tout.fatal(error) + tout.progress(f'Loaded {len(config)} config symbols') + + # Create unifdef config file + self._create_unifdef_config(config) + + tout.progress('Analysing preprocessor conditionals...') + file_results = {} + + # Filter sources to only .c and .S files unless include_headers is set + used_sources = self.used_sources + if not self.include_headers: + filtered_sources = {s for s in used_sources + if s.endswith('.c') or s.endswith('.S')} + excluded_count = len(used_sources) - len(filtered_sources) + if excluded_count > 0: + tout.progress(f'Excluding {excluded_count} header files ' + + '(use -i to include them)') + used_sources = filtered_sources + + # Count lines in defs file + with open(self.unifdef_cfg, 'r', encoding='utf-8') as f: + defs_lines = len(f.readlines()) + + # Use multiprocessing for parallel unifdef execution + # Prepare arguments for parallel processing + source_list = sorted(used_sources) + worker_args = [(source_file, self.unifdef_cfg, self.unifdef_path, True) + for source_file in source_list] + + tout.progress(f'Running unifdef on {len(source_list)} files...') + start_time = time.time() + + # If jobs=1, run directly without multiprocessing for easier debugging + if jobs == 1: + results = [worker(args) for args in worker_args] + else: + with multiprocessing.Pool(processes=jobs) as pool: + results = list(pool.imap(worker, worker_args, chunksize=10)) + elapsed_time = time.time() - start_time + + # Convert results to file_results dict and calculate totals + # Check for errors first + total_source_lines = 0 + errors = [] + for (source_file, total_lines, active_lines, inactive_lines, + line_status, error_msg) in results: + if error_msg: + errors.append(error_msg) + else: + file_results[source_file] = FileResult( + total_lines=total_lines, + active_lines=active_lines, + inactive_lines=inactive_lines, + line_status=line_status + ) + total_source_lines += total_lines + + # Report any errors + if errors: + for error in errors: + tout.error(error) + tout.fatal(f'unifdef failed on {len(errors)} file(s)') + + kloc = total_source_lines // 1000 + tout.info(f'Analysed {len(file_results)} files ({kloc} kLOC, ' + + f'{defs_lines} defs) in {elapsed_time:.1f} seconds') + tout.info(f'Unifdef directives file: {self.unifdef_cfg}') + + # Clean up temporary unifdef config file (unless in debug mode) + if tout.verbose >= tout.DEBUG: + tout.debug(f'Keeping unifdef directives file: {self.unifdef_cfg}') + else: + try: + os.unlink(self.unifdef_cfg) + tout.debug(f'Cleaned up {self.unifdef_cfg}') + except OSError as e: + tout.debug(f'Failed to clean up {self.unifdef_cfg}: {e}') + + return file_results -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 6/9] codman: Provide an dwarf analyser

From: Simon Glass <simon.glass@canonical.com> Add a way to do static preprocessor analysis using debug information from compiled code. This reads the DWARF tables to determin which lines produced code. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/codman/dwarf.py | 200 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 200 insertions(+) create mode 100644 tools/codman/dwarf.py diff --git a/tools/codman/dwarf.py b/tools/codman/dwarf.py new file mode 100644 index 00000000000..adceac9d20a --- /dev/null +++ b/tools/codman/dwarf.py @@ -0,0 +1,200 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright 2025 Canonical Ltd +# +"""DWARF debug info-based line-level analysis for source code. + +This module provides functionality to analyse which lines in source files +were compiled by extracting line information from DWARF debug data in +object files. +""" + +import multiprocessing +import os +import subprocess +from collections import defaultdict + +from u_boot_pylib import tout +from analyser import Analyser, FileResult + + +def worker(args): + """Extract line numbers from DWARF debug info in an object file. + + Uses readelf --debug-dump=decodedline to get the line table, then parses + section headers and line entries to determine which source lines were + compiled into the object. + + Args: + args (tuple): Tuple of (obj_path, build_dir, srcdir) + + Returns: + tuple: (source_lines_dict, error_msg) where source_lines_dict is a + mapping of source file paths to sets of line numbers, and + error_msg is None on success or an error string on failure + """ + obj_path, build_dir, srcdir = args + source_lines = defaultdict(set) + + # Get the directory of the .o file relative to build_dir + rel_to_build = os.path.relpath(obj_path, build_dir) + obj_dir = os.path.dirname(rel_to_build) + + # Use readelf to extract decoded line information + try: + result = subprocess.run( + ['readelf', '--debug-dump=decodedline', obj_path], + capture_output=True, text=True, check=False, + encoding='utf-8', errors='ignore') + if result.returncode != 0: + error_msg = (f'readelf failed on {obj_path} with return code ' + f'{result.returncode}\nstderr: {result.stderr}') + return (source_lines, error_msg) + + # Parse the output + # Format is: Section header with full path, then data lines + current_file = None + for line in result.stdout.splitlines(): + # Skip header lines and empty lines + if not line or line.startswith('Contents of') or \ + line.startswith('File name') or line.strip() == '' or \ + line.startswith(' '): + continue + + # Look for section headers with full path (e.g., '/path/to/file.c:') + if line.endswith(':'): + header_path = line.rstrip(':') + # Try to resolve the path + if os.path.isabs(header_path): + # Absolute path in DWARF + abs_path = os.path.realpath(header_path) + else: + # Relative path - try relative to srcdir and obj_dir + abs_path = os.path.realpath( + os.path.join(srcdir, obj_dir, header_path)) + if not os.path.exists(abs_path): + abs_path = os.path.realpath( + os.path.join(srcdir, header_path)) + + if os.path.exists(abs_path): + current_file = abs_path + continue + + # Parse data lines - use current_file from section header + if current_file: + parts = line.split() + if len(parts) >= 2: + try: + line_num = int(parts[1]) + # Skip special line numbers (like '-') + if line_num > 0: + source_lines[current_file].add(line_num) + except (ValueError, IndexError): + continue + except (OSError, subprocess.SubprocessError) as e: + error_msg = f'Failed to execute readelf on {obj_path}: {e}' + return (source_lines, error_msg) + + return (source_lines, None) + + +# pylint: disable=too-few-public-methods +class DwarfAnalyser(Analyser): + """Analyser that uses DWARF debug info to determine active lines. + + This analyser extracts line number information from DWARF debug data in + compiled object files to determine which source lines generated code. + """ + def __init__(self, build_dir, srcdir, used_sources, keep_temps=False): + """Initialise the DWARF analyser. + + Args: + build_dir (str): Build directory containing .o files + srcdir (str): Path to source root directory + used_sources (set): Set of source files that are compiled + keep_temps (bool): If True, keep temporary files for debugging + """ + super().__init__(srcdir, keep_temps) + self.build_dir = build_dir + self.used_sources = used_sources + + def extract_lines(self, jobs=None): + """Extract used line numbers from DWARF debug info in object files. + + Args: + jobs (int): Number of parallel jobs (None = use all CPUs) + + Returns: + dict: Mapping of source file paths to sets of line numbers that + generated code + """ + # Find all .o files + obj_files = self.find_object_files(self.build_dir) + + if not obj_files: + return defaultdict(set) + + # Prepare arguments for parallel processing + args_list = [(obj_path, self.build_dir, self.srcdir) + for obj_path in obj_files] + + # Process in parallel + num_jobs = jobs if jobs else multiprocessing.cpu_count() + with multiprocessing.Pool(num_jobs) as pool: + results = pool.map(worker, args_list) + + # Merge results from all workers and check for errors + source_lines = defaultdict(set) + errors = [] + for result_dict, error_msg in results: + if error_msg: + errors.append(error_msg) + else: + for source_file, lines in result_dict.items(): + source_lines[source_file].update(lines) + + # Report any errors + if errors: + for error in errors: + tout.error(error) + tout.fatal(f'readelf failed on {len(errors)} object file(s)') + + return source_lines + + def process(self, jobs=None): + """Perform line-level analysis using DWARF debug info. + + Args: + jobs (int): Number of parallel jobs (None = use all CPUs) + + Returns: + dict: Mapping of source file paths to FileResult named tuples + """ + tout.progress('Extracting DWARF line information...') + dwarf_line_map = self.extract_lines(jobs) + + file_results = {} + for source_file in self.used_sources: + abs_path = os.path.realpath(source_file) + used_lines = dwarf_line_map.get(abs_path, set()) + + # Count total lines in the file + total_lines = self.count_lines(abs_path) + + active_lines = len(used_lines) + inactive_lines = total_lines - active_lines + + # Create line status dict + line_status = {} + for i in range(1, total_lines + 1): + line_status[i] = 'active' if i in used_lines else 'inactive' + + file_results[abs_path] = FileResult( + total_lines=total_lines, + active_lines=active_lines, + inactive_lines=inactive_lines, + line_status=line_status + ) + + tout.info(f'Analysed {len(file_results)} files using DWARF debug info') + return file_results -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 7/9] codman: Begin an experimental lsp analyser

From: Simon Glass <simon.glass@canonical.com> It is possible to use an LSP to determine which code is used, at least to some degree. Make a start on this, in the hope that future work may prove out the concept. So far I have not found this to be particularly useful, since it does not seem to handle IS_ENABLED() and similar macros when working out inactive regions. Co-developed-by: Claude <noreply@anthropic.com> Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/codman/lsp.py | 319 +++++++++++++++++++++++++++++++++++++ tools/codman/lsp_client.py | 225 ++++++++++++++++++++++++++ tools/codman/test_lsp.py | 153 ++++++++++++++++++ 3 files changed, 697 insertions(+) create mode 100644 tools/codman/lsp.py create mode 100644 tools/codman/lsp_client.py create mode 100755 tools/codman/test_lsp.py diff --git a/tools/codman/lsp.py b/tools/codman/lsp.py new file mode 100644 index 00000000000..143fe22a7e1 --- /dev/null +++ b/tools/codman/lsp.py @@ -0,0 +1,319 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright 2025 Canonical Ltd +# +"""LSP-based line-level analysis for source code. + +This module provides functionality to analyse which lines in source files +are active vs inactive based on preprocessor conditionals, using clangd's +inactive regions feature via the Language Server Protocol (LSP). +""" + +import concurrent.futures +import json +import multiprocessing +import os +import re +import tempfile +import time + +from u_boot_pylib import tools, tout +from analyser import Analyser, FileResult +from lsp_client import LspClient + + +def create_compile_commands(build_dir, srcdir): + """Create compile_commands.json using gen_compile_commands.py. + + Args: + build_dir (str): Build directory path + srcdir (str): Source directory path + + Returns: + list: List of compile command entries + """ + # Use the same pattern as gen_compile_commands.py + line_pattern = re.compile( + r'^(saved)?cmd_[^ ]*\.o := (?P<command_prefix>.* )' + r'(?P<file_path>[^ ]*\.[cS]) *(;|$)') + + compile_commands = [] + + # Walk through build directory looking for .cmd files + filename_matcher = re.compile(r'^\..*\.cmd$') + exclude_dirs = ['.git', 'Documentation', 'include', 'tools'] + + for dirpath, dirnames, filenames in os.walk(build_dir, topdown=True): + # Prune unwanted directories + dirnames = [d for d in dirnames if d not in exclude_dirs] + + for filename in filenames: + if not filename_matcher.match(filename): + continue + + cmd_file = os.path.join(dirpath, filename) + try: + with open(cmd_file, 'rt', encoding='utf-8') as f: + result = line_pattern.match(f.readline()) + if result: + command_prefix = result.group('command_prefix') + file_path = result.group('file_path') + + # Clean up command prefix (handle escaped #) + prefix = command_prefix.replace(r'\#', '#').replace( + '$(pound)', '#') + + # Get absolute path to source file + abs_path = os.path.realpath( + os.path.join(srcdir, file_path)) + if os.path.exists(abs_path): + compile_commands.append({ + 'directory': srcdir, + 'file': abs_path, + 'command': prefix + file_path, + }) + except (OSError, IOError): + continue + + return compile_commands + + +def worker(args): + """Analyse a single source file using clangd LSP. + + Args: + args (tuple): Tuple of (source_file, client) + where client is a shared LspClient instance + + Returns: + tuple: (source_file, inactive_regions, error_msg) + """ + source_file, client = args + + try: + # Read file content + content = tools.read_file(source_file, binary=False) + + # Open the document + client.notify('textDocument/didOpen', { + 'textDocument': { + 'uri': f'file://{source_file}', + 'languageId': 'c', + 'version': 1, + 'text': content + } + }) + + # Wait for clangd to process and send notifications + # Poll for inactive regions notification for this specific file + max_wait = 10 # seconds + start_time = time.time() + inactive_regions = None + + while time.time() - start_time < max_wait: + time.sleep(0.1) + + with client.lock: + notifications = list(client.notifications) + # Clear processed notifications to avoid buildup + client.notifications = [] + + for notif in notifications: + method = notif.get('method', '') + if method == 'textDocument/clangd.inactiveRegions': + params = notif.get('params', {}) + uri = params.get('uri', '') + # Check if this notification is for our file + if uri == f'file://{source_file}': + inactive_regions = params.get('inactiveRegions', []) + break + + if inactive_regions is not None: + break + + # Close the document to free resources + client.notify('textDocument/didClose', { + 'textDocument': { + 'uri': f'file://{source_file}' + } + }) + + if inactive_regions is None: + # No inactive regions notification received + # This could mean the file has no inactive code + inactive_regions = [] + + return (source_file, inactive_regions, None) + + except Exception as e: + return (source_file, None, str(e)) + + +class LspAnalyser(Analyser): # pylint: disable=too-few-public-methods + """Analyser that uses clangd LSP to determine active lines. + + This analyser uses the Language Server Protocol (LSP) with clangd to + identify inactive preprocessor regions in source files. + """ + + def __init__(self, build_dir, srcdir, used_sources, keep_temps=False): + """Set up the LSP analyser. + + Args: + build_dir (str): Build directory containing .o and .cmd files + srcdir (str): Path to source root directory + used_sources (set): Set of source files that are compiled + keep_temps (bool): If True, keep temporary files for debugging + """ + super().__init__(srcdir, keep_temps) + self.build_dir = build_dir + self.used_sources = used_sources + + def extract_inactive_regions(self, jobs=None): + """Extract inactive regions from source files using clangd. + + Args: + jobs (int): Number of parallel jobs (None = use all CPUs) + + Returns: + dict: Mapping of source file paths to lists of inactive regions + """ + # Create compile commands database + tout.progress('Building compile commands database...') + compile_commands = create_compile_commands(self.build_dir, self.srcdir) + + # Filter to only .c and .S files that we need to analyse + filtered_files = [] + for cmd in compile_commands: + source_file = cmd['file'] + if source_file in self.used_sources: + if source_file.endswith('.c') or source_file.endswith('.S'): + filtered_files.append(source_file) + + tout.progress(f'Found {len(filtered_files)} source files to analyse') + + if not filtered_files: + return {} + + inactive = {} + errors = [] + + # Create a single clangd instance and use it for all files + with tempfile.TemporaryDirectory() as tmpdir: + # Write compile commands database + compile_db = os.path.join(tmpdir, 'compile_commands.json') + with open(compile_db, 'w', encoding='utf-8') as f: + json.dump(compile_commands, f) + + # Start a single clangd server + tout.progress('Starting clangd server...') + with LspClient(['clangd', '--log=error', + f'--compile-commands-dir={tmpdir}']) as client: + result = client.init(f'file://{self.srcdir}') + if not result: + tout.error('Failed to start clangd') + return {} + + # Determine number of workers + if jobs is None: + jobs = min(multiprocessing.cpu_count(), len(filtered_files)) + elif jobs <= 0: + jobs = 1 + + tout.progress(f'Processing files with {jobs} workers...') + + # Use ThreadPoolExecutor to process files in parallel + # (threads share the same clangd client) + with concurrent.futures.ThreadPoolExecutor( + max_workers=jobs) as executor: + # Submit all tasks + future_to_file = { + executor.submit(worker, (source_file, client)): + source_file + for source_file in filtered_files + } + + # Collect results as they complete + completed = 0 + for future in concurrent.futures.as_completed(future_to_file): + source_file = future_to_file[future] + completed += 1 + tout.progress( + f'Processing {completed}/{len(filtered_files)}: ' + + f'{os.path.basename(source_file)}...') + + try: + source_file_result, inactive_regions, error_msg = ( + future.result()) + + if error_msg: + errors.append(f'{source_file}: {error_msg}') + elif inactive_regions is not None: + inactive[source_file_result] = ( + inactive_regions) + except Exception as exc: + errors.append(f'{source_file}: {exc}') + + # Report any errors + if errors: + for error in errors[:10]: # Show first 10 errors + tout.error(error) + if len(errors) > 10: + tout.error(f'... and {len(errors) - 10} more errors') + tout.warning(f'Failed to analyse {len(errors)} file(s) with LSP') + + return inactive + + def process(self, jobs=None): + """Perform line-level analysis using clangd LSP. + + Args: + jobs (int): Number of parallel jobs (None = use all CPUs) + + Returns: + dict: Mapping of source file paths to FileResult named tuples + """ + tout.progress('Extracting inactive regions using clangd LSP...') + inactive_regions_map = self.extract_inactive_regions(jobs) + + file_results = {} + for source_file in self.used_sources: + # Only process .c and .S files + if not (source_file.endswith('.c') or source_file.endswith('.S')): + continue + + abs_path = os.path.realpath(source_file) + inactive_regions = inactive_regions_map.get(abs_path, []) + + # Count total lines in the file + total_lines = self.count_lines(abs_path) + + # Create line status dict + line_status = {} + # Set up all lines as active + for i in range(1, total_lines + 1): + line_status[i] = 'active' + + # Mark inactive lines based on regions + # LSP uses 0-indexed line numbers + for region in inactive_regions: + start_line = region['start']['line'] + 1 + end_line = region['end']['line'] + 1 + # Mark lines as inactive (inclusive range) + for line_num in range(start_line, end_line + 1): + if line_num <= total_lines: + line_status[line_num] = 'inactive' + + inactive_lines = len([s for s in line_status.values() + if s == 'inactive']) + active_lines = total_lines - inactive_lines + + file_results[abs_path] = FileResult( + total_lines=total_lines, + active_lines=active_lines, + inactive_lines=inactive_lines, + line_status=line_status + ) + + tout.info(f'Analysed {len(file_results)} files using clangd LSP') + return file_results diff --git a/tools/codman/lsp_client.py b/tools/codman/lsp_client.py new file mode 100644 index 00000000000..954879a651e --- /dev/null +++ b/tools/codman/lsp_client.py @@ -0,0 +1,225 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Copyright 2025 Canonical Ltd +# +"""Minimal LSP (Language Server Protocol) client for clangd. + +This module provides a simple JSON-RPC 2.0 client for communicating with +LSP servers like clangd. It focuses on the specific functionality needed +for analyzing inactive preprocessor regions. +""" + +import json +import subprocess +import threading +from typing import Any, Dict, Optional + + +class LspClient: + """Minimal LSP client for JSON-RPC 2.0 communication. + + This client handles the basic LSP protocol communication over + stdin/stdout with a language server process. + + Attributes: + process: The language server subprocess + next_id: Counter for JSON-RPC request IDs + responses: Dict mapping request IDs to response data + lock: Thread lock for response dictionary + reader_thread: Background thread reading server responses + """ + + def __init__(self, server_command): + """Init the LSP client and start the server. + + Args: + server_command (list): Command to start the LSP server + (e.g., ['clangd', '--log=error']) + """ + self.process = subprocess.Popen( + server_command, + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + text=True, + bufsize=0 + ) + self.next_id = 1 + self.responses = {} + self.notifications = [] + self.lock = threading.Lock() + self.running = True + + # Start background thread to read responses + self.reader_thread = threading.Thread(target=self._read_responses) + self.reader_thread.daemon = True + self.reader_thread.start() + + def _read_responses(self): + """Background thread to read responses from the server""" + while self.running and self.process.poll() is None: + try: + # Read headers + headers = {} + while True: + line = self.process.stdout.readline() + if not line or line == '\r\n' or line == '\n': + break + if ':' in line: + key, value = line.split(':', 1) + headers[key.strip()] = value.strip() + + if 'Content-Length' not in headers: + continue + + # Read content + content_length = int(headers['Content-Length']) + content = self.process.stdout.read(content_length) + + if not content: + break + + # Parse JSON + message = json.loads(content) + + # Store response or notification + with self.lock: + if 'id' in message: + # Response to a request + self.responses[message['id']] = message + else: + # Notification from server + self.notifications.append(message) + + except (json.JSONDecodeError, ValueError): + continue + except Exception: + break + + def _send_message(self, message: Dict[str, Any]): + """Send a JSON-RPC message to the server. + + Args: + message: JSON-RPC message dictionary + """ + content = json.dumps(message) + headers = f'Content-Length: {len(content)}\r\n\r\n' + self.process.stdin.write(headers + content) + self.process.stdin.flush() + + def request(self, method: str, params: Optional[Dict] = None, + timeout: int = 30) -> Optional[Dict]: + """Send a JSON-RPC request and wait for response. + + Args: + method: LSP method name (e.g., 'initialize') + params: Method parameters dictionary + timeout: Timeout in seconds (default: 30) + + Returns: + Response dictionary, or None on timeout/error + """ + request_id = self.next_id + self.next_id += 1 + + message = { + 'jsonrpc': '2.0', + 'id': request_id, + 'method': method, + } + if params: + message['params'] = params + + self._send_message(message) + + # Wait for response + import time + start_time = time.time() + while time.time() - start_time < timeout: + with self.lock: + if request_id in self.responses: + response = self.responses.pop(request_id) + if 'result' in response: + return response['result'] + if 'error' in response: + raise RuntimeError( + f"LSP error: {response['error']}") + return response + time.sleep(0.01) + + return None + + def notify(self, method: str, params: Optional[Dict] = None): + """Send a JSON-RPC notification (no response expected). + + Args: + method: LSP method name + params: Method parameters dictionary + """ + message = { + 'jsonrpc': '2.0', + 'method': method, + } + if params: + message['params'] = params + + self._send_message(message) + + def init(self, root_uri: str, capabilities: Optional[Dict] = None) -> Dict: + """Send initialize request to the server. + + Args: + root_uri: Workspace root URI (e.g., 'file:///path/to/workspace') + capabilities: Client capabilities dict + + Returns: + Server capabilities from initialize response + """ + if capabilities is None: + capabilities = { + 'textDocument': { + 'semanticTokens': { + 'requests': { + 'full': True + } + }, + 'publishDiagnostics': {}, + 'inactiveRegions': { + 'refreshSupport': False + } + } + } + + result = self.request('initialize', { + 'processId': None, + 'rootUri': root_uri, + 'capabilities': capabilities + }) + + # Send initialized notification + self.notify('initialized', {}) + + return result + + def shutdown(self): + """Shutdown the language server""" + self.request('shutdown') + self.notify('exit') + self.running = False + if self.process: + self.process.wait(timeout=5) + # Close file descriptors to avoid ResourceWarnings + if self.process.stdin: + self.process.stdin.close() + if self.process.stdout: + self.process.stdout.close() + if self.process.stderr: + self.process.stderr.close() + + def __enter__(self): + """Context manager entry""" + return self + + def __exit__(self, exc_type, exc_val, exc_tb): + """Context manager exit - ensure cleanup""" + self.shutdown() diff --git a/tools/codman/test_lsp.py b/tools/codman/test_lsp.py new file mode 100755 index 00000000000..1070ce655fb --- /dev/null +++ b/tools/codman/test_lsp.py @@ -0,0 +1,153 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0+ +# +# Copyright 2025 Canonical Ltd +# +"""Test script for LSP client with clangd""" + +import json +import os +import sys +import tempfile +import time + +# Add parent directory to path +sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + +from lsp_client import LspClient # pylint: disable=wrong-import-position + + +def test_clangd(): + """Test basic clangd functionality""" + # Create a temporary directory with a simple C file + with tempfile.TemporaryDirectory() as tmpdir: + # Create a C file with CONFIG-style inactive code + test_file = os.path.join(tmpdir, 'test.c') + with open(test_file, 'w', encoding='utf-8') as f: + f.write('''#include <stdio.h> + +// Simulate U-Boot style CONFIG options +#define CONFIG_FEATURE_A 1 + +void always_compiled(void) +{ + printf("Always here\\n"); +} + +#ifdef CONFIG_FEATURE_A +void feature_a_code(void) +{ + printf("Feature A enabled\\n"); +} +#endif + +#ifdef CONFIG_FEATURE_B +void feature_b_code(void) +{ + printf("Feature B enabled (THIS SHOULD BE INACTIVE)\\n"); +} +#endif + +#if 0 +void disabled_debug_code(void) +{ + printf("Debug code (INACTIVE)\\n"); +} +#endif +''') + + # Create compile_commands.json + compile_commands = [ + { + 'directory': tmpdir, + 'command': f'gcc -c {test_file}', + 'file': test_file + } + ] + compile_db = os.path.join(tmpdir, 'compile_commands.json') + with open(compile_db, 'w', encoding='utf-8') as f: + json.dump(compile_commands, f) + + # Create .clangd config to enable inactive regions + clangd_config = os.path.join(tmpdir, '.clangd') + with open(clangd_config, 'w', encoding='utf-8') as f: + f.write('''InactiveRegions: + Opacity: 0.55 +''') + + print(f'Created test file: {test_file}') + print(f'Created compile DB: {compile_db}') + print(f'Created clangd config: {clangd_config}') + + # Start clangd + print('\\nStarting clangd...') + with LspClient(['clangd', '--log=error', + f'--compile-commands-dir={tmpdir}']) as client: + print('Initialising...') + result = client.init(f'file://{tmpdir}') + print(f'Server capabilities: {result.get("capabilities", {}).keys()}') + + # Open the document + print(f'\\nOpening document: {test_file}') + with open(test_file, 'r', encoding='utf-8') as f: + content = f.read() + + client.notify('textDocument/didOpen', { + 'textDocument': { + 'uri': f'file://{test_file}', + 'languageId': 'c', + 'version': 1, + 'text': content + } + }) + + # Wait for clangd to index the file + print('\\nWaiting for clangd to index file...') + time.sleep(3) + + # Check for inactive regions notification + print('\\nChecking for inactive regions notification...') + with client.lock: + notifications = list(client.notifications) + + print(f'Received {len(notifications)} notifications:') + inactive_regions = None + for notif in notifications: + method = notif.get('method', 'unknown') + print(f' - {method}') + + # Look for the clangd inactive regions extension + if method == 'textDocument/clangd.inactiveRegions': + params = notif.get('params', {}) + inactive_regions = params.get('inactiveRegions', []) + print(f' Found {len(inactive_regions)} inactive regions!') + + if inactive_regions: + print('\\nInactive regions:') + for region in inactive_regions: + start = region['start'] + end = region['end'] + start_line = start['line'] + 1 # LSP is 0-indexed + end_line = end['line'] + 1 + print(f' Lines {start_line}-{end_line}') + else: + print('\\nNo inactive regions received (feature may not be enabled)') + + # Also show the file with line numbers for reference + print('\\nFile contents:') + for i, line in enumerate(content.split('\\n'), 1): + print(f'{i:3}: {line}') + + print('\\nTest completed!') + + # Check clangd stderr for any errors + print('\\n=== Clangd stderr output ===') + stderr_output = client.process.stderr.read() + if stderr_output: + print(stderr_output[:1000]) + else: + print('(no stderr output)') + + +if __name__ == '__main__': + test_clangd() -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 8/9] codman: Add some basic tests

From: Simon Glass <simon.glass@canonical.com> Add some rudimentary tests of the codman functionality. Signed-off-by: Simon Glass <simon.glass@canonical.com> --- tools/codman/test_codman.py | 470 ++++++++++++++++++++++++++++++++++++ 1 file changed, 470 insertions(+) create mode 100755 tools/codman/test_codman.py diff --git a/tools/codman/test_codman.py b/tools/codman/test_codman.py new file mode 100755 index 00000000000..ed387c82472 --- /dev/null +++ b/tools/codman/test_codman.py @@ -0,0 +1,470 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0+ +# +# Copyright 2025 Canonical Ltd +# +"""Very basic tests for codman.py script""" + +import os +import shutil +import subprocess +import sys +import tempfile +import unittest + +# Test configuration +SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) + +# Import the module to test +sys.path.insert(0, SCRIPT_DIR) +sys.path.insert(0, os.path.join(SCRIPT_DIR, '..')) +# pylint: disable=wrong-import-position +from u_boot_pylib import terminal, tools +import output # pylint: disable=wrong-import-position +import codman # pylint: disable=wrong-import-position + + +class TestSourceUsage(unittest.TestCase): + """Test cases for codman.py""" + + def setUp(self): + """Set up test environment with fake source tree and build""" + self.test_dir = tempfile.mkdtemp(prefix='test_source_usage_') + self.src_dir = os.path.join(self.test_dir, 'src') + self.build_dir = os.path.join(self.test_dir, 'build') + os.makedirs(self.src_dir) + os.makedirs(self.build_dir) + + # Create fake source files + self._create_fake_sources() + + # Create fake Makefile + self._create_makefile() + + # Create fake .config + self._create_config() + + def tearDown(self): + """Clean up test environment""" + if os.path.exists(self.test_dir): + shutil.rmtree(self.test_dir) + + def _create_fake_sources(self): + """Create a fake source tree with various files""" + # Create directory structure + dirs = [ + 'common', + 'drivers/video', + 'drivers/serial', + 'lib', + 'arch/sandbox', + ] + for dir_path in dirs: + os.makedirs(os.path.join(self.src_dir, dir_path), exist_ok=True) + + # Create source files + # common/main.c - will be compiled + self._write_file('common/main.c', '''#include <common.h> + +void board_init(void) +{ +#ifdef CONFIG_FEATURE_A + feature_a_init(); +#endif +#ifdef CONFIG_FEATURE_B + feature_b_init(); +#endif + common_init(); +} +''') + + # common/unused.c - will NOT be compiled + self._write_file('common/unused.c', '''#include <common.h> + +void unused_function(void) +{ + /* This file is never compiled */ +} +''') + + # drivers/video/display.c - will be compiled + self._write_file('drivers/video/display.c', '''#include <video.h> + +#ifdef CONFIG_VIDEO_LOGO +static void show_logo(void) +{ + /* Show boot logo */ +} +#endif + +void display_init(void) +{ +#ifdef CONFIG_VIDEO_LOGO + show_logo(); +#endif + /* Init display */ +} +''') + + # drivers/serial/serial.c - will be compiled + self._write_file('drivers/serial/serial.c', '''#include <serial.h> + +void serial_init(void) +{ + /* Init serial port */ +} +''') + + # lib/string.c - will be compiled + self._write_file('lib/string.c', '''#include <linux/string.h> + +int strlen(const char *s) +{ + int len = 0; + while (*s++) + len++; + return len; +} +''') + + # arch/sandbox/cpu.c - will be compiled + self._write_file('arch/sandbox/cpu.c', '''#include <common.h> + +void cpu_init(void) +{ + /* Sandbox CPU init */ +} +''') + + # Create header files + self._write_file('include/common.h', '''#ifndef __COMMON_H +#define __COMMON_H +void board_init(void); +#endif +''') + + self._write_file('include/video.h', '''#ifndef __VIDEO_H +#define __VIDEO_H +void display_init(void); +#endif +''') + + self._write_file('include/serial.h', '''#ifndef __SERIAL_H +#define __SERIAL_H +void serial_init(void); +#endif +''') + + self._write_file('include/linux/string.h', '''#ifndef __LINUX_STRING_H +#define __LINUX_STRING_H +int strlen(const char *s); +#endif +''') + + def _create_makefile(self): + """Create a simple Makefile that generates .cmd files""" + makefile = f'''# Simple test Makefile +SRCDIR := {self.src_dir} +O ?= . +BUILD_DIR = $(O) + +# Compiler flags +CFLAGS := -Iinclude +ifeq ($(DEBUG),1) +CFLAGS += -g +endif + +# Source files to compile +OBJS = $(BUILD_DIR)/common/main.o \\ + $(BUILD_DIR)/drivers/video/display.o \\ + $(BUILD_DIR)/drivers/serial/serial.o \\ + $(BUILD_DIR)/lib/string.o \\ + $(BUILD_DIR)/arch/sandbox/cpu.o + +all: $(OBJS) +\t@echo "Build complete" + +# Rule to compile .c files +$(BUILD_DIR)/%.o: %.c +\t@mkdir -p $(dir $@) +\t@echo " CC $<" +\t@gcc $(CFLAGS) -c -o $@ $(SRCDIR)/$< +\t@echo "cmd_$@ := gcc $(CFLAGS) -c -o $@ $<" > $(dir $@).$(notdir $@).cmd +\t@echo "source_$@ := $(SRCDIR)/$<" >> $(dir $@).$(notdir $@).cmd +\t@echo "deps_$@ := \\\\" >> $(dir $@).$(notdir $@).cmd +\t@echo " $(SRCDIR)/$< \\\\" >> $(dir $@).$(notdir $@).cmd +\t@echo "" >> $(dir $@).$(notdir $@).cmd + +clean: +\t@rm -rf $(BUILD_DIR) + +.PHONY: all clean +''' + self._write_file('Makefile', makefile) + + def _create_config(self): + """Create a fake .config file""" + config = '''CONFIG_FEATURE_A=y +# CONFIG_FEATURE_B is not set +CONFIG_VIDEO_LOGO=y +''' + self._write_file(os.path.join(self.build_dir, '.config'), config) + + def _write_file(self, rel_path, content): + """Write a file relative to src_dir""" + if rel_path.startswith('/'): + # Absolute path for build dir files + file_path = rel_path + else: + file_path = os.path.join(self.src_dir, rel_path) + os.makedirs(os.path.dirname(file_path), exist_ok=True) + tools.write_file(file_path, content.encode('utf-8')) + + def _build(self, debug=False): + """Run the test build. + + Args: + debug (bool): If True, build with debug symbols (DEBUG=1) + """ + cmd = ['make', '-C', self.src_dir, f'O={self.build_dir}'] + if debug: + cmd.append('DEBUG=1') + result = subprocess.run(cmd, capture_output=True, text=True, + check=False) + if result.returncode != 0: + print(f'Build failed: {result.stderr}') + print(f'Build stdout: {result.stdout}') + self.fail('Test build failed') + + def test_basic_file_stats(self): + """Test basic file-level statistics""" + self._build() + + # Call select_sources() directly + _all_srcs, used, skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Verify counts - we have 5 compiled .c files + self.assertEqual(len(used), 5, + f'Expected 5 used files, got {len(used)}') + + # Should have 1 unused .c file (common/unused.c) + unused_c_files = [f for f in skipped if f.endswith('.c')] + self.assertEqual(len(unused_c_files), 1, + f'Expected 1 unused .c file, got {len(unused_c_files)}') + + # Check that specific files are in used set + used_basenames = {os.path.basename(f) for f in used} + self.assertIn('main.c', used_basenames) + self.assertIn('display.c', used_basenames) + self.assertIn('serial.c', used_basenames) + self.assertIn('string.c', used_basenames) + self.assertIn('cpu.c', used_basenames) + + # Check that unused.c is not in used set + self.assertNotIn('unused.c', used_basenames) + + def test_list_unused(self): + """Test listing unused files""" + self._build() + + _all_srcs, _used, skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Check that unused.c is in skipped set + skipped_basenames = {os.path.basename(f) for f in skipped} + self.assertIn('unused.c', skipped_basenames) + + # Check that used files are not in skipped set + self.assertNotIn('main.c', skipped_basenames) + self.assertNotIn('display.c', skipped_basenames) + + def test_by_dir(self): + """Test directory breakdown by collecting stats""" + self._build() + + all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Collect directory stats + dir_stats = output.collect_dir_stats( + all_srcs, used, None, self.src_dir, False, False) + + # Should have stats for top-level directories + self.assertIn('common', dir_stats) + self.assertIn('drivers', dir_stats) + self.assertIn('lib', dir_stats) + self.assertIn('arch', dir_stats) + + # Check common directory has 2 files (main.c and unused.c) + self.assertEqual(dir_stats['common'].total, 2) + # Only 1 is used (main.c) + self.assertEqual(dir_stats['common'].used, 1) + + def test_subdirs(self): + """Test subdirectory breakdown""" + self._build() + + all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Collect subdirectory stats (by_subdirs=True) + dir_stats = output.collect_dir_stats( + all_srcs, used, None, self.src_dir, True, False) + + # Should have stats for subdirectories + self.assertIn('drivers/video', dir_stats) + self.assertIn('drivers/serial', dir_stats) + self.assertIn('arch/sandbox', dir_stats) + + def test_filter(self): + """Test filtering by pattern""" + self._build() + + # Apply video filter + all_srcs, _used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, '*video*') + + # Should only have video-related files + all_basenames = {os.path.basename(f) for f in all_srcs} + self.assertIn('display.c', all_basenames) + self.assertIn('video.h', all_basenames) + + # Should not have non-video files + self.assertNotIn('main.c', all_basenames) + self.assertNotIn('serial.c', all_basenames) + + def test_no_build_required(self): + """Test that analysis works with existing build""" + self._build() + + # Should work without building + all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Verify we got results + self.assertGreater(len(all_srcs), 0) + self.assertGreater(len(used), 0) + + def test_do_analysis_unifdef(self): + """Test do_analysis() with unifdef""" + self._build() + + _all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Run unifdef analysis + unifdef_path = shutil.which('unifdef') or '/usr/bin/unifdef' + results = codman.do_analysis(used, self.build_dir, self.src_dir, + unifdef_path, include_headers=False, + jobs=1, use_lsp=False) + + # Should get results + self.assertIsNotNone(results) + self.assertGreater(len(results), 0) + + # Check that results have the expected structure + for _file_path, result in results.items(): + self.assertGreater(result.total_lines, 0) + self.assertGreaterEqual(result.active_lines, 0) + self.assertGreaterEqual(result.inactive_lines, 0) + self.assertEqual(result.total_lines, + result.active_lines + result.inactive_lines) + + def test_do_analysis_dwarf(self): + """Test do_analysis() with DWARF""" + # Build with debug symbols + self._build(debug=True) + + _all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Run DWARF analysis (unifdef_path=None) + results = codman.do_analysis(used, self.build_dir, self.src_dir, + unifdef_path=None, include_headers=False, + jobs=1, use_lsp=False) + + # Should get results + self.assertIsNotNone(results) + self.assertGreater(len(results), 0) + + # Check that results have the expected structure + for _file_path, result in results.items(): + self.assertGreater(result.total_lines, 0) + self.assertGreaterEqual(result.active_lines, 0) + self.assertGreaterEqual(result.inactive_lines, 0) + self.assertEqual(result.total_lines, + result.active_lines + result.inactive_lines) + + def test_do_analysis_unifdef_missing_config(self): + """Test do_analysis() with unifdef when config file is missing""" + self._build() + + _all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Remove .config file + config_file = os.path.join(self.build_dir, '.config') + if os.path.exists(config_file): + os.remove(config_file) + + # Capture terminal output + with terminal.capture() as (_stdout, stderr): + # Run unifdef analysis - should return None + unifdef_path = shutil.which('unifdef') or '/usr/bin/unifdef' + results = codman.do_analysis(used, self.build_dir, self.src_dir, + unifdef_path, + include_headers=False, jobs=1, + use_lsp=False) + + # Should return None when config is missing + self.assertIsNone(results) + + # Check that error message was printed to stderr + error_text = stderr.getvalue() + self.assertIn('Config file not found', error_text) + self.assertIn('.config', error_text) + + def test_do_analysis_lsp(self): + """Test do_analysis() with LSP (clangd)""" + # Disabled for now + self.skipTest('LSP test disabled') + # Check if clangd is available + if not shutil.which('clangd'): + self.skipTest('clangd not found - skipping LSP test') + + # Build with compile commands + self._build() + + _all_srcs, used, _skipped = codman.select_sources( + self.src_dir, self.build_dir, None) + + # Run LSP analysis (unifdef_path=None, use_lsp=True) + results = codman.do_analysis(used, self.build_dir, self.src_dir, + unifdef_path=None, include_headers=False, + jobs=1, use_lsp=True) + + # Should get results + self.assertIsNotNone(results) + self.assertGreater(len(results), 0) + + # Check that results have the expected structure + for _file_path, result in results.items(): + self.assertGreater(result.total_lines, 0) + self.assertGreaterEqual(result.active_lines, 0) + self.assertGreaterEqual(result.inactive_lines, 0) + self.assertEqual(result.total_lines, + result.active_lines + result.inactive_lines) + + # Check specific file results + main_file = os.path.join(self.src_dir, 'common/main.c') + if main_file in results: + result = results[main_file] + # main.c has some conditional code, so should have some lines + self.assertGreater(result.total_lines, 0) + # Should have identified some active lines + self.assertGreater(result.active_lines, 0) + + +if __name__ == '__main__': + unittest.main(argv=['test_codman.py'], verbosity=2) -- 2.43.0

Simon Glass

1:49 p.m.

New subject: [PATCH 9/9] codman: Add documentation

From: Simon Glass <simon.glass@canonical.com> Provide a description of the purpose of codman and some examples of how to use it. Signed-off-by: Simon Glass <simon.glass@canonical.com> --- doc/develop/codman.rst | 1 + doc/develop/index.rst | 1 + tools/codman/codman.rst | 426 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 428 insertions(+) create mode 120000 doc/develop/codman.rst create mode 100644 tools/codman/codman.rst diff --git a/doc/develop/codman.rst b/doc/develop/codman.rst new file mode 120000 index 00000000000..a4f5c03d72d --- /dev/null +++ b/doc/develop/codman.rst @@ -0,0 +1 @@ +../../tools/codman/codman.rst \ No newline at end of file diff --git a/doc/develop/index.rst b/doc/develop/index.rst index 1a8e0168c67..d325ad23897 100644 --- a/doc/develop/index.rst +++ b/doc/develop/index.rst @@ -101,6 +101,7 @@ Refactoring checkpatch coccinelle + codman qconfig Code quality diff --git a/tools/codman/codman.rst b/tools/codman/codman.rst new file mode 100644 index 00000000000..d58bceb2101 --- /dev/null +++ b/tools/codman/codman.rst @@ -0,0 +1,426 @@ +.. SPDX-License-Identifier: GPL-2.0+ + +=================== +Codman code manager +=================== + +The codman tool analyses U-Boot builds to determine which source files and lines +of code are actually compiled and used. + +U-Boot is a massive project with thousands of files and nearly endless +configuration possibilities. A single board configuration might only compile a +small fraction of the total source tree. Codman can help answer questions like: + +* "I just enabled ``CONFIG_CMD_NET``, how much code did that actually add?" +* "How much code would I remove by disabling ``CONFIG_CMDLINE``? + +Simply searching for ``CONFIG_`` macros or header inclusions is tricky because +the build logic takes many forms: Makefile rules, #ifdefs, IS_ENABLED(), +CONFIG_IS_ENABLED() and static inlines. The end result is board-specific in any +case. + +Codman cuts through this complexity by analysing the actual build artifacts +generated by the compiler: + +#. Builds the specified board +#. Parses the ``.cmd`` files to find which source file were compiled. +#. Analyses the source code (with unifdef) or the object files (dwarf tables) + to figure out which files and lines were compiled. + +Usage +===== + +Basic usage, from within the U-Boot source tree:: + + ./tools/codman/codman.py -b <board> [flags] <command> [command-flags] + +Codman operations does out-of-tree builds, meaning that the object files end up +in a separate directory for each board. Use ``--build-base`` to set that. The +default is ``/tmp/b`` meaning that a sandbox build would end up in +``/tmp/b/sandbox``, for eaxmple. + +Relationship to LSPs +==================== + +LSPs can allow you to see unused code in your IDE, which is very handy for +interactive use. Codman is more about getting a broader picture, although it +does allow individual files to be listed. Codman does include a ``--lsp`` option +but this doesn't work particularly well. + +Commands +======== + +The basic functionality is accessed via these commands: + +* ``stats`` - Show statistics (default if no command given) +* ``dirs`` - Show directory breakdown +* ``unused`` - List unused files +* ``used`` - List used files +* ``summary`` - Show per-file summary +* ``detail <file>...`` - Show line-by-line analysis of one or more files +* ``copy-used <dir>`` - Copy used source files to a directory + + +This will build the board and show statistics about source file usage. + +Adjusting Configuration (-a) +============================ + +Sometimes you want to explore "what if" scenarios without manually editing +``defconfig`` files or running menuconfig. The ``-a`` (or ``--adjust``) option +allows you to modify the Kconfig configuration on the fly before the analysis +build runs. + +This is particularly useful for **impact analysis**: seeing exactly how much +code a specific feature adds to the build. + +Syntax +------ + +The `CONFIG_` prefix is optional. + +* ``-a CONFIG_OPTION``: Enable a boolean option (sets to 'y'). +* ``-a ~CONFIG_OPTION``: Disable an option. +* ``-a OPTION=val``: Set an option (``CONFIG_OPTION``) to a specific value. +* ``-a CONFIG_A,CONFIG_B``: Set multiple options (comma-separated). + +Examples +-------- + +**Check the impact of USB:** + +Enable the USB subsystem on the sandbox board and see how the code stats change:: + + codman -b sandbox -a CMD_USB stats + +**Disable Networking:** +See what code remains active when networking is explicitly disabled:: + + codman -b sandbox -a ~NET,NO_NET stats + +**Multiple Adjustments:** +Enable USB and USB storage together:: + + codman -b sandbox -a CONFIG_CMD_USB -a CONFIG_USB_STORAGE stats + +Common Options +============== + +Building: + +* ``-b, --board <board>`` - Board to build and analyse (default: sandbox, uses buildman) +* ``-B, --build-dir <dir>`` - Use existing build directory instead of building +* ``--build-base <dir>`` - Base directory for builds (default: /tmp/b) +* ``-n, --no-build`` - Skip building, use existing build directory +* ``-a, --adjust <config>`` - Adjust CONFIG options (see section above) + +Line-level analysis: + +* ``-w, --dwarf`` - Use DWARF debug info (most accurate, requires rebuild) +* ``-i, --include-headers`` - Include header files in unifdef analysis + +Filtering: + +* ``-f, --filter <pattern>`` - Filter files by wildcard pattern (e.g., + ``*acpi*``) + +Output control: + +* ``-v, --verbose`` - Show verbose output +* ``-D, --debug`` - Enable debug mode +* ``--top <N>`` - (for ``stats`` command) Show top N files with most inactive + code (default: 20) + +The ``dirs command`` has a few extra options: + +* ``-s, --subdirs`` - Show a breakdown by subdirectory +* ``-f, --show-files`` - Show individual files within directories (with ``-s``) +* ``-e, --show-empty`` - Show directories with 0 lines used + +Other: + +* ``-j, --jobs <N>`` - Number of parallel jobs for line analysis + +How to use commands +=================== + +The following commands show the different ways to use codman. Commands are +specified as positional arguments after the global options. + +Basic Statistics (``stats``) +----------------------------- + +Show overall statistics for sandbox build:: + + $ codman -b qemu-x86 stats + ====================================================================== + FILE-LEVEL STATISTICS + ====================================================================== + Total source files: 14114 + Used source files: 1046 (7.4%) + Unused source files: 13083 (92.7%) + + Total lines of code: 3646331 + Used lines of code: 192543 (5.3%) + Unused lines of code: 3453788 (94.7%) + ====================================================================== + + ====================================================================== + LINE-LEVEL STATISTICS (within compiled files) + ====================================================================== + Files analysed: 504 + Total lines in used files:209915 + Active lines: 192543 (91.7%) + Inactive lines: 17372 (8.3%) + ====================================================================== + + TOP 20 FILES WITH MOST INACTIVE CODE: + ---------------------------------------------------------------------- + 2621 inactive lines (56.6%) - drivers/mtd/spi/spi-nor-core.c + 669 inactive lines (46.7%) - cmd/mem.c + 594 inactive lines (45.8%) - cmd/nvedit.c + 579 inactive lines (89.5%) - drivers/mtd/spi/spi-nor-ids.c + 488 inactive lines (27.4%) - net/net.c + ... + + +Directory Breakdown (``dirs``) +------------------------------ + +See which top-level directories contribute code:: + + codman dirs + +Output shows breakdown by directory:: + + BREAKDOWN BY TOP-LEVEL DIRECTORY + ================================================================================= + Directory Files Used %Used %Code kLOC Used + --------------------------------------------------------------------------------- + arch 234 156 67 72 12.3 8.9 + board 123 45 37 25 5.6 1.4 + cmd 89 67 75 81 3.4 2.8 + common 156 134 86 88 8.9 7.8 + ... + +For detailed subdirectory breakdown:: + + codman dirs --subdirs + +With ``--show-files``, also shows individual files within each directory:: + + codman dirs --subdirs --show-files + +You can also specify a file filter:: + + codman -b qemu-x86 -f "*acpi*" dirs -sf + ======================================================================================= + BREAKDOWN BY TOP-LEVEL DIRECTORY + ======================================================================================= + Directory Files Used %Used %Code kLOC Used + --------------------------------------------------------------------------------------- + arch/x86/include/asm 5 2 40 36 0.6 0.2 + arch/x86/lib 5 1 20 6 1.2 0.1 + acpi.c 65 65 100.0 0 + cmd 1 1 100 100 0.2 0.2 + acpi.c 216 215 99.5 1 + drivers/qfw 1 1 100 93 0.3 0.3 + qfw_acpi.c 332 309 93.1 23 + include/acpi 5 4 80 91 3.3 3.0 + include/dm 1 1 100 100 0.4 0.4 + include/power 1 1 100 100 0.2 0.2 + lib/acpi 13 3 23 14 3.9 0.5 + acpi_writer.c 131 63 48.1 68 + acpi_extra.c 181 177 97.8 4 + acpi.c 304 304 100.0 0 + lib/efi_loader 1 1 100 100 0.1 0.1 + efi_acpi.c 75 75 100.0 0 + --------------------------------------------------------------------------------------- + TOTAL 78 15 19 7 17.5 1.2 + ======================================================================================= + + +Detail View (``detail``) +------------------------ + +See exactly which lines are active/inactive in a specific file:: + + $ codman -b qemu-x86 detail common/main.c + ====================================================================== + DETAIL FOR: common/main.c + ====================================================================== + Total lines: 115 + Active lines: 93 (80.9%) + Inactive lines: 22 (19.1%) + + 1 | // SPDX-License-Identifier: GPL-2.0+ + 2 | /* + 3 | * (C) Copyright 2000 + 4 | * Wolfgang Denk, DENX Software Engineering, wd@denx.de. + 5 | */ + ... + 23 | + 24 | static void run_preboot_environment_command(void) + 25 | { + 26 | char *p; + 27 | + 28 | p = env_get("preboot"); + 29 | if (p != NULL) { + 30 | int prev = 0; + 31 | + - 32 | if (IS_ENABLED(CONFIG_AUTOBOOT_KEYED)) + - 33 | prev = disable_ctrlc(1); /* disable Ctrl-C checking */ + 34 | + 35 | run_command_list(p, -1, 0); + 36 | + - 37 | if (IS_ENABLED(CONFIG_AUTOBOOT_KEYED)) + - 38 | disable_ctrlc(prev); /* restore Ctrl-C checking */ + 39 | } + 40 | } + 41 | + + +Lines with a ``-`` marker are not included in the build. + +Unused Files (``unused``) +------------------------- + +Find all source files that weren't compiled:: + + $ codman -b qemu-x86 unused |head -15 + Finding all source files...... + Found 1043 used source files... + Loading configuration...... + Loaded 8913 Kconfig symbols... + Loaded 8913 config symbols... + Analysing preprocessor conditionals...... + Excluding 539 header files (use -i to include them)... + Running unifdef on 504 files...... + Unused source files (13083): + arch/arc/cpu/arcv1/ivt.S + arch/arc/cpu/arcv2/ivt.S + arch/arc/include/asm/arc-bcr.h + +Used Files (``used``) +--------------------- + +List all source files that were included in a build:: + + $ codman -b qemu-x86 used |head -15 + Finding all source files...... + Found 1046 used source files... + Loading configuration...... + Loaded 8913 Kconfig symbols... + Loaded 8913 config symbols... + Analysing preprocessor conditionals...... + Excluding 542 header files (use -i to include them)... + Running unifdef on 504 files...... + Used source files (1046): + arch/x86/cpu/call32.S + arch/x86/cpu/cpu.c + ... + + +Per-File Summary (``summary``) +------------------------------ + +Shows detailed per-file statistics (requires ``-w`` or ``-l``):: + + $ codman -b qemu-x86 summary + ========================================================================================== + PER-FILE SUMMARY + ========================================================================================== + File Total Active Inactive %Active + ------------------------------------------------------------------------------------------ + arch/x86/cpu/call32.S 61 61 0 100.0% + arch/x86/cpu/cpu.c 399 353 46 88.5% + arch/x86/cpu/cpu_x86.c 99 99 0 100.0% + arch/x86/cpu/i386/call64.S 92 92 0 100.0% + arch/x86/cpu/i386/cpu.c 649 630 19 97.1% + arch/x86/cpu/i386/interrupt.c 630 622 8 98.7% + arch/x86/cpu/i386/setjmp.S 65 65 0 100.0% + arch/x86/cpu/intel_common/cpu.c 325 325 0 100.0% + ... + +Copy Used Files (``copy-used``) +------------------------------- + +Extract only the source files used in a build:: + + codman copy-used /tmp/sandbox-sources + +This creates a directory tree with only the compiled files, useful for creating +minimal source distributions. + +Analysis Methods +================ + +The script supports several analysis methods with different trade-offs. + +Firstly, files are detected by looking for .cmd files in the build. This +requires a build to be present. Given the complexity of the Makefile rules, it +seems like a reasonable trade-off. These directories are excluded: + +* tools/ +* test/ +* scripts/ +* doc/ + +unifdef +------- + +For discovering used/unused code, the unifdef mechanism produces reasonable +results. This simulates the C preprocessor using the ``unifdef`` tool to +determine which lines are active based on CONFIG_* settings. + +**Note:** This requires a patched version of unifdef that supports U-Boot's +``IS_ENABLED()`` and ``CONFIG_IS_ENABLED()`` macros, which are commonly used +throughout the codebase. It also supports faster operation, reducing run time +by about 100x on the U-Boot code base. + +The tools: + +1. Reads .config to extract all CONFIG_* symbol definitions +2. Generates a unifdef configuration file with -D/-U directives +3. Runs ``unifdef -k -E`` on each source file to process conditionals, with + ``-E`` enabling the IS_ENABLED() support +4. Compares original vs. processed output using line-number information +5. Lines removed by unifdef are marked as inactive + +This method Uses multiprocessing for parallel analysis of source files, so it +runs faster if you have plenty of CPU cores (e.g. 3s on a 22-thread +Intel Ultra 7). + +The preprocessor-level view is quite helpful. It is also possible to see .h +files using the ``-i`` flag + +Since unifdef does fairly simplistic parsing it can be fooled and show wrong +results. + + +DWARF (``-w/--dwarf``) +---------------------- + +The DWARF analyser uses debug information embedded in compiled object files to +determine exactly which source lines generated machine code. This is arguably +more accurate than unifdef, but it won't count comments, declarations and +various other features that don't actually generate code. + +The DWARF analyser: + +1. Rebuilds with ``CC_OPTIMIZE_FOR_DEBUG`` to prevent aggressive inlining +2. For each .o file, runs ``readelf --debug-dump=decodedline`` to get line info +3. Parses the DWARF line number table to map source lines to code addresses +4. Aggregates results across all object files +5. Any source line that doesn't appear in the line table is marked inactive + +As with unifdef, this uses multiprocessing for parallel analysis of object +files. It achieves similar performance. + + +See Also +======== + +* :doc:`../build/buildman` - Tool for building multiple boards +* :doc:`qconfig` +* :doc:`checkpatch` - Code-style checking tool -- 2.43.0

Age (days ago)

Last active (days ago)

List overview

Download

8 comments

1 participants

participants (1)

Simon Glass

[PATCH 0/9] codman: Add a new source-code analysis tool

tags

participants (1)