Skip to content

Instantly share code, notes, and snippets.

@phil-blain
Last active May 11, 2024 11:17
Show Gist options
  • Save phil-blain/17c67740bd26e66f4851fe0c07230ea4 to your computer and use it in GitHub Desktop.
Save phil-blain/17c67740bd26e66f4851fe0c07230ea4 to your computer and use it in GitHub Desktop.
Debugging Git for Git developers on Linux and macOS

Debugging Git

Some tips about debugging Git with GDB and LLDB.

Compiling Git for debugging

By default, Git's Makefile compiles Git with debug symbols (-g), but with optimization level -O2, which can lead to some variable being optimized out and thus making the executable harder to debug.

To compile with -O0, you can tweak CFLAGS using config.mak:

$ cat config.mak
DEVELOPER=1
CFLAGS+= -O0
# DEVOPTS=no-error # don't turn warning into errors (useful when compiling with older toolchains)

Including preprocessor macros

There are some preprocessor macros in Git's source code. To also generate debug symbols for these macros, you can use:

# GCC
CFLAGS += -g3
# Clang
CFLAGS += -fdebug-macro

Debugging functions defined via macros

Some functions are defined via macros, like those in the commit-slab API. To be able to step inside these functions in a debugger, we must tweak the build system so that the preprocessor and compiler run in separate steps, such that the debug info references the preprocessed source files. This can be done with the patch below, which applies to Git 2.44.0.

With this tweak applied, we can:

  1. Build the whole code with make -j
  2. Find the function(s) you want to debug in the appropriate preprocessed source file (*.i)
  3. Format the preprocessed source code to add back the line breaks removed by the preprocessor
  4. Rebuild with make -j again, which will recompile the corresponding object file from the preprocessed source
  5. Run the code in a debugger, and single step the preprocessed source!

Launching Git in a debugger

The CodingGuidelines document has some guidance on using the system-installed GDB to debug your Git executable.

If you compiled GDB yourself you will need to invoke it like this:

GIT_DEBUGGER="/path/to/gdb --args" ./bin-wrappers/git <command> <args...>

If you are on macOS and want to use LLDB, that would be:

GIT_DEBUGGER="lldb --" ./bin-wrappers/git <command> <args...>

Debugging from a test script

If you're working on a test script and want to automatically launch your debugger at a certain git invocation in the test, you can use the debug function of the test library. Note however that since the Git test harness (t/test-lib.sh) sets TERM to "dumb", your debugger interface won't have colors. To use your original TERM, you can use debug -t, since 2b2af95908 (Merge branch 'pb/test-use-user-env', 2021-09-15). If you are debugging an earlier commit, it can be easily worked around by adjusting the value of TERM before the call to debug, i.e.

TERM=xterm-256color debug git <args>

If you want to manually inspect the state of the TRASH_DIRECTORY at some point in a test script, you can add a call to the test_pause function of the test library at the appropriate point. This will open a shell in $TRASH_DIRECTORY at that point in the test. The test will continue upon exiting that shell. Again, since TERM is set to "dumb", this shell will not output colors, and since the test harness also sets HOME to the trash directory, you won't have any shell customization specified in your shell startup files, and your Git aliases won't work. To use your original TERM and HOME, you can use test_pause -t -h, or test_pause -a, which additionally invokes your interactive shell (SHELL) instead of /bin/sh. These options also work since 2b2af95908 (Merge branch 'pb/test-use-user-env', 2021-09-15). When checking out an earlier commit, this can be worked around by exec'ing a new shell and adjusting TERM and HOME:

sh-3.2$ HOME=/Users/Philippe/ TERM=xterm-256color exec bash --login
/Users/Philippe/Code/git/t/trash directory.t5572-pull-submodule
Philippe@<host> 02.11.2020, 13:50:15  (master #%)
$ 

Debugging fork and exec

Sometimes Git commands spawn other Git commands under the hood, using fork+exec. You will recognize this by a call to either run_command or more rarely directly to start_command.

Using a single GDB instance (Linux)

If the code your are trying to debug is in an instance of git spawned (fork+exec) by the main git process, you can use the set detach-on-fork off option in GDB, which works only Linux (not on macOS or Windows). I had success doing that on Kubuntu 17.10 using a self-compiled GDB 8.3.1 (but not with the system-installed GDB 7.1.2) as well as under Kubuntu 20.04 using the system-installed GDB 9.2.

Have a look at the "Debugging Multiple Inferiors Connections and Programs" and "Debugging Forks" sections of the GDB documentation for more information.

Using several debugger instances (Linux/macOS)

Another option is to put a call to sleep(3) just after the fork in start_command, and put a break point at the call to run_command.

(gdb) b submodule.c:1958 # (example of a call to run_command)
(gdb) r # run up to the breakpoint
(gdb) n # step

At this point the debugger stops and does not return the prompt. You can then open another debugger instance and attach to the new process, e.g. with

gdb -p $(pgrep -n git)
# or 
lldb -p $(pgrep -n git) 

The debugger will stop the program deep in system libs (sleep). To get out of there, use the fin command repeatedly until you are in start_command, then add a break point at main and continue:

(gdb) b main
(gdb) c

GDB will follow the exec and stop at main. Note that this does not seem to work on macOS with GDB, but the same process works in LLDB. You can then debug the child process normally.

Notes:

  • With both GDB and LLDB, it's important to step (n) and not continue (c) after the breakpoint at the call to run_command is reached, as if you continue, the parent process will die as soon as you step over execve in the child, and the child process will stop and become un-debuggable.
  • In GDB, you can use the info proc cwd to show the current working directory of the active inferior (works on Linux, FreeBSD and NetBSD)

Enums

Work is under way to slowly convert preprocessor macros to enums in Git's source code. During debugging, it is useful to be able to convert integer variables to enums and vice-versa.

In GDB, to convert an enum constant to the corresponding integer, you can use any of the following syntax:

(gdb) p (int) RECURSE_SUBMODULES_OFF
0
(gdb) p /d RECURSE_SUBMODULES_OFF
0
(gdb) p RECURSE_SUBMODULES_OFF+0
0

In LLDB, at least as of version 10.0.0, the debugger only understands enum constants if the enum itself has an identifier:

/* A named enum */
enum diff_submodule_format {
	DIFF_SUBMODULE_SHORT = 0,
	DIFF_SUBMODULE_LOG,
	DIFF_SUBMODULE_INLINE_DIFF
};
/* An anonymous enum */
enum {
	RECURSE_SUBMODULES_ONLY = -5,
	RECURSE_SUBMODULES_CHECK = -4,
	RECURSE_SUBMODULES_ERROR = -3,
	RECURSE_SUBMODULES_NONE = -2,
	RECURSE_SUBMODULES_ON_DEMAND = -1,
	RECURSE_SUBMODULES_OFF = 0,
	RECURSE_SUBMODULES_DEFAULT = 1,
	RECURSE_SUBMODULES_ON = 2
};

For constants of named enum types, you can do:

(lldb) p (int) diff_submodule_format::DIFF_SUBMODULE_SHORT
(int) $0 = 0
(lldb) p diff_submodule_format::DIFF_SUBMODULE_SHORT+0
(int) $1 = 0

For both GDB and LLDB, you can also check in the other direction, i.e. cast the int to the enum type, if the enum is a named enum.

In GDB:

(gdb) p (enum diff_submodule_format) options->submodule_format
$1 = DIFF_SUBMODULE_LOG

In LLDB:

(lldb) p (diff_submodule_format) options->submodule_format
(diff_submodule_format) $1 = DIFF_SUBMODULE_LOG
(lldb) p (enum submodule_recurse_mode) options->submodule_format
(diff_submodule_format) $2 = DIFF_SUBMODULE_LOG

Toolchain bugs

The macOS debug linker, dsymutil, has an old bug such that debug symbols for anonymous enums are not linked into the .dSYM bundle. This should not matter if compiling and linking in separate steps (as Git's Makefile does), as in this case dsymutil is not ran; the debug information stays in the object files and debuggers read it from there.

However, if compiling with GCC version 4.9.3 up to and not including 9.1, dsymutil is erroneously always ran, which should not be the case if only object files appear on the command line (and link time optimization is not used). This is due to gcc-mirror/gcc@79abf19fb6 (and its backport to the 4.9 series, gcc-mirror/gcc@3e18abd274) which aimed to fix GCC not running dsymutil when it should indeed run it. This was fixed years after it was noticed in gcc-mirror/gcc@4098a6d436, which went into GCC 9.1. For more history on GCC's use of dsymutil:

LLDB gotcha's

It seems that printing the name of a cache entry (p ce->name) does not work (at least with my 10.11 system's LLDB, lldb-360.1.70). You need to cast the name to the right type: p (const char*)(ce->name). If you want to make that easier, you can set up an LLDB alias:

echo "command regex pchar 's/(.+)/expression -- (char*)(%1)/'" >> ~/.lldbinit

and then use pchar ce->name.


Addendum: there is also some interesting guidelines on the old wiki, that might still be useful for debugging tests: https://git.wiki.kernel.org/index.php/What_to_do_when_a_test_fails

diff --git a/Makefile b/Makefile
index 78e874099d..687e81dea9 100644
--- a/Makefile
+++ b/Makefile
@@ -2418,11 +2418,11 @@ git$X: git.o GIT-LDFLAGS $(BUILTIN_OBJS) $(GITLIBS)
$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) \
$(filter %.o,$^) $(LIBS)
-help.sp help.s help.o: command-list.h
-builtin/bugreport.sp builtin/bugreport.s builtin/bugreport.o: hook-list.h
+help.sp help.s help.i help.o: command-list.h
+builtin/bugreport.sp builtin/bugreport.s builtin/bugreport.i builtin/bugreport.o: hook-list.h
-builtin/help.sp builtin/help.s builtin/help.o: config-list.h GIT-PREFIX
-builtin/help.sp builtin/help.s builtin/help.o: EXTRA_CPPFLAGS = \
+builtin/help.sp builtin/help.s builtin/help.i builtin/help.o: config-list.h GIT-PREFIX
+builtin/help.sp builtin/help.s builtin/help.i builtin/help.o: EXTRA_CPPFLAGS = \
'-DGIT_HTML_PATH="$(htmldir_relative_SQ)"' \
'-DGIT_MAN_PATH="$(mandir_relative_SQ)"' \
'-DGIT_INFO_PATH="$(infodir_relative_SQ)"'
@@ -2430,11 +2430,11 @@ builtin/help.sp builtin/help.s builtin/help.o: EXTRA_CPPFLAGS = \
PAGER_ENV_SQ = $(subst ','\'',$(PAGER_ENV))
PAGER_ENV_CQ = "$(subst ",\",$(subst \,\\,$(PAGER_ENV)))"
PAGER_ENV_CQ_SQ = $(subst ','\'',$(PAGER_ENV_CQ))
-pager.sp pager.s pager.o: EXTRA_CPPFLAGS = \
+pager.sp pager.s pager.i pager.o: EXTRA_CPPFLAGS = \
-DPAGER_ENV='$(PAGER_ENV_CQ_SQ)'
-version.sp version.s version.o: GIT-VERSION-FILE GIT-USER-AGENT
-version.sp version.s version.o: EXTRA_CPPFLAGS = \
+version.sp version.s version.i version.o: GIT-VERSION-FILE GIT-USER-AGENT
+version.sp version.s version.i version.o: EXTRA_CPPFLAGS = \
'-DGIT_VERSION="$(GIT_VERSION)"' \
'-DGIT_USER_AGENT=$(GIT_USER_AGENT_CQ_SQ)' \
'-DGIT_BUILT_FROM_COMMIT="$(shell \
@@ -2700,6 +2700,8 @@ endif
.PHONY: objects
objects: $(OBJECTS)
+PREPROCESSED := $(patsubst %.o,%.i,$(OBJECTS))
+
dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS))))
@@ -2731,8 +2733,11 @@ missing_compdb_dir =
compdb_args =
endif
-$(OBJECTS): %.o: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
- $(QUIET_CC)$(CC) -o $*.o -c $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
+$(PREPROCESSED): %.i: %.c GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
+ $(QUIET_CPP)$(CC) -o $@ -E -P $(dep_args) $(compdb_args) $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
+
+$(OBJECTS): %.o: %.i GIT-CFLAGS $(missing_dep_dirs) $(missing_compdb_dir)
+ $(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) $<
%.s: %.c GIT-CFLAGS FORCE
$(QUIET_CC)$(CC) -o $@ -S $(ALL_CFLAGS) $(EXTRA_CPPFLAGS) $<
@@ -3684,6 +3689,7 @@ clean: profile-clean coverage-clean cocciclean
$(RM) po/git.pot po/git-core.pot
$(RM) git.res
$(RM) $(OBJECTS)
+ $(RM) $(PREPROCESSED)
$(RM) headless-git.o
$(RM) $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(REFTABLE_TEST_LIB)
$(RM) $(ALL_PROGRAMS) $(SCRIPT_LIB) $(BUILT_INS) $(OTHER_PROGRAMS)
diff --git a/shared.mak b/shared.mak
index 29bebd30d8..f7673848ab 100644
--- a/shared.mak
+++ b/shared.mak
@@ -57,6 +57,7 @@ ifndef V
## Used in "Makefile"
QUIET_CC = @echo ' ' CC $@;
+ QUIET_CPP = @echo ' ' CPP $@;
QUIET_AR = @echo ' ' AR $@;
QUIET_LINK = @echo ' ' LINK $@;
QUIET_BUILT_IN = @echo ' ' BUILTIN $@;
@HomyeeKing
Copy link

HomyeeKing commented Jul 22, 2023

for vscode user:

// task.json
{
  "tasks": [
    {
      "type": "cppbuild",
      "label": "build with gmake",
      "command": "gmake",
      "args": [ 
        "CFLAG=-g" 
      ],
      "options": {
        "cwd": "${workspaceFolder}"
      },
      "problemMatcher": [
        "$gcc"
      ],
      "group": {
        "kind": "build",
        "isDefault": true
      },
      "detail": "gmake build task"
    }
  ],
  "version": "2.0.0"
}
// launch.json
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "debug git command",
      "type": "cppdbg",
      "request": "launch",
      "program": "${workspaceRoot}/git",
      "args": [
        "status", // command you want to debug
        "-h"
      ],
      "stopAtEntry": false,
      "cwd": "${fileDirname}",
      "targetArchitecture": "arm64", // arch of yours
      "preLaunchTask": "build with gmake",
      "externalConsole": false,
      "MIMode": "lldb"
    }
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment