Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang modules] Setting DebugCompilationDir when it is safe to ignore current working directory #128446

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

qiongsiwu
Copy link
Contributor

@qiongsiwu qiongsiwu commented Feb 24, 2025

This PR explicitly sets DebugCompilationDir to the system's root directory if it is safe to ignore the current working directory.

This fixes a problem where a PCM file's embedded debug information can lead to compilation failure. The compiler may have decided it is indeed safe to ignore the current working directory. In this case, the PCM file's content is functionally correct regardless of the current working directory because no inputs use relative paths (see #124786). However, a PCM may contain debug info. If debug info is requested, the compiler uses the current working directory value to set DW_AT_comp_dir. This may lead to the following situation:

  1. Two different compilations need the same PCM file.
  2. The PCM file is compiled assuming a working directory, which is embedded in the debug info, but otherwise has no effect.
  3. The second compilation assumes a different working directory, and expects an identically-sized pcm file. However, it cannot find such a PCM, because the existing PCM file has been compiled assuming a different DW_AT_comp_dir , which is embedded in the debug info.

This PR resets the DebugCompilationDir if it is functionally safe to ignore the working directory so the above situation is avoided, since all debug information will share the same working directory.

rdar://145249881

@qiongsiwu qiongsiwu marked this pull request as draft February 24, 2025 00:55
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Feb 24, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 24, 2025

@llvm/pr-subscribers-clang

Author: Qiongsi Wu (qiongsiwu)

Changes

This PR explicitly sets DebugCompilationDir to the system's root directory if it is safe to ignore the current working directory.

This fixes a problem where a PCM file's embedded debug information can lead to compilation failure. The compiler may have decided it is indeed safe to ignore the current working directory. In this case, the PCM file's content is functionally correct regardless of the current working directory because no inputs use relative paths (see #124786). However, a PCM may contain debug info. If debug info is requested, the compiler uses the current working directory value to set DW_AT_comp_dir. This may lead to the following situation:

  1. Two different compilations need the same PCM file.
  2. The PCM file is compiled assuming a working directory, which is embedded in the debug info, but otherwise has no effect.
  3. The second compilation assumes a different working directory, hence expects an identically-sized pcm file. However, it cannot find such a PCM, because the existing PCM file has been compiled assuming a different DW_AT_comp_dir , which is embedded in the debug info.

This PR resets the DebugCompilationDir if it is functionally safe to ignore the working directory so the above situation is avoided, since all debug information will share the same working directory.


Full diff: https://github.com/llvm/llvm-project/pull/128446.diff

2 Files Affected:

  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+16-1)
  • (added) clang/test/ClangScanDeps/modules-debug-dir.c (+22)
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 1c5f4c4b50ab6..8a94535d3806c 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -492,8 +492,23 @@ static std::string getModuleContextHash(const ModuleDeps &MD,
   auto &FSOpts = const_cast<FileSystemOptions &>(CI.getFileSystemOpts());
   if (CWD && !IgnoreCWD)
     HashBuilder.add(*CWD);
-  else
+  else {
     FSOpts.WorkingDir.clear();
+    auto &CGOpts = const_cast<CodeGenOptions &>(CI.getCodeGenOpts());
+    if (CGOpts.DwarfVersion && CWD) {
+      // It is necessary to explicitly set the DebugCompilationDir
+      // to a common directory (e.g. root) if IgnoreCWD is true.
+      // When IgnoreCWD is true, the module's content should not depend
+      // on the current working directory. However, if dwarf information
+      // is needed (when CGOpts.DwarfVersion is non-zero), and if
+      // CGOpts.DebugCompilationDir is not explicitly set,
+      // the current working directory will be automatically embedded
+      // in the dwarf information in the pcm, contradicting the assumption
+      // that it is safe to ignore the CWD. Thus in such cases,
+      // CGOpts.DebugCompilationDir is explicitly set to a common directory.
+      CGOpts.DebugCompilationDir = llvm::sys::path::root_path(*CWD);
+    }
+  }
 
   // Hash the BuildInvocation without any input files.
   SmallString<0> ArgVec;
diff --git a/clang/test/ClangScanDeps/modules-debug-dir.c b/clang/test/ClangScanDeps/modules-debug-dir.c
new file mode 100644
index 0000000000000..580f72205e68f
--- /dev/null
+++ b/clang/test/ClangScanDeps/modules-debug-dir.c
@@ -0,0 +1,22 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb.json.in > %t/cdb.json
+// RUN: clang-scan-deps -compilation-database %t/cdb.json -format \
+// RUN:   experimental-full > %t/result.json
+
+//--- cdb.json.in
+[{
+  "directory": "DIR",
+  "command": "clang -g -fdebug-info-for-profiling DIR/tu.c -fmodules -fmodules-cache-path=DIR/cache -IDIR/include/ -o DIR/tu.o",
+  "file": "DIR/tu.c"
+}]
+
+//--- include/module.modulemap
+module mod {
+  header "mod.h"
+}
+
+//--- include/mod.h
+
+//--- tu.c
+#include "mod.h"

@qiongsiwu
Copy link
Contributor Author

qiongsiwu commented Feb 24, 2025

Note to reviewers:
This fixes rdar://145249881, but I'd like to get some input on three things.

  1. I am not sure if it is a good idea to set DebugCompilationDir to root. I chose it because I think it is safe and all systems should have the root directory. I am all ears for a different choice.
  2. I have not figured out a good way to test this. For some reason, I cannot get the module compilation command to produce debug information in the test (through -g, or -dwarf-version options). I don't think we always want to hardcode DebugCompilationDir to root, so I think we want to set it only if dwarf info is needed. May I get some pointers to get the compiler to generate debug info when compiling the pcm? The test is fixed. I need -gmodules.
  3. This fix feels very patchy to me. I worry that something else may somehow embed the CWD in the binary and we may have to play the cat and mouse game again. I think it may be fine for now, but I'd not want to leave it like this for too long and I'd love to get some suggestions.

Thanks!

@qiongsiwu qiongsiwu self-assigned this Feb 24, 2025
@qiongsiwu qiongsiwu marked this pull request as ready for review February 24, 2025 17:37
@cachemeifyoucan
Copy link
Collaborator

@adrian-prantl for the debug directory option since that might interact how debugger working with the option.

I think we need a solution approved by debugger forks for how to avoid stamping CWD into debug info and when it is safe to do so (for example, we guarantee all the paths in the TU/module are absolute path).

@qiongsiwu
Copy link
Contributor Author

qiongsiwu commented Feb 24, 2025

Or maybe it makes more sense to set the working directory to SYSROOT, or to the input header's path, or the directory that contains the output pcm?

Update: setting DebugCompilationDir to the pcms' output directory works. But as @cachemeifyoucan pointed out earlier this may not be the right thing to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants