Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(type-checking): Add type-checking pre-commit hooks #32261

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

alveifbklsiu259
Copy link

@alveifbklsiu259 alveifbklsiu259 commented Feb 14, 2025

feat(type-checking): Add type-checking pre-commit hooks

SUMMARY

This PR introduces some pre-commit hooks to check TypeScript files using tscw-config. This enhancement aims to streamline the development process by running type checks only on changed files.

NOTES

  • Since different packages (e.g., superset-frontend, superset-websocket...) use their own tsconfig.json and may require different declaration files, multiple type-checking hooks are used. (edit: after discussion, only one hook (type-check-frontend) is used, if future requirement changes, one can always reference commits).
  • JS/JSX files will be processed by the type-checking hooks, but checkJs is not enabled by default. To enable type-checking for JS/JSX files, either include //@ts-check at the top of each JS/JSX file or enable it in the package's tsconfig.json.
  • Each package, such as superset-frontend, superset-websocket... will include all files ending with .d.ts in their respective directories (recursively). However, if one needs to include third-party declaration files, one can either import them directly in the files that need them or declare them in the type field of the package's tsconfig.json. For example import "@testing-library/jest-dom"
  • Under the hood, what this hook does it create a tsconfig.json from the current package (superset-frontend/tsconfig.json for superset-frontend), remove the include filed, and include all the files passed to this hook (and some declaration files) in the files field. This means that ANY files passed to the hook will be type-cheked even if they are not included in the original tsconfig.json's include field. For example, at the time of writing, superset-frontend/tsconfig.json does not include files from ./*/storybook/**/*, but if one modifies superset-frontend/packages/superset-ui-demo/storybook/stories/superset-ui-color/RenderPalettes.tsx, this file will still be type-checked. IMHO type-checking the files even though they will not be compiled to javascript is not a bad idea, but if one really wants to disable this behavior, one can:
    • Use // @ts-nocheck.
    • Prevent those files from being passed to the hook, for example, use exclude: ^(superset-frontend/cypress-base/|superset-frontend/.*/storybook/) in pre-commit-config.yaml.

DISCLOSURE

I am the author of tscw-config. I genuinely believe that this library offers an effective solution to the issue discussed in Issue #27379.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Performance Blocking File System Operations ▹ view
Logging Incorrect Log Levels for stdout/stderr ▹ view
Error Handling Missing Error Context In Catch Block ▹ view
Files scanned
File Path Reviewed
scripts/check-type.js

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

scripts/check-type.js Outdated Show resolved Hide resolved
scripts/check-type.js Outdated Show resolved Hide resolved
scripts/check-type.js Outdated Show resolved Hide resolved
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on making your first PR and thank you for contributing to Superset! 🎉 ❤️

We hope to see you in our Slack community too! Not signed up? Use our Slack App to self-register.

… add error context

- Updated log levels for stdout and stderr to reflect appropriate severity.
- Refactored fs operations to use asynchronous method.
- Enhanced error handling by adding context information.
@sadpandajoe
Copy link
Member

@mistercrunch don't we already have some sort of type checking? Adding you since you did a lot of work to clean up our pre-commits

scripts/check-type.js Fixed Show resolved Hide resolved
Copy link

codecov bot commented Feb 14, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.45%. Comparing base (76d897e) to head (d97b699).
Report is 1453 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #32261       +/-   ##
===========================================
+ Coverage   60.48%   83.45%   +22.96%     
===========================================
  Files        1931      546     -1385     
  Lines       76236    39097    -37139     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    32629    -13485     
+ Misses      28017     6468    -21549     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 48.49% <ø> (-0.67%) ⬇️
javascript ?
mysql 75.85% <ø> (?)
postgres 75.92% <ø> (?)
presto 53.02% <ø> (-0.78%) ⬇️
python 83.45% <ø> (+19.96%) ⬆️
sqlite 75.43% <ø> (?)
unit 61.06% <ø> (+3.44%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mistercrunch
Copy link
Member

@sadpandajoe yes we do in CI, but this add the checks to pre-commit so issues can be caught at commit time, which is better. This adds in a fair amount of complexity to deliver on it.

Wondering if it would make sense to only do it on superset-frontend/**/js-and-ts-files to keep pre-commit.yml a bit simpler.

@alveifbklsiu259 is it all super fast?

- Removed type-checking-websocket hook
- Removed type-checking-embedded-sdk hook
- Removed type-checking-cypress hook
…n CI pipeline

Disable it as it requires npm ci, plus type-check is handled by other job.
@github-actions github-actions bot added the github_actions Pull requests that update GitHub Actions code label Feb 15, 2025
@alveifbklsiu259
Copy link
Author

/korbit-review

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Error Handling Missing TSConfig Existence Check ▹ view
Error Handling Missing Dependency Check ▹ view
Functionality Incomplete Process Output Handling ▹ view
Files scanned
File Path Reviewed
scripts/check-type.js

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

scripts/check-type.js Outdated Show resolved Hide resolved
scripts/check-type.js Show resolved Hide resolved
scripts/check-type.js Outdated Show resolved Hide resolved
- Add check for module installation and provide clear error message if missing
- Ensure both stdout and stderr are logged if they contain content
- Add check for existence of tsconfig.json and provide clear error message if missing
@alveifbklsiu259
Copy link
Author

/korbit-review

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Security Command Injection Vulnerability in TypeScript Compiler Command ▹ view
Performance Inefficient Array Concatenation ▹ view
Performance Sequential Directory Traversal ▹ view
Error Handling Missing Directory Read Error Handling ▹ view
Files scanned
File Path Reviewed
scripts/check-type.js

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support


const packageRootDirAbsolute = join(SUPERSET_ROOT, packageRootDir);
const tsConfig = getTsConfig(packageRootDirAbsolute);
const command = `--noEmit --allowJs --composite false --project ${tsConfig} ${argsStr} ${declarationFilesStr}`;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Command Injection Vulnerability in TypeScript Compiler Command category Security

Tell me more
What is the issue?

Command string is constructed using direct string interpolation with user-provided input (argsStr and declarationFilesStr), which could lead to command injection if the input is malicious.

Why this matters

An attacker could potentially inject malicious commands by providing specially crafted arguments that escape the intended TypeScript compiler context and execute arbitrary system commands.

Suggested change ∙ Feature Preview

Validate and sanitize argsStr and declarationFilesStr to ensure they only contain safe TypeScript compiler arguments. Consider using a whitelist of allowed arguments or a safer command construction method:

const safeArgs = validateTypeScriptArgs(argsStr);
const safeDeclarationFiles = validateDeclarationFiles(declarationFilesStr);
const command = ['--noEmit', '--allowJs', '--composite', 'false', '--project', tsConfig, ...safeArgs, ...safeDeclarationFiles].join(' ');

Report a problem with this comment

💬 Chat with Korbit by mentioning @korbit-ai.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@korbit-ai

I understand the concerns about command injection, but I'd like to provide some context on why sanitization might not be strictly necessary in this case.

  • Controlled Environment: The script is executed as part of a pre-commit hook in our version control. This controlled environment means that the inputs to the script are coming from a trusted source (i.e., the pre-commit configuration). These inputs include only the changed files and predefined arguments.

  • Trusted Source: Since the arguments are provided by the pre-commit hook configuration, the risk of malicious input is significantly minimized. The inputs are controlled by the developers working within our repository, which reduces the likelihood of encountering harmful characters.

  • Practical Implications: Adding sanitization to the script could impose unnecessary restrictions on file naming conventions. This might lead to confusion and hinder developer productivity, as it would require developers to avoid certain characters in file names to ensure the type-checking script works correctly.

Additionally, we have similar validation for arguments. For instance, declaration files are ensured to be files that end with .d.ts, and only arguments predefined in pre-commit-config.yaml or files matching ^superset-frontend\/.*\.(js|jsx|ts|tsx)$ are passed to the script.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @alveifbklsiu259 for your detailed explaination, you have valid points and this context helps. Given the controlled environment and trusted source of the inputs, the risk is indeed minimized. However, I still recommend being cautious about potential command injections, even in controlled environments -- better safe than sorry. But, as you clearly laid out, the practical implications might outweigh the theoretical security risk in this case. Thanks for your thoughtful work on this.

scripts/check-type.js Outdated Show resolved Hide resolved
scripts/check-type.js Outdated Show resolved Hide resolved
scripts/check-type.js Outdated Show resolved Hide resolved
- getFilesRecursively: use `promise.all`, handle error.
- getPackage: handle error.
@alveifbklsiu259
Copy link
Author

/korbit-review

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Functionality Incorrect Directory Exclusion Logic ▹ view
Suppressed issues based on your team's Korbit activity
This issue Is similar to Because

line 64:

Command string is constructed using direct string interpolation with unsanitized user inputs (argsStr and declarationFilesStr), which could lead to command injection.

Command Injection Vulnerability in TypeScript Compiler Command

Ignored

When you react to issues (for example, an upvote or downvote) or you fix them, Korbit will tune future reviews based on these signals.

Files scanned
File Path Reviewed
scripts/check-type.js

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

scripts/check-type.js Outdated Show resolved Hide resolved
@alveifbklsiu259
Copy link
Author

/korbit-review

Copy link

@korbit-ai korbit-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by Korbit AI

Korbit automatically attempts to detect when you fix issues in new commits.
Category Issue Fix Detected
Performance Unbounded concurrent directory traversal ▹ view
Functionality Directory exclusion logic false positives ▹ view
Files scanned
File Path Reviewed
scripts/check-type.js

Explore our documentation to understand the languages and file types we support and the files we ignore.

Need a new review? Comment /korbit-review on this PR and I'll review your latest changes.

Korbit Guide: Usage and Customization

Interacting with Korbit

  • You can manually ask Korbit to review your PR using the /korbit-review command in a comment at the root of your PR.
  • You can ask Korbit to generate a new PR description using the /korbit-generate-pr-description command in any comment on your PR.
  • Too many Korbit comments? I can resolve all my comment threads if you use the /korbit-resolve command in any comment on your PR.
  • Chat with Korbit on issues we post by tagging @korbit-ai in your reply.
  • Help train Korbit to improve your reviews by giving a 👍 or 👎 on the comments Korbit posts.

Customizing Korbit

  • Check out our docs on how you can make Korbit work best for you and your team.
  • Customize Korbit for your organization through the Korbit Console.

Feedback and Support

Comment on lines 100 to 118
const recursivePromises = [];
const result = [];

for (const file of files) {
const fullPath = join(dir, file.name);
const shouldExclude = excludedDirs.some((excludedDir) =>
normalize(fullPath).includes(normalize(excludedDir))
);

if (file.isDirectory() && !shouldExclude) {
recursivePromises.push(
getFilesRecursively(fullPath, regex, excludedDirs)
);
} else if (regex.test(file.name)) {
result.push(fullPath);
}
}

const recursiveResults = await Promise.all(recursivePromises);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded concurrent directory traversal category Performance

Tell me more
What is the issue?

The recursive directory traversal creates and awaits all promises simultaneously, which can lead to memory spikes when processing large directory trees.

Why this matters

In large codebases with deep directory structures, this approach will create many promises in memory before any can complete, potentially causing out-of-memory issues or degraded performance.

Suggested change ∙ Feature Preview

Implement a semaphore or concurrency limit to control the number of concurrent directory reads. Consider using a library like p-limit or implement a simple queue:

const pLimit = require('p-limit');
const limit = pLimit(10); // limit concurrent operations

async function getFilesRecursively(dir, regex, excludedDirs) {
  const files = await readdir(dir, { withFileTypes: true });
  const recursivePromises = [];
  const result = [];

  for (const file of files) {
    // ... existing checks ...
    if (file.isDirectory() && !shouldExclude) {
      recursivePromises.push(
        limit(() => getFilesRecursively(fullPath, regex, excludedDirs))
      );
    }
    // ... rest of the code ...
  }
}

Report a problem with this comment

💬 Chat with Korbit by mentioning @korbit-ai.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@korbit-ai

After analyzing the memory impact, I found that the increase is not significant enough to warrant the added complexity of introducing a semaphore or concurrency limit. Implementing such a solution would complicate the codebase.

I believe the current implementation is sufficient for our use case and does not pose any immediate performance concerns. Maybe we will consider revisiting this approach if we encounter any memory-related issues in the future.

  const initial = process.memoryUsage();

  let declarationFiles = await getFilesRecursively(
    packageRootDir,
    DECLARATION_FILE_REGEX,
    excludedDeclarationDirs
  );
  const final = process.memoryUsage();

  const memoryDiff = {
    rss: `${Math.round((final.rss - initial.rss) / (1024 * 1024))} MB`,
    heapTotal: `${Math.round(
      (final.heapTotal - initial.heapTotal) / (1024 * 1024)
    )} MB`,
    heapUsed: `${Math.round(
      (final.heapUsed - initial.heapUsed) / (1024 * 1024)
    )} MB`,
    external: `${Math.round(
      (final.external - initial.external) / (1024 * 1024)
    )} MB`,
  };

  console.log("Memory Difference:", memoryDiff); 
  // Memory Difference: { rss: '16 MB', heapTotal: '9 MB', heapUsed: '3 MB', external: '0 MB' }

scripts/check-type.js Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies:npm embedded github_actions Pull requests that update GitHub Actions code size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants