Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve git_status_list (git status) #11

Open
wants to merge 34 commits into
base: libgit-next
Choose a base branch
from

Conversation

julianmesa-gitkraken
Copy link

There are 4 commits from romkatv
And one from me
Tested repo: https://axogitlab.axonet.local/Vicente-axosoft/knime_workflows
Perf times:
git status: ~500ms
Before PR git_status_list: ~1300ms
After PR git_status_list: ~750ms

implausible and others added 30 commits January 26, 2022 14:36
GitHub is removing support for the unauthenticated git protocol; test
with the https protocol.
Provide individual file ownership checks for both the current user and
the system user, as well as a combined current user and system user
check.
Ensure that the repository directory is owned by the current user; this
prevents us from opening configuration files that may have been created
by an attacker.
Provide a mock for file ownership for testability.
Test that we prevent opening directories that are not owned by
ourselves.
Pull the global configuration loader out of the symlink check so that it
can be re-used.
Obey the `safe.directory` configuration variable if it is set in the
global or system configuration. (Do not try to load this from the
repository configuration - to avoid malicious repositories that then
mark themselves as safe.)
Introduce the `GIT_OPT_SET_OWNER_VALIDATION` option, so that users can
disable repository ownership validation.
Validate repository directory ownership (v1.3)
It is a string array of the filters you want to disable in checkout
…s-on-no-nanoseconds-repos

Fix degraded performance using GIT_USE_NSEC on repos cloned with GIT_USE_NSEC disabled
There is a typo, Instead comparing nanoseconds to zero the code is comparing seconds to zero
Original commit: libgit2@8a565a7
iterator: sort same-dir entries by basename
The current code sorts entries in a directory by their full paths. Full paths
share long prefixes, which makes comparator slow, which makes sorting slow.
TimSort, that libgit2 uses for sorting, performs insertion sort when there are
fewer than 65 elements. Most directories have fewer than 65 entries. Insertion
sort performs a lot of comparisons, so it's very important to make comparator
cheap.
Copy link

@AlexaXs AlexaXs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed just first commit 8254d2e2aedde8edf6641cea729626d750591205 (Do not add the .gitignore file if it not existing)

{
char *path_file = NULL;

path_file = git__malloc(path->size + 11);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better replace 11 with a definition, like #define GIT_IGNORE_FILE_NAMELENGTH 11

return NULL;

memcpy(path_file, path->ptr, path->size);
memcpy(path_file + path->size, GIT_IGNORE_FILE, 11);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here, replace 11 with GIT_IGNORE_FILE_NAMELENGTH

@@ -370,13 +370,40 @@ int git_ignore__for_path(
return error;
}

char * concat_path_ign_file(git_buf *path)
{
char *path_file = NULL;
Copy link

@AlexaXs AlexaXs Jul 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we are not using git_buf and git_buf_joinpath (or git_str and git_str_joinpath after libgit2 1.4) for efficiency?

if (git_buf_joinpath(&ign->dir, ign->dir.ptr, dir) < 0)
return -1;

ign->depth++;

// Do not add the .gitignore file if it not existing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe extend a bit the explanation in this comment, like:
// Do not add the .gitignore file if it does not exist, since
// calling push_ignore_file for repos with many dirs is expensive
?

file_ign = concat_path_ign_file(&ign->dir);
if (file_ign == NULL)
return -1;
if (p_stat(file_ign, &st) < 0) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested checking errno after p_stat:

    if (p_stat(file_ign, &st) < 0) {
      free(file_ign);
      if (errno != ENOENT && errno != ENOTDIR)
        return -1;
      else
        return 0;
    }

But the times went up a bit. At the moment caller of git_ignore__push_dir is not checking the return value.
Could that be a problem if p_stat fails with (errno != ENOENT && errno != ENOTDIR)?

int git_ignore__push_dir(git_ignores *ign, const char *dir)
{
char *file_ign = NULL;
struct stat st;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing asserts:

    GIT_ASSERT_ARG(ign);
    GIT_ASSERT_ARG(dir);

Copy link

@AlexaXs AlexaXs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also comment for first commit

Comment on lines +402 to +405
free(file_ign);
return 0;
}
free(file_ign);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we reserve with git_malloc, free with git__free

Copy link

@AlexaXs AlexaXs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review for only second commit 45f0e262f532a1bde03c8d35d8cdc0b471974a51 (iterator: don't stat directories)

@@ -482,6 +482,7 @@ struct git_path_diriter
size_t parent_len;

unsigned int flags;
unsigned char d_type;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems d_type is being assigned only for non-Windows platforms, but used for all platforms. Should we add this member as well to the same structure git_path_diriter above for Windows? or maybe specify the platform when used?

if (S_ISDIR(statbuf.st_mode)) {
bool submodule = false;
/* Ignore wacky things in the filesystem */
if (!S_ISDIR(statbuf.st_mode) &&
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check again if it's a directory? Can a submodule not be a directory?

if (submodule)
statbuf.st_mode = GIT_FILEMODE_COMMIT;
/* convert submodules to GITLINK and remove trailing slashes */
if (S_ISDIR(statbuf.st_mode)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check again if it's a directory? Can a submodule not be a directory?

statbuf.st_mode = GIT_FILEMODE_COMMIT;
/* convert submodules to GITLINK and remove trailing slashes */
if (S_ISDIR(statbuf.st_mode)) {
if (!submodule) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

matching end of { is confusing, maybe because it's not correctly indented?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants