Search Commits via Commit Hash #7400

Merged
gary-kim merged 13 commits from enh/7395/search-via-commit-hash into master 2019-09-02 23:38:05 +00:00
gary-kim commented 2019-07-09 10:55:18 +00:00 (Migrated from github.com)

Closes #7395
Commit history can now be searched with commit hash.

EDIT: whoops, slight mistake in the test case. I'll fix it in a bit.
EDIT2: should be fixed now.

Closes #7395 Commit history can now be searched with commit hash. EDIT: whoops, slight mistake in the test case. I'll fix it in a bit. EDIT2: should be fixed now.
codecov-io commented 2019-07-09 15:20:14 +00:00 (Migrated from github.com)

Codecov Report

Merging #7400 into master will increase coverage by 0.01%.
The diff coverage is 92.85%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #7400      +/-   ##
=========================================
+ Coverage   41.59%   41.6%   +0.01%     
=========================================
  Files         480     480              
  Lines       64144   64160      +16     
=========================================
+ Hits        26681   26695      +14     
- Misses      34004   34007       +3     
+ Partials     3459    3458       -1
Impacted Files Coverage Δ
modules/git/repo_commit.go 54% <92.85%> (+2.71%) ⬆️
models/unit.go 62.16% <0%> (-5.41%) ⬇️
models/repo_list.go 73.17% <0%> (-0.98%) ⬇️
modules/log/event.go 65.64% <0%> (+1.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6097ff6...1ff3ae4. Read the comment docs.

# [Codecov](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=h1) Report > Merging [#7400](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=desc) into [master](https://codecov.io/gh/go-gitea/gitea/commit/6097ff68e7a50a3ae4933a4169c007a78651017e?src=pr&el=desc) will **increase** coverage by `0.01%`. > The diff coverage is `92.85%`. [![Impacted file tree graph](https://codecov.io/gh/go-gitea/gitea/pull/7400/graphs/tree.svg?width=650&token=t1G57YGbPy&height=150&src=pr)](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=tree) ```diff @@ Coverage Diff @@ ## master #7400 +/- ## ========================================= + Coverage 41.59% 41.6% +0.01% ========================================= Files 480 480 Lines 64144 64160 +16 ========================================= + Hits 26681 26695 +14 - Misses 34004 34007 +3 + Partials 3459 3458 -1 ``` | [Impacted Files](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=tree) | Coverage Δ | | |---|---|---| | [modules/git/repo\_commit.go](https://codecov.io/gh/go-gitea/gitea/pull/7400/diff?src=pr&el=tree#diff-bW9kdWxlcy9naXQvcmVwb19jb21taXQuZ28=) | `54% <92.85%> (+2.71%)` | :arrow_up: | | [models/unit.go](https://codecov.io/gh/go-gitea/gitea/pull/7400/diff?src=pr&el=tree#diff-bW9kZWxzL3VuaXQuZ28=) | `62.16% <0%> (-5.41%)` | :arrow_down: | | [models/repo\_list.go](https://codecov.io/gh/go-gitea/gitea/pull/7400/diff?src=pr&el=tree#diff-bW9kZWxzL3JlcG9fbGlzdC5nbw==) | `73.17% <0%> (-0.98%)` | :arrow_down: | | [modules/log/event.go](https://codecov.io/gh/go-gitea/gitea/pull/7400/diff?src=pr&el=tree#diff-bW9kdWxlcy9sb2cvZXZlbnQuZ28=) | `65.64% <0%> (+1.02%)` | :arrow_up: | ------ [Continue to review full report at Codecov](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=continue). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta) > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=footer). Last update [6097ff6...1ff3ae4](https://codecov.io/gh/go-gitea/gitea/pull/7400?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
lunny requested changes 2019-07-10 02:46:20 +00:00
@ -223,3 +219,3 @@
for _, v := range opts.Committers {
cmd.AddArguments("--committer=" + v)
args = append(args, "--committer="+v)
}

I don't think this is right, because there is a prameter -100 before.

I don't think this is right, because there is a prameter `-100` before.

Maybe we could add an extra git log <keyword> -1 to get all output when keyword didn't containers : and then merge the output with original.

Maybe we could add an extra `git log <keyword> -1` to get all output when `keyword` didn't containers `:` and then merge the output with original.
gary-kim commented 2019-07-10 03:44:22 +00:00 (Migrated from github.com)

Maybe we could add an extra git log <keyword> -1 to get all output when keyword didn't containers : and then merge the output with original.

Wouldn't that only match a single commit hash? In Gitea's commit history, there are 4 hashes that start with 46d3 so doing git log 46d3 -1 only outputs the most recent one or errors, I don't remember which. Also, do we want this matching only from the start or from the middle of the commit hash as well?

EDIT: Okay, not 46d3 but I do remember finding a two commit hashes that had the same first 4 characters.

> Maybe we could add an extra `git log <keyword> -1` to get all output when `keyword` didn't containers `:` and then merge the output with original. Wouldn't that only match a single commit hash? In Gitea's commit history, there are 4 hashes that start with `46d3` so doing `git log 46d3 -1` only outputs the most recent one or errors, I don't remember which. Also, do we want this matching only from the start or from the middle of the commit hash as well? EDIT: Okay, not `46d3` but I do remember finding a two commit hashes that had the same first 4 characters.
gary-kim (Migrated from github.com) reviewed 2019-07-10 03:48:24 +00:00
@ -223,3 +219,3 @@
for _, v := range opts.Committers {
cmd.AddArguments("--committer=" + v)
args = append(args, "--committer="+v)
}
gary-kim (Migrated from github.com) commented 2019-07-10 03:48:24 +00:00

I didn't think about that. The commit hash search would not go back as far as the grep search.

I didn't think about that. The commit hash search would not go back as far as the grep search.

@gary-kim I think we should only match prefix what is the git log does.

@gary-kim I think we should only match prefix what is the `git log` does.
gary-kim commented 2019-07-10 13:19:34 +00:00 (Migrated from github.com)

@gary-kim I think we should only match prefix what is the git log does.

@lunny Okay, done.

> @gary-kim I think we should only match prefix what is the `git log` does. @lunny Okay, done.
lunny reviewed 2019-07-10 14:53:50 +00:00
@ -30,1 +30,4 @@
func TestRepoCommitsSearch(t *testing.T) {
testRepoCommitsSearch(t, "e8eabd", "")
testRepoCommitsSearch(t, "38a9cb", "")
testRepoCommitsSearch(t, "6e8e", "6e8eabd9a7")

Could we have a failed test.

Could we have a failed test.

@gary-kim Good work! Two small things we may need to consider:

  • If a commit has another commit's id on commit message, we may get duplicated commit output, how to fix that?
  • I find repo. parsePrettyFormatLogToList will get all commit information again via git command, so why we use prettyLogFormat parameter? Just parse the full commit information may give us better performance.
    Of course this issue is out of this PR. I just found that there.
@gary-kim Good work! Two small things we may need to consider: - If a commit has another commit's id on commit message, we may get duplicated commit output, how to fix that? - I find repo. parsePrettyFormatLogToList will get all commit information again via `git` command, so why we use `prettyLogFormat` parameter? Just parse the full commit information may give us better performance. Of course this issue is out of this PR. I just found that there.
gary-kim (Migrated from github.com) reviewed 2019-07-10 15:50:25 +00:00
@ -30,1 +30,4 @@
func TestRepoCommitsSearch(t *testing.T) {
testRepoCommitsSearch(t, "e8eabd", "")
testRepoCommitsSearch(t, "38a9cb", "")
testRepoCommitsSearch(t, "6e8e", "6e8eabd9a7")
gary-kim (Migrated from github.com) commented 2019-07-10 15:50:25 +00:00

Sorry, I'm not sure what you mean by that.

Sorry, I'm not sure what you mean by that.
gary-kim commented 2019-07-10 16:05:53 +00:00 (Migrated from github.com)
  • If a commit has another commit's id on commit message, we may get duplicated commit output, how to fix that?

I just pushed a commit that should solve this issue.

  • I find repo. parsePrettyFormatLogToList will get all commit information again via git command, so why we use prettyLogFormat parameter? Just parse the full commit information may give us better performance.
    Of course this issue is out of this PR. I just found that there.

I did find that strange as well. Could I open a PR after this to try and address that?

> * If a commit has another commit's id on commit message, we may get duplicated commit output, how to fix that? I just pushed a commit that should solve this issue. > * I find repo. parsePrettyFormatLogToList will get all commit information again via `git` command, so why we use `prettyLogFormat` parameter? Just parse the full commit information may give us better performance. > Of course this issue is out of this PR. I just found that there. I did find that strange as well. Could I open a PR after this to try and address that?
mrsdizzie (Migrated from github.com) reviewed 2019-07-10 16:17:08 +00:00
@ -30,1 +30,4 @@
func TestRepoCommitsSearch(t *testing.T) {
testRepoCommitsSearch(t, "e8eabd", "")
testRepoCommitsSearch(t, "38a9cb", "")
testRepoCommitsSearch(t, "6e8e", "6e8eabd9a7")
mrsdizzie (Migrated from github.com) commented 2019-07-10 16:17:07 +00:00

I think to have a search that should not match anything, to test that it works right when there is nothing to match also.

So if there is no commit 58f73d1a78 then do a search for that and make sure it returns empty and not something else.

I think to have a search that should *not* match anything, to test that it works right when there is nothing to match also. So if there is no commit ```58f73d1a78``` then do a search for that and make sure it returns empty and not something else.
gary-kim (Migrated from github.com) reviewed 2019-07-10 17:06:43 +00:00
@ -30,1 +30,4 @@
func TestRepoCommitsSearch(t *testing.T) {
testRepoCommitsSearch(t, "e8eabd", "")
testRepoCommitsSearch(t, "38a9cb", "")
testRepoCommitsSearch(t, "6e8e", "6e8eabd9a7")
gary-kim (Migrated from github.com) commented 2019-07-10 17:06:43 +00:00

Okay, sounds good.

Okay, sounds good.
gary-kim (Migrated from github.com) reviewed 2019-07-10 17:25:21 +00:00
@ -30,1 +30,4 @@
func TestRepoCommitsSearch(t *testing.T) {
testRepoCommitsSearch(t, "e8eabd", "")
testRepoCommitsSearch(t, "38a9cb", "")
testRepoCommitsSearch(t, "6e8e", "6e8eabd9a7")
gary-kim (Migrated from github.com) commented 2019-07-10 17:25:21 +00:00

Alright, added a few.

Alright, added a few.
  • If a commit has another commit's id on commit message, we may get duplicated commit output, how to fix that?

I just pushed a commit that should solve this issue.

  • I find repo. parsePrettyFormatLogToList will get all commit information again via git command, so why we use prettyLogFormat parameter? Just parse the full commit information may give us better performance.
    Of course this issue is out of this PR. I just found that there.

I did find that strange as well. Could I open a PR after this to try and address that?

Yes, please.

> > * If a commit has another commit's id on commit message, we may get duplicated commit output, how to fix that? > > I just pushed a commit that should solve this issue. > > > * I find repo. parsePrettyFormatLogToList will get all commit information again via `git` command, so why we use `prettyLogFormat` parameter? Just parse the full commit information may give us better performance. > > Of course this issue is out of this PR. I just found that there. > > I did find that strange as well. Could I open a PR after this to try and address that? Yes, please.
lunny approved these changes 2019-07-11 01:20:03 +00:00
gary-kim commented 2019-07-11 03:11:43 +00:00 (Migrated from github.com)

I did find that strange as well. Could I open a PR after this to try and address that?

I just took a better look at how parsePrettyFormatLogToList works and I think I'll need some time to understand this codebase a little more before working on this. If someone else would like to work on this, great, please go ahead! If not, I'll get back to this later.

> I did find that strange as well. Could I open a PR after this to try and address that? I just took a better look at how `parsePrettyFormatLogToList` works and I think I'll need some time to understand this codebase a little more before working on this. If someone else would like to work on this, great, please go ahead! If not, I'll get back to this later.
typeless commented 2019-07-11 05:15:02 +00:00 (Migrated from github.com)

@gary-kim There seems to be a more straightforward approach.

  1. Scan each keyword and determine if it is a hex number (https://golang.org/pkg/encoding/hex/#DecodeString) and its length is longer than 4 characters (less than 4 characters won't be accepted by git as valid commit id).
  2. If it is a hex number, collect it into a slice (named hashes, for example).
  3. Append the hashes slice to the end of the command.
@gary-kim There seems to be a more straightforward approach. 1. Scan each keyword and determine if it is a hex number (https://golang.org/pkg/encoding/hex/#DecodeString) and its length is longer than 4 characters (less than 4 characters won't be accepted by git as valid commit id). 2. If it is a hex number, collect it into a slice (named `hashes`, for example). 3. Append the `hashes` slice to the end of the command.
gary-kim commented 2019-07-11 06:10:56 +00:00 (Migrated from github.com)

@gary-kim There seems to be a more straightforward approach.

  1. Scan each keyword and determine if it is a hex number (https://golang.org/pkg/encoding/hex/#DecodeString) and its length is longer than 4 characters (less than 4 characters won't be accepted by git as valid commit id).

  2. If it is a hex number, collect it into a slice (named hashes, for example).

  3. Append the hashes slice to the end of the command.

git log only uses the first commit hash in the command so I don't think appending hashes to the end of the command is going to do anything.
Also, there would need to be a check to make sure it is actually a commit first anyways because parsePrettyFormatLogToList returns an error if any of the given commits are not actual commits.

> @gary-kim There seems to be a more straightforward approach. > > 1. Scan each keyword and determine if it is a hex number (https://golang.org/pkg/encoding/hex/#DecodeString) and its length is longer than 4 characters (less than 4 characters won't be accepted by git as valid commit id). > > 2. If it is a hex number, collect it into a slice (named `hashes`, for example). > > 3. Append the `hashes` slice to the end of the command. `git log` only uses the first commit hash in the command so I don't think appending hashes to the end of the command is going to do anything. Also, there would need to be a check to make sure it is actually a commit first anyways because `parsePrettyFormatLogToList` returns an error if any of the given commits are not actual commits.
typeless commented 2019-07-11 06:33:27 +00:00 (Migrated from github.com)

@gary-kim Yeah, that does not work as I expected :(
My rationale is that it would be nice if the seemingly duplicate code paths (cmd and args) can be unified into one.
It is probably not easy though.

By the way, what is the expected behavior of the hash keyword? Do they OR or AND with other conditions?

@gary-kim Yeah, that does not work as I expected :( My rationale is that it would be nice if the seemingly duplicate code paths (`cmd` and `args`) can be unified into one. It is probably not easy though. By the way, what is the expected behavior of the hash keyword? Do they OR or AND with other conditions?
gary-kim commented 2019-07-11 07:36:42 +00:00 (Migrated from github.com)

By the way, what is the expected behavior of the hash keyword? Do they OR or AND with other conditions?

They will act similarly to the keywords. The commit needs to match all the given conditions such as author: and at least one of the keywords/hashes. From a user's standpoint, it's as if the hash has become part of the commit message.

My rationale is that it would be nice if the seemingly duplicate code paths (cmd and args) can be unified into one.

That is possible. Although, personally, I'd rather keep it as it is. Essentially, the original command can also built using the args slice at the end. The slight issue is what the command needs to run starts to really diverge before the command can be run which will make the last little bit of the function kind of messy.
@lunny What do you think?

> By the way, what is the expected behavior of the hash keyword? Do they OR or AND with other conditions? They will act similarly to the keywords. The commit needs to match all the given conditions such as `author:` and at least one of the keywords/hashes. From a user's standpoint, it's as if the hash has become part of the commit message. > My rationale is that it would be nice if the seemingly duplicate code paths (cmd and args) can be unified into one. That is possible. Although, personally, I'd rather keep it as it is. Essentially, the original command can also built using the `args` slice at the end. The slight issue is what the command needs to run starts to really diverge before the command can be run which will make the last little bit of the function kind of messy. @lunny What do you think?

@gary-kim I think it's enough for this PR. @typeless is the original author of this feature. :)

@gary-kim I think it's enough for this PR. @typeless is the original author of this feature. :)
gary-kim commented 2019-07-11 12:34:46 +00:00 (Migrated from github.com)

@typeless is the original author of this feature. :)

Thank you @typeless for this feature and everyone else who has contributed to Gogs/Gitea.

> @typeless is the original author of this feature. :) Thank you @typeless for this feature and everyone else who has contributed to Gogs/Gitea.
typeless commented 2019-07-12 01:54:13 +00:00 (Migrated from github.com)

@gary-kim you're welcome. I am a user benefited from this project too.

@gary-kim you're welcome. I am a user benefited from this project too.
typeless commented 2019-07-15 02:20:37 +00:00 (Migrated from github.com)

I am not going to block the PR if there is no clear way to improve it. But it still needs another maintainer's review & approval.

I am not going to block the PR if there is no clear way to improve it. But it still needs another maintainer's review & approval.
gary-kim commented 2019-08-12 03:40:54 +00:00 (Migrated from github.com)

@typeless I got rid of the duplicate code paths. What do you think?

@typeless I got rid of the duplicate code paths. What do you think?
typeless (Migrated from github.com) reviewed 2019-08-12 04:02:58 +00:00
typeless (Migrated from github.com) commented 2019-08-12 04:02:23 +00:00

The constant 6 here is a little bit mysterious.

The constant 6 here is a little bit mysterious.
typeless (Migrated from github.com) commented 2019-08-12 03:59:41 +00:00

Why does it have to trim the ‘\n’?

Why does it have to trim the ‘\n’?
gary-kim (Migrated from github.com) reviewed 2019-08-12 04:07:03 +00:00
gary-kim (Migrated from github.com) commented 2019-08-12 04:07:03 +00:00

My bad, I kind of changed that without thinking about it. Changed it back to 4. The idea is it's probably not worth checking strings that are too short.

My bad, I kind of changed that without thinking about it. Changed it back to 4. The idea is it's probably not worth checking strings that are too short.
gary-kim (Migrated from github.com) reviewed 2019-08-12 04:12:28 +00:00
gary-kim (Migrated from github.com) commented 2019-08-12 04:12:28 +00:00

New commits are added on by appending new hashes and a new line but you make a good point. Could just move the new line to before the hash.

EDIT: Done

EDIT2: Turns out that was needed to prevent 500 errors. I put it back in.

New commits are added on by appending new hashes and a new line but you make a good point. Could just move the new line to before the hash. EDIT: Done EDIT2: Turns out that was needed to prevent 500 errors. I put it back in.
typeless (Migrated from github.com) reviewed 2019-08-12 05:03:58 +00:00
typeless (Migrated from github.com) commented 2019-08-12 05:03:58 +00:00

Okay, it looks like a problem of how parsePrettyFormatLogToList handles its input.

Okay, it looks like a problem of how parsePrettyFormatLogToList handles its input.
typeless commented 2019-08-12 05:06:31 +00:00 (Migrated from github.com)

@gary-kim LGTM

@gary-kim LGTM
typeless (Migrated from github.com) reviewed 2019-08-12 05:11:55 +00:00
typeless (Migrated from github.com) commented 2019-08-12 05:11:55 +00:00

I think the constant 4 or 6 deserves a comment. I can figure that it is related to how git recognize the shortest allowed input hash. But it is not apparent simply looking at the code.

I think the constant 4 or 6 deserves a comment. I can figure that it is related to how `git` recognize the shortest allowed input hash. But it is not apparent simply looking at the code.
typeless (Migrated from github.com) approved these changes 2019-08-12 05:21:21 +00:00

CI failed.

CI failed.
gary-kim commented 2019-08-13 13:09:56 +00:00 (Migrated from github.com)

CI failed.

Fixed.

> CI failed. Fixed.

@typeless you are not a maintainer yet but you have satisfied the condition, please send a PR to apply for that. Otherwise your approval will not be counted.

@typeless you are not a maintainer yet but you have satisfied the condition, please send a PR to apply for that. Otherwise your approval will not be counted.
guillep2k (Migrated from github.com) reviewed 2019-08-16 17:00:53 +00:00
@ -241,0 +246,4 @@
for _, v := range opts.Keywords {
if len(v) >= 4 {
hashCmd := NewCommand("log", "-1", prettyLogFormat)
hashCmd.AddArguments(args...)
guillep2k (Migrated from github.com) commented 2019-08-16 17:00:53 +00:00

If I'm getting this code right, you always add the first '\n' and then always remove the last one withTrimSuffix(). Perhaps it would be easier to prepend '\n' when required? e.g. change:
stdout = append(stdout, hashMatching...)
into
stdout = append(stdout, '\n', hashMatching...)
and remove the other appends & TrimSuffix?

If I'm getting this code right, you always add the first `'\n'` and then always remove the last one with`TrimSuffix()`. Perhaps it would be easier to prepend `'\n'` when required? e.g. change: `stdout = append(stdout, hashMatching...)` into `stdout = append(stdout, '\n', hashMatching...)` and remove the other appends & TrimSuffix?
guillep2k (Migrated from github.com) reviewed 2019-08-16 17:01:48 +00:00
guillep2k (Migrated from github.com) left a comment

Check for simplification on '\n' handling.

Check for simplification on `'\n'` handling.
gary-kim (Migrated from github.com) reviewed 2019-08-17 05:21:38 +00:00
@ -241,0 +246,4 @@
for _, v := range opts.Keywords {
if len(v) >= 4 {
hashCmd := NewCommand("log", "-1", prettyLogFormat)
hashCmd.AddArguments(args...)
gary-kim (Migrated from github.com) commented 2019-08-17 05:21:38 +00:00

The issue with prepending is that repo.parsePrettyFormatLogToList is a very picky function. Having any empty line will result in a 500 error. By prepending, if the first command did not return any results, you could end up with an empty line at the start. You could check for if the line is empty beforehand but then you have an extra if statement that runs very time during the loop.

The important part is that the first '\n' is only added if the original command did not return anything.

Also, the multiple appends were needed because Go does not let you use ... as a part, not the whole of a variadic function parameter.

The issue with prepending is that `repo.parsePrettyFormatLogToList` is a very picky function. Having any empty line will result in a 500 error. By prepending, if the first command did not return any results, you could end up with an empty line at the start. You could check for if the line is empty beforehand but then you have an extra if statement that runs very time during the loop. The important part is that the first `'\n'` is only added if the original command did not return anything. Also, the multiple appends were needed because Go does not let you use `...` as a part, not the whole of a variadic function parameter.
guillep2k (Migrated from github.com) reviewed 2019-08-17 20:02:30 +00:00
@ -241,0 +246,4 @@
for _, v := range opts.Keywords {
if len(v) >= 4 {
hashCmd := NewCommand("log", "-1", prettyLogFormat)
hashCmd.AddArguments(args...)
guillep2k (Migrated from github.com) commented 2019-08-17 20:02:30 +00:00

OK, I see it now.

OK, I see it now.
guillep2k (Migrated from github.com) approved these changes 2019-09-02 22:28:38 +00:00
This repo is archived. You cannot comment on pull requests.
No Milestone
No project
No Assignees
1 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: lunny/gitea#7400
No description provided.