Speed up git diff highlight generation #16180

Merged
typeless merged 4 commits from speedup-highlight into main 2021-06-17 14:55:17 +00:00
typeless commented 2021-06-17 03:10:47 +00:00 (Migrated from github.com)

For #14734

Note that the client-side bottleneck remains, which is the primary culprit of #14734.
This PR addresses the server-side path only.

Before:

File: gitea
Type: cpu
Time: Jun 16, 2021 at 5:23pm (CST)
Duration: 30.13s, Total samples = 22.60s (75.00%)
Showing nodes accounting for 12.96s, 57.35% of 22.60s total
Dropped 571 nodes (cum <= 0.11s)
Showing top 15 nodes out of 165
      flat  flat%   sum%        cum   cum%
     2.81s 12.43% 12.43%      2.81s 12.43%  unicode/utf8.DecodeRuneInString
     2.65s 11.73% 24.16%      8.13s 35.97%  github.com/danwakefield/fnmatch.Match
     1.63s  7.21% 31.37%      4.43s 19.60%  github.com/danwakefield/fnmatch.unpackRune
     1.06s  4.69% 36.06%      1.06s  4.69%  code.gitea.io/gitea/services/gitdiff.(*DiffSection).GetLine
     1.05s  4.65% 40.71%      2.43s 10.75%  github.com/danwakefield/fnmatch.Match.func1
     0.68s  3.01% 43.72%      0.86s  3.81%  github.com/gogs/chardet.(*recognizerMultiByte).matchConfidence
     0.54s  2.39% 46.11%      1.91s  8.45%  runtime.mallocgc
     0.51s  2.26% 48.36%      1.10s  4.87%  runtime.scanobject
     0.41s  1.81% 50.18%      0.50s  2.21%  runtime.heapBitsSetType
     0.36s  1.59% 51.77%      1.27s  5.62%  github.com/dlclark/regexp2.(*runner).execute
     0.28s  1.24% 53.01%      8.54s 37.79%  github.com/alecthomas/chroma/lexers/internal.Match
     0.27s  1.19% 54.20%      0.27s  1.19%  runtime.memmove
     0.25s  1.11% 55.31%      0.25s  1.11%  runtime.memclrNoHeapPointers
     0.24s  1.06% 56.37%      0.24s  1.06%  runtime.nextFreeFast (inline)
     0.22s  0.97% 57.35%     16.80s 74.34%  reflect.Value.call**

After:

File: gitea
Type: cpu
Time: Jun 17, 2021 at 10:23am (CST)
Duration: 30.10s, Total samples = 25390ms (84.34%)
Showing nodes accounting for 9480ms, 37.34% of 25390ms total
Dropped 602 nodes (cum <= 126.95ms)
Showing top 15 nodes out of 185
      flat  flat%   sum%        cum   cum%
    1790ms  7.05%  7.05%     1800ms  7.09%  code.gitea.io/gitea/services/gitdiff.(*DiffSection).GetLine
    1140ms  4.49% 11.54%     2400ms  9.45%  runtime.scanobject
    1100ms  4.33% 15.87%     3640ms 14.34%  runtime.mallocgc
     810ms  3.19% 19.06%      990ms  3.90%  runtime.heapBitsSetType
     700ms  2.76% 21.82%     2650ms 10.44%  github.com/dlclark/regexp2.(*runner).execute
     680ms  2.68% 24.50%      840ms  3.31%  github.com/gogs/chardet.(*recognizerMultiByte).matchConfidence
     480ms  1.89% 26.39%      480ms  1.89%  runtime.nextFreeFast
     430ms  1.69% 28.08%     1180ms  4.65%  fmt.(*pp).doPrintf
     400ms  1.58% 29.66%      400ms  1.58%  runtime.memmove
     360ms  1.42% 31.08%      460ms  1.81%  runtime.findObject
     340ms  1.34% 32.41%    16670ms 65.66%  reflect.Value.call
     330ms  1.30% 33.71%      330ms  1.30%  runtime.memhash64
     320ms  1.26% 34.97%      320ms  1.26%  github.com/dlclark/regexp2.(*runner).checkTimeout
     320ms  1.26% 36.23%      320ms  1.26%  runtime.memclrNoHeapPointers
     280ms  1.10% 37.34%      550ms  2.17%  runtime.mapaccess1_fast64
For #14734 Note that the client-side bottleneck remains, which is the primary culprit of #14734. This PR addresses the server-side path only. Before: ``` File: gitea Type: cpu Time: Jun 16, 2021 at 5:23pm (CST) Duration: 30.13s, Total samples = 22.60s (75.00%) Showing nodes accounting for 12.96s, 57.35% of 22.60s total Dropped 571 nodes (cum <= 0.11s) Showing top 15 nodes out of 165 flat flat% sum% cum cum% 2.81s 12.43% 12.43% 2.81s 12.43% unicode/utf8.DecodeRuneInString 2.65s 11.73% 24.16% 8.13s 35.97% github.com/danwakefield/fnmatch.Match 1.63s 7.21% 31.37% 4.43s 19.60% github.com/danwakefield/fnmatch.unpackRune 1.06s 4.69% 36.06% 1.06s 4.69% code.gitea.io/gitea/services/gitdiff.(*DiffSection).GetLine 1.05s 4.65% 40.71% 2.43s 10.75% github.com/danwakefield/fnmatch.Match.func1 0.68s 3.01% 43.72% 0.86s 3.81% github.com/gogs/chardet.(*recognizerMultiByte).matchConfidence 0.54s 2.39% 46.11% 1.91s 8.45% runtime.mallocgc 0.51s 2.26% 48.36% 1.10s 4.87% runtime.scanobject 0.41s 1.81% 50.18% 0.50s 2.21% runtime.heapBitsSetType 0.36s 1.59% 51.77% 1.27s 5.62% github.com/dlclark/regexp2.(*runner).execute 0.28s 1.24% 53.01% 8.54s 37.79% github.com/alecthomas/chroma/lexers/internal.Match 0.27s 1.19% 54.20% 0.27s 1.19% runtime.memmove 0.25s 1.11% 55.31% 0.25s 1.11% runtime.memclrNoHeapPointers 0.24s 1.06% 56.37% 0.24s 1.06% runtime.nextFreeFast (inline) 0.22s 0.97% 57.35% 16.80s 74.34% reflect.Value.call** ``` After: ``` File: gitea Type: cpu Time: Jun 17, 2021 at 10:23am (CST) Duration: 30.10s, Total samples = 25390ms (84.34%) Showing nodes accounting for 9480ms, 37.34% of 25390ms total Dropped 602 nodes (cum <= 126.95ms) Showing top 15 nodes out of 185 flat flat% sum% cum cum% 1790ms 7.05% 7.05% 1800ms 7.09% code.gitea.io/gitea/services/gitdiff.(*DiffSection).GetLine 1140ms 4.49% 11.54% 2400ms 9.45% runtime.scanobject 1100ms 4.33% 15.87% 3640ms 14.34% runtime.mallocgc 810ms 3.19% 19.06% 990ms 3.90% runtime.heapBitsSetType 700ms 2.76% 21.82% 2650ms 10.44% github.com/dlclark/regexp2.(*runner).execute 680ms 2.68% 24.50% 840ms 3.31% github.com/gogs/chardet.(*recognizerMultiByte).matchConfidence 480ms 1.89% 26.39% 480ms 1.89% runtime.nextFreeFast 430ms 1.69% 28.08% 1180ms 4.65% fmt.(*pp).doPrintf 400ms 1.58% 29.66% 400ms 1.58% runtime.memmove 360ms 1.42% 31.08% 460ms 1.81% runtime.findObject 340ms 1.34% 32.41% 16670ms 65.66% reflect.Value.call 330ms 1.30% 33.71% 330ms 1.30% runtime.memhash64 320ms 1.26% 34.97% 320ms 1.26% github.com/dlclark/regexp2.(*runner).checkTimeout 320ms 1.26% 36.23% 320ms 1.26% runtime.memclrNoHeapPointers 280ms 1.10% 37.34% 550ms 2.17% runtime.mapaccess1_fast64 ```
lunny reviewed 2021-06-17 05:45:56 +00:00

Is the fileName an absolute path?

Is the fileName an absolute path?
lafriks (Migrated from github.com) approved these changes 2021-06-17 06:05:30 +00:00
lafriks (Migrated from github.com) reviewed 2021-06-17 06:06:06 +00:00
lafriks (Migrated from github.com) commented 2021-06-17 06:06:06 +00:00

I don't think that matters as it only gets lexer

I don't think that matters as it only gets lexer
typeless (Migrated from github.com) reviewed 2021-06-17 06:08:36 +00:00
typeless (Migrated from github.com) commented 2021-06-17 06:08:36 +00:00

To be honest, I am not so sure.
As far as I can tell, the matcher of chroma cares about the file extension only. It uses the extension to determine the file type for syntax highlighting.

To be honest, I am not so sure. As far as I can tell, the matcher of `chroma` cares about the file extension only. It uses the extension to determine the file type for syntax highlighting.
KN4CK3R (Migrated from github.com) approved these changes 2021-06-17 06:48:44 +00:00
lunny reviewed 2021-06-18 11:53:13 +00:00

The cache will be big more and more except Gitea restarted. Maybe we should get only ext of the filename here.

The cache will be big more and more except Gitea restarted. Maybe we should get only ext of the filename here.
This repo is archived. You cannot comment on pull requests.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: lunny/gitea#16180
No description provided.