Add Tabular Diff for CSV files #14661

Merged
KN4CK3R merged 30 commits from feature-csv-diff into master 2021-03-29 20:44:28 +00:00
KN4CK3R commented 2021-02-12 15:38:52 +00:00 (Migrated from github.com)

Implements request #14320 The rendering of csv files does match the diff style.

A simple change:
grafik

Changes in different areas (note the jumping line numbers):
grafik

A removed column is rendered without marking all rows as changed:
grafik

Toggle view:
toggle

Implements request #14320 The rendering of csv files does match the diff style. A simple change: ![grafik](https://user-images.githubusercontent.com/1666336/107120601-c41fdf80-688e-11eb-8522-78c7a3202da0.png) Changes in different areas (note the jumping line numbers): ![grafik](https://user-images.githubusercontent.com/1666336/107120621-e44f9e80-688e-11eb-94fe-53462502c288.png) A removed column is rendered without marking all rows as changed: ![grafik](https://user-images.githubusercontent.com/1666336/107120631-ef0a3380-688e-11eb-81ce-b1817e26139d.png) Toggle view: ![toggle](https://user-images.githubusercontent.com/1666336/107585285-3d586300-6bfe-11eb-8ff9-6931e06aa947.gif)
zeripath reviewed 2021-02-12 19:51:25 +00:00
@ -68,3 +150,4 @@
// ParseCompareInfo parse compare info between two commit for preparing comparing references
func ParseCompareInfo(ctx *context.Context) (*models.User, *models.Repository, *git.Repository, *git.CompareInfo, string, string) {
baseRepo := ctx.Repo.Repository
Contributor

This reader is unbound in size. There's no reason why that couldn't simply read in 100Mb straight in to memory.

This reader is unbound in size. There's no reason why that couldn't simply read in 100Mb straight in to memory.
zeripath reviewed 2021-02-12 19:54:02 +00:00
Contributor

This is also an unbound read.

This is also an unbound read.
zeripath reviewed 2021-02-12 19:54:07 +00:00
Contributor

And this.

And this.
KN4CK3R (Migrated from github.com) reviewed 2021-02-13 13:42:54 +00:00
@ -68,3 +150,4 @@
// ParseCompareInfo parse compare info between two commit for preparing comparing references
func ParseCompareInfo(ctx *context.Context) (*models.User, *models.Repository, *git.Repository, *git.CompareInfo, string, string) {
baseRepo := ctx.Repo.Repository
KN4CK3R (Migrated from github.com) commented 2021-02-13 13:42:54 +00:00

I have ignored this because this code doesn't run if the diff decides "file too large". Anyway the max file size is restricted to 512kb now like on GitHub.

I have ignored this because this code doesn't run if the diff decides "file too large". Anyway the max file size is restricted to 512kb now like on GitHub.
lunny reviewed 2021-02-13 16:02:15 +00:00

Why not keep these code still in package modules/markup/csv/?

Why not keep these code still in package `modules/markup/csv/`?
KN4CK3R (Migrated from github.com) reviewed 2021-02-14 07:43:39 +00:00
KN4CK3R (Migrated from github.com) commented 2021-02-14 07:43:39 +00:00

I thought the csv reader creation logic shouldn't be coupled with the render part.

I thought the csv reader creation logic shouldn't be coupled with the render part.
lunny reviewed 2021-02-15 03:44:34 +00:00

I'm OK with another package but put them in base package seems not suitable. Maybe we can create a new package named modules/csv or similar to do that.

I'm OK with another package but put them in `base` package seems not suitable. Maybe we can create a new package named `modules/csv` or similar to do that.
KN4CK3R (Migrated from github.com) reviewed 2021-02-15 15:28:16 +00:00
KN4CK3R (Migrated from github.com) commented 2021-02-15 15:28:16 +00:00

I moved the code to a new csv package.

I moved the code to a new csv package.
lunny reviewed 2021-02-15 16:24:10 +00:00
@ -0,0 +56,4 @@
}
// scoreDelimiter uses a count & regularity metric to evaluate a delimiter against lines of CSV.
func scoreDelimiter(lines []string, delim rune) float64 {

How did you handle "a,b"\t"c,b"?

How did you handle `"a,b"\t"c,b"`?
lunny reviewed 2021-02-15 16:24:30 +00:00
@ -0,0 +38,4 @@
maxBytes := util.Min(len(data), 1e4)
text := string(data[:maxBytes])
text = quoteRegexp.ReplaceAllLiteralString(text, "")
lines := strings.SplitN(text, "\n", maxLines+1)

"a\nb","c"?

`"a\nb","c"`?
lunny reviewed 2021-02-15 16:26:11 +00:00
@ -0,0 +56,4 @@
}
// scoreDelimiter uses a count & regularity metric to evaluate a delimiter against lines of CSV.
func scoreDelimiter(lines []string, delim rune) float64 {

This is not for this PR since it's only a refactor.

This is not for this PR since it's only a refactor.
lunny reviewed 2021-02-15 16:26:17 +00:00
@ -0,0 +38,4 @@
maxBytes := util.Min(len(data), 1e4)
text := string(data[:maxBytes])
text = quoteRegexp.ReplaceAllLiteralString(text, "")
lines := strings.SplitN(text, "\n", maxLines+1)

This is not for this PR since it's only a refactor.

This is not for this PR since it's only a refactor.
KN4CK3R (Migrated from github.com) reviewed 2021-02-15 18:13:03 +00:00
@ -0,0 +56,4 @@
}
// scoreDelimiter uses a count & regularity metric to evaluate a delimiter against lines of CSV.
func scoreDelimiter(lines []string, delim rune) float64 {
KN4CK3R (Migrated from github.com) commented 2021-02-15 18:13:03 +00:00

I didn't change this code but first the regex deletes all quoted text and then the delimiter \t is found.
Original code: 0a9a484e1e/modules/markup/csv/csv.go (L67-L113)

I didn't change this code but first the regex deletes all quoted text and then the delimiter `\t` is found. Original code: https://github.com/go-gitea/gitea/blob/0a9a484e1e2c082f7ebd747837e9c9557b3bacac/modules/markup/csv/csv.go#L67-L113
KN4CK3R (Migrated from github.com) reviewed 2021-02-15 18:15:48 +00:00
@ -0,0 +38,4 @@
maxBytes := util.Min(len(data), 1e4)
text := string(data[:maxBytes])
text = quoteRegexp.ReplaceAllLiteralString(text, "")
lines := strings.SplitN(text, "\n", maxLines+1)
KN4CK3R (Migrated from github.com) commented 2021-02-15 18:15:47 +00:00

Same as above.
grafik

Same as [above](https://github.com/go-gitea/gitea/pull/14661#discussion_r576360598). ![grafik](https://user-images.githubusercontent.com/1666336/107981331-25982a80-6fc2-11eb-9fda-0c444460fea5.png)
bagasme (Migrated from github.com) reviewed 2021-02-16 10:56:10 +00:00
@ -250,1 +250,4 @@
[ui.csv]
; Maximum allowed file size in bytes to render CSV files as table. (Set to 0 for no limit).
MAX_FILE_SIZE = 524288
bagasme (Migrated from github.com) commented 2021-02-16 10:56:10 +00:00

Why did you set 512 KB as default maximum?

Why did you set 512 KB as default maximum?
KN4CK3R (Migrated from github.com) reviewed 2021-02-16 11:03:53 +00:00
@ -250,1 +250,4 @@
[ui.csv]
; Maximum allowed file size in bytes to render CSV files as table. (Set to 0 for no limit).
MAX_FILE_SIZE = 524288
KN4CK3R (Migrated from github.com) commented 2021-02-16 11:03:53 +00:00

I just adopted the value from Github.

Exceeding the file size. Our rendering only works for files up to 512KB. Anything bigger than that slows down the browser. Source

I just adopted the value from Github. > Exceeding the file size. Our rendering only works for files up to 512KB. Anything bigger than that slows down the browser. [Source](https://docs.github.com/en/github/managing-files-in-a-repository/rendering-csv-and-tsv-data)
zeripath reviewed 2021-03-06 23:28:25 +00:00
Contributor

Is this really related?

Is this really related?
KN4CK3R (Migrated from github.com) reviewed 2021-03-10 22:11:31 +00:00
KN4CK3R (Migrated from github.com) commented 2021-03-10 22:11:31 +00:00

No, should this "fix" be another PR?

No, should this "fix" be another PR?
lafriks (Migrated from github.com) reviewed 2021-03-10 22:18:16 +00:00
lafriks (Migrated from github.com) commented 2021-03-10 22:18:16 +00:00

Yes, please

Yes, please
KN4CK3R (Migrated from github.com) reviewed 2021-03-10 22:27:00 +00:00
kdumontnu (Migrated from github.com) requested changes 2021-03-23 22:59:21 +00:00
kdumontnu (Migrated from github.com) left a comment

Can you add a test for line deletion and line addition? Seems to display incorrectly when I tested:

image

image

Can you add a test for line deletion and line addition? Seems to display incorrectly when I tested: ![image](https://user-images.githubusercontent.com/12700993/112229453-9bfafe80-8bf8-11eb-85cd-dd3570644f4e.png) ![image](https://user-images.githubusercontent.com/12700993/112229407-8980c500-8bf8-11eb-86ac-67c435411166.png)
kdumontnu (Migrated from github.com) commented 2021-03-23 22:51:33 +00:00

I would use something a little higher, like 1.2. Gets a little squished with some fonts.

image

I would use something a little higher, like 1.2. Gets a little squished with some fonts. ![image](https://user-images.githubusercontent.com/12700993/112229102-f5166280-8bf7-11eb-9f50-a31ade1a7431.png)
KN4CK3R (Migrated from github.com) reviewed 2021-03-24 15:24:15 +00:00
KN4CK3R (Migrated from github.com) commented 2021-03-24 15:24:15 +00:00

I have removed the line-height. Looks better without it.

I have removed the line-height. Looks better without it.
lunny approved these changes 2021-03-25 16:26:39 +00:00
kdumontnu (Migrated from github.com) approved these changes 2021-03-25 16:34:41 +00:00
kdumontnu (Migrated from github.com) left a comment

Awesome. Thanks for adding unit tests! ?

Awesome. Thanks for adding unit tests! ?
lafriks (Migrated from github.com) reviewed 2021-03-29 06:28:33 +00:00
lafriks (Migrated from github.com) commented 2021-03-29 06:28:33 +00:00

Could you add also @ as delimiter?

Could you add also `@` as delimiter?
lafriks (Migrated from github.com) reviewed 2021-03-29 06:29:50 +00:00
lafriks (Migrated from github.com) commented 2021-03-29 06:29:50 +00:00
// Copyright 2021 The Gitea Authors. All rights reserved.

please also update year in other newly added files

```suggestion // Copyright 2021 The Gitea Authors. All rights reserved. ``` please also update year in other newly added files
6543 (Migrated from github.com) approved these changes 2021-03-29 18:37:51 +00:00
lafriks (Migrated from github.com) approved these changes 2021-03-29 20:00:45 +00:00
kdumontnu (Migrated from github.com) approved these changes 2021-03-29 20:23:51 +00:00
This repo is archived. You cannot comment on pull requests.
No Milestone
No project
No Assignees
2 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: lunny/gitea#14661
No description provided.