Add Tabular Diff for CSV files #14661

zeripath · 2021-02-12T15:38:52Z

KN4CK3R commented

2021-02-12 15:38:52 +00:00

(Migrated from github.com)

Implements request #14320 The rendering of csv files does match the diff style.

A simple change:

Changes in different areas (note the jumping line numbers):

A removed column is rendered without marking all rows as changed:

Toggle view:

Implements request #14320 The rendering of csv files does match the diff style. A simple change: ![grafik](https://user-images.githubusercontent.com/1666336/107120601-c41fdf80-688e-11eb-8522-78c7a3202da0.png) Changes in different areas (note the jumping line numbers): ![grafik](https://user-images.githubusercontent.com/1666336/107120621-e44f9e80-688e-11eb-94fe-53462502c288.png) A removed column is rendered without marking all rows as changed: ![grafik](https://user-images.githubusercontent.com/1666336/107120631-ef0a3380-688e-11eb-81ce-b1817e26139d.png) Toggle view: ![toggle](https://user-images.githubusercontent.com/1666336/107585285-3d586300-6bfe-11eb-8ff9-6931e06aa947.gif)

👍 6 ❤️ 2 🎉 6

zeripath reviewed 2021-02-12 19:51:25 +00:00

routers/repo/compare.go

						
				@ -68,3 +150,4 @@

				// ParseCompareInfo parse compare info between two commit for preparing comparing references

				func ParseCompareInfo(ctx *context.Context) (*models.User, *models.Repository, *git.Repository, *git.CompareInfo, string, string) {

					baseRepo := ctx.Repo.Repository

zeripath commented

2021-02-12 19:51:21 +00:00

This reader is unbound in size. There's no reason why that couldn't simply read in 100Mb straight in to memory.

zeripath reviewed 2021-02-12 19:54:02 +00:00

services/gitdiff/csv.go

zeripath commented

2021-02-12 19:54:02 +00:00

This is also an unbound read.

zeripath reviewed 2021-02-12 19:54:07 +00:00

services/gitdiff/csv.go

zeripath commented

2021-02-12 19:54:07 +00:00

And this.

KN4CK3R (Migrated from github.com) reviewed 2021-02-13 13:42:54 +00:00

routers/repo/compare.go

						
				@ -68,3 +150,4 @@

				// ParseCompareInfo parse compare info between two commit for preparing comparing references

				func ParseCompareInfo(ctx *context.Context) (*models.User, *models.Repository, *git.Repository, *git.CompareInfo, string, string) {

					baseRepo := ctx.Repo.Repository

KN4CK3R (Migrated from github.com) commented

2021-02-13 13:42:54 +00:00

I have ignored this because this code doesn't run if the diff decides "file too large". Anyway the max file size is restricted to 512kb now like on GitHub.

lunny reviewed 2021-02-13 16:02:15 +00:00

modules/base/csv.go

lunny commented

2021-02-13 16:02:14 +00:00

Why not keep these code still in package modules/markup/csv/?

Why not keep these code still in package `modules/markup/csv/`?

KN4CK3R (Migrated from github.com) reviewed 2021-02-14 07:43:39 +00:00

modules/base/csv.go

KN4CK3R (Migrated from github.com) commented

2021-02-14 07:43:39 +00:00

I thought the csv reader creation logic shouldn't be coupled with the render part.

lunny reviewed 2021-02-15 03:44:34 +00:00

modules/base/csv.go

lunny commented

2021-02-15 03:44:34 +00:00

I'm OK with another package but put them in base package seems not suitable. Maybe we can create a new package named modules/csv or similar to do that.

I'm OK with another package but put them in `base` package seems not suitable. Maybe we can create a new package named `modules/csv` or similar to do that.

KN4CK3R (Migrated from github.com) reviewed 2021-02-15 15:28:16 +00:00

modules/base/csv.go

KN4CK3R (Migrated from github.com) commented

2021-02-15 15:28:16 +00:00

I moved the code to a new csv package.

lunny reviewed 2021-02-15 16:24:10 +00:00

modules/csv/csv.go

						
				@ -0,0 +56,4 @@

				}

				// scoreDelimiter uses a count & regularity metric to evaluate a delimiter against lines of CSV.

				func scoreDelimiter(lines []string, delim rune) float64 {

lunny commented

2021-02-15 16:24:10 +00:00

How did you handle "a,b"\t"c,b"?

How did you handle `"a,b"\t"c,b"`?

lunny reviewed 2021-02-15 16:24:30 +00:00

modules/csv/csv.go

						
				@ -0,0 +38,4 @@

					maxBytes := util.Min(len(data), 1e4)

					text := string(data[:maxBytes])

					text = quoteRegexp.ReplaceAllLiteralString(text, "")

					lines := strings.SplitN(text, "\n", maxLines+1)

lunny commented

2021-02-15 16:24:30 +00:00

"a\nb","c"?

`"a\nb","c"`?

lunny reviewed 2021-02-15 16:26:11 +00:00

modules/csv/csv.go

						
				@ -0,0 +56,4 @@

				}

				// scoreDelimiter uses a count & regularity metric to evaluate a delimiter against lines of CSV.

				func scoreDelimiter(lines []string, delim rune) float64 {

lunny commented

2021-02-15 16:26:11 +00:00

This is not for this PR since it's only a refactor.

lunny reviewed 2021-02-15 16:26:17 +00:00

modules/csv/csv.go

						
				@ -0,0 +38,4 @@

					maxBytes := util.Min(len(data), 1e4)

					text := string(data[:maxBytes])

					text = quoteRegexp.ReplaceAllLiteralString(text, "")

					lines := strings.SplitN(text, "\n", maxLines+1)

lunny commented

2021-02-15 16:26:17 +00:00

This is not for this PR since it's only a refactor.

KN4CK3R (Migrated from github.com) reviewed 2021-02-15 18:13:03 +00:00

modules/csv/csv.go

						
				@ -0,0 +56,4 @@

				}

				// scoreDelimiter uses a count & regularity metric to evaluate a delimiter against lines of CSV.

				func scoreDelimiter(lines []string, delim rune) float64 {

KN4CK3R (Migrated from github.com) commented

2021-02-15 18:13:03 +00:00

I didn't change this code but first the regex deletes all quoted text and then the delimiter \t is found.
Original code: 0a9a484e1e/modules/markup/csv/csv.go (L67-L113)

I didn't change this code but first the regex deletes all quoted text and then the delimiter `\t` is found. Original code: https://github.com/go-gitea/gitea/blob/0a9a484e1e2c082f7ebd747837e9c9557b3bacac/modules/markup/csv/csv.go#L67-L113

KN4CK3R (Migrated from github.com) reviewed 2021-02-15 18:15:48 +00:00

modules/csv/csv.go

						
				@ -0,0 +38,4 @@

					maxBytes := util.Min(len(data), 1e4)

					text := string(data[:maxBytes])

					text = quoteRegexp.ReplaceAllLiteralString(text, "")

					lines := strings.SplitN(text, "\n", maxLines+1)

KN4CK3R (Migrated from github.com) commented

2021-02-15 18:15:47 +00:00

Same as above.

Same as [above](https://github.com/go-gitea/gitea/pull/14661#discussion_r576360598). ![grafik](https://user-images.githubusercontent.com/1666336/107981331-25982a80-6fc2-11eb-9fda-0c444460fea5.png)

bagasme (Migrated from github.com) reviewed 2021-02-16 10:56:10 +00:00

custom/conf/app.example.ini

						
				@ -250,1 +250,4 @@

				[ui.csv]

				; Maximum allowed file size in bytes to render CSV files as table. (Set to 0 for no limit).

				MAX_FILE_SIZE = 524288

bagasme (Migrated from github.com) commented

2021-02-16 10:56:10 +00:00

Why did you set 512 KB as default maximum?

KN4CK3R (Migrated from github.com) reviewed 2021-02-16 11:03:53 +00:00

custom/conf/app.example.ini

						
				@ -250,1 +250,4 @@

				[ui.csv]

				; Maximum allowed file size in bytes to render CSV files as table. (Set to 0 for no limit).

				MAX_FILE_SIZE = 524288

KN4CK3R (Migrated from github.com) commented

2021-02-16 11:03:53 +00:00

I just adopted the value from Github.

Exceeding the file size. Our rendering only works for files up to 512KB. Anything bigger than that slows down the browser. Source

I just adopted the value from Github. > Exceeding the file size. Our rendering only works for files up to 512KB. Anything bigger than that slows down the browser. [Source](https://docs.github.com/en/github/managing-files-in-a-repository/rendering-csv-and-tsv-data)

zeripath reviewed 2021-03-06 23:28:25 +00:00

docs/content/doc/advanced/config-cheat-sheet.en-us.md

zeripath commented

2021-03-06 23:28:25 +00:00

Is this really related?

KN4CK3R (Migrated from github.com) reviewed 2021-03-10 22:11:31 +00:00

docs/content/doc/advanced/config-cheat-sheet.en-us.md

KN4CK3R (Migrated from github.com) commented

2021-03-10 22:11:31 +00:00

No, should this "fix" be another PR?

lafriks (Migrated from github.com) reviewed 2021-03-10 22:18:16 +00:00

docs/content/doc/advanced/config-cheat-sheet.en-us.md

lafriks (Migrated from github.com) commented

2021-03-10 22:18:16 +00:00

Yes, please

KN4CK3R (Migrated from github.com) reviewed 2021-03-10 22:27:00 +00:00

docs/content/doc/advanced/config-cheat-sheet.en-us.md

KN4CK3R (Migrated from github.com) commented

2021-03-10 22:27:00 +00:00

done #14949

kdumontnu (Migrated from github.com) requested changes 2021-03-23 22:59:21 +00:00

kdumontnu (Migrated from github.com) left a comment

Can you add a test for line deletion and line addition? Seems to display incorrectly when I tested:

Can you add a test for line deletion and line addition? Seems to display incorrectly when I tested: ![image](https://user-images.githubusercontent.com/12700993/112229453-9bfafe80-8bf8-11eb-85cd-dd3570644f4e.png) ![image](https://user-images.githubusercontent.com/12700993/112229407-8980c500-8bf8-11eb-86ac-67c435411166.png)

web_src/less/_repository.less

kdumontnu (Migrated from github.com) commented

2021-03-23 22:51:33 +00:00

I would use something a little higher, like 1.2. Gets a little squished with some fonts.

I would use something a little higher, like 1.2. Gets a little squished with some fonts. ![image](https://user-images.githubusercontent.com/12700993/112229102-f5166280-8bf7-11eb-9f50-a31ade1a7431.png)

KN4CK3R (Migrated from github.com) reviewed 2021-03-24 15:24:15 +00:00

web_src/less/_repository.less

KN4CK3R (Migrated from github.com) commented

2021-03-24 15:24:15 +00:00

I have removed the line-height. Looks better without it.

lunny approved these changes 2021-03-25 16:26:39 +00:00

kdumontnu (Migrated from github.com) approved these changes 2021-03-25 16:34:41 +00:00

kdumontnu (Migrated from github.com) left a comment

Awesome. Thanks for adding unit tests! ?

lafriks (Migrated from github.com) reviewed 2021-03-29 06:28:33 +00:00

modules/csv/csv.go

lafriks (Migrated from github.com) commented

2021-03-29 06:28:33 +00:00

Could you add also @ as delimiter?

Could you add also `@` as delimiter?

lafriks (Migrated from github.com) reviewed 2021-03-29 06:29:50 +00:00

modules/csv/csv.go

lafriks (Migrated from github.com) commented

2021-03-29 06:29:50 +00:00

// Copyright 2021 The Gitea Authors. All rights reserved.

please also update year in other newly added files

6543 (Migrated from github.com) approved these changes 2021-03-29 18:37:51 +00:00

lafriks (Migrated from github.com) approved these changes 2021-03-29 20:00:45 +00:00

kdumontnu (Migrated from github.com) approved these changes 2021-03-29 20:23:51 +00:00

This repo is archived. You cannot comment on pull requests.

No reviewers

6543

lafriks

kdumontnu

No Label

hacktoberfest-accepted

kind/upstream-related

status/needs-feedback

No Milestone

No project

No Assignees

2 Participants

Due Date

The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: lunny/gitea#14661