Convert EOL to UNIX-style to render MD properly #8925

Merged
guillep2k merged 19 commits from fix-8865 into master 2019-11-13 02:27:12 +00:00
guillep2k commented 2019-11-11 19:35:52 +00:00 (Migrated from github.com)

Closes #8865

  • Tested with Linux server
  • Tested with Windows server
  • Tested with MAC server

NOTE: Props to @zeripath for devising a super-fast parsing function. ?

Closes #8865 - [x] Tested with Linux server - [x] Tested with Windows server - [x] Tested with MAC server NOTE: Props to @zeripath for devising a super-fast parsing function. ?
zeripath reviewed 2019-11-11 19:38:28 +00:00
zeripath reviewed 2019-11-11 19:50:44 +00:00
Contributor
		right := bytes.IndexByte(input[left:], '\r')
```suggestion right := bytes.IndexByte(input[left:], '\r') ```
zeripath reviewed 2019-11-11 19:52:16 +00:00
Contributor
	for left < len(input) && input[left] == '\n' {
```suggestion for left < len(input) && input[left] == '\n' { ```
zeripath reviewed 2019-11-11 21:23:20 +00:00
Contributor
	if bytes.IndexByte(input, '\r') == -1 {
		return input
	}
	tmp := make([]byte, len(input))
```suggestion if bytes.IndexByte(input, '\r') == -1 { return input } tmp := make([]byte, len(input)) ```
zeripath reviewed 2019-11-11 21:31:59 +00:00
Contributor

This makes it much faster for small unix files - but does slightly slow things down on big windows/mac files

This makes it much faster for small unix files - but does slightly slow things down on big windows/mac files
gary-kim (Migrated from github.com) reviewed 2019-11-12 00:02:59 +00:00
gary-kim (Migrated from github.com) left a comment

Wow, that's fast

Wow, that's fast
gary-kim (Migrated from github.com) commented 2019-11-12 00:01:31 +00:00

Is this faster then using something like left = bytes.LastIndexByte(input, '\n') + 1? My current system has gccgo so I'll check when I have a change to install gc instead for a second.

Is this faster then using something like `left = bytes.LastIndexByte(input, '\n') + 1`? My current system has `gccgo` so I'll check when I have a change to install `gc` instead for a second.
@ -66,0 +79,4 @@
pos += right
tmp[pos] = '\n'
left += right + 1
pos++
gary-kim (Migrated from github.com) commented 2019-11-12 00:02:48 +00:00

Nitpickest of nitpicks :)

length := len(input)
tmp := make([]byte, length)
Nitpickest of nitpicks :) ``` length := len(input) tmp := make([]byte, length) ```
guillep2k (Migrated from github.com) reviewed 2019-11-12 00:42:20 +00:00
@ -66,0 +79,4 @@
pos += right
tmp[pos] = '\n'
left += right + 1
pos++
guillep2k (Migrated from github.com) commented 2019-11-12 00:42:19 +00:00

It's not really different, since len() is not really a function but a language construct AFAIK. These are the numbers for 150,000 iterations (method9 is len before make):

==== Testing DOS conversion
     Method8: 733.420742ms
     Method9: 745.064906ms
==== Testing MAC conversion
     Method8: 740.482412ms
     Method9: 734.314871ms
==== Testing LINUX conversion
     Method8: 33.425224ms
     Method9: 33.219408ms
PASS
ok      speed   3.198s

I've attached my tests four your reference.

It's not really different, since `len()` is not really a function but a language construct AFAIK. These are the numbers for 150,000 iterations (method9 is len before make): ``` ==== Testing DOS conversion Method8: 733.420742ms Method9: 745.064906ms ==== Testing MAC conversion Method8: 740.482412ms Method9: 734.314871ms ==== Testing LINUX conversion Method8: 33.425224ms Method9: 33.219408ms PASS ok speed 3.198s ``` I've attached my tests four your reference.
guillep2k (Migrated from github.com) reviewed 2019-11-12 00:44:00 +00:00
guillep2k (Migrated from github.com) commented 2019-11-12 00:44:00 +00:00

Is this faster then using something like left = bytes.LastIndexByte(input, '\n') + 1? My current system has gccgo so I'll check when I have a change to install gc instead for a second.

@gary-kim check the test file attached in another comment; somehow I can't attach files inside code review comments.

> > > Is this faster then using something like `left = bytes.LastIndexByte(input, '\n') + 1`? My current system has `gccgo` so I'll check when I have a change to install `gc` instead for a second. @gary-kim check the test file attached in another comment; somehow I can't attach files inside code review comments.
guillep2k (Migrated from github.com) reviewed 2019-11-12 01:30:23 +00:00
@ -66,0 +79,4 @@
pos += right
tmp[pos] = '\n'
left += right + 1
pos++
guillep2k (Migrated from github.com) commented 2019-11-12 01:30:23 +00:00

@gary-kim Well, I stand corrected. I ran more tests and it seems it's slightly faster, go figure. Updated.

@gary-kim Well, I stand corrected. I ran more tests and it seems it's slightly faster, go figure. Updated.
gary-kim (Migrated from github.com) approved these changes 2019-11-12 01:56:28 +00:00
guillep2k (Migrated from github.com) reviewed 2019-11-12 02:27:56 +00:00
guillep2k (Migrated from github.com) commented 2019-11-12 02:27:56 +00:00
		dos := []byte(strings.ReplaceAll(str, "\n", "\r\n"))
```suggestion dos := []byte(strings.ReplaceAll(str, "\n", "\r\n")) ```
zeripath reviewed 2019-11-12 09:09:01 +00:00
Contributor

We can drop this check. We know that input[left] is not a \n from line 76.

We can drop this check. We know that `input[left]` is not a `\n` from line 76.
zeripath reviewed 2019-11-12 09:11:36 +00:00
Contributor

In fact now that we have unfurled the initial step we can drop lines 76-84 and lines 86-89

In fact now that we have unfurled the initial step we can drop lines 76-84 and lines 86-89
zeripath reviewed 2019-11-12 09:13:15 +00:00
Contributor

And you can drop the if left < length { test at line 85 because the only way that could happen would cause right == -1.

And you can drop the `if left < length {` test at line 85 because the only way that could happen would cause `right == -1`.
zeripath reviewed 2019-11-12 09:18:42 +00:00
Contributor

So that code was munching initial \n to prevent them from being skipped. However, I've realised it's totally unnecessary, see below...

So that code was munching initial `\n` to prevent them from being skipped. However, I've realised it's totally unnecessary, see below...
zeripath reviewed 2019-11-12 09:19:03 +00:00
Contributor
See https://github.com/go-gitea/gitea/pull/8925#issuecomment-552804972
zeripath reviewed 2019-11-12 09:59:43 +00:00
@ -66,0 +94,4 @@
}
copy(tmp[pos:pos+right], input[left:left+right])
pos += right
tmp[pos] = '\n'
Contributor

In Java, the JVM JIT has a weird feature whereby it's actually quicker to assign names used inside loops rather than refer to things used outside the loops, e.g. right := ... would be quicker than right = .... On testing, this does not appear to be the case for go's compiler.

In Java, the JVM JIT has a weird feature whereby it's actually quicker to assign names used inside loops rather than refer to things used outside the loops, e.g. `right := ...` would be quicker than `right = ...`. On testing, this does not appear to be the case for go's compiler.
guillep2k (Migrated from github.com) reviewed 2019-11-12 12:08:33 +00:00
guillep2k (Migrated from github.com) commented 2019-11-12 12:08:33 +00:00

? @zeripath done.

? @zeripath done.
mrsdizzie (Migrated from github.com) approved these changes 2019-11-12 19:12:15 +00:00
mrsdizzie (Migrated from github.com) left a comment

This makes my eyes hurt but seems exponentially better than ones that don't sadly :)

Works fine on Mac.

This makes my eyes hurt but seems exponentially better than ones that don't sadly :) Works fine on Mac.
sapk (Migrated from github.com) approved these changes 2019-11-13 00:52:19 +00:00
This repo is archived. You cannot comment on pull requests.
No Milestone
No project
No Assignees
2 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: lunny/gitea#8925
No description provided.