'actions/checkout@v3' with LFS fails because of double auth header #164

Open
opened 2023-04-29 21:55:49 +00:00 by stavae · 8 comments

The issue:

Using 'actions/checkout@v3' with LFS fails to get the LFS files. It appears to be caused by a duplicate authentication header.

I have no idea why this happens or even who is responsible for it ('actions/checkout@v3' vs 'gitea/act_runner' vs 'gitea' vs 'something in my setup').

Reproduce steps:

  1. Have gitea running behind nginx (swag) proxy/act running
  2. Have a test project with LFS files
  3. Have the provided .gitea/workflows/release.yml (00_release.txt)
  4. Run the action

This will fail with (see 01_lfs fail.txt for entire log):

[command]/usr/bin/git lfs fetch origin refs/remotes/origin/master
fetch: Fetching reference refs/remotes/origin/master
LFS: Client error: https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs/objects/<FILE ID>
error: failed to fetch some objects from 'https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs'
The process '/usr/bin/git' failed with exit code 2

Checking nginx proxy logs (if error log verbosity is set to info) it shows that the auth header is used twice:

2023/04/29 23:05:49 [info] 240#240: *10461 client sent duplicate header line: "authorization: basic <AUTH TOKEN>", previous value: "authorization: basic <AUTH TOKEN>" while reading client request headers, client: 192.168.3.1, server: gitea-test.my-domain.com, host: "gitea-test.my-domain.com"

This result (in the nginx proxy) can be reproduced more easily by manually curl-ing the file with the double header; like so:

root@ba7335b1c49e:/<USERNAME>/test# curl  https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs/objects/<FILE ID> -H "Authorization: basic <AUTH TOKEN>" -H "Authorization: basic <AUTH TOKEN>"
<html>
<head><title>400 Bad Request</title></head>
<body>
<center><h1>400 Bad Request</h1></center>
<hr><center>nginx</center>
</body>
</html>

If we only provide one header everyhing works as expected:

root@ba7335b1c49e:/<USERNAME>/test# curl  https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs/objects/<FILE ID> -H "Authorization: basic <AUTH TOKEN>"
<the file contents>

A more verbose output of git lfs fetch can be found at 02_verbose lfs fetch.txt

Versions used:

  • Host machine - Linux version 5.4.0-147-generic (buildd@lcy02-amd64-067) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #164-Ubuntu SMP Tue Mar 21 14:23:17 UTC 2023
  • linuxserver/nginx [docker] - 1.22.1-r0-ls213 Build-date:- 2023-03-03T14:34:20+01:00
  • gitea/gitea [docker] - "org.opencontainers.image.created": "2023-04-13T02:25:20Z", "org.opencontainers.image.revision": "447fa6715ca56ea0a7d2d411d82bced4a6ffec31",
  • act_runner [docker (ghcr.io/linuxserver/baseimage-ubuntu:jammy)] - version v0.1.2-10-g293926f
# The issue: Using 'actions/checkout@v3' with LFS fails to get the LFS files. It appears to be caused by a duplicate authentication header. I have no idea why this happens or even who is responsible for it ('actions/checkout@v3' vs 'gitea/act_runner' vs 'gitea' vs 'something in my setup'). # Reproduce steps: 0. Have gitea running behind nginx (swag) proxy/act running 1. Have a test project with LFS files 2. Have the provided .gitea/workflows/release.yml (`00_release.txt`) 3. Run the action This will fail with (see `01_lfs fail.txt` for entire log): ``` [command]/usr/bin/git lfs fetch origin refs/remotes/origin/master fetch: Fetching reference refs/remotes/origin/master LFS: Client error: https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs/objects/<FILE ID> error: failed to fetch some objects from 'https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs' The process '/usr/bin/git' failed with exit code 2 ``` Checking nginx proxy logs (if error log verbosity is set to `info`) it shows that the auth header is used twice: ``` 2023/04/29 23:05:49 [info] 240#240: *10461 client sent duplicate header line: "authorization: basic <AUTH TOKEN>", previous value: "authorization: basic <AUTH TOKEN>" while reading client request headers, client: 192.168.3.1, server: gitea-test.my-domain.com, host: "gitea-test.my-domain.com" ``` This result (in the nginx proxy) can be reproduced more easily by manually curl-ing the file with the double header; like so: ``` root@ba7335b1c49e:/<USERNAME>/test# curl https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs/objects/<FILE ID> -H "Authorization: basic <AUTH TOKEN>" -H "Authorization: basic <AUTH TOKEN>" <html> <head><title>400 Bad Request</title></head> <body> <center><h1>400 Bad Request</h1></center> <hr><center>nginx</center> </body> </html> ``` If we only provide one header everyhing works as expected: ``` root@ba7335b1c49e:/<USERNAME>/test# curl https://gitea-test.my-domain.com/<USERNAME>/test.git/info/lfs/objects/<FILE ID> -H "Authorization: basic <AUTH TOKEN>" <the file contents> ``` A more verbose output of `git lfs fetch` can be found at `02_verbose lfs fetch.txt` # Versions used: * Host machine - Linux version 5.4.0-147-generic (buildd@lcy02-amd64-067) (gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)) #164-Ubuntu SMP Tue Mar 21 14:23:17 UTC 2023 * linuxserver/nginx [docker] - 1.22.1-r0-ls213 Build-date:- 2023-03-03T14:34:20+01:00 * gitea/gitea [docker] - "org.opencontainers.image.created": "2023-04-13T02:25:20Z", "org.opencontainers.image.revision": "447fa6715ca56ea0a7d2d411d82bced4a6ffec31", * act_runner [docker (ghcr.io/linuxserver/baseimage-ubuntu:jammy)] - version v0.1.2-10-g293926f
lunny added the
kind
bug
label 2023-04-30 03:06:44 +00:00
Owner

Did you use https://gitea.com/actions/checkout@v3 or https://github.com/actions/checkout@v3?

Did you use `https://gitea.com/actions/checkout@v3` or `https://github.com/actions/checkout@v3`?
Author

From the Setup job log:

  🐳  docker network connect GITEA-ACTIONS-TASK-30_WORKFLOW-Release_JOB-Build-network GITEA-ACTIONS-TASK-30_WORKFLOW-Release_JOB-Build
Writing entry to tarball workflow/event.json len:4343
Writing entry to tarball workflow/envs.txt len:0
Extracting content to '/var/run/act/'
  ☁  git clone 'https://gitea.com/actions/checkout' # ref=v3
  cloning https://gitea.com/actions/checkout to /root/.cache/act/actions-checkout@v3
Cloned https://gitea.com/actions/checkout to /root/.cache/act/actions-checkout@v3
Checked out v3

-> gitea version

From the `Setup job` log: ``` 🐳 docker network connect GITEA-ACTIONS-TASK-30_WORKFLOW-Release_JOB-Build-network GITEA-ACTIONS-TASK-30_WORKFLOW-Release_JOB-Build Writing entry to tarball workflow/event.json len:4343 Writing entry to tarball workflow/envs.txt len:0 Extracting content to '/var/run/act/' ☁ git clone 'https://gitea.com/actions/checkout' # ref=v3 cloning https://gitea.com/actions/checkout to /root/.cache/act/actions-checkout@v3 Cloned https://gitea.com/actions/checkout to /root/.cache/act/actions-checkout@v3 Checked out v3 ``` -> gitea version
Author

Here is a pretty hacky workaround for anyone with the same issue:

      - name: Checkout
        uses: actions/checkout@v3
        with:
#          lfs: 'true'

      - name: Checkout LFS
        run: |
          function EscapeForwardSlash() { echo "$1" | sed 's/\//\\\//g'; }
          readonly ReplaceStr="EscapeForwardSlash ${{ gitea.repository }}.git/info/lfs/objects/batch"; sed -i "s/\(\[http\)\( \".*\)\"\]/\1\2`$ReplaceStr`\"]/" .git/config

          git config --local lfs.transfer.maxretries 1

          /usr/bin/git lfs fetch    origin refs/remotes/origin/${{ gitea.ref_name }}
          /usr/bin/git lfs checkout

This replaces (in .git/config)

[http "https://my-domain.com/"]
        extraheader = AUTHORIZATION: basic <AUTH TOKEN>

with

[http "https://my-domain.com/<USER NAME>/<PROJECT NAME>.git/info/lfs/objects/batch"]
        extraheader = AUTHORIZATION: basic <AUTH TOKEN>

-> This way the auth token is still used for the 'which LFS files are there' stage, but not for the actual download of those files.
(the actual extra extraheader = AUTHORIZATION: basic <AUTH TOKEN> stuff is set by checkout)

Setting the following in .git/config also works:

[http "https://my-domain.com/"]
        extraheader = AUTHORIZATION: basic <AUTH TOKEN>
[http "https://my-domain.com/<USERNAME>/<PROJECT>.git/info/lfs/objects/batch"]
        extraheader = AUTHORIZATION: basic <AUTH TOKEN>
[http "https://my-domain.com/<USERNAME>/<PROJECT>.git/info/lfs/objects/"]
        extraheader = ""

This option is probably nicer (seeing as only the LFS files themselves dont have the extra header), but I am having issues getting the actual workflow to generate the config file in this way.

Here is a pretty hacky workaround for anyone with the same issue: ``` - name: Checkout uses: actions/checkout@v3 with: # lfs: 'true' - name: Checkout LFS run: | function EscapeForwardSlash() { echo "$1" | sed 's/\//\\\//g'; } readonly ReplaceStr="EscapeForwardSlash ${{ gitea.repository }}.git/info/lfs/objects/batch"; sed -i "s/\(\[http\)\( \".*\)\"\]/\1\2`$ReplaceStr`\"]/" .git/config git config --local lfs.transfer.maxretries 1 /usr/bin/git lfs fetch origin refs/remotes/origin/${{ gitea.ref_name }} /usr/bin/git lfs checkout ``` This replaces (in `.git/config`) ``` [http "https://my-domain.com/"] extraheader = AUTHORIZATION: basic <AUTH TOKEN> ``` with ``` [http "https://my-domain.com/<USER NAME>/<PROJECT NAME>.git/info/lfs/objects/batch"] extraheader = AUTHORIZATION: basic <AUTH TOKEN> ``` -> This way the auth token is still used for the 'which LFS files are there' stage, but not for the actual download of those files. (the actual extra `extraheader = AUTHORIZATION: basic <AUTH TOKEN>` stuff is set by checkout) Setting the following in `.git/config` also works: ``` [http "https://my-domain.com/"] extraheader = AUTHORIZATION: basic <AUTH TOKEN> [http "https://my-domain.com/<USERNAME>/<PROJECT>.git/info/lfs/objects/batch"] extraheader = AUTHORIZATION: basic <AUTH TOKEN> [http "https://my-domain.com/<USERNAME>/<PROJECT>.git/info/lfs/objects/"] extraheader = "" ``` This option is probably nicer (seeing as *only* the LFS files themselves dont have the extra header), but I am having issues getting the actual workflow to generate the config file in this way.
Author

This option is probably nicer (seeing as only the LFS files themselves dont have the extra header), but I am having issues getting the actual workflow to generate the config file in this way.

Figured it out:

      - name: Checkout LFS
        run: |
          UrlBase=$GITHUB_SERVER_URL; \
          UrlLfsBase=$UrlBase/${{ gitea.repository }}.git/info/lfs/objects; \
          Auth=`/usr/bin/git config --get --local http.$UrlBase/.extraheader`; \
          /usr/bin/git config --local http.${UrlLfsBase}/batch.extraheader "$Auth"; \
          /usr/bin/git config --local http.${UrlLfsBase}/.extraheader ''

          git config --local lfs.transfer.maxretries 1

          /usr/bin/git lfs fetch    origin refs/remotes/origin/${{ gitea.ref_name }}
          /usr/bin/git lfs checkout

Note that the wierd UrlBase=$GITHUB_SERVER_URL is because does not behave like a normal variable?

For example. This:

      - run: WAT=$GITHUB_SERVER_URL; echo "$A -- $B -- $C -- $D -- $WAT"
        env:
          A: $GITHUB_SERVER_URL
          B: ${GITHUB_SERVER_URL}
          C: ${{ vars.GITHUB_SERVER_URL }}
          D: $$GITHUB_SERVER_URL

Outputs $GITHUB_SERVER_URL -- ${GITHUB_SERVER_URL} -- -- $$GITHUB_SERVER_URL -- https://my-domain.com

> This option is probably nicer (seeing as *only* the LFS files themselves dont have the extra header), but I am having issues getting the actual workflow to generate the config file in this way. Figured it out: ``` - name: Checkout LFS run: | UrlBase=$GITHUB_SERVER_URL; \ UrlLfsBase=$UrlBase/${{ gitea.repository }}.git/info/lfs/objects; \ Auth=`/usr/bin/git config --get --local http.$UrlBase/.extraheader`; \ /usr/bin/git config --local http.${UrlLfsBase}/batch.extraheader "$Auth"; \ /usr/bin/git config --local http.${UrlLfsBase}/.extraheader '' git config --local lfs.transfer.maxretries 1 /usr/bin/git lfs fetch origin refs/remotes/origin/${{ gitea.ref_name }} /usr/bin/git lfs checkout ``` Note that the wierd `UrlBase=$GITHUB_SERVER_URL` is because does not behave like a normal variable? For example. This: ``` - run: WAT=$GITHUB_SERVER_URL; echo "$A -- $B -- $C -- $D -- $WAT" env: A: $GITHUB_SERVER_URL B: ${GITHUB_SERVER_URL} C: ${{ vars.GITHUB_SERVER_URL }} D: $$GITHUB_SERVER_URL ``` Outputs `$GITHUB_SERVER_URL -- ${GITHUB_SERVER_URL} -- -- $$GITHUB_SERVER_URL -- https://my-domain.com`
Member

I reproduced this bug without proxy. Maybe it's a bug from actions/checkout, I'll do more test to check.

The cause of this bug is complex.

I'm not sure how to fix the issue properly. Maybe we should improve Gitea's logic so that it doesn't return the Authorization header in this case?

I reproduced this bug without proxy. ~~Maybe it's a bug from `actions/checkout`, I'll do more test to check.~~ The cause of this bug is complex. - `actions/checkout` always set the `http.extraheader` configuration like: ``` [http "https://my-domain.com/"] extraheader = AUTHORIZATION: basic <AUTH TOKEN> ``` Related code: - https://github.com/actions/checkout/blob/f095bcc56b7c2baf48f3ac70d6d6782f4f553222/src/git-auth-helper.ts#L275-L301 - https://github.com/actions/checkout/blob/f095bcc56b7c2baf48f3ac70d6d6782f4f553222/src/git-auth-helper.ts#L54-L63 - When calling `git lfs fetch`, git will call the `info/lfs/objects/batch` API firstly before downloading the object. In the response of the API, some information about how to download the object will be returned, including the link and headers. The headers contains `Authorization`. Code: - https://github.com/go-gitea/gitea/blob/f17a4358f45a4b870f3202379dcb05358114e662/services/lfs/server.go#L434-L463 - Now there are two `Authorization` headers. One is from the `http.extraheader` configuration, and another is from the `info/lfs/objects/batch` API. `git-lfs` will add the two `Authorization` to the headers when creating the download request. - Add the headers from the `info/lfs/objects/batch` API https://github.com/git-lfs/git-lfs/blob/8e96b5d5d84095d1c0dd0c550d8fdf2c8c5c6456/tq/adapterbase.go#L202-L225 - Add the headers from `http.extraheader`. Note that `append` is used here to add headers. https://github.com/git-lfs/git-lfs/blob/8e96b5d5d84095d1c0dd0c550d8fdf2c8c5c6456/lfshttp/client.go#L232-L249 I'm not sure how to fix the issue properly. Maybe we should improve Gitea's logic so that it doesn't return the `Authorization` header in this case?

I've run into the same problem. What's worse: @stavae's workaround doesn't seem to work when setting ssh-key in the checkout step, which in turn is required when using git submodules from private repositories. So I can only use either submodules or LFS, not both.

I'd like to vote for this issue to be taken a look at 🙂.

I've run into the same problem. What's worse: @stavae's workaround doesn't seem to work when setting `ssh-key` in the checkout step, which in turn is required when using git submodules from private repositories. So I can only use either submodules or LFS, not both. I'd like to vote for this issue to be taken a look at 🙂.

This is issue also occurs on https://gitea.com/actions/checkout@v4 and https://github.com/actions/checkout@v4.

I've included a log file of my runner below.

This is issue also occurs on ``https://gitea.com/actions/checkout@v4`` and ``https://github.com/actions/checkout@v4``. I've included a log file of my runner below.

I have the same problem and not been able to find a solution for my submodules.
Unfortunately, the solution described by @stavae only worked with the main repo.

I create a ticket for this problem on the right place: https://github.com/actions/checkout/issues/1830

I have the same problem and not been able to find a solution for my submodules. Unfortunately, the solution described by @stavae only worked with the main repo. I create a ticket for this problem on the right place: https://github.com/actions/checkout/issues/1830
Sign in to join this conversation.
No Milestone
No Assignees
6 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: gitea/act_runner#164
No description provided.