Federated collaborators and organizations #11

Open
opened 2022-07-22 16:32:11 +00:00 by xy · 1 comment
Owner

Both of these are a huge headache, which is why I've intentionally omitted them in my current PR to upstream Gitea.

The goal of federated collaborators is to add a remote user as a collaborator to a repository. This is mainly difficult because the collaborator will only be able to modify the repository on their remote instance (unless we do some SSO wizardry), so changes must be synchronized between the copies on the two instances.

Why do we need federated collaborators?

One "solution" to this problem would be to simply not implement federated collaborators. Then the collaborators for a repo can only be users on the same instance as the repo. This is very inconvenient because if you are a maintainer of two projects on different instances, you will have to create two accounts to be a collaborator on both projects, which is contrary to the federation goal of using one account across all instances. For a concrete example of this, I would like to eventually collaborate on the ForgeFed Codeberg repo using my account on git.exozy.me instead of my Codeberg account.

Challenges

The only way to sanely implement federated collaborators is to designate one copy as the main copy and single source of truth and all other copies as secondary copies. Naturally, the instance that "owns" the repo should be the main copy. For instance, the repo alice@alice.com/hello-world could be replicated on many instances, but the copy on alice.com is the main copy. If Bobert on bobert.com wants to create a new issue on Alice's repo, the issue (as ForgeFed Ticket) would first be sent to alice.com, and then alice.com sends out the Ticket to all the copies.

For synchronizing code, there are several methods we could use here, since we really don't want sync conflicts and changes to be overwritten.

  1. When pushing to a copy, the copy is not modified and the push is redirected to the main copy using Git. The main copy force pushes to all other copies.
  2. When pushing to a copy, the copy is not modified and the push is redirected to the main copy using ForgeFed. The main copy force pushes to all other copies.
  3. The copies display the main copy's clone URL, so all Git operations only go to the main copy. This requires some work to be able to authenticate a push by a user from another instance. The main copy force pushes to all other copies.

Another idea could be to use the F3 format for doing the one-way synchronization.

Federated organizations

Federated organizations are similar to federated users, except with more headaches. Like with federated collaborators, there should be one main copy which is the single source of truth. For instance, if Alice creates the organization hello-world-org on alice.com and adds Bobert on bobert.com as a member, both alice.com and bobert.com will have a copy of the org, but alice.com's copy is the main copy.

In Gitea, organizations can have multiple teams with different permissions, so one way to model this with ActivityStreams to represent the entire org as an Organization actor, with Group actors for each team. We can then reuse some of the existing AS properties to represent the members of the Group (for instance with items) and permissions.

If Bobert modifies an organization, the modification activity should first be sent to alice.com, and then alice.com sends out the activity to all copies of the org.

Now if Bobert wants to create a repository owned by hello-world-org, he can only create repositories on bobert.com. bobert.com should send a Create activity to alice.com to create the repository on the main copy of the org, and then send over the contents of the repo using the federated collaborators mechanisms above.

For the grand finale, let's say there's a third user, Charles on charles.com with a repo called example-repo. Bobert (on bobert.com) would like to fork this repository to the hello-world-org organization on alice.com. This sounds like the ultimate headache, but we now have everything we need to deal with nightmarish situations like this. First, Bobert goes to charles.com/Charles/example-repo and clicks fork, where he's redirected to bobert.com. (See #7 for more info about federated forking) Now Bobert chooses hello-world-org@alice.com as the recipient of the fork. This causes a Create activity to be sent to alice.com, where the fork is created, and this is synced back to the copy of hello-world-org on bobert.com. And that's it! Whew!

The lesson here is that we need to design federated features to be composable and as building blocks for more complicated behaviors. And of course, all the ideas above are just a proposal, so there might be a better way to do federated collaborators and organizations. Feedback welcome!

Both of these are a huge headache, which is why I've intentionally omitted them in my [current PR to upstream Gitea](https://github.com/go-gitea/gitea/pull/203910). The goal of federated collaborators is to add a remote user as a collaborator to a repository. This is mainly difficult because the collaborator will only be able to modify the repository on their remote instance (unless we do some SSO wizardry), so changes must be synchronized between the copies on the two instances. ## Why do we need federated collaborators? One "solution" to this problem would be to simply not implement federated collaborators. Then the collaborators for a repo can only be users on the same instance as the repo. This is very inconvenient because if you are a maintainer of two projects on different instances, you will have to create two accounts to be a collaborator on both projects, which is contrary to the federation goal of using one account across all instances. For a concrete example of this, I would like to eventually collaborate on the ForgeFed Codeberg repo using my account on git.exozy.me instead of my Codeberg account. ## Challenges The only way to sanely implement federated collaborators is to designate one copy as the main copy and single source of truth and all other copies as secondary copies. Naturally, the instance that "owns" the repo should be the main copy. For instance, the repo alice@alice.com/hello-world could be replicated on many instances, but the copy on alice.com is the main copy. If Bobert on bobert.com wants to create a new issue on Alice's repo, the issue (as ForgeFed Ticket) would first be sent to alice.com, and then alice.com sends out the Ticket to all the copies. For synchronizing code, there are several methods we could use here, since we really don't want sync conflicts and changes to be overwritten. 1. When pushing to a copy, the copy is not modified and the push is redirected to the main copy using Git. The main copy force pushes to all other copies. 2. When pushing to a copy, the copy is not modified and the push is redirected to the main copy using [ForgeFed](https://forgefed.org/behavior.html#reporting-pushed-commits). The main copy force pushes to all other copies. 3. The copies display the main copy's clone URL, so all Git operations only go to the main copy. This requires some work to be able to authenticate a push by a user from another instance. The main copy force pushes to all other copies. Another idea could be to use the F3 format for doing the one-way synchronization. ## Federated organizations Federated organizations are similar to federated users, except with more headaches. Like with federated collaborators, there should be one main copy which is the single source of truth. For instance, if Alice creates the organization hello-world-org on alice.com and adds Bobert on bobert.com as a member, both alice.com and bobert.com will have a copy of the org, but alice.com's copy is the main copy. In Gitea, organizations can have multiple teams with different permissions, so one way to model this with ActivityStreams to represent the entire org as an Organization actor, with Group actors for each team. We can then reuse some of the existing AS properties to represent the members of the Group (for instance with `items`) and permissions. If Bobert modifies an organization, the modification activity should first be sent to alice.com, and then alice.com sends out the activity to all copies of the org. Now if Bobert wants to create a repository owned by hello-world-org, he can only create repositories on bobert.com. bobert.com should send a Create activity to alice.com to create the repository on the main copy of the org, and then send over the contents of the repo using the federated collaborators mechanisms above. For the grand finale, let's say there's a third user, Charles on charles.com with a repo called example-repo. Bobert (on bobert.com) would like to fork this repository to the hello-world-org organization on alice.com. This sounds like the ultimate headache, but we now have everything we need to deal with nightmarish situations like this. First, Bobert goes to charles.com/Charles/example-repo and clicks fork, where he's redirected to bobert.com. (See #7 for more info about federated forking) Now Bobert chooses hello-world-org@alice.com as the recipient of the fork. This causes a Create activity to be sent to alice.com, where the fork is created, and this is synced back to the copy of hello-world-org on bobert.com. And that's it! Whew! The lesson here is that we need to design federated features to be composable and as building blocks for more complicated behaviors. And of course, all the ideas above are just a proposal, so there might be a better way to do federated collaborators and organizations. Feedback welcome!
Author
Owner

This also depends on https://codeberg.org/ForgeFed/ForgeFed/pulls/156 to notify an instance that a user on that instance was added as a collaborator to a remote repository.

This also depends on https://codeberg.org/ForgeFed/ForgeFed/pulls/156 to notify an instance that a user on that instance was added as a collaborator to a remote repository.
This repo is archived. You cannot comment on issues.
No Label
No Milestone
No project
No Assignees
1 Participants
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: xy/gitea#11
No description provided.