admin: installation: remote data folder (#193)

Documentation proposal for remote data mounting, following up from https://codeberg.org/forgejo/forgejo/issues/1590 Co-authored-by: Daniel Bischof <daniel.bischof@protonmail.com> Reviewed-on: https://codeberg.org/forgejo/docs/pulls/193 Reviewed-by: Earl Warren <earl-warren@noreply.codeberg.org> Co-authored-by: dbischof90 <dbischof90@noreply.codeberg.org> Co-committed-by: dbischof90 <dbischof90@noreply.codeberg.org>
2024-11-24 18:09:26 -05:00 · 2023-10-14 17:42:40 +00:00 · 2023-10-14 17:42:40 +00:00 · cabb9fca11
commit cabb9fca11
parent 26ce6aa32b
1 changed files with 124 additions and 21 deletions
--- a/docs/admin/installation.md
+++ b/docs/admin/installation.md
@ -146,6 +146,109 @@ services:
 +      - ./postgres:/var/lib/postgresql/data
 ```

+### Hosting repository data on remote storage systems
+
+You might also mount the data and repository folders on a remote drive such as a
+network-attached storage system. While there are a multitude of possible solutions,
+we will focus on a somewhat minimal setup with NFS here and explain what
+measures have to be taken in general so that the administrators can adapt this to
+their individual setup.
+
+We begin to describe a possible setup and will try to highlight all important aspects which
+the administrator will have to consider if a different hosting environment is present.
+An important assumption for the Forgejo image to make is to own the folders it writes into
+and reads from. This is naturally an issue since file-system permissions are a machine-local
+concept and don't translate over the network easily.
+
+We assume that a server with the hostname `server` is accessible which has a folder `/respositories`
+shared via NFS. Append an entry to your `/etc/exports` like
+
+```shell
+[...]
+/repositories	*(rw,sync,all_squash,ec=sys,anonuid=1024,anongid=100)
+```
+
+Four aspects to consider:
+
+- The folder is mounted as `rw`, meaning clients can both read and write in the folder.
+- The folder is mounted as `sync`. This is NFS-specific but means that transactions block until they are finished. This is
+  not essential but increases the robustness against file corruption
+- The `all_squash` setting maps all file accesses to an anonymous user, meaning that both the files of a user with the UID of `1050`
+  and `1051` are mapped to a single `UID` on the server.
+- We set these anonymous (G/U)ID to explicit values on the server with `anonuid=1024,anongid=100`. Hence all files will be owned by
+  a user with the UID `1024`, belonging to a group `100`. Make sure the UID is available and a group with that ID is present.
+
+Effectively we are now able to write and create files and folders on the remote share. With the `all_squash` setting, we map
+all users to one user, hence all data writable by one user is writable by all users, implying all files have a `drwxrwxrwx`
+setting (abreviated "`0777` permissions"). We can also "fake-own" data, since all `chown` calls are now mapped to the anonymous user. This is an
+important behaviour.
+We now mount this folder on the `client` which will host Forgejo to a folder `/mnt/repositories`...
+
+```shell
+# mount -o hard,timeo=10,retry=10,vers=4.1 server:/repositories /mnt/repositories/
+```
+
+... and create two folders
+
+```shell
+$ mkdir conf
+$ mkdir data
+```
+
+To consider in the client setup is the `hard` setting, blocking all file operations if the share is not available.
+This prevents state changes in the repository which could potentially corrupt the repository data and is an NFS-specific setting.
+
+To circumvent this, you can use the
+We will use the `rootless` image, which hosts the `ssh` server for Forgejo embedded. A possible entry for a `docker-compose` file
+would look like this (shown as a `diff like` view to the example shown [in our initial example](#installation-with-docker)):
+
+```yaml
+version: "3"
+
+networks:
+  forgejo:
+    external: false
+
+services:
+  server:
+-    image: codeberg.org/forgejo/forgejo:1.20
+    image: codeberg.org/forgejo/forgejo:1.20-rootless
+    container_name: forgejo
+    environment:
+      - USER_UID=1024
+      - USER_GID=100
+-      - USER_UID=1000
+-      - USER_GID=1000
+
+    restart: always
+    networks:
+      - forgejo
+    volumes:
+-      - ./forgejo:/var/lib/gitea
+      - /mnt/repositories/data:/var/lib/gitea
+      - /mnt/repositories/conf:/etc/gitea
+      - /etc/timezone:/etc/timezone:ro
+      - /etc/localtime:/etc/localtime:ro
+    ports:
+      - "3000:3000"
+      - "222:22"
+```
+
+This will write the configuration into our created `conf` folder and all other data into the `data` folder.
+Make sure that `USER_UID` and `USER_GID` match the `anonuid` and `anongid` setting
+in the NFS server setting here such that the Forgejo user sees files and folders with the same UID and GID
+in the respective folders and thus identifies itself as the sole owner of the folder structure.
+
+Using the `rootless` image here solves another problem resulting from the file-system ownership issue.
+If we create ssh keys on the `client` image and save them on the `server`, they too will have `0777` permissions, which is prohibited by `openssh`.
+It is important for all involved tools that these files not be writable by just anybody with a login, so you would get you an error if you try to use them.
+Changing permissions will also not succeed through the chosen `all_squash` setup, which was necessary to allow a correct ownership
+mechanic on the server. To resolve this, we consider the `rootless` image, which embeds the `ssh` server, circumventing the problem entirely.
+
+Note that this is a comparatively simple setup which does not necessarily reflect the reality of your network.
+User mapping and ownership could theoretically be streamlined better with Kerberos, which is however out of scope
+for this guide.
+
 ## Installation from binary

 ### Install Forgejo and git, create git user