From 5958b3a61889ce98e8976c04eb91b4600a75f6d8 Mon Sep 17 00:00:00 2001 From: Gergely Nagy Date: Thu, 28 Dec 2023 11:46:34 +0100 Subject: [PATCH] admin: database: Suggest a better collation for MySQL In `admin/database-preparation` suggest `utf8mb4_bin` as the collate function, rather than `utf8mb4_unicode_ci`. The former is accent- and case sensitive, while the latter isn't, and Forgejo assumes that columns are case sensitive. Also add a short paragraph explaining why `utf8mb4_bin` is suggested (case sensitivity), and what problems may arise and why if case insensitive collation is used. This partially addresses forgejo/forgejo#2039. Signed-off-by: Gergely Nagy --- docs/admin/database-preparation.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/admin/database-preparation.md b/docs/admin/database-preparation.md index cb5e359e..88efd840 100644 --- a/docs/admin/database-preparation.md +++ b/docs/admin/database-preparation.md @@ -46,14 +46,16 @@ Note: All steps below requires that the database engine of your choice is instal Replace username and password above as appropriate. -4. Create database with UTF-8 charset and collation. Make sure to use `utf8mb4` charset instead of `utf8` as the former supports all Unicode characters (including emojis) beyond _Basic Multilingual Plane_. Also, collation chosen depending on your expected content. When in doubt, use either `unicode_ci` or `general_ci`. +4. Create database with UTF-8 charset and collation. Make sure to use `utf8mb4` charset instead of `utf8` as the former supports all Unicode characters (including emojis) beyond _Basic Multilingual Plane_. Also, collation chosen depending on your expected content, but make sure that the collation is accent- and case sensitive. When in doubt, use `utf8mb4_bin`. ```sql - CREATE DATABASE forgejodb CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'; + CREATE DATABASE forgejodb CHARACTER SET 'utf8mb4' COLLATE 'utf8mb4_bin'; ``` Replace database name as appropriate. + Using an accent- and case sensitive collation such as `utf8mb4_bin` is important, because Forgejo often relies on these sensitivities, and if those assumptions are broken, that may lead to internal server errors or other unexpected results. + 5. Grant all privileges on the database to database user created above. For local database: