When matching emoji, use a regex built from the data we have instead of something generic using unicode ranges. A generic regex can't tell the difference between two separate emoji next to each other or one emoji that is built out of two separate emoji next to each other.
This means that emoji that are next to each other without space in between will be now accurately spanned individually with proper title etc...
GH has different HardBreaks behaviour for markdown comments and documents.
Comments have hard breaks and documents have soft breaks - therefore Gitea's rendering will always be different from GH's if we only provide one setting.
Here we split the setting in to two - one for documents and one for comments and other things.
Signed-off-by: Andrew Thornton art27@cantab.net
Changes to index.js as per @silverwind
Co-authored-by: silverwind <me@silverwind.io>
Changes to docs as per @guillep2k
Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
* Add test
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Restore checkbox rendering and prevent poor sanitization of spans
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Also fix preview context
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Also fix preview context
Signed-off-by: Andrew Thornton <art27@cantab.net>
Now that emojify.js has been removed, get rid of all instances of has-emoji class that was only used for that. Support for rendering shortcodes should remain in all of these places so it should still work the same.
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lauris BH <lauris@nix.lv>
* Support unicode emojis and remove emojify.js
This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.
This works in a few ways:
First it adds emoji parsing support into gitea itself. This allows us to
* Render emojis from valid alias (😄)
* Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
* Easily allow for custom "emoji"
* Support all emoji rendering and features without javascript
* Uses plain unicode and lets the system render in appropriate emoji font
* Doesn't leave us relying on external sources for updates/fixes/features
That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)
For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.
The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.
I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.
I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.
Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.
Fixes: https://github.com/go-gitea/gitea/issues/9182
Fixes: https://github.com/go-gitea/gitea/issues/8974
Fixes: https://github.com/go-gitea/gitea/issues/8953
Fixes: https://github.com/go-gitea/gitea/issues/6628
Fixes: https://github.com/go-gitea/gitea/issues/5130
* add new shared function emojiHTML
* don't increase emoji size in issue title
* Update templates/repo/issue/view_content/add_reaction.tmpl
Co-Authored-By: 6543 <6543@obermui.de>
* Support for emoji rendering in various templates
* Render code and review comments as they should be
* Better way to handle mail subjects
* insert unicode from tribute selection
* Add template helper for plain text when needed
* Use existing replace function I forgot about
* Don't include emoji greater than Unicode Version 12
Only include emoji and aliases in JSON
* Update build/generate-emoji.go
* Tweak regex slightly to really match everything including random invisible characters. Run tests for every emoji we have
* final updates
* code review
* code review
* hard code gitea custom emoji to match previous behavior
* Update .eslintrc
Co-Authored-By: silverwind <me@silverwind.io>
* disable preempt
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
* Prevent panic during wrappedConn close at hammertime
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Update modules/graceful/server.go
* Fix extraneous debug in goldmark.go
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Fix checkbox rendering
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Normalize checkbox rendering
Signed-off-by: Andrew Thornton <art27@cantab.net>
* set the checkboxes to readonly instead of disabled
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
* Add control for the rendering of the frontmatter
* Add control to include a TOC
* Add control to set language - allows control of ToC header and CJK glyph choice.
Signed-off-by: Andrew Thornton art27@cantab.net
Annoyingly goldmarks SetAttributeString requires that
the value of the attribute is still a []byte but does
not make it clear in the documentation.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Fix task-list checkbox styling
The pandoc renderer will append the class "task-list" to the ul element
wrapping a li with one or more check-boxes. This allows us to select for
them, removing their list-style-type property. However, goldmark and the
gfm spec doesn't specify the "task-list" class name, so we can't use
that to fix the issue there.
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
* Update to goldmark v1.1.25
This version adds the missing space after a checkbox.
Resolves: #9656
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
Co-authored-by: zeripath <art27@cantab.net>
* Don't manually replace whitespace during render
For historical reasons Gitea manually alters the urlPrefix and replaces
a whitespace with a +. This Works for URLs, but we're also passing
urlPrefix to git calls and adding the + is breaking the tree path.
Goldmark will automatically convert a white space to the proper %20, so
we should leave the string as is which lets us pass it to git unmodified
and then let Goldmark fix it.
Also fixed separate bug in URLJoin I noticed while testing where it will
silently discard sections of a path that have # in them (possibly
others). We should just escape it first.
Fixes 10156
* Escape elems as well
* Revert "Escape elems as well"
This reverts commit 8bf49596fe.
* restart ci
* remove changes to URLJoin
* restart ci
Co-authored-by: techknowlogick <matti@mdranta.net>
* Move to goldmark
Markdown rendering moved from blackfriday to the goldmark.
Multiple subtle changes required to the goldmark extensions to keep
current rendering and defaults.
Can go further with goldmark linkify and have this work within markdown
rendering making the link processor unnecessary.
Need to think about how to go about allowing extensions - at present it
seems that these would be hard to do without recompilation.
* linter fixes
Co-authored-by: Lauris BH <lauris@nix.lv>
* Prefix all user-generated IDs in markup
* Add user-content- to IDs in unit-tests
* fixup markdown_test.go
* update the hrefs for the wiki test
* Add blackfriday extension regex
Signed-off-by: jolheiser <john.olheiser@gmail.com>
* Support custom sanitization policy
Allowing the gitea administrator to configure sanitization policy allows
them to couple external renders and custom templates to support more
markup. In particular, the `pandoc` renderer allows generating KaTeX
annotations, wrapping them in `<span>` elements with class `math` and
either `inline` or `display` (depending on whether or not inline or
block mode was requested).
This iteration gives the administrator whitelisting powers; carefully
crafted regexes will thus let through only the desired attributes
necessary to support their custom markup.
Resolves: #9054
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
* Document new sanitization configuration
- Adds basic documentation to app.ini.sample,
- Adds an example to the Configuration Cheat Sheet, and
- Adds extended information to External Renderers section.
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
* Drop extraneous length check in newMarkupSanitizer(...)
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
* Fix plural ELEMENT and ALLOW_ATTR in docs
These were left over from their initial names. Make them singular to
conform with the current expectations.
Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
* Add support for local vs. remote xrefs
* Add doc for references
* Docs: fix cases not currently supported
* One more doc fix
* Doc: mentions for teams and orgs
* Change !num ref concept, no change in functionality
* Fix test
* Improve table of issue reference types
* Fix paragraph mark
* Convert EOL to UNIX-style to render MD properly
* Update modules/markup/markdown/markdown.go
Co-Authored-By: zeripath <art27@cantab.net>
* Fix lint optimization
* Check for empty content before conversion
* Update modules/util/util.go
Co-Authored-By: zeripath <art27@cantab.net>
* Improved checks and tests
* Add paragraph render test
* Improve speed even more, improve tests
* Small improvement by @gary-kim
* Fix test for DOS
* More improvements
* Restart CI
* Make link last commit massages in repository home page and commit tables
* Use RenderCommitMessageLink instead surround with a
* deleted __debug_bin file
* Exclude email to link from latest commit title
* Exclude email processor from commit table
Co-Authored-By: mrsdizzie <info@mrsdizzie.com>
* Add class parameter to a html element creator functions.
Make links underline dashed that are not commit
* fix tests
* Show dashed underline when also not hovered
* feat: highlight issue references with :
e.g. #1287: my commit msg
e.g. ABC-1234: my commit msg
* ref: update model regex to consistent with issueNumericPattern
* test: check highlight issue with : in commits messages
* detect csv delimiter in csv rendering
fixes #7868
* make linter happy
* fix failing testcase & use ints where possible
* expose markup type to template
previously all markup had the .markdown class, which is incorrect,
as it applies markdown CSS & JS logic to CSV rendering
* fix build (missing `make css`)
* ignore quoted csv content for delimiter scoring
also fix html generation
* Check commit message hashes before making links
Previously, when formatting commit messages, anything
that looked like SHA1 hashes was turned into a link
using regex. This meant that certain phrases or numbers
such as `777777` or `deadbeef` could be recognized as a commit
even if the repository has no commit with those hashes.
This change will make it so that anything that looks
like a SHA1 hash using regex will then also be checked
to ensure that there is a commit in the repository
with that hash before making a link.
Signed-off-by: Gary Kim <gary@garykim.dev>
* Use gogit to check if commit exists
This commit modifies the commit hash check
in the render for commit messages to use
gogit for better performance.
Signed-off-by: Gary Kim <gary@garykim.dev>
* Make code cleaner
Signed-off-by: Gary Kim <gary@garykim.dev>
* Use rev-parse to check if commit exists
Signed-off-by: Gary Kim <gary@garykim.dev>
* Add and modify tests for checking hashes in html link rendering
Signed-off-by: Gary Kim <gary@garykim.dev>
* Return error in sha1CurrentPatternProcessor
Co-Authored-By: mrsdizzie <info@mrsdizzie.com>
* Import Gitea log module
Signed-off-by: Gary Kim <gary@garykim.dev>
* Revert "Return error in sha1CurrentPatternProcessor"
This reverts commit 28f561cac4.
Signed-off-by: Gary Kim <gary@garykim.dev>
* Add debug logging to sha1CurrentPatternProcessor
This will log errors by the git command run in
sha1CurrentPatternProcessor if the error is one
that was unexpected.
Signed-off-by: Gary Kim <gary@garykim.dev>
Since #6273 was merged, we now have access to proper context metas
always. Update SHA generated links to use these instead of urlPrefix.
Update tests as well.
Fixes #4536.
* Improve issue autolinks
Update autolinks to match what github does here:
Issue in same repo: #1
Issue in different repo: org/repo#1
Fixes #6264
* Use setting.AppURL when parsing URL
Using setting.AppURL here is a more reliable way of parsing the current
URL and what other functions in this file seem to use.
* Make ComposeMetas always return a valid context
* Add per repository markdown renderers for better context
* Update for use of context metas
Now that we include the user and repo name inside context metas, update
various code and tests for this new logic