- Fix emoji not being replaced in issue title change text
- Make the image attributes consistent, add alt, remove align
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Fix emoji detection certain cases
Previous tests weren't complicated enough so there were some situations where emojis were't detected properly. Find the earliest occurance in addition to checking for the longest combination.
Fixes #12312
* ok spell bot
Co-authored-by: Lauris BH <lauris@nix.lv>
* Disable all typographic replacements in markdown renderer
Previously we only disabled some of them. This disables all the default
replacements that goldmark's typographer extension offers, matching
GitHub's renderer.
Ref: https://github.com/yuin/goldmark#typographer-extension
Fixes: https://github.com/go-gitea/gitea/issues/11001
* remove typographer extension completely
* fix test
* really fix test
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
When matching emoji, use a regex built from the data we have instead of something generic using unicode ranges. A generic regex can't tell the difference between two separate emoji next to each other or one emoji that is built out of two separate emoji next to each other.
This means that emoji that are next to each other without space in between will be now accurately spanned individually with proper title etc...
* Support unicode emojis and remove emojify.js
This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.
This works in a few ways:
First it adds emoji parsing support into gitea itself. This allows us to
* Render emojis from valid alias (😄)
* Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
* Easily allow for custom "emoji"
* Support all emoji rendering and features without javascript
* Uses plain unicode and lets the system render in appropriate emoji font
* Doesn't leave us relying on external sources for updates/fixes/features
That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)
For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.
The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.
I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.
I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.
Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.
Fixes: https://github.com/go-gitea/gitea/issues/9182
Fixes: https://github.com/go-gitea/gitea/issues/8974
Fixes: https://github.com/go-gitea/gitea/issues/8953
Fixes: https://github.com/go-gitea/gitea/issues/6628
Fixes: https://github.com/go-gitea/gitea/issues/5130
* add new shared function emojiHTML
* don't increase emoji size in issue title
* Update templates/repo/issue/view_content/add_reaction.tmpl
Co-Authored-By: 6543 <6543@obermui.de>
* Support for emoji rendering in various templates
* Render code and review comments as they should be
* Better way to handle mail subjects
* insert unicode from tribute selection
* Add template helper for plain text when needed
* Use existing replace function I forgot about
* Don't include emoji greater than Unicode Version 12
Only include emoji and aliases in JSON
* Update build/generate-emoji.go
* Tweak regex slightly to really match everything including random invisible characters. Run tests for every emoji we have
* final updates
* code review
* code review
* hard code gitea custom emoji to match previous behavior
* Update .eslintrc
Co-Authored-By: silverwind <me@silverwind.io>
* disable preempt
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
* Move to goldmark
Markdown rendering moved from blackfriday to the goldmark.
Multiple subtle changes required to the goldmark extensions to keep
current rendering and defaults.
Can go further with goldmark linkify and have this work within markdown
rendering making the link processor unnecessary.
Need to think about how to go about allowing extensions - at present it
seems that these would be hard to do without recompilation.
* linter fixes
Co-authored-by: Lauris BH <lauris@nix.lv>
* Check commit message hashes before making links
Previously, when formatting commit messages, anything
that looked like SHA1 hashes was turned into a link
using regex. This meant that certain phrases or numbers
such as `777777` or `deadbeef` could be recognized as a commit
even if the repository has no commit with those hashes.
This change will make it so that anything that looks
like a SHA1 hash using regex will then also be checked
to ensure that there is a commit in the repository
with that hash before making a link.
Signed-off-by: Gary Kim <gary@garykim.dev>
* Use gogit to check if commit exists
This commit modifies the commit hash check
in the render for commit messages to use
gogit for better performance.
Signed-off-by: Gary Kim <gary@garykim.dev>
* Make code cleaner
Signed-off-by: Gary Kim <gary@garykim.dev>
* Use rev-parse to check if commit exists
Signed-off-by: Gary Kim <gary@garykim.dev>
* Add and modify tests for checking hashes in html link rendering
Signed-off-by: Gary Kim <gary@garykim.dev>
* Return error in sha1CurrentPatternProcessor
Co-Authored-By: mrsdizzie <info@mrsdizzie.com>
* Import Gitea log module
Signed-off-by: Gary Kim <gary@garykim.dev>
* Revert "Return error in sha1CurrentPatternProcessor"
This reverts commit 28f561cac4.
Signed-off-by: Gary Kim <gary@garykim.dev>
* Add debug logging to sha1CurrentPatternProcessor
This will log errors by the git command run in
sha1CurrentPatternProcessor if the error is one
that was unexpected.
Signed-off-by: Gary Kim <gary@garykim.dev>
Since #6273 was merged, we now have access to proper context metas
always. Update SHA generated links to use these instead of urlPrefix.
Update tests as well.
Fixes #4536.
* Use stricter boundaries for auto-link detection
Currently autolinks use \W for boundary detection which creates many
situations of inserting links into places they don't belong (paths,
URLs, UUIDs, etc...)
This fixes that by replacing \W and only allowing these matches to touch
an open paren or bracket (matching what seems to be Github behavior) in
addition to whitespace and start of line. Similar for ending boundary as
well.
Fixes #6149
(and probably others)
* Update test
Replace incorrect test with a value that is a valid username, based on:
"Username should contain only alphanumeric, dash ('-'), underscore ('_')
and dot ('.') characters."
* Also allow for period at the end
Matching Github behavior
* Fix email regex to work properly with specificed boundaries
Create a specific capture group for email address and then use
FindStringSubmatchIndex to allow for non-matching patterns as
boundaries.
* Add Tests
Add tests for new behavior -- including tests for email addresses which
were absent before.
* Replace linkRegex with xurls library
Rather than maintaining a complicated regex to match URLs for
autolinking, gitea can use this existing go library that takes care of
the matching with very little code change to gitea itself. After
spending a while trying to find the perfect regex for all cases this library
still works better as it is more flexible than a single regex ever will be.
This will also fix the following issues: #5844#3095#3381
This passes all our current tests and I've added new ones mentioned in
those issues as well.
* Use xurls.StrictMatchingScheme instead of xurls.Strict
This is much faster and we only care about https? links to preserve
existing behavior.
The visitLinksForShortLinks feature would look inside of an <a> tag and
run shortLinkProcessorFull on any text, which attempts to create links
out of potential 'short links' like [[test]] [[link|example]] etc...
This makes no sense because you can't have nested links within an <a>
tag. Specifically, the html5 standard says <a> tags can't include
interactive content if they contain the href attribute:
http://w3c.github.io/html/single-page.html#the-a-element
And also defines an <a> element with a href attribute as interactive:
http://w3c.github.io/html/single-page.html#interactive-content
Therefore you can't really put a link inside of another link. In
practice none of this works anyways since browsers won't render it, it
would probably be broken if they tried, and it is causing a bug
(#4946). No current tests rely on this behavior either.
This removes the feature and also explicitly excludes the
current visitNodeForShortLinks from looking in <a> tags.
Modify the current linkRegex to require http|https which appears to be
the intended behavior based on the comments. Right now, it also matches
anything starting with www as well. Also add testing for linkRegex
* Get rid of autolink
* autolink in markdown
* Replace email addresses with mailto links
* better handling of links
* Remove autolink.js from footer
* Refactor entire html.go
* fix some bugs
* Make tests green, move what we can to html_internal_test, various other changes to processor logic
* Make markdown tests work again
This is just a description to allow me to force push in order to restart
the drone build.
* Fix failing markdown tests in routers/api/v1/misc
* Add license headers, log errors, future-proof <body>
* fix formatting
* add init support of orgmode document type on file view and readme
* fix imports
* fix imports and readmeExist
* fix imports order
* fix format
* remove unnecessary convert
* restructure markup & markdown to prepare for multiple markup languages support
* adjust some functions between markdown and markup
* fix tests
* improve the comments