1
0
Fork 0
mirror of https://codeberg.org/forgejo/forgejo.git synced 2024-12-04 10:30:19 -05:00
Commit graph

59 commits

Author SHA1 Message Date
wxiaoguang
061b68e995
Refactor path & config system (#25330) (#25416)
Backport #25330

# The problem

There were many "path tricks":

* By default, Gitea uses its program directory as its work path
* Gitea tries to use the "work path" to guess its "custom path" and
"custom conf (app.ini)"
* Users might want to use other directories as work path
* The non-default work path should be passed to Gitea by GITEA_WORK_DIR
or "--work-path"
* But some Gitea processes are started without these values
    * The "serv" process started by OpenSSH server
    * The CLI sub-commands started by site admin
* The paths are guessed by SetCustomPathAndConf again and again
* The default values of "work path / custom path / custom conf" can be
changed when compiling

# The solution

* Use `InitWorkPathAndCommonConfig` to handle these path tricks, and use
test code to cover its behaviors.
* When Gitea's web server runs, write the WORK_PATH to "app.ini", this
value must be the most correct one, because if this value is not right,
users would find that the web UI doesn't work and then they should be
able to fix it.
* Then all other sub-commands can use the WORK_PATH in app.ini to
initialize their paths.
* By the way, when Gitea starts for git protocol, it shouldn't output
any log, otherwise the git protocol gets broken and client blocks
forever.

The "work path" priority is: WORK_PATH in app.ini > cmd arg --work-path
> env var GITEA_WORK_DIR > builtin default

The "app.ini" searching order is: cmd arg --config > cmd arg "work path
/ custom path" > env var "work path / custom path" > builtin default


## ⚠️ BREAKING

If your instance's "work path / custom path / custom conf" doesn't meet
the requirements (eg: work path must be absolute), Gitea will report a
fatal error and exit. You need to set these values according to the
error log.
2023-06-22 16:27:18 +00:00
Lunny Xiao
377a0a20f0
Merge setting.InitXXX into one function with options (#24389)
This PR will merge 3 Init functions on setting packages as 1 and
introduce an options struct.
2023-05-04 11:55:35 +08:00
KN4CK3R
f1173d6879
Use more specific test methods (#24265)
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Giteabot <teabot@gitea.io>
2023-04-22 17:56:27 -04:00
Jonathan Tran
4de80392bc
Add context when rendering labels or emojis (#23281)
This branch continues the work of #23092 and attempts to rid the
codebase of any `nil` contexts when using a `RenderContext`.

Anything that renders markdown or does post processing may call
`markup.sha1CurrentPatternProcessor()`, and this runs
`git.OpenRepository()`, which needs a context. It will panic if the
context is `nil`. This branch attempts to _always_ include a context
when creating a `RenderContext` to prevent future crashes.

Co-authored-by: Kyle D <kdumontnu@gmail.com>
2023-03-05 22:59:05 +01:00
Lunny Xiao
c53ad052d8
Refactor the setting to make unit test easier (#22405)
Some bugs caused by less unit tests in fundamental packages. This PR
refactor `setting` package so that create a unit test will be easier
than before.

- All `LoadFromXXX` files has been splited as two functions, one is
`InitProviderFromXXX` and `LoadCommonSettings`. The first functions will
only include the code to create or new a ini file. The second function
will load common settings.
- It also renames all functions in setting from `newXXXService` to
`loadXXXSetting` or `loadXXXFrom` to make the function name less
confusing.
- Move `XORMLog` to `SQLLog` because it's a better name for that.

Maybe we should finally move these `loadXXXSetting` into the `XXXInit`
function? Any idea?

---------

Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: delvh <dev.lh@web.de>
2023-02-20 00:12:01 +08:00
flynnnnnnnnnn
e81ccc406b
Implement FSFE REUSE for golang files (#21840)
Change all license headers to comply with REUSE specification.

Fix #16132

Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
2022-11-27 18:20:29 +00:00
zeripath
93df41f506
Fix slight bug in katex (#21171)
There is a small bug in #20571 whereby `$a a$b b$` will not be correctly
detected as a math inline block of `a a$b b`. This PR fixes this.

Also reenable test cases as per #21340 

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-10-05 19:55:36 +01:00
Kyle D
c8ded77680
Kd/ci playwright go test (#20123)
* Add initial playwright config

* Simplify Makefile

* Simplify Makefile

* Use correct config files

* Update playwright settings

* Fix package-lock file

* Don't use test logger for e2e tests

* fix frontend lint

* Allow passing TEST_LOGGER variable

* Init postgres database

* use standard gitea env variables

* Update playwright

* update drone

* Move empty env var to commands

* Cleanup

* Move integrations to subfolder

* tests integrations to tests integraton

* Run e2e tests with go test

* Fix linting

* install CI deps

* Add files to ESlint

* Fix drone typo

* Don't log to console in CI

* Use go test http server

* Add build step before tests

* Move shared init function to common package

* fix drone

* Clean up tests

* Fix linting

* Better mocking for page + version string

* Cleanup test generation

* Remove dependency on gitea binary

* Fix linting

* add initial support for running specific tests

* Add ACCEPT_VISUAL variable

* don't require git-lfs

* Add initial documentation

* Review feedback

* Add logged in session test

* Attempt fixing drone race

* Cleanup and bump version

* Bump deps

* Review feedback

* simplify installation

* Fix ci

* Update install docs
2022-09-02 15:18:23 -04:00
wxiaoguang
157b405753
Remove legacy git code (ver < 2.0), fine tune markup tests (#19930)
* clean git support for ver < 2.0

* fine tune tests for markup (which requires git module)

* remove unnecessary comments

* try to fix tests

* try test again

* use const for GitVersionRequired instead of var

* try to fix integration test

* Refactor CheckAttributeReader to make a *git.Repository version

* update document for commit signing with Gitea's internal gitconfig

* update document for commit signing with Gitea's internal gitconfig

Co-authored-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2022-06-16 23:47:44 +08:00
Lunny Xiao
b01dce2a6e
Allow render HTML with css/js external links (#19017)
* Allow render HTML with css/js external links

* Fix bug because of filename escape chars

* Fix lint

* Update docs about new configuration item

* Fix bug of render HTML in sub directory

* Add CSP head for displaying iframe in rendering file

* Fix test

* Apply suggestions from code review

Co-authored-by: delvh <dev.lh@web.de>

* Some improvements

* some improvement

* revert change in SanitizerDisabled of external renderer

* Add sandbox for iframe and support allow-scripts and allow-same-origin

* refactor

* fix

* fix lint

* fine tune

* use single option RENDER_CONTENT_MODE, use sandbox=allow-scripts

* fine tune CSP

* Apply suggestions from code review

Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>

Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-06-16 11:33:23 +08:00
Alexander Neumann
fd273b05b9
Correctly link URLs to users/repos with dashes, dots or underscores (#18890)
* Add tests for references with dashes

This commit adds tests for full URLs referencing repos names and user
names containing a dash.

* Extend regex to match URLs to repos/users with dashes
2022-02-26 00:26:43 +01:00
Gusted
72256c16a8
Prevent NPE on partial match of compare URL and allow short SHA1 compare URLs (#18472)
* Don't panic & allow shorter sha1

- Don't panic when the full regex isn't matched and allow the usage of a
shorter sha1 being used.
- Resolves #18471

* Update modules/markup/html.go

Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2022-01-31 01:48:47 +02:00
6543
54e9ee37a7
format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
zeripath
5cb0c9aa0d
Propagate context and ensure git commands run in request context (#17868)
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.

This now means that the if there is a git repo already open in the context it will be used instead of reopening it.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-01-19 23:26:57 +00:00
wxiaoguang
6d4172987e
Fix markdown URL parsing (#17924)
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: zeripath <art27@cantab.net>
2021-12-11 19:21:36 +02:00
zeripath
5fbccad906
Fix NPE in fuzzer (#16680)
The fuzzer found an issue with the issue pattern processor where there is a spurious
path.Clean which does not need to be there. This PR also sets the default AppURL for
the fuzzer too.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-08-13 00:22:05 +02:00
6543
91162bbaea
Update bluemonday to v1.0.15 (#16379)
* update github.com/microcosm-cc/bluemonday

* add exec flag to contrib/update_dependencies.sh

* Fix TESTS
2021-07-09 03:30:31 +02:00
zeripath
32fd11395b
Fix relative links in postprocessed images (#16334)
If a pre-post-processed file contains relative img tags these need to be updated
and joined correctly with the prefix. Finally, the node attributes need to be updated.

Fix #16308

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
2021-07-04 10:26:04 +01:00
6543
65548359cc
Add custom emoji support (#16004) 2021-06-29 16:28:38 +02:00
KN4CK3R
c9c7afda1a
Add sanitizer rules per renderer (#16110)
* Added sanitizer rules per renderer.

* Updated documentation.

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2021-06-23 17:09:51 -04:00
zeripath
0db1048c3a
Run processors on whole of text (#16155)
There is an inefficiency in the design of our processors which means that Emoji
and other processors run in order n^2 time.

This PR forces the processors to process the entirety of text node before passing
back up. The fundamental inefficiency remains but it should be significantly
ameliorated.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-06-17 11:35:05 +01:00
KN4CK3R
21cde5c439
Fix data URI scramble (#16098)
* Removed unused method.

* No prefix for data uris.

* Added test to prevent regressions.
2021-06-07 18:55:26 +02:00
silverwind
d4f28fd4ad
Fix URL of gitea emoji (#15770)
Fixes regression from #15219
2021-05-07 17:34:33 +02:00
Lunny Xiao
9d99f6ab19
Refactor renders (#15175)
* Refactor renders

* Some performance optimization

* Fix comment

* Transform reader

* Fix csv test

* Fix test

* Fix tests

* Improve optimaziation

* Fix test

* Fix test

* Detect file encoding with reader

* Improve optimaziation

* reduce memory usage

* improve code

* fix build

* Fix test

* Fix for go1.15

* Fix render

* Fix comment

* Fix lint

* Fix test

* Don't use NormalEOF when unnecessary

* revert change on util.go

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

* rename function

* Take NormalEOF back

Co-authored-by: zeripath <art27@cantab.net>
2021-04-19 18:25:08 -04:00
zeripath
b9ed3cbc26
Upgrade to bluemonday 1.0.7 (#15379)
* Upgrade to bluemonday 1.0.7

Fix #15349

Signed-off-by: Andrew Thornton <art27@cantab.net>

* resolve unit test

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2021-04-09 19:13:06 -04:00
zeripath
172229966c
Prevent panic on fuzzer provided string (#14405)
* Prevent panic on fuzzer provided string

The fuzzer has found that providing a <body> tag with an attribute to
PostProcess causes a panic. This PR removes any rendered html or body
tags from the output.

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Placate lint

* placate lint again

Signed-off-by: Andrew Thornton <art27@cantab.net>

* minor cleanup

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-01-20 23:10:50 +08:00
Nuno Silva
44ff1d6a1e
Render links for commit hashes followed by comma (#14224)
Regex test cases: https://regex101.com/r/mVbPxM/2/

fixes #14223
2021-01-03 23:11:10 +08:00
Lunny Xiao
11555d850b
Fix bug of link query order on markdown render (#14156)
* Fix bug of link query order on markdown render

* Fix bluemonday bug and fix one wrong test

Co-authored-by: 6543 <6543@obermui.de>
2020-12-29 00:28:27 +08:00
kolaente
64133126cd
Update golangci-lint to version 1.31.0 (#13102)
This PR updates golangci-lint to the latest version 1.31.0.

The upgrade introduced a new check for which I've fixed or disabled most cases.

Signed-off-by: kolaente <k@knt.li>
2020-10-11 21:27:20 +01:00
silverwind
ee047312a1
Fix emoji replacements, make emoji images consistent (#12567)
- Fix emoji not being replaced in issue title change text
- Make the image attributes consistent, add alt, remove align

Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2020-08-23 21:44:53 -04:00
mrsdizzie
ea1ed802a3
Fix emoji detection in certain cases (#12320)
* Fix emoji detection certain cases

Previous tests weren't complicated enough so there were some situations where emojis were't detected properly. Find the earliest occurance in addition to checking for the longest combination.

Fixes #12312

* ok spell bot

Co-authored-by: Lauris BH <lauris@nix.lv>
2020-07-25 16:40:04 +03:00
silverwind
2447ffc74a
Disable all typographic replacements in markdown renderer (#11871)
* Disable all typographic replacements in markdown renderer

Previously we only disabled some of them. This disables all the default
replacements that goldmark's typographer extension offers, matching
GitHub's renderer.

Ref: https://github.com/yuin/goldmark#typographer-extension
Fixes: https://github.com/go-gitea/gitea/issues/11001

* remove typographer extension completely

* fix test

* really fix test

Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2020-06-13 02:10:17 -04:00
mrsdizzie
4c1ff57f1a
Update emoji regex (#11584)
When matching emoji, use a regex built from the data we have instead of something generic using unicode ranges. A generic regex can't tell the difference between two separate emoji next to each other or one emoji that is built out of two separate emoji next to each other.

This means that emoji that are next to each other without space in between will be now accurately spanned individually with proper title etc...
2020-05-29 17:08:36 +01:00
mrsdizzie
4563eb873d
Support unicode emojis and remove emojify.js (#11032)
* Support unicode emojis and remove emojify.js

This PR replaces all use of emojify.js and adds unicode emoji support to various areas of gitea.

This works in a few ways:

First it adds emoji parsing support into gitea itself. This allows us to

 * Render emojis from valid alias (😄)
 * Detect unicode emojis and let us put them in their own class with proper aria-labels and styling
 * Easily allow for custom "emoji"
 * Support all emoji rendering and features without javascript
 * Uses plain unicode and lets the system render in appropriate emoji font
 * Doesn't leave us relying on external sources for updates/fixes/features

That same list of emoji is also used to create a json file which replaces the part of emojify.js that populates the emoji search tribute. This file is about 35KB with GZIP turned on and I've set it to load after the page renders to not hinder page load time (and this removes loading emojify.js also)

For custom "emoji" it uses a pretty simple scheme of just looking for /emojis/img/name.png where name is something a user has put in the "allowed reactions" setting we already have. The gitea reaction that was previously hard coded into a forked copy of emojify.js is included and works as a custom reaction under this method.

The emoji data sourced here is from https://github.com/github/gemoji which is the gem library Github uses for their emoji rendering (and a data source for other sites). So we should be able to easily render any emoji and :alias: that Github can, removing any errors from migrated content. They also update it as well, so we can sync when there are new unicode emoji lists released.

I've included a slimmed down and slightly modified forked copy of https://github.com/knq/emoji to make up our own emoji module. The code is pretty straight forward and again allows us to have a lot of flexibility in what happens.

I had seen a few comments about performance in some of the other threads if we render this ourselves, but there doesn't seem to be any issue here. In a test it can parse, convert, and render 1,000 emojis inside of a large markdown table in about 100ms on my laptop (which is many more emojis than will ever be in any normal issue). This also prevents any flickering and other weirdness from using javascript to render some things while using go for others.

Not included here are image fall back URLS. I don't really think they are necessary for anything new being written in 2020. However, managing the emoji ourselves would allow us to add these as a feature later on if it seems necessary.

Fixes: https://github.com/go-gitea/gitea/issues/9182
Fixes: https://github.com/go-gitea/gitea/issues/8974
Fixes: https://github.com/go-gitea/gitea/issues/8953
Fixes: https://github.com/go-gitea/gitea/issues/6628
Fixes: https://github.com/go-gitea/gitea/issues/5130

* add new shared function emojiHTML

* don't increase emoji size in issue title

* Update templates/repo/issue/view_content/add_reaction.tmpl

Co-Authored-By: 6543 <6543@obermui.de>

* Support for emoji rendering in various templates

* Render code and review comments as they should be

* Better way to handle mail subjects

* insert unicode from tribute selection

* Add template helper for plain text when needed

* Use existing replace function I forgot about

* Don't include emoji greater than Unicode Version 12

Only include emoji and aliases in JSON

* Update build/generate-emoji.go

* Tweak regex slightly to really match everything including random invisible characters. Run tests for every emoji we have

* final updates

* code review

* code review

* hard code gitea custom emoji to match previous behavior

* Update .eslintrc

Co-Authored-By: silverwind <me@silverwind.io>

* disable preempt

Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: guillep2k <18600385+guillep2k@users.noreply.github.com>
2020-04-28 15:05:39 -03:00
techknowlogick
d00ebf445b
upgrade to most recent bluemonday (#11007)
* upgrade to most recent bluemonday

* make vendor

* update tests for bluemonday

* update tests for bluemonday

* update tests for bluemonday
2020-04-07 23:08:47 +03:00
zeripath
154b137b6d
Relax sanitization as per https://github.com/jch/html-pipeline (#10527)
Looking at github/markup#245 it is clear that GH uses https://github.com/jch/html-pipeline to sanitize. This PR relaxes our sanitization to more closely match this.

Fixes #10471
and likely others...
2020-02-28 20:05:12 +00:00
John Olheiser
7d7ab1eeae Issue/PR Context Popups (#9822)
* Add data-index attribute to issue anchors

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Init JS

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Add required data to anchor

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Finish popup

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Revert changes to html.go

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Better octicon contexts

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Split out popup function for re-use

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Style changes, test fixes, and cross-reference support

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Prefer em to px

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Move label margin to base CSS

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Move JS to separate file.

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Move JS to features and fix module

Signed-off-by: jolheiser <john.olheiser@gmail.com>

* Remove query-string and hash

Co-Authored-By: silverwind <me@silverwind.io>

Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: Antoine GIRARD <sapk@users.noreply.github.com>
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: zeripath <art27@cantab.net>
2020-01-19 23:39:21 -05:00
zeripath
27757714d0 Change markdown rendering from blackfriday to goldmark (#9533)
* Move to goldmark

Markdown rendering moved from blackfriday to the goldmark.

Multiple subtle changes required to the goldmark extensions to keep
current rendering and defaults.

Can go further with goldmark linkify and have this work within markdown
rendering making the link processor unnecessary.

Need to think about how to go about allowing extensions - at present it
seems that these would be hard to do without recompilation.

* linter fixes

Co-authored-by: Lauris BH <lauris@nix.lv>
2019-12-31 03:53:28 +02:00
Lauris BH
086a46994a Rewrite markdown rendering to blackfriday v2 and rewrite orgmode rendering to go-org (#8560)
* Rewrite markdown rendering to blackfriday v2.0

* Fix style

* Fix go mod with golang 1.13

* Fix blackfriday v2 import

* Inital orgmode renderer migration to go-org

* Vendor go-org dependency

* Ignore errors :/

* Update go-org to latest version

* Update test

* Fix go-org test

* Remove unneeded code

* Fix comments

* Fix markdown test

* Fix blackfriday regression rendering HTML block
2019-10-31 01:06:25 +00:00
guillep2k
cea8ea5ae6 Support inline rendering of CUSTOM_URL_SCHEMES (#8496)
* Support inline rendering of CUSTOM_URL_SCHEMES

* Fix lint

* Add tests

* Fix lint
2019-10-15 02:31:09 +01:00
Gary Kim
7eed11e5e9 Check commit message hashes before making links (#7713)
* Check commit message hashes before making links

Previously, when formatting commit messages, anything
that looked like SHA1 hashes was turned into a link
using regex. This meant that certain phrases or numbers
such as `777777` or `deadbeef` could be recognized as a commit
even if the repository has no commit with those hashes.

This change will make it so that anything that looks
like a SHA1 hash using regex will then also be checked
to ensure that there is a commit in the repository
with that hash before making a link.

Signed-off-by: Gary Kim <gary@garykim.dev>

* Use gogit to check if commit exists

This commit modifies the commit hash check
in the render for commit messages to use
gogit for better performance.

Signed-off-by: Gary Kim <gary@garykim.dev>

* Make code cleaner

Signed-off-by: Gary Kim <gary@garykim.dev>

* Use rev-parse to check if commit exists

Signed-off-by: Gary Kim <gary@garykim.dev>

* Add and modify tests for checking hashes in html link rendering

Signed-off-by: Gary Kim <gary@garykim.dev>

* Return error in sha1CurrentPatternProcessor

Co-Authored-By: mrsdizzie <info@mrsdizzie.com>

* Import Gitea log module

Signed-off-by: Gary Kim <gary@garykim.dev>

* Revert "Return error in sha1CurrentPatternProcessor"

This reverts commit 28f561cac4.

Signed-off-by: Gary Kim <gary@garykim.dev>

* Add debug logging to sha1CurrentPatternProcessor

This will log errors by the git command run in
sha1CurrentPatternProcessor if the error is one
that was unexpected.

Signed-off-by: Gary Kim <gary@garykim.dev>
2019-08-14 16:04:55 +08:00
Christian Muehlhaeuser
54d96c79b5 Removed unnecessary conversions (#7557)
No need to convert to the same type.
2019-07-23 19:50:39 +01:00
mrsdizzie
0064535ad2 Fix domain name pattern in email regex (#6739)
Fixes #6735
2019-04-24 21:53:41 -04:00
mrsdizzie
1bce1894f5 Use ctx.metas for SHA hash links (#6645)
Since #6273 was merged, we now have access to proper context metas
always. Update SHA generated links to use these instead of urlPrefix.

Update tests as well.

Fixes #4536.
2019-04-16 08:53:57 +01:00
silverwind
8e949db3b5 Render SHA1 links as code blocks (#6546) 2019-04-09 06:18:48 +03:00
mrsdizzie
6293736d02 Use stricter boundaries for auto-link detection (#6522)
* Use stricter boundaries for auto-link detection

Currently autolinks use \W for boundary detection which creates many
situations of inserting links into places they don't belong (paths,
URLs, UUIDs, etc...)

This fixes that by replacing \W and only allowing these matches to touch
an open paren or bracket (matching what seems to be Github behavior) in
addition to whitespace and start of line. Similar for ending boundary as
well.

Fixes #6149
(and probably others)

* Update test

Replace incorrect test with a value that is a valid username, based on:

"Username should contain only alphanumeric, dash ('-'), underscore ('_')
and dot ('.') characters."

* Also allow for period at the end

Matching Github behavior

* Fix email regex to work properly with specificed boundaries

Create a specific capture group for email address and then use
FindStringSubmatchIndex to allow for non-matching patterns as
boundaries.

* Add Tests

Add tests for new behavior -- including tests for email addresses which
were absent before.
2019-04-07 12:18:16 +01:00
mrsdizzie
c8650aef0a Change order that PostProcess Processors are run (#6445)
Make sure Processors that work on full links are run first so that
something matching another pattern doesn't alter a link before we get to
it, for example:

 https://stackoverflow.com/questions/2896191/what-is-go-used-fore

Fixes #4813
2019-03-27 11:37:54 -04:00
mrsdizzie
f2de5dc8c8 Replace linkRegex with xurls library (#6261)
* Replace linkRegex with xurls library

Rather than maintaining a complicated regex to match URLs for
autolinking, gitea can use this existing go library that takes care of
the matching with very little code change to gitea itself. After
spending a while trying to find the perfect regex for all cases this library
still works better as it is more flexible than a single regex ever will be.

This will also fix the following issues: #5844 #3095 #3381

This passes all our current tests and I've added new ones mentioned in
those issues as well.

* Use xurls.StrictMatchingScheme instead of xurls.Strict

This is much faster and we only care about https? links to preserve
existing behavior.
2019-03-07 15:12:01 -05:00
mrsdizzie
020075e12f Remove visitLinksForShortLinks features (#6257)
The visitLinksForShortLinks feature would look inside of an <a> tag and
run shortLinkProcessorFull on any text, which attempts to create links
out of potential 'short links' like [[test]] [[link|example]] etc...
This makes no sense because you can't have nested links within an <a>
tag. Specifically, the html5 standard says <a> tags can't include
interactive content if they contain the href attribute:

 http://w3c.github.io/html/single-page.html#the-a-element

And also defines an <a> element with a href attribute as interactive:

 http://w3c.github.io/html/single-page.html#interactive-content

Therefore you can't really put a link inside of another link. In
practice none of this works anyways since browsers won't render it, it
would probably be broken if they tried, and it is causing a bug
(#4946). No current tests rely on this behavior either.

This removes the feature and also explicitly excludes the
current visitNodeForShortLinks from looking in <a> tags.
2019-03-07 14:13:44 -05:00
mrsdizzie
4a2e92bcd1 Modify linkRegex to require http|https (#6171)
Modify the current linkRegex to require http|https which appears to be
the intended behavior based on the comments. Right now, it also matches
anything starting with www as well. Also add testing for linkRegex
2019-02-28 20:31:53 +08:00