1
0
Fork 0
mirror of https://codeberg.org/forgejo/forgejo.git synced 2024-12-30 14:09:42 -05:00
Commit graph

37 commits

Author SHA1 Message Date
0ko
c7ba51518c s/Gitea/Forgejo in various log messages and comments
(cherry picked from commit 469c214ec8)
2024-04-22 14:41:17 +00:00
Gusted
5b3a82d621
[FEAT] Enable ambiguous character detection in configured contexts
- The ambiguous character detection is an important security feature to
combat against sourcebase attacks (https://trojansource.codes/).
- However there are a few problems with the feature as it stands
today (i) it's apparantly an big performance hitter, it's twice as slow
as syntax highlighting (ii) it contains false positives, because it's
reporting valid problems but not valid within the context of a
programming language (ambiguous charachters in code comments being a
prime example) that can lead to security issues (iii) charachters from
certain languages always being marked as ambiguous. It's a lot of effort
to fix the aforementioned issues.
- Therefore, make it configurable in which context the ambiguous
character detection should be run, this avoids running detection in all
contexts such as file views, but still enable it in commits and pull
requests diffs where it matters the most. Ideally this also becomes an
per-repository setting, but the code architecture doesn't allow for a
clean implementation of that.
- Adds unit test.
- Adds integration tests to ensure that the contexts and instance-wide
is respected (and that ambigious charachter detection actually work in
different places).
- Ref: https://codeberg.org/forgejo/forgejo/pulls/2395#issuecomment-1575547
- Ref: https://codeberg.org/forgejo/forgejo/issues/564
2024-02-23 13:12:17 +01:00
wxiaoguang
65248945c9
Refactor locale&string&template related code (#29165)
Clarify when "string" should be used (and be escaped), and when
"template.HTML" should be used (no need to escape)

And help PRs like  #29059 , to render the error messages correctly.

(cherry picked from commit f3eb835886031df7a562abc123c3f6011c81eca8)

Conflicts:
	modules/web/middleware/binding.go
	routers/web/feed/convert.go
	tests/integration/branches_test.go
	tests/integration/repo_branch_test.go
	trivial context conflicts
2024-02-16 15:20:52 +01:00
silverwind
60e4a98ab0
Preserve BOM in web editor (#28935)
The `ToUTF8*` functions were stripping BOM, while BOM is actually valid
in UTF8, so the stripping must be optional depending on use case. This
does:

- Add a options struct to all `ToUTF8*` functions, that by default will
strip BOM to preserve existing behaviour
- Remove `ToUTF8` function, it was dead code
- Rename `ToUTF8WithErr` to `ToUTF8`
- Preserve BOM in Monaco Editor
- Remove a unnecessary newline in the textarea value. Browsers did
ignore it, it seems but it's better not to rely on this behaviour.

Fixes: https://github.com/go-gitea/gitea/issues/28743
Related: https://github.com/go-gitea/gitea/issues/6716 which seems to
have once introduced a mechanism that strips and re-adds the BOM, but
from what I can tell, this mechanism was removed at some point after
that PR.
2024-01-27 18:02:51 +00:00
wxiaoguang
20929edc99
Add option to disable ambiguous unicode characters detection (#28454)
* Close #24483
* Close #28123
* Close #23682
* Close #23149

(maybe more)
2023-12-17 14:38:54 +00:00
silverwind
88f835192d
Replace interface{} with any (#25686)
Result of running `perl -p -i -e 's#interface\{\}#any#g' **/*` and `make fmt`.

Basically the same [as golang did](2580d0e08d).
2023-07-04 18:36:08 +00:00
silverwind
8dc6eabbc0
Update go tool dependencies, restructure lint targets (#24239)
- Update all tool dependencies to latest tag
- Remove unused errcheck, it is part of golangci-lint
- Include main.go in air
- Enable wastedassign again now that it's
[generics-compatible](https://github.com/golangci/golangci-lint/pull/3689)
- Restructured lint targets to new `lint-*` namespace
2023-04-22 14:53:00 -04:00
wxiaoguang
7681d582cd
Refactor locale number (#24134)
Before, the `GiteaLocaleNumber.js` was just written as a a drop-in
replacement for old `js-pretty-number`.

Actually, we can use Golang's `text` package to format.

This PR partially completes the TODOs in `GiteaLocaleNumber.js`:

> if we have complete backend locale support (eg: Golang "x/text"
package), we can drop this component.
> tooltip: only 2 usages of this, we can replace it with Golang's
"x/text/number" package in the future.

This PR also helps #24131

Screenshots:

<details>

![image](https://user-images.githubusercontent.com/2114189/232179420-b1b9974b-9d96-4408-b209-b80182c8b359.png)


![image](https://user-images.githubusercontent.com/2114189/232179416-14f36aa0-3f3e-4ac9-b366-7bd3a4464a11.png)

</details>
2023-04-17 11:37:23 +08:00
wxiaoguang
8d5fbeb7a2
Use data-tooltip-content for tippy tooltip (#23649)
Follow:
* #23574
* Remove all ".tooltip[data-content=...]"

Major changes:

* Remove "tooltip" class, use "[data-tooltip-content=...]" instead of
".tooltip[data-content=...]"
* Remove legacy `data-position`, it's dead code since last Fomantic
Tooltip -> Tippy Tooltip refactoring
* Rename reaction attribute from `data-content` to
`data-reaction-content`
* Add comments for some `data-content`: `{{/* used by the form */}}`
* Remove empty "ui" class
* Use "text color" for SVG icons (a few)
2023-03-24 18:35:38 +08:00
Jason Song
e253888a0e
Fix isAllowed of escapeStreamer (#22814)
The use of `sort.Search` is wrong: The slice should be sorted, and
`return >= 0` doen't mean it exists, see the
[manual](https://pkg.go.dev/sort#Search).

Could be fixed like this if we really need it:

```diff
diff --git a/modules/charset/escape_stream.go b/modules/charset/escape_stream.go
index 823b63513..fcf1ffbc1 100644
--- a/modules/charset/escape_stream.go
+++ b/modules/charset/escape_stream.go
@@ -20,6 +20,9 @@ import (
 var defaultWordRegexp = regexp.MustCompile(`(-?\d*\.\d\w*)|([^\` + "`" + `\~\!\@\#\$\%\^\&\*\(\)\-\=\+\[\{\]\}\\\|\;\:\'\"\,\.\<\>\/\?\s\x00-\x1f]+)`)

 func NewEscapeStreamer(locale translation.Locale, next HTMLStreamer, allowed ...rune) HTMLStreamer {
+       sort.Slice(allowed, func(i, j int) bool {
+               return allowed[i] < allowed[j]
+       })
        return &escapeStreamer{
                escaped:                 &EscapeStatus{},
                PassthroughHTMLStreamer: *NewPassthroughStreamer(next),
@@ -284,14 +287,8 @@ func (e *escapeStreamer) runeTypes(runes ...rune) (types []runeType, confusables
 }

 func (e *escapeStreamer) isAllowed(r rune) bool {
-       if len(e.allowed) == 0 {
-               return false
-       }
-       if len(e.allowed) == 1 {
-               return e.allowed[0] == r
-       }
-
-       return sort.Search(len(e.allowed), func(i int) bool {
+       i := sort.Search(len(e.allowed), func(i int) bool {
                return e.allowed[i] >= r
-       }) >= 0
+       })
+       return i < len(e.allowed) && e.allowed[i] == r
 }
```

But I don't think so, a map is better to do it.
2023-02-09 20:51:36 +08:00
crystal
5d9c64b3fe
Fix line spacing for plaintext previews (#22699)
Adding `<br>` between each line is not necessary since the entire file
is rendered inside a `<pre>`

fixes https://codeberg.org/Codeberg/Community/issues/915
2023-02-01 22:51:02 -06:00
zeripath
6e22605793
Ensure that plain files are rendered correctly even when containing ambiguous characters (#22017)
As recognised in #21841 the rendering of plain text files is somewhat
incorrect when there are ambiguous characters as the html code is double
escaped. In fact there are several more problems here.

We have a residual isRenderedHTML which is actually simply escaping the
file - not rendering it. This is badly named and gives the wrong
impression.

There is also unusual behaviour whether the file is called a Readme or
not and there is no way to get to the source code if the file is called
README.

In reality what should happen is different depending on whether the file
is being rendered a README at the bottom of the directory view or not.

1. If it is rendered as a README on a directory - it should simply be
escaped and rendered as `<pre>` text.
2. If it is rendered as a file then it should be rendered as source
code.

This PR therefore does:
1. Rename IsRenderedHTML to IsPlainText
2. Readme files rendered at the bottom of the directory are rendered
without line numbers
3. Otherwise plain text files are rendered as source code.

Replace #21841

Signed-off-by: Andrew Thornton <art27@cantab.net>

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2022-12-17 22:22:25 +02:00
silverwind
0585ac3ac6
Update go dev dependencies (#22064)
`golangci-lint`
[deprecated](https://github.com/golangci/golangci-lint/issues/1841) a
bunch of linters, removed them.
2022-12-08 16:21:37 +08:00
zeripath
a08584ee36
Ensure that Chinese punctuation is not ambiguous when locale is Chinese (#22019)
Although there are per-locale fallbacks for ambiguity the locale names
for Chinese do not quite match our locales. This PR simply maps zh-CN on
to zh-hans and other zh variants on to zh-hant.

Ref #20999

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-12-04 17:57:30 +00:00
flynnnnnnnnnn
e81ccc406b
Implement FSFE REUSE for golang files (#21840)
Change all license headers to comply with REUSE specification.

Fix #16132

Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
2022-11-27 18:20:29 +00:00
zeripath
8080e23c9b
Move go-licenses to generate and separate generate into a frontend and backend component (#21061)
The `go-licenses` make task introduced in #21034 is being run on make vendor
and occasionally causes an empty go-licenses file if the vendors need to
change. This should be moved to the generate task as it is a generated file.

Now because of this change we also need to split generation into two separate 
steps:

1. `generate-backend`
2. `generate-frontend`

In the future it would probably be useful to make `generate-swagger` part of `generate-frontend` but it's not tolerated with our .drone.yml

Ref #21034

Signed-off-by: Andrew Thornton <art27@cantab.net>

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: delvh <dev.lh@web.de>
2022-09-05 14:04:18 +08:00
zeripath
bb0ff77e46
Share HTML template renderers and create a watcher framework (#20218)
The recovery, API, Web and package frameworks all create their own HTML
Renderers. This increases the memory requirements of Gitea
unnecessarily with duplicate templates being kept in memory.

Further the reloading framework in dev mode for these involves locking
and recompiling all of the templates on each load. This will potentially
hide concurrency issues and it is inefficient.

This PR stores the templates renderer in the context and stores this
context in the NormalRoutes, it then creates a fsnotify.Watcher
framework to watch files.

The watching framework is then extended to the mailer templates which
were previously not being reloaded in dev.

Then the locales are simplified to a similar structure.

Fix #20210 
Fix #20211
Fix #20217

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-08-28 10:43:25 +01:00
Jason Song
15b189b570
Avoid frequent string2bytes conversions (#20940)
Fix #20939
2022-08-24 12:50:13 +01:00
zeripath
99efa02edf
Switch Unicode Escaping to a VSCode-like system (#19990)
This PR rewrites the invisible unicode detection algorithm to more
closely match that of the Monaco editor on the system. It provides a
technique for detecting ambiguous characters and relaxes the detection
of combining marks.

Control characters are in addition detected as invisible in this
implementation whereas they are not on monaco but this is related to
font issues.

Close #19913

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-08-13 19:32:34 +01:00
luzpaz
d29d6d1991
Fix various typos (#20338)
* Fix various typos

Found via `codespell -q 3 -S ./options/locale,./options/license,./public/vendor -L actived,allways,attachements,ba,befores,commiter,pullrequest,pullrequests,readby,splitted,te,unknwon`

Co-authored-by: zeripath <art27@cantab.net>
2022-07-12 23:32:37 +02:00
Wim
cb50375e2b
Add more linters to improve code readability (#19989)
Add nakedret, unconvert, wastedassign, stylecheck and nolintlint linters to improve code readability

- nakedret - https://github.com/alexkohler/nakedret - nakedret is a Go static analysis tool to find naked returns in functions greater than a specified function length.
- unconvert - https://github.com/mdempsky/unconvert - Remove unnecessary type conversions
- wastedassign - https://github.com/sanposhiho/wastedassign -  wastedassign finds wasted assignment statements.
- notlintlint -  Reports ill-formed or insufficient nolint directives
- stylecheck - https://staticcheck.io/docs/checks/#ST - keep style consistent
  - excluded: [ST1003 - Poorly chosen identifier](https://staticcheck.io/docs/checks/#ST1003) and [ST1005 - Incorrectly formatted error string](https://staticcheck.io/docs/checks/#ST1005)
2022-06-20 12:02:49 +02:00
zeripath
bc4764ffc6
Detect truncated utf-8 characters at the end of content as still representing utf-8 (#19773)
Our character detection algorithm can potentially incorrectly detect utf-8 as iso-8859-x
if there is a truncated character at the end of the partially read file.

This PR changes the detection algorithm to truncated utf8 characters at the end of the
buffer.

Fix #19743

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-05-21 14:06:24 +01:00
Gusted
bf2867dec2
Don't treat BOM escape sequence as hidden character. (#18909)
* Don't treat BOM escape sequence as hidden character.

- BOM sequence is a common non-harmfull escape sequence, it shouldn't be
shown as hidden character.
- Follows GitHub's behavior.
- Resolves #18837

Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-02-26 16:48:23 +00:00
zeripath
4b3ebda0e7
Fix panic in EscapeReader (#18820)
There is a potential panic due to a mistaken resetting of the length parameter when
multibyte characters go over a read boundary.

Signed-off-by: Andrew Thornton <art27@cantab.net>
2022-02-19 15:25:31 +00:00
6543
54e9ee37a7
format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
zeripath
21ed4fd8da
Add warning for BIDI characters in page renders and in diffs (#17562)
Fix #17514

Given the comments I've adjusted this somewhat. The numbers of characters detected are increased and include things like the use of U+300 to make à instead of à and non-breaking spaces.

There is a button which can be used to escape the content to show it.

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Gwyneth Morgan <gwymor@tilde.club>
Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2022-01-07 02:18:52 +01:00
Gusted
ff2fd08228
Simplify parameter types (#18006)
Remove repeated type declarations in function definitions.
2021-12-20 04:41:31 +00:00
KN4CK3R
f99d50fc9f
Read expected buffer size (#17409)
* Read expected buffer size.

* Changed name.
2021-10-24 22:12:43 +01:00
Eng Zer Jun
f2e7d5477f
refactor: move from io/ioutil to io and os package (#17109)
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2021-09-22 13:38:34 +08:00
Lunny Xiao
9d99f6ab19
Refactor renders (#15175)
* Refactor renders

* Some performance optimization

* Fix comment

* Transform reader

* Fix csv test

* Fix test

* Fix tests

* Improve optimaziation

* Fix test

* Fix test

* Detect file encoding with reader

* Improve optimaziation

* reduce memory usage

* improve code

* fix build

* Fix test

* Fix for go1.15

* Fix render

* Fix comment

* Fix lint

* Fix test

* Don't use NormalEOF when unnecessary

* revert change on util.go

* Apply suggestions from code review

Co-authored-by: zeripath <art27@cantab.net>

* rename function

* Take NormalEOF back

Co-authored-by: zeripath <art27@cantab.net>
2021-04-19 18:25:08 -04:00
zeripath
e429c1164e
Ensure that the detected charset order is set in chardet test (#12574)
TestToUTF8WithFallback is the cause of recurrent spurious test failures
even despite code to set the detected charset order.

The reason why this happens is because the preferred detected charset order
is not being initialised for these tests.

This PR simply ensures that this is set at the start of each test and would
allow different tests to be written to allow differing orders.

Replaces #12571
Close #12571

Signed-off-by: Andrew Thornton <art27@cantab.net>
2020-08-23 14:15:29 +01:00
zeripath
a1ad188326
Fix chardet test and add ordering option (#11621)
* Fix chardet test and add ordering option

Signed-off-by: Andrew Thornton <art27@cantab.net>

* minor fixes

Signed-off-by: Andrew Thornton <art27@cantab.net>

* remove log

Signed-off-by: Andrew Thornton <art27@cantab.net>

* remove log2

Signed-off-by: Andrew Thornton <art27@cantab.net>

* only iterate through top results

Signed-off-by: Andrew Thornton <art27@cantab.net>

* Update docs/content/doc/advanced/config-cheat-sheet.en-us.md

* slight restructure of for loop

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2020-06-02 19:20:19 -03:00
Antoine GIRARD
81a52442a1 deps: update and fix chardet import (#9351) 2019-12-14 02:15:48 +02:00
guillep2k
356e1a70ea Reduce test sensibility (#8393) 2019-10-07 01:49:14 -04:00
guillep2k
2628b15ee3 Fix utf8 tests (#8192)
* Prevent compiler environment from making the tests fail

* Remove unused function

* Pass lint
2019-09-21 13:01:34 -04:00
guillep2k
6097ff68e7 Make encoding tests independent of LOCALE settings (#8018)
* Make encoding tests independent of LOCALE settings

* Fix fmt

* Force CI to restart
2019-09-02 19:08:07 -04:00
guillep2k
5a44be627c Convert files to utf-8 for indexing (#7814)
* Convert files to utf-8 for indexing

* Move utf8 functions to modules/base

* Bump repoIndexerLatestVersion to 3

* Add tests for base/encoding.go

* Changes to pass gosimple

* Move UTF8 funcs into new modules/charset package
2019-08-15 20:07:28 +08:00