Backport #27265 by @JakobDev
Part of #27065
This PR touches functions used in templates. As templates are not static
typed, errors are harder to find, but I hope I catch it all. I think
some tests from other persons do not hurt.
Co-authored-by: JakobDev <jakobdev@gmx.de>
This PR replaces all string refName as a type `git.RefName` to make the
code more maintainable.
Fix #15367
Replaces #23070
It also fixed a bug that tags are not sync because `git remote --prune
origin` will not remove local tags if remote removed.
We in fact should use `git fetch --prune --tags origin` but not `git
remote update origin` to do the sync.
Some answer from ChatGPT as ref.
> If the git fetch --prune --tags command is not working as expected,
there could be a few reasons why. Here are a few things to check:
>
>Make sure that you have the latest version of Git installed on your
system. You can check the version by running git --version in your
terminal. If you have an outdated version, try updating Git and see if
that resolves the issue.
>
>Check that your Git repository is properly configured to track the
remote repository's tags. You can check this by running git config
--get-all remote.origin.fetch and verifying that it includes
+refs/tags/*:refs/tags/*. If it does not, you can add it by running git
config --add remote.origin.fetch "+refs/tags/*:refs/tags/*".
>
>Verify that the tags you are trying to prune actually exist on the
remote repository. You can do this by running git ls-remote --tags
origin to list all the tags on the remote repository.
>
>Check if any local tags have been created that match the names of tags
on the remote repository. If so, these local tags may be preventing the
git fetch --prune --tags command from working properly. You can delete
local tags using the git tag -d command.
---------
Co-authored-by: delvh <dev.lh@web.de>
# ⚠️ Breaking
Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).
If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.
Example:
```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```
Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.
# The problem
The old queue package has some legacy problems:
* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.
It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.
# The new queue package
It keeps using old config and concept as much as possible.
* It only contains two major kinds of concepts:
* The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.
There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.
Almost ready for review.
TODO:
* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)
## Code coverage:
![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
This PR follows #21535 (and replace #22592)
## Review without space diff
https://github.com/go-gitea/gitea/pull/22678/files?diff=split&w=1
## Purpose of this PR
1. Make git module command completely safe (risky user inputs won't be
passed as argument option anymore)
2. Avoid low-level mistakes like
https://github.com/go-gitea/gitea/pull/22098#discussion_r1045234918
3. Remove deprecated and dirty `CmdArgCheck` function, hide the `CmdArg`
type
4. Simplify code when using git command
## The main idea of this PR
* Move the `git.CmdArg` to the `internal` package, then no other package
except `git` could use it. Then developers could never do
`AddArguments(git.CmdArg(userInput))` any more.
* Introduce `git.ToTrustedCmdArgs`, it's for user-provided and already
trusted arguments. It's only used in a few cases, for example: use git
arguments from config file, help unit test with some arguments.
* Introduce `AddOptionValues` and `AddOptionFormat`, they make code more
clear and simple:
* Before: `AddArguments("-m").AddDynamicArguments(message)`
* After: `AddOptionValues("-m", message)`
* -
* Before: `AddArguments(git.CmdArg(fmt.Sprintf("--author='%s <%s>'",
sig.Name, sig.Email)))`
* After: `AddOptionFormat("--author='%s <%s>'", sig.Name, sig.Email)`
## FAQ
### Why these changes were not done in #21535 ?
#21535 is mainly a search&replace, it did its best to not change too
much logic.
Making the framework better needs a lot of changes, so this separate PR
is needed as the second step.
### The naming of `AddOptionXxx`
According to git's manual, the `--xxx` part is called `option`.
### How can it guarantee that `internal.CmdArg` won't be not misused?
Go's specification guarantees that. Trying to access other package's
internal package causes compilation error.
And, `golangci-lint` also denies the git/internal package. Only the
`git/command.go` can use it carefully.
### There is still a `ToTrustedCmdArgs`, will it still allow developers
to make mistakes and pass untrusted arguments?
Generally speaking, no. Because when using `ToTrustedCmdArgs`, the code
will be very complex (see the changes for examples). Then developers and
reviewers can know that something might be unreasonable.
### Why there was a `CmdArgCheck` and why it's removed?
At the moment of #21535, to reduce unnecessary changes, `CmdArgCheck`
was introduced as a hacky patch. Now, almost all code could be written
as `cmd := NewCommand(); cmd.AddXxx(...)`, then there is no need for
`CmdArgCheck` anymore.
### Why many codes for `signArg == ""` is deleted?
Because in the old code, `signArg` could never be empty string, it's
either `-S[key-id]` or `--no-gpg-sign`. So the `signArg == ""` is just
dead code.
---------
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Change all license headers to comply with REUSE specification.
Fix #16132
Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
This PR adds a context parameter to a bunch of methods. Some helper
`xxxCtx()` methods got replaced with the normal name now.
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
- Add a new push mirror to specific repository
- Sync now ( send all the changes to the configured push mirrors )
- Get list of all push mirrors of a repository
- Get a push mirror by ID
- Delete push mirror by ID
Signed-off-by: Mohamed Sekour <mohamed.sekour@exfo.com>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: zeripath <art27@cantab.net>
Support synchronizing with the push mirrors whenever new commits are pushed or synced from pull mirror.
Related Issues: #18220
Co-authored-by: delvh <dev.lh@web.de>
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
- Always give a best-effort to fetching the repositories, if even that
fails indeed give a disconnected mirror found error.
- *Partially* resolves #19928
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
* Move some repository related code into sub package
* Move more repository functions out of models
* Fix lint
* Some performance optimization for webhooks and others
* some refactors
* Fix lint
* Fix
* Update modules/repository/delete.go
Co-authored-by: delvh <dev.lh@web.de>
* Fix test
* Merge
* Fix test
* Fix test
* Fix test
* Fix test
Co-authored-by: delvh <dev.lh@web.de>
A `repo_model.Mirror` repository field (`.Repo`) will not automatically
be set, but is used without checking in mirror_pull.go:UpdateAddress.
This will cause an NPE.
This PR changes UpdateAddress to use the helper function GetRepository()
helping prevent future NPEs but also changes modules/context/repo.go to
ensure that the Mirror.Repo is set.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Follows #19266, #8553, Close #18553, now there are only three `Run..(&RunOpts{})` functions.
* before: `stdout, err := RunInDir(path)`
* now: `stdout, _, err := RunStdString(&git.RunOpts{Dir:path})`
Strangely #19038 appears to relate to an issue whereby a tag appears to
be listed in `git show-ref --tags` but then does not appear when `git
show-ref --tags -- short_name` is called.
As a solution though I propose to stop the second call as it is
unnecessary and only likely to cause problems.
I've also noticed that the tags calls are wildly inefficient and aren't using the common cat-files - so these have been added.
I've also noticed that the git commit-graph is not being written on mirroring - so I've also added writing this to the migration which should improve mirror rendering somewhat.
Fix #19038
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
* Touch mirrors on even on fail to update
If a mirror fails to be synchronised it should be pushed to the bottom of the queue
of the awaiting mirrors to be synchronised. At present if there LIMIT number of
broken mirrors they can effectively prevent all other mirrors from being synchronized
as their last_updated time will remain earlier than other mirrors.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Remove `db.DefaultContext` usage in routers, use `ctx` directly
* Use `ctx` directly if there is one, remove some `db.DefaultContext` in `services`
* Use ctx instead of db.DefaultContext for `cmd` and some `modules` packages
* fix incorrect context usage
Yet another issue has come up where the logging from SyncMirrors does not provide
enough context. This PR adds more context to these logging events.
Related #19038
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Use MirrorID instead of RepoID
- Use the MirrorID as index(SQL uses `id` column not the `repo_id`).
Passes the Mirror ID's into the Sync functions.
* Check for MirrorID == 0
* Fix `StartToMirror` + refactor
* Update services/mirror/mirror.go
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
* Restart zero worker if there is still work to do
It is possible for the zero worker to timeout before all the work is finished.
This may mean that work may take a long time to complete because a worker will only
be induced on repushing.
Also ensure that requested count is reset after pulls and push mirror sync requests and add some more trace logging to the queue push.
Fix #18607
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Start adding mechanism to return unhandled data
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Create pushback interface
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add Pausable interface to WorkerPool and Manager
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Implement Pausable and PushBack for the bytefifos
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Implement Pausable and Pushback for ChannelQueues and ChannelUniqueQueues
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Wire in UI for pausing
Signed-off-by: Andrew Thornton <art27@cantab.net>
* add testcases and fix a few issues
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix build
Signed-off-by: Andrew Thornton <art27@cantab.net>
* prevent "race" in the test
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix jsoniter mismerge
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix conflicts
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix format
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Add warnings for no worker configurations and prevent data-loss with redis/levelqueue
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Use StopTimer
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lauris BH <lauris@nix.lv>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
This PR continues the work in #17125 by progressively ensuring that git
commands run within the request context.
This now means that the if there is a git repo already open in the context it will be used instead of reopening it.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Some refactors related repository model
* Move more methods out of repository
* Move repository into models/repo
* Fix test
* Fix test
* some improvements
* Remove unnecessary function
This PR registers requests with the process manager and manages hierarchy within the processes.
Git repos are then associated with a context, (usually the request's context) - with sub commands using this context as their base context.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Use hostmacher to replace matchlist.
And we introduce a better DialContext to do a full host/IP check, otherwise the attackers can still bypass the allow/block list by a 302 redirection.
* Handle broken references in mirror sync
If there are broken references during a mirror attempt to fix using `git remote prune`.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Convert the old mirror syncing queue to the more modern queue format.
Fix a bug in the from the repo-archive queue PR - the assumption was made that uniqueness could be enforced with by checking equality in a map in channel unique queues - however this only works for primitive types - which was the initial intention but is an imperfect. This is fixed by marshalling the data and placing the martialled data in the unique map instead.
The documentation is also updated to add information about the deprecated configuration values.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* DBContext is just a Context
This PR removes some of the specialness from the DBContext and makes it context
This allows us to simplify the GetEngine code to wrap around any context in future
and means that we can change our loadRepo(e Engine) functions to simply take contexts.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix unit tests
Signed-off-by: Andrew Thornton <art27@cantab.net>
* another place that needs to set the initial context
Signed-off-by: Andrew Thornton <art27@cantab.net>
* avoid race
Signed-off-by: Andrew Thornton <art27@cantab.net>
* change attachment error
Signed-off-by: Andrew Thornton <art27@cantab.net>