1
0
Fork 0
mirror of https://codeberg.org/forgejo/forgejo.git synced 2024-12-04 10:30:19 -05:00
Commit graph

19 commits

Author SHA1 Message Date
Chl
544cbc6f01 Optimization of labels handling in issue_search (#4228)
This PR optimizes the SQL query and de-duplicate the labels' ids when generating the query string, on the issue page.

<hr/>

### Background

Some time ago, BingBot and some other crawlers have been putting my instance on its knees with requests containing a lot of label ids, like this one :

```
[07/Aug/2023:11:28:37 +0200] "GET /Dolibarr/sendrecurringinvoicebymail/issues?q=&type=all&sort=&state=closed&labels=1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c2%2c10%2c2%2c1%2c1%2c10%2c10%2c7%2c6%2c10%2c10%2c3%2c2%2c1%2c5%2c10%2c1%2c6%2c2%2c7%2c3%2c7%2c6%2c10%2c1%2c10%2c1%2c1%2c7%2c7%2c1%2c1%2c1%2c1%2c10%2c10%2c1%2c2%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c1%2c2%2c1%2c12%2c6%2c6%2c10&milestone=0&project=-1&poster=0 HTTP/1.1" 499 0 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/103.0.5060.134 Safari/537.36"
```

Since each of the label ids implies a join, it grows exponentially expensive for the database engine (at least on PostgreSQL but SQLite suffers a little too).

Thus, this PR proposes two enhancements:

* rewrite the database query to use only one squashed condition,
* deduplicate the label ids when generating the URL.

### Performance comparison

Here are some timings on Postgresql-backed, Forgejo 7.0.4 instances :
```sh
$ time curl -s -o /dev/null "http://localhost:3000/toto/tata/issues?q=&type=all&sort=&labels=19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25&state=open&milestone=0&project=0&assignee=0&poster=0"

real    0m10,491s
user    0m0,017s
sys     0m0,008s
```
...and with the patch:
```sh
$ time curl -s -o /dev/null "http://localhost:3000/toto/tata/issues?q=&type=all&sort=&labels=19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25%2c19%2c25&state=open&milestone=0&project=0&assignee=0&poster=0"

real    0m0,094s
user    0m0,012s
sys     0m0,013s
```

### Annex

This issue was originally proposed to [Gitea](https://github.com/go-gitea/gitea/pull/26460) but didn't get much attention, and I switched to Forgejo in the meantime :)

Reviewed-on: https://codeberg.org/forgejo/forgejo/pulls/4228
Reviewed-by: Earl Warren <earl-warren@noreply.codeberg.org>
Co-authored-by: Chl <chl@xlii.si>
Co-committed-by: Chl <chl@xlii.si>
2024-06-28 05:11:57 +00:00
silverwind
d8bc0495de
Enable unparam linter (#31277)
Enable [unparam](https://github.com/mvdan/unparam) linter.

Often I could not tell the intention why param is unused, so I put
`//nolint` for those cases like webhook request creation functions never
using `ctx`.

---------

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: delvh <dev.lh@web.de>
(cherry picked from commit fc2d75f86d77b022ece848acf2581c14ef21d43b)

Conflicts:
	modules/setting/config_env.go
	modules/storage/azureblob.go
	services/webhook/dingtalk.go
	services/webhook/discord.go
	services/webhook/feishu.go
	services/webhook/matrix.go
	services/webhook/msteams.go
	services/webhook/packagist.go
	services/webhook/slack.go
	services/webhook/telegram.go
	services/webhook/wechatwork.go

	run make lint-go and fix Forgejo specific warnings
2024-06-16 13:42:58 +02:00
Lunny Xiao
a7591f9738
Rename project board -> column to make the UI less confusing (#30170)
This PR split the `Board` into two parts. One is the struct has been
renamed to `Column` and the second we have a `Template Type`.

But to make it easier to review, this PR will not change the database
schemas, they are just renames. The database schema changes could be in
future PRs.

---------

Co-authored-by: silverwind <me@silverwind.io>
Co-authored-by: yp05327 <576951401@qq.com>
(cherry picked from commit 98751108b11dc748cc99230ca0fc1acfdf2c8929)

Conflicts:
	docs/content/administration/config-cheat-sheet.en-us.md
	docs/content/index.en-us.md
	docs/content/installation/comparison.en-us.md
	docs/content/usage/permissions.en-us.md
	non existent files

	options/locale/locale_en-US.ini
	routers/web/web.go
	templates/repo/header.tmpl
	templates/repo/settings/options.tmpl
	trivial context conflicts
2024-06-02 09:42:39 +02:00
Lunny Xiao
9ca245ad96
Use db.ListOptions directly instead of Paginator interface to make it easier to use and fix performance of /pulls and /issues (#29990)
This PR uses `db.ListOptions` instead of `Paginor` to make the code
simpler.
And it also fixed the performance problem when viewing /pulls or
/issues. Before the counting in fact will also do the search.

---------

Co-authored-by: Jason Song <i@wolfogre.com>
Co-authored-by: silverwind <me@silverwind.io>
(cherry picked from commit 3f26fe2fa2c7141c9e622297e50a70f3e0003e4d)
2024-03-30 07:17:29 +01:00
pengqiseven
e825d007b1
remove repetitive words (#29695)
Signed-off-by: pengqiseven <912170095@qq.com>
(cherry picked from commit 7f856d5d742dcb6febdb8a3f22cd9a8fecc69a4d)
2024-03-20 08:46:28 +01:00
6543
e2371743d5
remove util.OptionalBool and related functions (#29513)
and migrate affected code

_last refactoring bits to replace **util.OptionalBool** with
**optional.Option[bool]**_

(cherry picked from commit a3f05d0d98408bb47333b19f505b21afcefa9e7c)

Conflicts:
	services/repository/branch.go
	trivial context conflict
2024-03-06 12:10:46 +08:00
Nanguan Lin
6a725b6f9c
Remove deadcode under models/issues (#28536)
Using the Go Official tool `golang.org/x/tools/cmd/deadcode@latest`
mentioned by [go blog](https://go.dev/blog/deadcode).
Just use `deadcode .` in the project root folder and it gives a list of
unused functions. Though it has some false alarms.
This PR removes dead code detected in `models/issues`.
2023-12-19 20:12:02 +01:00
Jason Song
beb71f5ef6
Include public repos in doer's dashboard for issue search (#28304)
It will fix #28268 .

<img width="1313" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/cb1e07d5-7a12-4691-a054-8278ba255bfc">

<img width="1318" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/4fd60820-97f1-4c2c-a233-d3671a5039e9">

## ⚠️ BREAKING ⚠️

But need to give up some features:

<img width="1312" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/281c0d51-0e7d-473f-bbed-216e2f645610">

However, such abandonment may fix #28055 .

## Backgroud

When the user switches the dashboard context to an org, it means they
want to search issues in the repos that belong to the org. However, when
they switch to themselves, it means all repos they can access because
they may have created an issue in a public repo that they don't own.

<img width="286" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/182dcd5b-1c20-4725-93af-96e8dfae5b97">

It's a confusing design. Think about this: What does "In your
repositories" mean when the user switches to an org? Repos belong to the
user or the org?

Whatever, it has been broken by #26012 and its following PRs. After the
PR, it searches for issues in repos that the dashboard context user owns
or has been explicitly granted access to, so it causes #28268.

## How to fix it

It's not really difficult to fix it. Just extend the repo scope to
search issues when the dashboard context user is the doer. Since the
user may create issues or be mentioned in any public repo, we can just
set `AllPublic` to true, which is already supported by indexers. The DB
condition will also support it in this PR.

But the real difficulty is how to count the search results grouped by
repos. It's something like "search issues with this keyword and those
filters, and return the total number and the top results. **Then, group
all of them by repo and return the counts of each group.**"

<img width="314" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/5206eb20-f8f5-49b9-b45a-1be2fcf679f4">

Before #26012, it was being done in the DB, but it caused the results to
be incomplete (see the description of #26012).

And to keep this, #26012 implement it in an inefficient way, just count
the issues by repo one by one, so it cannot work when `AllPublic` is
true because it's almost impossible to do this for all public repos.


1bfcdeef4c/modules/indexer/issues/indexer.go (L318-L338)

## Give up unnecessary features

We may can resovle `TODO: use "group by" of the indexer engines to
implement it`, I'm sure it can be done with Elasticsearch, but IIRC,
Bleve and Meilisearch don't support "group by".

And the real question is, does it worth it? Why should we need to know
the counts grouped by repos?

Let me show you my search dashboard on gitea.com.

<img width="1304" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/2bca2d46-6c71-4de1-94cb-0c9af27c62ff">

I never think the long repo list helps anything.

And if we agree to abandon it, things will be much easier. That is this
PR.

## TODO

I know it's important to filter by repos when searching issues. However,
it shouldn't be the way we have it now. It could be implemented like
this.

<img width="1316" alt="image"
src="https://github.com/go-gitea/gitea/assets/9418365/99ee5f21-cbb5-4dfe-914d-cb796cb79fbe">

The indexers support it well now, but it requires some frontend work,
which I'm not good at. So, I think someone could help do that in another
PR and merge this one to fix the bug first.

Or please block this PR and help to complete it.

Finally, "Switch dashboard context" is also a design that needs
improvement. In my opinion, it can be accomplished by adding filtering
conditions instead of "switching".
2023-12-07 13:26:18 +08:00
Nanguan Lin
1eae2aadae
Fix issue not showing on default board and add test (#27720)
See https://github.com/go-gitea/gitea/pull/27718#issuecomment-1773743014
. Add a test to ensure its behavior.
Why this test uses `ProjectBoardID=0`? Because in `SearchOptions`,
`ProjectBoardID=0` means what it is. But in `IssueOptions`,
`ProjectBoardID=0` means there is no condition, and
`ProjectBoardID=db.NoConditionID` means the board ID = 0.
It's really confusing. Probably it's better to separate the db search
engine and the other issue search code. It's really two different
systems. As far as I can see, `IssueOptions` is not necessary for most
of the code, which has very simple issue search conditions.
2023-10-25 11:51:49 +00:00
Nanguan Lin
eb1478791f
Clean some functions about project issue (#27705)
1. remove unused function `MoveIssueAcrossProjectBoards`
2. extract the project board condition into a function
3. use db.NoCondition instead of -1. (BTW, the usage of db.NoCondition
is too confusing. Is there any way to avoid that?)
4. remove the unnecessary comment since the ctx refactor is completed.
5. Change `b.ID != 0` to `b.ID > 0`. It's more intuitive but I think
they're the same since board ID can't be negative.
2023-10-20 14:01:25 +02:00
JakobDev
ebe803e514
Penultimate round of db.DefaultContext refactor (#27414)
Part of #27065

---------

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2023-10-11 04:24:07 +00:00
Nanguan Lin
2f8e1604f8
Fix review request number and add more tests (#27104)
fix #27019 
## testfixture yml
1. add issue20(a pr issue) in repo 23, org 17
2. add user15 to team 9
3. add four reviews about issue20
## test case
add two tests that are described with code comments
the code before pr #26784 failed the first test
<img width="479" alt="image"
src="https://github.com/go-gitea/gitea/assets/70063547/1d9b5787-11b4-4c4d-931f-6a9869547f35">
current code failed the second test(as mentioned in #27019)
<img width="484" alt="image"
src="https://github.com/go-gitea/gitea/assets/70063547/05608055-7587-43d1-bae1-92c688270819">
Any advice is appreciated.

---------

Co-authored-by: CaiCandong <50507092+CaiCandong@users.noreply.github.com>
Co-authored-by: Giteabot <teabot@gitea.io>
2023-09-21 13:59:50 +02:00
Nanguan Lin
f1fe102c8c
Fix wrong review requested number (#26784)
Fix the wrong review requested number mentioned by #18808 .
Fix #18808 
Before:

![ksnip_20230829-140750](https://github.com/go-gitea/gitea/assets/70063547/0af2055b-6f16-4699-a944-c7186831d7f9)
After:

![ksnip_20230829-141817](https://github.com/go-gitea/gitea/assets/70063547/16633264-20ba-45e3-bfbb-a495ed76a45b)
2023-09-03 02:12:38 +00:00
CaiCandong
0e74fc4a84
Fix project filter bugs (#26490)
related: #26012

### Bugs
1. missing project filter on the issue page.

1e76a824bc/modules/indexer/issues/dboptions.go (L11-L15)
3. incorrect SQL condition: some issue does not belong to a project but
exists on the project_issue table.

f5dbac9d36/models/issues/issue_search.go (L233)

### Before:

![before](https://github.com/go-gitea/gitea/assets/50507092/1dcde39e-3e2f-4151-b2c6-4d67bf493c2f)

### After:

![after](https://github.com/go-gitea/gitea/assets/50507092/badfb81f-056d-4a2f-9838-1cba9c15768d)

---------

Co-authored-by: Giteabot <teabot@gitea.io>
2023-08-15 14:50:12 +00:00
Lunny Xiao
f5dbac9d36
Use more IssueList instead of []*Issue (#26369) 2023-08-07 19:26:40 +00:00
Jason Song
1e76a824bc
Refactor and enhance issue indexer to support both searching, filtering and paging (#26012)
Fix #24662.

Replace #24822 and #25708 (although it has been merged)


## Background

In the past, Gitea supported issue searching with a keyword and
conditions in a less efficient way. It worked by searching for issues
with the keyword and obtaining limited IDs (as it is heavy to get all)
on the indexer (bleve/elasticsearch/meilisearch), and then querying with
conditions on the database to find a subset of the found IDs. This is
why the results could be incomplete.

To solve this issue, we need to store all fields that could be used as
conditions in the indexer and support both keyword and additional
conditions when searching with the indexer.

## Major changes

- Redefine `IndexerData` to include all fields that could be used as
filter conditions.
- Refactor `Search(ctx context.Context, kw string, repoIDs []int64,
limit, start int, state string)` to `Search(ctx context.Context, options
*SearchOptions)`, so it supports more conditions now.
- Change the data type stored in `issueIndexerQueue`. Use
`IndexerMetadata` instead of `IndexerData` in case the data has been
updated while it is in the queue. This also reduces the storage size of
the queue.
- Enhance searching with Bleve/Elasticsearch/Meilisearch, make them
fully support `SearchOptions`. Also, update the data versions.
- Keep most logic of database indexer, but remove
`issues.SearchIssueIDsByKeyword` in `models` to avoid confusion where is
the entry point to search issues.
- Start a Meilisearch instance to test it in unit tests.
- Add unit tests with almost full coverage to test
Bleve/Elasticsearch/Meilisearch indexer.

---------

Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
2023-07-31 06:28:53 +00:00
Lunny Xiao
b167f35113
Add context parameter to some database functions (#26055)
To avoid deadlock problem, almost database related functions should be
have ctx as the first parameter.
This PR do a refactor for some of these functions.
2023-07-22 22:14:27 +08:00
Lunny Xiao
38cf43d060
Some refactors for issues stats (#24793)
This PR

- [x] Move some functions from `issues.go` to `issue_stats.go` and
`issue_label.go`
- [x] Remove duplicated issue options `UserIssueStatsOption` to keep
only one `IssuesOptions`
2023-05-19 22:17:48 +08:00
Lunny Xiao
09ab64dfad
Remove duplicated issues options and some more refactors (#24787)
This PR 

- [x] Move some code from `issue.go` to `issue_search.go` and
`issue_update.go`
- [x] Use `IssuesOptions` instead of `IssueStatsOptions` becuase they
are too similiar.
- [x] Rename some functions
2023-05-18 10:45:25 +00:00