Commit graph

1290 commits

Author SHA1 Message Date
Mark Felder
8f285a787f Refactor backups to be fully controlled by Oban 2024-07-23 10:30:40 -04:00
Mark Felder
80e16de3bd Increase slow job queue parallelization 2024-07-15 12:00:58 -04:00
Mark Felder
52b6dd8bff Increase background job concurrency to 20 2024-07-15 11:45:13 -04:00
Mark Felder
d790df73f6 Remove the unused ingestion queue 2024-07-12 10:12:18 -04:00
Mark Felder
b135fa35a1 RichMedia: test that activity is streamed out 2024-06-24 09:47:16 -04:00
Mark Felder
655ac98478 Merge remote-tracking branch 'origin/develop' into fix/debug-logs 2024-06-20 09:00:39 -04:00
Mark Felder
e1e099d3bf Set console logs to :info for Elixir 1.15+ 2024-06-19 23:26:12 -04:00
Mark Felder
17d04ccc8b RichMedia backfill processing through Oban 2024-06-19 23:20:22 -04:00
Mark Felder
85b81cc933 Remove Logger from ConfigDB descriptions 2024-06-19 10:32:15 -04:00
Mark Felder
e628d00a81 Disable Ecto logging in tests
The debug logs are very noisy and can be enabled during analysis of a specific error believed to be SQL-related
2024-06-18 15:25:18 +00:00
Mark Felder
a734efeff8 Formatting 2024-06-12 15:21:43 -04:00
Lain Soykaf
41434ffcec Tests: Don't spawn processes in tests. 2024-06-12 15:20:04 -04:00
Haelwenn (lanodan) Monnier
c389ea0f42 Fix compatibility with Loggers in Elixir 1.15+ 2024-06-12 15:18:47 -04:00
Mark Felder
cfc8d7aade IPFS uploader: dialyzer fixes
lib/pleroma/uploaders/ipfs.ex:43:no_return
Function put_file/1 has no local return.
________________________________________________________________________________
lib/pleroma/uploaders/ipfs.ex:49:call
The function call will not succeed.

Pleroma.HTTP.post(
  binary(),
  _mp :: %Tesla.Multipart{
    :boundary => binary(),
    :content_type_params => [binary()],
    :parts => [
      %Tesla.Multipart.Part{
        :body => binary(),
        :dispositions => [any()],
        :headers => [any()]
      },
      ...
    ]
  },
  [],
  [{:params, [{:"cid-version", <<49>>}]}]
)

will never return since the success typing is:
(binary(), binary(), [{binary(), binary()}], Keyword.t()) ::
  {:error, _}
  | {:ok,
     %Tesla.Env{
       :__client__ => %Tesla.Client{
         :adapter => nil | {_, _} | {_, _, _},
         :fun => _,
         :post => [any()],
         :pre => [any()]
       },
       :__module__ => atom(),
       :body => _,
       :headers => [{_, _}],
       :method => :delete | :get | :head | :options | :patch | :post | :put | :trace,
       :opts => [{_, _}],
       :query => [{_, _}],
       :status => nil | integer(),
       :url => binary()
     }}

and the contract is
(Pleroma.HTTP.Request.url(), String.t(), Pleroma.HTTP.Request.headers(), :elixir.keyword()) ::
  {:ok, Tesla.Env.t()} | {:error, any()}
2024-05-30 15:14:27 -04:00
Lain Soykaf
f5978da676 HTTPSignaturePlugTest: Rewrite to use mox. 2024-05-28 14:00:25 +04:00
Lain Soykaf
3b4be5daa2 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into pleroma-secure-mode 2024-05-28 12:31:12 +04:00
lain
25903a4996 Merge branch 'auth-fetch-exception' into 'develop'
HTTPSignaturePlug: Add :authorized_fetch_mode_exceptions

See merge request pleroma/pleroma!4007
2024-05-28 04:42:35 +00:00
lain
8ff0c32903 Merge branch 'httpfixes' into 'develop'
Some HTTP and connection pool improvements

See merge request pleroma/pleroma!4124
2024-05-28 04:38:01 +00:00
Lain Soykaf
687ac4a850 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into auth-fetch-exception 2024-05-27 23:09:17 +04:00
feld
38db406ce4 Merge branch 'simpler-oban-queues' into 'develop'
Oban queue simplification

See merge request pleroma/pleroma!4123
2024-05-27 19:02:53 +00:00
lain
121791882f Merge branch 'explicitly-allow-unsafe-2' into 'develop'
Explicitly allow unsafe 2

See merge request pleroma/pleroma!4125
2024-05-27 18:43:05 +00:00
lain
3316a7ab70 Merge branch 'qdrant-search-2' into 'develop'
Search: Basic Qdrant/Ollama search

See merge request pleroma/pleroma!4109
2024-05-27 18:41:20 +00:00
Lain Soykaf
81e44ced0c HTTPSecurityPlug: Fix tests 2024-05-27 22:13:20 +04:00
Mark Felder
ba511a30b9 RichMedia use of ConcurrentLimiter was removed in the refactor 2024-05-27 14:12:38 -04:00
Mark Felder
6b8c15a4a1 Remove MediaProxyWarmingPolicy config for ConcurrentLimiter as we are not using it 2024-05-27 14:11:42 -04:00
feld
42150d5581 Merge branch 'logger-metadata' into 'develop'
Logger metadata

See merge request pleroma/pleroma!3990
2024-05-27 17:53:33 +00:00
Mark Felder
0847d9ebaf Oban queue simplification 2024-05-27 13:48:17 -04:00
Lain Soykaf
1c699144d2 HttpSecurityPlug: Don't allow unsafe-eval by default 2024-05-27 21:26:40 +04:00
feld
10b7efa98c Merge branch 'anti-mention-spam-mrf' into 'develop'
Anti-mention Spam MRF

See merge request pleroma/pleroma!4072
2024-05-27 16:46:31 +00:00
Mark Felder
cab6372d7a Make user age limit configurable
Switch to milliseconds for consistency with other configuration options in codebase
2024-05-27 12:31:29 -04:00
Mark Felder
0bddca361d DNSRBL in an MRF 2024-05-27 12:23:36 -04:00
Lain Soykaf
d3e85da0fd Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into auth-fetch-exception 2024-05-27 19:27:02 +04:00
lain
e93ae96e13 Merge branch 'nsfw-api-mrf' into 'develop'
NSFW API Policy

See merge request pleroma/pleroma!3471
2024-05-27 15:20:43 +00:00
Mark Felder
6708f154a4 Rework Gun connection pool sizes to make better use of the default 250 connections 2024-05-27 11:18:58 -04:00
Mark Felder
a50c657427 Add a dedicated connection pool for Rich Media
Sharing this pool with regular Media is problematic as Rich Media will connect to many different
domains and thrash the pool, but regular Media will have predictable connections to the webservers
hosting media for the fediverse servers you peer with.
2024-05-27 11:17:02 -04:00
Lain Soykaf
4325b1aec3 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into nsfw-api-mrf 2024-05-27 17:49:31 +04:00
Lain Soykaf
3055c1598b IPFSTest: Fix configuration mocking 2024-05-27 17:22:18 +04:00
Lain Soykaf
825b4122a5 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into pleroma-ipfs_uploader 2024-05-27 16:23:40 +04:00
Lain Soykaf
f4c04e6b2d QdrantSearch: Add health checks. 2024-05-27 14:21:55 +04:00
Lain Soykaf
08e9d995f8 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into qdrant-search-2 2024-05-27 13:50:22 +04:00
Mark Felder
61a3b79316 Search backend healthcheck process 2024-05-25 16:07:47 -04:00
Lain Soykaf
c67506ba68 Merge branch 'develop' of git.pleroma.social:pleroma/pleroma into auth-fetch-exception 2024-05-20 18:21:46 +04:00
Lain Soykaf
c139a9f38c B Config: Set default Qdrant embedder to our fastembed-api server 2024-05-19 12:39:54 +04:00
Lain Soykaf
72ec261a69 B QdrantSearch: Switch to OpenAI api 2024-05-19 12:17:46 +04:00
Lain Soykaf
cd7e2138d1 Search: Basic Qdrant/Ollama search 2024-05-14 14:13:37 +04:00
Mark Felder
d21aa1a77c Respect the TTL returned in OpenGraph tags 2024-05-07 19:54:56 -04:00
Mark Felder
df0734fcbf Increase the :max_body for Rich Media to 5MB
Websites are increasingly getting more bloated with tricks like inlining content (e.g., CNN.com) which puts pages at or above 5MB. This value may still be too low.
2024-05-07 19:54:56 -04:00
Mark Felder
ede414094f RichMedia refactor
Rich Media parsing was previously handled on-demand with a 2 second HTTP request timeout and retained only in Cachex. Every time a Pleroma instance is restarted it will have to request and parse the data for each status with a URL detected. When fetching a batch of statuses they were processed in parallel to attempt to keep the maximum latency at 2 seconds, but often resulted in a timeline appearing to hang during loading due to a URL that could not be successfully reached. URLs which had images links that expire (Amazon AWS) were parsed and inserted with a TTL to ensure the image link would not break.

Rich Media data is now cached in the database and fetched asynchronously. Cachex is used as a read-through cache. When the data becomes available we stream an update to the clients. If the result is returned quickly the experience is almost seamless. Activities were already processed for their Rich Media data during ingestion to warm the cache, so users should not normally encounter the asynchronous loading of the Rich Media data.

Implementation notes:

- The async worker is a Task with a globally unique process name to prevent duplicate processing of the same URL
- The Task will attempt to fetch the data 3 times with increasing sleep time between attempts
- The HTTP request obeys the default HTTP request timeout value instead of 2 seconds
- URLs that cannot be successfully parsed due to an unexpected error receives a negative cache entry for 15 minutes
- URLs that fail with an expected error will receive a negative cache with no TTL
- Activities that have no detected URLs insert a nil value in the Cachex :scrubber_cache so we do not repeat parsing the object content with Floki every time the activity is rendered
- Expiring image URLs are handled with an Oban job
- There is no automatic cleanup of the Rich Media data in the database, but it is safe to delete at any time
- The post draft/preview feature makes the URL processing synchronous so the rendered post preview will have an accurate rendering

Overall performance of timelines and creating new posts which contain URLs is greatly improved.
2024-05-07 19:54:56 -04:00
637f5bc431 Fix type in description
Signed-off-by: marcin mikołajczak <git@mkljczk.pl>
2024-04-27 20:29:23 +02:00
Mark Felder
462d5aa5cb logger: remove request_id metadata which is not useful 2024-03-19 20:53:40 -04:00