fix(pgvector): add chunking to prevent long list of args in queries by kyteinsky · Pull Request #290 · nextcloud/context_chat_backend

kyteinsky · 2026-03-20T13:29:29Z

The error log for the non-chunked deletion query:

{"timestamp": "2026-03-20T11:45:29.417326+00:00", "level": "ERROR", "logger": "ccb.vectordb", "message": "Error deleting chunks, rolling back documents store deletion for source ids", "filename": "pgvector.py", "function": "delete_source_ids", "line": 438, "thread_name": "AnyIO worker thread", "pid": 2148, "version": "5.3.0", "exc_info": "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 1967, in _exec_single_context\n    self.dialect.do_execute(\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/default.py\", line 952, in do_execute\n    cursor.execute(statement, parameters)\n  File \"/usr/local/lib/python3.11/dist-packages/psycopg/cursor.py\", line 117, in execute\n    raise ex.with_traceback(None)\npsycopg.OperationalError: sending query and params failed: number of parameters must be between 0 and 65535\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/app/context_chat_backend/vectordb/pgvector.py\", line 435, in delete_source_ids\n    session.execute(stmt_chunks)\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/orm/session.py\", line 2351, in execute\n    return self._execute_internal(\n           ^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/orm/session.py\", line 2249, in _execute_internal\n    result: Result[Any] = compile_state_cls.orm_execute_statement(\n                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/orm/bulk_persistence.py\", line 2033, in orm_execute_statement\n    return super().orm_execute_statement(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/orm/context.py\", line 306, in orm_execute_statement\n    result = conn.execute(\n             ^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 1419, in execute\n    return meth(\n           ^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/sql/elements.py\", line 527, in _execute_on_connection\n    return connection._execute_clauseelement(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 1641, in _execute_clauseelement\n    ret = self._execute_context(\n          ^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 1846, in _execute_context\n    return self._exec_single_context(\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 1986, in _exec_single_context\n    self._handle_dbapi_exception(\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 2363, in _handle_dbapi_exception\n    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/base.py\", line 1967, in _exec_single_context\n    self.dialect.do_execute(\n  File \"/usr/local/lib/python3.11/dist-packages/sqlalchemy/engine/default.py\", line 952, in do_execute\n    cursor.execute(statement, parameters)\n  File \"/usr/local/lib/python3.11/dist-packages/psycopg/cursor.py\", line 117, in execute\n    raise ex.with_traceback(None)\nsqlalchemy.exc.OperationalError: (psycopg.OperationalError) sending query and params failed: number of parameters must be between 0 and 65535\n[SQL: DELETE FROM langchain_pg_embedding WHERE langchain_pg_embedding.collection_id = %(collection_id_1)s::UUID AND langchain_pg_embedding.id IN (%(id_1_1)s::VARCHAR, %(id_1_2)s::VARCHAR, %(id_1_3)s::VARCHAR, %(id_1_4)s::VARCHAR, %(id_1_5)s::VARCHAR, %(id_1_6)s::VARCHAR, %(id_1_7)s::VARCHAR, %(id_1_8)s::VARCHAR, %(id_1_9)s::VARCHAR, %(id_1_10)s::VARCHAR, %(id_1_11)s::VARCHAR, %(id_1_12)s::VARCHAR, %(id_1_13)s::VARCHAR, %(id_1_14)s::VARCHAR, ...

raise ex.with_traceback(None)\nsqlalchemy.exc.OperationalError: (psycopg.OperationalError) sending query and params failed: number of parameters must be between 0 and 65535

Signed-off-by: Anupam Kumar <kyteinsky@gmail.com>

@fcharlaix-opendsi

## 5.4.0-beta0 - 2026-05-26 ### Added - add network embedding batching (#276) @fcharlaix-opendsi - add kubernetes support and reverse content/indexing flow (#284) @kyteinsky @marcelklehr - add gh workflows for docker builds and do separate cpu, cuda and rocm (vulkan) images (#295) @kyteinsky ### Changed - update readme according to the latest changes (#300) @kyteinsky - bump llama_cpp_python to 0.3.23 (#301) @kyteinsky ### Fixed - improve loadSources error handling (#288) @kyteinsky - fix(pgvector): add chunking to prevent long list of args in queries (#290) @kyteinsky - fix(pgvector): make doc deletion query faster (#289) @kyteinsky

@fcharlaix-opendsi

## 5.4.0 - 2026-06-24 ### Highlights - The indexing direction has been reversed now. Instead of the context_chat PHP app sending documents to the context_chat_backend ExApp, the ExApp downloads the documents from the server according to a list obtained from the PHP app. This also means that the `occ context_chat:scan` command serves no purpose and has been removed. Indexing should be smoother and run continuously now. - Kubernetes support to scale the CPU computation - Separate docker images for CPU, CUDA and ROCM (uses Vulkan) instead of one heavy CUDA/CPU image - CUDA 12.8 is shipped in the CUDA image so the host drivers should be updated to this at the minimum. ### Added - add network embedding batching (#276) @fcharlaix-opendsi - add kubernetes support and reverse content/indexing flow (#284) @kyteinsky @marcelklehr - add gh workflows for docker builds and do separate cpu, cuda and rocm (vulkan) images (#295) @kyteinsky ### Changed - update readme according to the latest changes (#300) @kyteinsky - bump llama_cpp_python to 0.3.23 (#301) @kyteinsky - move task types to the backend (#321) @kyteinsky - adjust comment in Dockerfile regarding RTX5090 support (#316) @kyteinsky ### Fixed - improve loadSources error handling (#288) @kyteinsky - fix(pgvector): add chunking to prevent long list of args in queries (#290) @kyteinsky - fix(pgvector): make doc deletion query faster (#289) @kyteinsky - drop latin-1 decode in source title and userIds (#306) @kyteinsky - handle validation errors of files and content providers individually (#308) @kyteinsky - prevent race condition in vectordb tables creation (#308) @kyteinsky - pass actual error in the error object (#308) @kyteinsky - add container hostname to /etc/hosts to silence sudo warning (#311) @sanzakicesarr ## 🤖 AI (if applicable) - [ ] The content of this PR was partly or fully generated using AI Signed-off-by: kyteinsky <kyteinsky@gmail.com>

fix(pgvector): add chunking to prevent long list of args in queries

05b52d9

Signed-off-by: Anupam Kumar <kyteinsky@gmail.com>

kyteinsky requested a review from marcelklehr as a code owner March 20, 2026 13:29

marcelklehr approved these changes Mar 26, 2026

View reviewed changes

kyteinsky merged commit 2debcb2 into master Mar 26, 2026
11 checks passed

kyteinsky deleted the fix/pgvector-chunked-queries branch March 26, 2026 17:24

kyteinsky mentioned this pull request May 26, 2026

5.4.0-beta0 #302

Merged

kyteinsky mentioned this pull request Jun 24, 2026

5.4.0 #322

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(pgvector): add chunking to prevent long list of args in queries#290

fix(pgvector): add chunking to prevent long list of args in queries#290
kyteinsky merged 1 commit into
masterfrom
fix/pgvector-chunked-queries

kyteinsky commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kyteinsky commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants