Commit Graph

236 Commits

Author SHA1 Message Date
yangdx 2f0aa7ed12 Optimize graph query by simplifying MATCH pattern
- Simplify MATCH clause to ()-[r]-()
- Remove node type constraints
- Improve query performance
2025-08-02 12:54:22 +08:00
yangdx 9a8f58826d fix: Add safe handling for missing file_path and metadata in PostgreSQL doc status functions
- Add null-safe file_path handling with "no-file-path" fallback in get_docs_by_status and get_docs_by_track_id
- Enhance metadata validation to ensure dict type after JSON parsing
- Align PostgreSQL implementation with JSON implementation safety patterns
- Prevent KeyError exceptions when database records have missing fields
2025-07-31 18:07:53 +08:00
yangdx 0eac1a883a Feat: add file path sorting for document manager
- Add file_path sorting support to all database backends (JSON, Redis, PostgreSQL, MongoDB)
- Implement smart column header switching between "ID" and "File Name" based on display mode
- Add automatic sort field switching when toggling between ID and file name display
- Create composite indexes for workspace+file_path in PostgreSQL and MongoDB for better query performance
- Update frontend to maintain sort state when switching display modes
- Add internationalization support for "fileName" in English and Chinese locales

This enhancement improves user experience by providing intuitive file-based sorting
while maintaining performance through optimized database indexes.
2025-07-30 18:46:55 +08:00
yangdx 74eecc46e5 feat(pagination): Implement document list pagination backends and frontend UI
- Add pagination support to BaseDocStatusStorage interface and all implementations (PostgreSQL, MongoDB, Redis, JSON)
- Implement RESTful API endpoints for paginated document queries and status counts
- Create reusable pagination UI components with internationalization support
- Optimize performance with database-level pagination and efficient in-memory processing
- Maintain backward compatibility while adding configurable page sizes (10-200 items)
2025-07-30 17:58:32 +08:00
yangdx cfb7117dd6 Fix track_id missing for query in PostgreSQL 2025-07-30 03:44:20 +08:00
yangdx 93afa7d8a7 feat: add processing time tracking to document status with metadata field
- Add metadata field to DocProcessingStatus with start_time and end_time tracking
- Record processing timestamps using Unix time format (seconds precision)
- Update all storage backends (JSON, MongoDB, Redis, PostgreSQL) for new field support
- Maintain backward compatibility with default values for existing data
- Add error_msg field for better error tracking during document processing
2025-07-29 23:42:33 +08:00
yangdx 7206c07468 Remove deprecated content field from doc status
- Drop content column from LIGHTRAG_DOC_STATUS
- Clean up doc status handling code
- Maintain backward compatibility
2025-07-29 23:19:36 +08:00
yangdx 1e1adcb64a Add index on track_id column in doc status table of PostgreSQL 2025-07-29 23:03:09 +08:00
yangdx 6014b9bf73 feat: add track_id support for document processing progress monitoring
- Add get_docs_by_track_id() method to all storage backends (MongoDB, PostgreSQL, Redis, JSON)
- Implement automatic track_id generation with upload_/insert_ prefixes
- Add /track_status/{track_id} API endpoint for frontend progress queries
- Create database indexes for efficient track_id lookups
- Enable real-time document processing status tracking across all storage types
2025-07-29 22:24:21 +08:00
yangdx 24c36d876c Remove content field from DocProcessingStatus, update MongoDB and PostgreSQL implementation 2025-07-29 14:52:45 +08:00
yangdx 5574a30856 fix(postgres): handle ssl_mode="allow" in _create_ssl_context
Add "allow" to the list of recognized SSL modes in PostgreSQL connection helper. Previously, ssl_mode="allow" would fall through to "Unknown SSL mode" warning. Now it's properly handled alongside "require" and "prefer" modes.
2025-07-24 12:45:13 +08:00
yangdx df8b4202f3 feat: Add SSL support for PostgreSQL database connections
- Add SSL configuration options (ssl_mode, ssl_cert, ssl_key, ssl_root_cert, ssl_crl)
- Support all PostgreSQL SSL modes (disable, allow, prefer, require, verify-ca, verify-full)
- Add SSL context creation with certificate validation
- Update initdb() method to handle SSL connection parameters
- Add SSL environment variables to env.example
- Maintain backward compatibility with existing non-SSL configurations
2025-07-21 02:03:06 +08:00
yangdx 19a38d9310 Feat: add PostgreSQL extensions for vector and AGE
- Ensure VECTOR extension is available when PostgreSQL init
- Ensure AGE extension is available when PGGraphStorage init
2025-07-21 01:46:41 +08:00
yangdx f033fd6f87 fix(postgres): improve AGE agtype parsing and simplify error logging
- Fix JSON parsing errors caused by :: characters in data content
- Implement precise agtype string parsing using rfind() to separate JSON content from type identifiers
- Add robust error handling for malformed JSON in graph data
2025-07-18 08:50:47 +08:00
yangdx 57c8c19628 Add datetime format migration for doc status table 2025-07-16 22:21:51 +08:00
yangdx c7b566f6d5 Fix cache migration MD5 error for PostgreSQL 2025-07-16 19:24:57 +08:00
yangdx 80f7e37168 Fix default workspace name for PostgreSQL AGE graph storage 2025-07-16 19:16:22 +08:00
yangdx bab2803953 Optimize PostgreSQL database migrations for LLM cache
- Combine column migration into single operation
- Optimize LLM cache key migration query
- Improve migration error handling
- Add conflict detection for cache migration
2025-07-16 17:32:53 +08:00
yangdx bd340fece6 Fix timestamp column migration comment typos
- Correct timezone-related comments
- Fix typo in debug log message
- Update migration success message
- Maintain same migration logic
2025-07-16 14:27:52 +08:00
yangdx 7e988158a9 Fix: Resolve timezone handling problem in PostgreSQL storage
- Changed timestamp columns to naive UTC
- Added datetime formatting utilities
- Updated SQL templates for timestamp extraction
- Simplified timestamp migration logic
2025-07-14 04:12:52 +08:00
yangdx 157fb4c871 Increase field lengths for entity and file paths for PostgreSQL
- Expand entity_name length to 512 chars
- Increase source/target ID lengths
- Convert file_path to TEXT type
- Add migration logic
2025-07-14 00:24:54 +08:00
yangdx ef79088f60 Move max_graph_nodes to global config 2025-07-07 21:53:57 +08:00
yangdx da8655002a Add composite indexes for workspace+id columns for PostgreSQL 2025-07-07 03:36:49 +08:00
yangdx 033098c1bc Feat: Add WORKSPACE support to all storage types 2025-07-07 00:57:21 +08:00
yangdx 531502677e fix: Use create_time when update_time is 0 in PGKVStorage queries 2025-07-03 23:38:53 +08:00
yangdx 6c2ae40d7d Refac: Enhance KG rebuild stability by incorporating `create_time` into the LLM cache 2025-07-03 17:08:29 +08:00
yangdx 70e154b0aa Fix linting 2025-07-03 12:26:05 +08:00
yangdx ff1b1c61c7 Implemented storage types: PostgreSQL and MongoDB 2025-07-03 11:46:24 +08:00
yangdx 86c9a0cda2 Fix linting 2025-07-02 16:29:43 +08:00
yangdx 271722405f feat: Flatten LLM cache structure for improved recall efficiency
Refactored the LLM cache to a flat Key-Value (KV) structure, replacing the previous nested format. The old structure used the 'mode' as a key and stored specific cache content as JSON nested under it. This change significantly enhances cache recall efficiency.
2025-07-02 16:11:53 +08:00
yangdx 37bf341a69 Fix LLM cache handling for PGKVStorage to address document deletion scenarios.
- Add dynamic cache_type field
- Support mode parameter for LLM cache
- Maintain backward compatibility
2025-06-29 14:39:50 +08:00
yangdx b7f8c20e61 fix(postgres): use correct table for vector queries
Change SQL templates from LIGHTRAG_DOC_CHUNKS to LIGHTRAG_VDB_CHUNKS
to fix "content_vector does not exist" error in vector operations.
2025-06-28 15:36:54 +08:00
yangdx 2c47367975 Fix linting 2025-06-28 14:37:55 +08:00
yangdx 95c7a7d038 feat(db): Add data migration from LIGHTRAG_DOC_CHUNKS to LIGHTRAG_VDB_CHUNKS 2025-06-28 14:37:47 +08:00
yangdx 3a8a99b73d feat(postgres): Implement text_chunks upsert for PGKVStorage 2025-06-28 14:37:35 +08:00
yangdx 72384f87c4 Remove deprecated code from Postgres_impl.py
- Stop filtering out 'base' node labels
- Match any edge type in query to improve performance
2025-06-25 12:53:07 +08:00
yangdx 109c2b48be Fix linting 2025-06-25 12:39:43 +08:00
yangdx da46b341dc feat: Optimize document deletion performance
- To enhance performance during document deletion, new batch-get methods, `get_nodes_by_chunk_ids` and `get_edges_by_chunk_ids`, have been added to the graph storage layer (`BaseGraphStorage` and its implementations). The [`adelete_by_doc_id`](lightrag/lightrag.py:1681) function now leverages these methods to avoid unnecessary iteration over the entire knowledge graph, significantly improving efficiency.
- Graph storage updated: Networkx, Neo4j, Postgres AGE
2025-06-25 12:37:57 +08:00
yangdx cc12460b38 Fix: Silence PostgreSQL logs during idempotent graph initialization 2025-06-23 23:08:56 +08:00
zrguo 4937de8809 Update 2025-06-22 15:12:09 +08:00
yangdx ada2443653 Optimize default setting of PostgreSQL 2025-05-22 17:09:26 +08:00
yangdx 2ee809cf58 Increase PG connection pool to 20 2025-05-22 16:37:18 +08:00
yangdx ebdc7cea49 Merge branch 'allow_max_connection_config' into pg-max-connection 2025-05-09 14:16:53 +08:00
Arjun Rao 6ebd76d5da bugfix: convert config val to int 2025-05-09 04:22:46 +10:00
Arjun Rao f2c522ce7a Allow max_connections to be configured in postgres 2025-05-08 11:00:56 +10:00
widgit e070c19414
Update postgres_impl.py
Was missing , on the SQL Table create command
2025-05-05 23:55:19 +10:00
yangdx e46a4b2079 Optimize log message 2025-05-04 22:31:57 +08:00
yangdx 9a41de51fb Optimize log message 2025-05-04 22:20:44 +08:00
yangdx dcb2a72462 Fix JSON handling error for PostgreSQL graph storage 2025-05-04 22:18:56 +08:00
yangdx 1213f53fc9 Fix mistakenly interpreting a string as JSON for PostgreSQL AGE graph storage 2025-05-04 02:20:43 +08:00