Commit Graph

  • b57792eb41 feat: add pptx_extractor and html_extractor main m.dabbagh 2026-01-31 18:23:04 +03:30
  • b53f8c47d3 add title to Document model and remove display_name form DocumentMetadata m.dabbagh 2026-01-28 22:13:55 +03:30
  • 6259220629 disable swagger auth m.dabbagh 2026-01-28 22:10:24 +03:30
  • 2753b913fb enable swagger auth m.dabbagh 2026-01-28 10:46:37 +03:30
  • a1fbd12874 fix: paragraph_chunker was adding "None" when the section.title was none m.dabbagh 2026-01-27 21:08:32 +03:30
  • 80dd901e42 fix: remove file extension from DocumentMetadata.display_name m.dabbagh 2026-01-25 11:33:50 +03:30
  • 9e1e49bc59 add document title and section title to the beginning of each chunk in paragraph chunker m.dabbagh 2026-01-25 11:32:35 +03:30
  • cda128e438 one paragraph per chunk in paragraph chunking method m.dabbagh 2026-01-25 11:03:54 +03:30
  • 8ecbd88498 make DocumentSection.title optional m.dabbagh 2026-01-24 20:25:34 +03:30
  • 3aad734140 comment out swagger authentication m.dabbagh 2026-01-24 17:06:25 +03:30
  • c6302bc792 add api-key header and swagger authentication m.dabbagh 2026-01-24 17:05:29 +03:30
  • 2ccb38179d use docling in extractors m.dabbagh 2026-01-24 13:43:07 +03:30
  • ad163eb665 change api defaults m.dabbagh 2026-01-20 23:36:02 +03:30
  • 91f8035043 add s3 storage m.dabbagh 2026-01-20 12:46:47 +03:30
  • 0c09c79a2e refactor api routes m.dabbagh 2026-01-19 22:03:36 +03:30
  • 6086ddf818 add /chunk route m.dabbagh 2026-01-19 21:54:23 +03:30
  • 2c4a59f84b add extract endpoint m.dabbagh 2026-01-19 16:05:55 +03:30
  • 0084ae6bc0 fix m.dabbagh 2026-01-19 15:42:46 +03:30
  • e783d92eca make chunking method enum and remove some redundant code in core and api m.dabbagh 2026-01-19 15:19:11 +03:30
  • e2e1c86dd4 fix sorting and merging in zip extractor m.dabbagh 2026-01-19 14:00:17 +03:30
  • 6072bb188c fix a bug in zip extractor m.dabbagh 2026-01-18 20:57:01 +03:30
  • 32ca394d91 some fixes on the output text m.dabbagh 2026-01-18 20:05:41 +03:30
  • 90c10c79fa add text api m.dabbagh 2026-01-18 19:38:53 +03:30
  • 13b887260f add zip extractor adapter m.dabbagh 2026-01-18 15:44:49 +03:30
  • f06370e0b9 some fixes in concrete implementations of chunkers m.dabbagh 2026-01-08 16:47:50 +03:30
  • 2c375ce6bd make the domain general and open to add crawling system m.dabbagh 2026-01-08 04:57:35 +03:30
  • 359026fa98 add SourceFile, DocumentSection models and markdown parser m.dabbagh 2026-01-08 03:46:35 +03:30
  • 10a619494b fix potential race condition in DocumentProcessorService._chunk_document by making the context stateless m.dabbagh 2026-01-07 21:57:22 +03:30
  • fd39184c0c some fixes on architecture. make bootstrap wraps only the hexagonal plus the outgoing adapters m.dabbagh 2026-01-07 21:02:38 +03:30
  • 70f5b1478c init m.dabbagh 2026-01-07 19:15:46 +03:30