Document better understood by AI | Voters

Document better understood by AI

Kinkazma

Add support for generating multiple embeddings from a long document, using Ollama-compatible embedding models like granite-embedding, nomic-embed-text, snowflake-arctic-embed2:568m, etc.
Expected behavior:
•	A document (e.g. 65 pages) is automatically split into segments (e.g. per paragraph, page, or fixed-size chunks with overlap)
•	Each chunk is processed independently to produce a separate embedding vector
•	Output is a list of vectors, one per chunk
•	Optionally:
•	Export to JSON or CSV
•	Show token count and chunk preview
•	Use for similarity search or RAG
Do not compute one embedding for the whole document — the goal is to allow semantic lookup from fine-grained vectors, not a single blurred one.

August 4, 2025

Kinkazma

I don't know if many people will be interested, but I think this application has everything to gain from this implementation, regardless of what people want.