hey folks
i have a coding assessment coming up and they mentioned there will be a rag question. the exact line given is:
document ingestion + preprocessing, embedding creation (fastembed), in-memory vector indexing, module separation.
Also the OA will be in Hackerrank
no mention of langchain or anything so i’m confused what exactly they expect and how low level i should prepare.
- what kind of question is this usually? like full rag pipeline or just parts of it?
- do we have to write everything in pure python? like simulate vector db ourselves (lists/numpy) + cosine similarity?
- for ingestion part: do we need to do chunking + preprocessing manually?
- embedding creation (fastembed): will they expect us to actually use fastembed or just mock embeddings?
- in-memory vector indexing: is this just storing embeddings in array and doing top-k similarity search
- prompt engineering part also mentioned separately: do they expect just passing context to prompt or something more structured?
- if anyone has given similar hackerank style tests, what was actually asked?
i’ve only done rag using langchain so not sure if i should prepare low level python implementations.
If you have any insights please do share...