u/Important_Pin_5896

hey folks

i have a coding assessment coming up and they mentioned there will be a rag question. the exact line given is:

document ingestion + preprocessing, embedding creation (fastembed), in-memory vector indexing, module separation.

Also the OA will be in Hackerrank

no mention of langchain or anything so i’m confused what exactly they expect and how low level i should prepare.

  1. what kind of question is this usually? like full rag pipeline or just parts of it?
  2. do we have to write everything in pure python? like simulate vector db ourselves (lists/numpy) + cosine similarity?
  3. for ingestion part: do we need to do chunking + preprocessing manually?
  4. embedding creation (fastembed): will they expect us to actually use fastembed or just mock embeddings?
  5. in-memory vector indexing: is this just storing embeddings in array and doing top-k similarity search
  6. prompt engineering part also mentioned separately: do they expect just passing context to prompt or something more structured?
  7. if anyone has given similar hackerank style tests, what was actually asked?

i’ve only done rag using langchain so not sure if i should prepare low level python implementations.

If you have any insights please do share...

reddit.com
u/Important_Pin_5896 — 12 days ago

hey folks

i have a coding assessment coming up and they mentioned there will be a rag question. the exact line given is:

document ingestion + preprocessing, embedding creation (fastembed), in-memory vector indexing, module separation.

Also the OA will be in Hackerrank

no mention of langchain or anything so i’m confused what exactly they expect and how low level i should prepare.

  1. what kind of question is this usually? like full rag pipeline or just parts of it?
  2. do we have to write everything in pure python? like simulate vector db ourselves (lists/numpy) + cosine similarity?
  3. for ingestion part: do we need to do chunking + preprocessing manually?
  4. embedding creation (fastembed): will they expect us to actually use fastembed or just mock embeddings?
  5. in-memory vector indexing: is this just storing embeddings in array and doing top-k similarity search
  6. prompt engineering part also mentioned separately: do they expect just passing context to prompt or something more structured?
  7. if anyone has given similar hackerank style tests, what was actually asked?

i’ve only done rag using langchain so not sure if i should prepare low level python implementations.

If you have any insights please do share...

reddit.com
u/Important_Pin_5896 — 12 days ago
▲ 11 r/CodingJobs+2 crossposts

hey folks

i have a coding assessment coming up and they mentioned there will be a rag question. the exact line given is:

document ingestion + preprocessing, embedding creation (fastembed), in-memory vector indexing, module separation.

Also the OA will be in Hackerrank

no mention of langchain or anything so i’m confused what exactly they expect and how low level i should prepare.

  1. what kind of question is this usually? like full rag pipeline or just parts of it?
  2. do we have to write everything in pure python? like simulate vector db ourselves (lists/numpy) + cosine similarity?
  3. for ingestion part: do we need to do chunking + preprocessing manually?
  4. embedding creation (fastembed): will they expect us to actually use fastembed or just mock embeddings?
  5. in-memory vector indexing: is this just storing embeddings in array and doing top-k similarity search
  6. prompt engineering part also mentioned separately: do they expect just passing context to prompt or something more structured?
  7. if anyone has given similar hackerank style tests, what was actually asked?

i’ve only done rag using langchain so not sure if i should prepare low level python implementations.

If you have any insights please do share...

u/Important_Pin_5896 — 12 days ago