u/sathvikchava

I am a data engineer with zero experince in a new data engineering team at our org.

Stack: ADF + Databricks + ADLS Gen2(medallion), serving into Microsoft Fabric.

Our work primarily focus on migrating badly built legacy ETL systems to cloud, also on boarding new sources (emailed xlsx/csv files, SQL servers , SAP, third-party ads and sales APIs) into our data space.

The environment I'm working in:

No proper requirement gathering- most of the things a with half info requireing a lot of back and forth communication.
Everthing has to be built form scarcth - so, there are no standards or best practices set
No project planning - each project is a single jira ticket
Architecture is just like "here you go - databricks and ADF - use it"
There is one senior DE but the above but manager doesnot want him to be smart and impactfull - because they wanna secure thier manager layer

I want to grow and learn as a data engineer - both technically and also handling the process side

Would love advice on:

How do you deal with unclear requirements and no direct stakeholder access — what do you do before writing a single line of code?
What standards or practices are worth pushing for early in a new DE team?
Best practices for this stack and multi-source ingestion (APIs, SAP, SQL, flat files)
How do you make good architecture decisions when there's no proper design stage?
Resources that taught you to think like a proper data engineer, not just use the tools

Happy to hear from anyone who's been in a "building the plane while flying it" situation. What helped you most?

New data eng team modernising messy legacy workarounds onto ADF + Databricks + ADLS + Fabric — how do we build this properly from the start?