Efficient Ways to Organize Data Models
Poom Wettayakorn
data-modeling
I've researched and curated a list of best practices for organizing data models.
Thought it'd be nice to share with y'all!
We'll focus on modular modeling, and explore the essential model layers: base, staging, intermediate, and marts (dim & facts).
Some key ideas we'll cover
Align best practices for creating data models
Pros & cons of different project structures
Structure models for a larger scale
Official docs
Project design: https://www.getdbt.com/dbt-learn/lessons/dbt-project-design
Modular modeling: https://www.getdbt.com/analytics-engineering/modular-data-modeling-technique
How to structure a project: https://docs.getdbt.com/best-practices/how-we-structure
Mesh patterns: https://docs.getdbt.com/best-practices/how-we-mesh
Data model layers
Folder structure
Base = When joins are necessary to stage concepts
Staging = The shape you wish your data came in
Intermediate = Purpose-built transformation steps
Marts = Business-defined entities
Recommended blogs:
Efficient way to organize dbt models by Madison Schott
Tips on starting new dbt project by Oleg Agapov
Project Structure
Recommended blogs:
How to structure your dbt project by Paul Fry
Practical tips to get the best out of dbt by Stefano Solimito
Pros & Cons
Real World Examples
Automated database diagrams from existing queries: https://getdatascale.com
Multi-project Architecture
Mesh Patterns
"Mesh Patterns - empowering data teams to work independently and collaboratively; sharing data, code, and best practices without sacrificing autonomy."
"When adopting a multi-project architecture, where do you draw the lines between projects?"
Vertical splits
Horizontal splits
Hybrid