About Dr. Tosin Ojajuni
Comprehensive overview of my professional background, experience, and expertise in Data Engineering and AI/ML systems.
About Me
Distinguished Senior Data & AI Engineer with PhD credentials, specializing in architecting enterprise-scale data platforms and AI-powered solutions across AWS, GCP, and Azure ecosystems. With 6+ years focused in Data and AI Engineering, I deliver mission-critical data pipelines processingpetabytes of data for 50M+ users, achieving 40% performance improvements and20% productivity gains.
I excel at building production-grade ML/AI systems, including GenAI agents and agentic AI solutions usingVertex AI and LLMs, with deep expertise in modern data stack technologies likeDatabricks, DBT, and Dataform. My diverse professional foundation spanning 15+ years provides exceptional analytical rigor and systematic problem-solving approaches that drive innovative data solutions.
Proven track record of driving $M+ cost optimizations through intelligent automation and architectural excellence. Certified across major cloud platforms including latest 2025 GCP certifications, with expertise in technical leadership and team mentoring to deliver scalable data engineering solutions that create measurable business impact.
Professional Projects & Solutions
Specialized Data & AI Engineering expertise delivering enterprise-scale solutions with measurable business impact. Enhanced by 15+ years of analytical rigor and systematic problem-solving across diverse professional domains.
Enterprise-Scale Data Platform Architecture
Key Results & Impact
- •Architected enterprise data platform reducing data processing costs by 35% while improving pipeline reliability to 99.9% SLA
- •Pioneered Delta Live Tables implementation for real-time event streaming analytics, processing 10M+ events daily with sub-minute latency
- •Built production ML pipelines achieving 28% improvement in user engagement and fraud detection preventing $2M+ annual losses
- •Established MLOps best practices increasing ML deployment velocity by 3x and reducing model debugging time by 45%
- •Led cross-functional collaboration driving 15% increase in user retention
- •Mentored 4 junior engineers on advanced data engineering and ML principles
Technologies & Tools
Petabyte-Scale Cloud Migration & Data Lake Architecture
Key Results & Impact
- •Orchestrated petabyte-scale data migration to GCP, completing migration 20% ahead of schedule with zero data loss
- •Built scalable data lake architecture processing 5+ petabytes of network telemetry data for 30M+ customers
- •Implemented automated workflows reducing manual intervention by 80% and improving data freshness from hours to minutes
- •Optimized data warehouse achieving 40% reduction in query response times and $100K+ annual cost savings
- •Developed comprehensive data catalog improving data discoverability by 60% across 200+ datasets
- •Engineered real-time streaming pipelines processing 1M+ messages/hour with 99.95% reliability
Technologies & Tools
Enterprise Data Solutions & BI Platform Development
Key Results & Impact
- •Designed dimensional data models for Fortune 500 clients supporting $10M+ analytics initiatives
- •Automated data ingestion pipelines improving data quality from 60% to 95% accuracy
- •Built serverless ETL orchestration reducing infrastructure costs by 40%
- •Implemented Apache Airflow as central workflow orchestration platform, reducing pipeline development time by 30%
- •Architected BI platform reducing report generation time from days to hours
- •Delivered data governance framework ensuring GDPR compliance across 15+ enterprise applications
Technologies & Tools
Retail Analytics Platform & Customer 360 Solution
Key Results & Impact
- •Architected cloud data warehouse consolidating customer data from 15+ global markets
- •Improved data accuracy from 55% to 92% enabling unified customer 360 view
- •Tripled BI capability reducing time-to-insight from weeks to hours for marketing teams
- •Built retail analytics platform processing 5M+ customer transactions monthly
- •Optimized transformations achieving 60% reduction in processing times
- •Established data governance standards ensuring data quality and compliance
Technologies & Tools
End-to-End ML Applications & GenAI Solutions
Key Results & Impact
- •Developed 10+ production ML applications serving 50K+ monthly users across finance, healthcare, and e-commerce
- •Built GenAI prototypes leveraging LLMs achieving 85% user satisfaction in pilot programs
- •Implemented MLOps pipelines enabling automated model training, versioning, and deployment
- •Engineered scalable data pipelines supporting ML model training and inference workloads
- •Conducted comprehensive EDA and feature engineering improving model performance by 20-30%
- •Designed scalable inference APIs with <100ms latency and 99.9% uptime using containerized deployments
Technologies & Tools
Technical Skills
Expertise across the full data engineering and AI/ML stack
Cloud Platforms
- • AWS (Glue, Redshift, S3, Lambda, EMR, Step Functions)
- • GCP (BigQuery, Pub/Sub, Dataflow, Vertex AI)
- • Azure (Databricks, Data Factory, Synapse Analytics)
Data Engineering
- • ETL/ELT: DBT, Dataform, Apache Airflow, Databricks Workflows
- • Big Data: Apache Spark, PySpark, Kafka, Delta Lake
- • Data Warehousing: BigQuery, Redshift, Snowflake
Programming
- • Python, SQL, PL/pgSQL
- • Scala, Bash
- • Processing: Pandas, NumPy, Polars, DuckDB
Machine Learning & AI
- • Platforms: Databricks MLflow, Vertex AI, SageMaker
- • Frameworks: TensorFlow, PyTorch, Scikit-learn, XGBoost
- • GenAI: LangChain, RAG pipelines, LLMs, Prompt Engineering
Data Visualization
- • Tableau, Power BI, Looker
- • Streamlit
- • Chart.js, Plotly
DevOps & MLOps
- • Docker, Kubernetes
- • Git, GitHub Actions, GitLab CI/CD
- • Model versioning, A/B testing, CI/CD for ML
Streaming & Real-time
- • Apache Kafka
- • Google Pub/Sub
- • AWS Kinesis
Specializations
- • Data Lake Architecture
- • Dimensional Modeling
- • Cost Optimization
- • Performance Tuning
Technical Leadership & Innovation
Bridging academic research with industry innovation through computational methods, technical leadership, and open source contributions to the data engineering community.
Research Specializations
Data-Driven Engineering
Advanced computational methods for optimizing complex engineering systems through data analytics and machine learning.
Computational Methods
Development of novel algorithms for processing large-scale engineering datasets and predictive modeling.
AI in Production Systems
Bridging the gap between academic research and enterprise-scale AI implementation.
Technical Leadership
Cross-functional Team Leadership
Team developmentEnterprise Organizations
Led engineering teams across multiple projects, driving technical strategy and mentoring junior engineers.
Architecture & System Design
Enterprise scaleMission-Critical Systems
Architected data platforms serving 50M+ users with focus on scalability and reliability.
Knowledge Transfer & Mentoring
Technical Mentorship & University Teaching
10+ engineers mentoredAcademic & Enterprise Teams
Established mentorship programs leveraging University lecturing experience to develop advanced data engineering and ML engineering capabilities across teams.
Technical Workshop Leadership
Knowledge transferAcademic & Industry Events
Led workshops on Modern Data Stack, MLOps, and Cloud Architecture, drawing from 15+ years of Civil Engineering and Academic experience.
Academic & Industry Impact
Education & Certifications
Education
PhD in Engineering
University of Surrey
Guildford, UK
Focus: Data-driven Engineering and Computational Methods
MSc Civil Engineering (Merit)
University of Surrey
Guildford, UK
Focus: Engineering Analytics and Optimization
Professional Certifications
Google Cloud Professional Cloud Architect
Google Cloud
2025
Google Cloud Professional Machine Learning Engineer
Google Cloud
2025
Google Generative AI Leader Certification
Google Cloud
2024
Databricks Certified Data Engineer Associate
Databricks
2023
GCP Professional Data Engineer
Google Cloud
2023
TensorFlow Developer Certificate
2023
BCS Certificate in Business Analysis
BCS
2022
Get In Touch
Interested in collaborating on data engineering or AI projects? Let's connect!
Contact Information
Location
Portsmouth, England, UK
Available via contact form
Open to Opportunities
I'm always interested in hearing about new opportunities in data engineering, ML engineering, AI engineering, agentic AI, and cloud architecture. Feel free to reach out!