Speaking
Speaking Engagements
I speak about data engineering, production debugging, and building scalable data platforms. My talks focus on real-world experiences and practical lessons from production systems.
✅ Completed Engagements
Oxford Microsoft Data Platform Group
Date: January 21, 2026
Format: Online Presentation (60 minutes + 30 min Q&A)
Location: Virtual
Status: ✅ COMPLETED
Slides: Download Presentation (5MB PPTX)
Event Link: Meetup Event
Title: From Raw to Refined: Building Production Data Pipelines That Scale
Abstract:
Presented battle-tested architecture patterns from building scalable data pipelines at NatWest Bank. Covered the three-zone architecture (Raw → Curated → Refined) with real production code examples and incident case studies from financial services systems.
Topics Covered:
- Three-zone data architecture pattern (Raw → Curated → Refined / Bronze → Silver → Gold)
- Real-time streaming with Kafka and Spark (sub-10ms latency for quality scoring)
- Batch processing with PySpark and Azure Databricks (2TB daily reconciliation)
- ML-powered data quality monitoring with Isolation Forest
- Apache Airflow orchestration for 40+ interdependent pipelines
- Production incidents and recovery strategies (6-hour outage case study)
- Honest assessment of Azure data tools (ADLS Gen2, Databricks, Synapse, Snowflake)
- Customer 360 implementation (12 source systems, 200+ business users)
Technologies: Azure Data Lake Gen2, Azure Databricks, Azure Event Hubs, Apache Kafka, Apache Airflow, PySpark, Snowflake, Great Expectations, Delta Lake
Audience: 14+ data engineers, cloud architects, and data platform professionals
Organizer: Felicity Nyan (Senior Cloud Solution Architect - Data & Analytics, Microsoft)
Sponsors: Humand Talent Solutions, Packt Publishing
Feedback:
“Many thanks for your very interesting and informative presentation. We all enjoyed it tremendously and hope that you will be back to present on Airflow one day.”
— Felicity Nyan, Senior Cloud Solution Architect, Microsoft Customer Success Unit
Impact: Invited back for dedicated Apache Airflow session based on strong audience engagement and presentation quality.
🎤 Upcoming Engagements
Oxford Microsoft Data Platform Group - Apache Airflow Deep Dive
Date: TBD (2026)
Format: Online Presentation
Status: 🟡 INVITED (Return engagement)
Title: Apache Airflow in Production: Orchestration Patterns That Scale
Proposed Topics:
- DAG design patterns for complex dependencies (40+ pipelines)
- Production best practices from running Airflow on Azure Kubernetes Service (AKS)
- Monitoring, alerting, and operational strategies for enterprise deployments
- Retry logic, idempotency, and error handling patterns
- Real-world orchestration challenges from financial services systems
- Airflow vs Azure Data Factory: When to use which
- Contributing to Apache Airflow open source (based on my merged PRs)
Technologies: Apache Airflow, Azure Kubernetes Service, PostgreSQL, PySpark, Azure Data Factory
Background: This return invitation follows my successful January 2026 presentation where Apache Airflow orchestration patterns generated significant audience interest and discussion.
📋 Pending Proposals
I’ve submitted speaking proposals to 13 conferences and meetups across Europe, including:
Major Conferences
SQLBits 2026 - Newport, United Kingdom (April 2026)
- Europe’s largest data platform conference
- Topic: “5 Production Data Pipeline Mistakes That Cost Me Weeks”
- Format: 60-minute session
PyData Global 2025 - Virtual
- International Python data science conference
- Topic: “5 Production Data Pipeline Mistakes That Cost Me Weeks”
- Format: 30-minute talk
Data Engineering Meetups & User Groups
Submitted to 11 additional meetups focusing on:
- Data engineering best practices
- Production debugging stories
- Building reliable data platforms
- Cloud data architecture
🎯 Speaking Topics
Production Data Pipeline Mistakes
Duration: 30-60 minutes
Real production incidents from financial services data engineering. Each story includes the problem, debugging process, root cause, and lessons learned.
Key Stories:
- Silent data loss (10% of transactions)
- Weekend-only failures
- Currency format bugs ($100 → 10,000)
- Schema changes creating duplicates
- Retry logic processing same day 47 times
- 6-hour analytics platform outage from single transformation bug
Audience: Data engineers, software engineers working with data
Building Scalable Data Pipelines
Duration: 45-60 minutes
Architecture patterns for production data platforms, from ingestion to analytics. Includes real-world examples from processing millions of transactions daily at scale.
Topics:
- Three-zone architecture (Raw → Curated → Refined)
- Streaming vs batch processing trade-offs
- Data quality frameworks and ML-powered monitoring
- Orchestration patterns with Apache Airflow
- Monitoring and observability strategies
- Azure vs AWS data stack comparison
Audience: Data engineers, data architects, platform engineers
Apache Airflow in Production
Duration: 45-60 minutes
Production-tested patterns for running Apache Airflow at scale in enterprise environments. Based on experience managing 40+ interdependent pipelines on Azure Kubernetes Service.
Topics:
- DAG design patterns for complex workflows
- Deployment on Kubernetes (AKS)
- Monitoring, alerting, and SLA management
- Idempotency and retry strategies
- Integration with Azure services (Databricks, ADLS, Synapse)
- Contributing to Airflow open source
Audience: Data engineers, DevOps engineers, platform engineers
💡 Speaking Philosophy
Real experiences over theory - Every talk is based on actual production work at scale
Practical and actionable - Attendees leave with patterns they can apply immediately
Honest about failures - Sharing what went wrong, how we debugged it, and how we fixed it
Code and architecture - Real production code examples, not just slide diagrams
Interactive and engaging - Encouraging questions and discussion throughout
📊 Speaking Experience
Conference & Community Talks:
- Oxford Microsoft Data Platform Group (January 2026) - 60+ attendees
- Invited return speaker for Apache Airflow session
Technical Presentations:
- Internal technical talks at NatWest Bank and Accenture
- Team knowledge sharing sessions on data engineering best practices
- Client presentations on data platform architecture
- Architecture review presentations for enterprise systems
Community Engagement:
- Active participant in data engineering communities (Reddit r/dataengineering, LinkedIn, Slack)
- Technical writing with 71,000+ views across Medium and Dev.to
- Open source contributions with code reviews and technical discussions
- Apache Airflow contributor (merged PRs)
🎤 Want Me to Speak?
I’m available for:
- Conference talks (virtual or in-person UK/Europe)
- Meetup presentations (online or London-based)
- Corporate tech talks and workshops
- Panel discussions on data engineering
- Internal company training sessions
Topics I cover:
- Production data pipeline architecture and design patterns
- Debugging production data issues and incident response
- Data quality, reliability, and monitoring strategies
- Cloud data platforms (AWS, Azure, Microsoft Fabric)
- Apache Airflow and workflow orchestration
- Open source data tools and contributions
- Real-time streaming and batch processing at scale
Recent talk highlights:
- Three-zone architecture patterns
- ML-powered data quality monitoring
- Orchestrating 40+ interdependent pipelines
- Recovery from production incidents
Contact: kalluripradeep99@gmail.com
Note: Based in London, UK. Available for in-person speaking in UK/Europe and virtual events globally.
🔗 Additional Resources
Technical Writing: View my articles with 71,000+ views
Open Source: See my contributions to Apache Airflow and dbt-core
Projects: View my work including real-time data quality systems
Portfolio: GitHub • LinkedIn • Medium
| ← Back to Home | View Technical Writing → | View Projects → |