InData Engineer ThingsbyB V Sarath Chandra15 Databricks Interview Questions | Medium To Hard LevelMust read for DE Databricks Interview.Feb 131Feb 131
InDataVidhyabyDarshil ParmarUnderstand Apache Airflow Like Never BeforeIn the world of data engineering, one of the most critical tasks you’ll encounter is building data pipelines.Jan 298Jan 298
InData Engineer ThingsbySahil SharmaApache Spark Interview Scenarios: Key Configurations Every Data Engineer Should KnowOptimizing Apache Spark: Essential Configurations for Performance, Resource Management, and Scalability.Dec 15, 20241Dec 15, 20241
InDev GeniusbyMuttineni Sai RohithUnderstanding Parquet Files | Efficient Data StorageIn the modern data-driven world, efficiency in data storage and retrieval is paramount. As datasets grow in size, traditional file formats…Jan 5Jan 5
InDev GeniusbySutanu DuttaWhy Kafka ditched ZookeeperFor many years, Apache Kafka relied on Apache ZooKeeper to manage metadata, cluster configurations and maintain a distributed state across…Nov 3, 2024Nov 3, 2024
Mayurkumar SuraniAce Your Data Engineering Interview: 20 Questions and Answers to Land Your Dream JobSo, you’re gearing up for a data engineering interview? Congratulations! It’s an exciting field with tons of opportunity. But let’s be…Sep 29, 2024Sep 29, 2024
InThe Resume WhispererbyKudosWallWhy Data Professionals Need a Portfolio (and How to Create One)For a data professional, having a well-crafted resume is essential, but it might not be enough. Whether you’re a Data Engineer, Data…Oct 21, 20241Oct 21, 20241
InData Engineer ThingsbyVu TrinhI spent 8 hours learning the details of the Apache Spark scheduling process.Anatomy of a Spark job and the typical scheduling process.Oct 29, 2024Oct 29, 2024
Irem ErtürkStream Processing with Python: Part 2: Kafka Producer-Consumer with Avro Schema and Schema RegistryIn Part 2 of Stream Processing with Python series, we will deal with a more structured way of managing the messages with the help of…Jul 26, 2024Jul 26, 2024
InNetflix TechBlogbyNetflix Technology BlogMaestro: Netflix’s Workflow OrchestratorBy Jun He, Natallia Dzenisenka, Praneeth Yenugutala, Yingyi Zhang, and Anjali NorwoodJul 22, 202412Jul 22, 202412
Swathi ThokalaYouTube Trend Analysis Pipeline: ETL with Airflow, Spark, S3 and DockerIn this article, we will walk through creating an automated ETL (Extract, Transform, Load) pipeline using Apache Airflow and PySpark. This…Jun 18, 20242Jun 18, 20242
InData Engineer ThingsbyVu TrinhApache Kafka — OverviewThe terminology and the architecture.Jul 6, 202412Jul 6, 202412
InThe Deep HubbyVu TrinhHow does LinkedIn process 4 Trillion Events every day?Key insights on how LinkedIn leverages Apache Beam for real-time processingJun 10, 20245Jun 10, 20245