Skip to main content

Posts

Showing posts from December, 2024

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Ad

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Interview Framework at Paytm for a Business Analyst Role

Interview Framework at Paytm for a Business Analyst Role  (For Freshers & Experienced Candidates) In this blog, I’ll share a detailed breakdown of the Paytm interview process for the Business Analyst role, including insights into the technical rounds, expectations, and how to answer key questions. If you’re preparing for this role, this blog is for you! Round 1: Technical Interview (With Analysts) Duration : 1 Hour Structure : This round is conducted by analysts from the team you'll work with. It focuses on SQL , Python , Excel , and Visualization tools like Power BI or Tableau. Here’s a breakdown of the typical questions and how I would approach them: 1. SQL Questions Question : Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK(). Answer : ROW_NUMBER() assigns a unique number to each row, starting from 1, without caring about duplicates. RANK() assigns ranks, but if there are ties, the next rank skips numbers. For example: 1, 2, 2, 4 . DENSE_RANK() is sim...

Power BI Developer Interview at Novartis: My Approach to the Questions

Power BI Developer Interview at Novartis: My Approach to the Questions Excited to share how I would answer the questions asked in a recent interview for a Power BI Developer role at Novartis. These questions cover both technical concepts and practical applications, so let’s dive in! 1️⃣ Introduce Yourself Answer: I’m a passionate data professional with [X years] of experience in data visualization, analytics, and reporting. I specialize in Power BI, SQL, and Python, having worked on projects involving dashboard creation, data modeling, and KPI analysis to drive business insights. My experience includes collaborating with cross-functional teams and delivering actionable insights for data-driven decision-making. 2️⃣ Explain Merge and Append Queries Answer: Merge Queries: Used to join two tables based on a common column (like SQL joins). It’s useful for combining data from different sources. Append Queries: Used to stack or union tables vertically, adding rows from one table to another....

BI/Tableau Developer Role Deloitte Interview Question (with proof)

Recent Deloitte Interview Question for a Power BI/Tableau Developer Role How I Would Answer:  Interview Question: How would you analyze data, gather requirements, and use different tools to deliver insights? In this blog, I’ll walk you through my thought process using a practical example—a scenario where a clothing company wants to analyze why their shirt sales dropped last month. 1️⃣ Understand the Business Goal The first step is to clearly define the objective. Example: The goal is to understand the reasons behind a decline in shirt sales and recommend strategies to improve sales performance. 2️⃣ Identify Key Metrics Defining the right metrics is crucial for actionable insights. Example: Total shirt sales Customer purchases by region and day Return rates or refunds Sales trends by product category 3️⃣ Collect Relevant Data Gathering comprehensive data ensures the analysis is thorough. Example: Use SQL to query the company’s sales database for shirt sales by region, date, and pro...

UST Data Analyst Interview First-Round Questions

Cracking the UST Data Analyst Interview: First-Round Questions If you’re gearing up for a data analyst interview at UST, you’ll need more than just technical know-how; clarity in explaining concepts is equally critical. Here’s how I tackled the first-round questions and prepared to ace the challenge! 📋 Top Questions and My Approach 1️⃣ Self-Introduction Your introduction sets the tone for the interview. Here’s my strategy: Start with a brief overview of your educational background. Highlight your relevant experience, focusing on roles where you leveraged data analysis, SQL, or Power BI. End with a mention of key projects, tools you’re proficient in, and what excites you about the role. Example: "Hi, I’m [Your Name], a data analyst with 3+ years of experience in leveraging SQL, Power BI, and Python to drive data-driven decisions. In my recent role, I designed dashboards to monitor KPIs, optimized queries for better performance, and collaborated with cross-functional teams to deliv...

Siemens Data Analyst question asked in recent Interview

Siemens Data Analyst Interview Experience (1–3 Years): A Comprehensive Breakdown Landing a data analyst role at a reputed company like Siemens demands a solid understanding of SQL, Python, and Power BI. Here's how I tackled the questions asked during the interview, along with detailed explanations and solutions. SQL Questions 1. Find Devices Exceeding Daily Average Energy Usage by 20% in the Last Month The table EnergyConsumption has columns: DeviceID , Timestamp , and EnergyUsed . Solution: sql WITH DailyUsage AS ( SELECT DeviceID, CAST ( Timestamp AS DATE ) AS UsageDate, AVG (EnergyUsed) AS AvgDailyUsage FROM EnergyConsumption WHERE Timestamp >= DATEADD( MONTH , -1 , GETDATE()) GROUP BY DeviceID, CAST ( Timestamp AS DATE ) ), ExceedingDevices AS ( SELECT e.DeviceID, e.Timestamp, e.EnergyUsed, d.AvgDailyUsage FROM EnergyConsumption e JOIN DailyU...

Power BI Developer Interview Question at Indegene

Recently Asked Power BI Developer Interview Question at Indegene As a Power BI enthusiast or developer, interview questions often delve into the technical intricacies of DAX (Data Analysis Expressions). Here’s a deep dive into a commonly asked question, recently posed to a 2+ year candidate for the Power BI Developer role at Indegene. 1. What is the difference between ALL, ALLSELECTED, and ALLEXCEPT functions? Understanding these functions is key to managing filters effectively in your calculations. ALL ➡️ Removes all filters applied to a table or column, including slicers, visuals, and external filters. Example: DAX TotalSalesWithoutFilters = SUMX(ALL(Sales), Sales[Amount]) If filters are applied to Region and Product, using ALL(Sales) ignores both. One-liner: Removes all filters from the data. ALLSELECTED ➡️ Removes filters inside a visual but respects filters from slicers or external visuals. Example: DAX SalesInSlicerContext = SUMX(ALLSELECTED(Sales), Sales[Amount]) If a slic...

Power BI Gateways – Real-Time Insights & Interview Tips

Power BI Gateways – Real-Time Insights & Interview Tips As a Power BI developer with 3 years of hands-on experience, I’ve encountered several scenarios requiring efficient use of Power BI Gateways. These tools are vital for enabling secure data transfer, especially when connecting on-premises data sources to Power BI for real-time or scheduled insights. What are Power BI Gateways? A Power BI Gateway acts as a bridge that facilitates secure data movement between on-premises sources (e.g., SQL Server, Oracle, Excel) and the Power BI service. Gateways ensure seamless connectivity and enable real-time or scheduled data refreshes, helping organizations make data-driven decisions. Real-World Use Case In one of my recent projects, I configured an On-Premises Data Gateway to provide real-time updates for a client’s sales dashboard. The dashboard sourced data from a SQL Server database hosted on the client’s internal network. This implementation enabled: Live Data Access: Sales teams cou...

PwC Data Analyst Interview question and its answer

PwC Data Analyst Interview question and its answer PwC Data Analyst Interview Experience (1–3 Years) Are you preparing for a data analyst role at PwC or a similar organization? Here’s my recent experience tackling some challenging SQL and Python interview questions during the selection process for a PwC Data Analyst role. These questions test both foundational knowledge and problem-solving skills. Here's how I approached them. SQL Questions 1. How Indexing Works in SQL Indexing improves query performance by allowing faster retrieval of rows. A clustered index organizes data physically, while a non-clustered index uses pointers to rows. Choose columns frequently used in WHERE or JOIN clauses for indexing, like CustomerID in a Transactions table. 2. Identify Customers with Purchases in Consecutive Months Using window functions: sql WITH ConsecutivePurchases AS ( SELECT CustomerID, MONTH (TransactionDate) AS TransactionMonth, YEAR (TransactionD...

Ad