Skip to main content

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Ad

Power BI Developer Interview Question at Indegene

Recently Asked Power BI Developer Interview Question at Indegene

As a Power BI enthusiast or developer, interview questions often delve into the technical intricacies of DAX (Data Analysis Expressions). Here’s a deep dive into a commonly asked question, recently posed to a 2+ year candidate for the Power BI Developer role at Indegene.



1. What is the difference between ALL, ALLSELECTED, and ALLEXCEPT functions?

Understanding these functions is key to managing filters effectively in your calculations.

ALL

➡️ Removes all filters applied to a table or column, including slicers, visuals, and external filters.

Example:

DAX

TotalSalesWithoutFilters = SUMX(ALL(Sales), Sales[Amount])

If filters are applied to Region and Product, using ALL(Sales) ignores both.

One-liner: Removes all filters from the data.


ALLSELECTED

➡️ Removes filters inside a visual but respects filters from slicers or external visuals.

Example:

DAX

SalesInSlicerContext = SUMX(ALLSELECTED(Sales), Sales[Amount])

If a slicer sets Region = East and a chart filters Product = Bikes, ALLSELECTED(Sales) keeps the slicer filter but ignores the chart filter.

One-liner: Keeps slicer filters, ignores visual filters.


ALLEXCEPT

➡️ Removes all filters except those specified.

Example:

DAX

TotalSalesByRegion = SUMX(ALLEXCEPT(Sales, Sales[Region]), Sales[Amount])

If filters are applied to Region and Product, using ALLEXCEPT(Sales, Sales[Region]) retains only the filter on Region.

One-liner: Keeps only specific filters, removes the rest.


2. Which function is used to make an inactive relationship active for a specific calculation?

The USERELATIONSHIP function allows you to activate an inactive relationship between tables for a single calculation.

Real-World Scenario:

Imagine your sales data has two dates: Order Date and Ship Date, both connected to a Calendar table. The default active relationship is with Order Date, but you want to use Ship Date for specific calculations.

Example:

DAX

TotalSalesByShipDate =
CALCULATE(SUM(Sales[Amount]), USERELATIONSHIP(Calendar[Date], Sales[ShipDate]))

This formula temporarily activates the relationship with Ship Date for the calculation, while leaving other relationships unchanged.

Pro Tip: Use USERELATIONSHIP to handle multiple date dimensions in your data model.


3. How to optimize DAX calculations?

Optimizing DAX ensures that your Power BI reports perform efficiently, especially when handling large datasets.

Best Practices for DAX Optimization:

  1. Keep calculations simple: Break down large formulas into smaller, reusable measures.
  2. Use variables: Define intermediate results using VAR to avoid recalculating values.
    DAX
    OptimizedMeasure =
    VAR TotalRevenue = SUM(Sales[Revenue]) RETURN TotalRevenue * 0.1

  1. Avoid complex functions: Prefer straightforward functions like SUM instead of SUMX when possible.
  2. Filter early: Apply filters at the data source or as early as possible in your model.
  3. Minimize CALCULATE: Use CALCULATE sparingly and with simple filters.
  4. Optimize your data model: Remove unnecessary columns and focus on a star schema for simpler calculations.
  5. Optimize your data model: Remove unnecessary columns and focus on a star schema for simpler calculations.
  6. Use smarter grouping: Instead of FILTER, use SUMMARIZE or GROUPBY for better performance.
  7. Clean your data: Less data in the model means faster computations.


Final Thoughts

Mastering these concepts not only helps you ace interviews but also makes you a better Power BI professional. Interviewers often look for not just theoretical knowledge but also practical understanding through real-world applications. Practice these DAX functions with datasets to ensure you’re confident in their usage.

Have more Power BI questions? Drop them in the comments and let’s solve them together!

#PowerBI #DAX #InterviewQuestions #DataVisualization

Comments

Ad

Popular posts from this blog

Deloitte Data Analyst Interview Questions and Answer

Deloitte Data Analyst Interview Questions: Insights and My Personal Approach to Answering Them 1. Tell us about yourself and your current job responsibilities. Example Answer: "I am currently working as a Data Analyst at [Company Name], where I manage and analyze large datasets to drive business insights. My responsibilities include creating and maintaining Power BI dashboards, performing advanced SQL queries to extract and transform data, and collaborating with cross-functional teams to improve data-driven decision-making. Recently, I worked on a project where I streamlined reporting processes using DAX measures and optimized SQL queries, reducing report generation time by 30%." 2. Can you share some challenges you encountered in your recent project involving Power BI dashboards, and how did you resolve them? Example Challenge: In a recent project, one of the key challenges was handling complex relationships between multiple datasets, which caused performance issues and in...

Deloitte Recent Interview Questions for Data Analyst Position November 2024

Deloitte Recent Interview Insights for a Data Analyst Position (0-3 Years) When preparing for an interview with a firm like Deloitte, particularly for a data analyst role, it's crucial to combine technical proficiency with real-world experiences. Below are my personalized insights into common interview questions. 1. Tell us about yourself and your current job responsibilities. Hi, I’m [Your Name], currently working as a Sr. Data Analyst with over 3.5 years of experience. I specialize in creating interactive dashboards, analyzing large datasets, and automating workflows. My responsibilities include developing Power BI dashboards for financial and operational reporting, analyzing trends in customer churn rates, and collaborating with cross-functional teams to implement data-driven solutions. Here’s a quick glimpse of my professional journey: Reporting financial metrics using Power BI, Excel, and SQL. Designing dashboards to track sales and marketing KPIs. Teaching data analysis conce...

EXL Interview question and answer for Power BI Developer (3 Years of Experience)

EXL Interview Experience for Power BI Developer (3 Years of Experience) I recently appeared for an interview at EXL for the role of Power BI Developer . The selection process consisted of three rounds: 2 Technical Rounds 1 Managerial Round Here, I’ll share the key technical questions I encountered, along with my approach to answering them. SQL Questions 1️⃣ Write a SQL query to find the second most recent order date for each customer from a table Orders ( OrderID , CustomerID , OrderDate ). To solve this, I used the ROW_NUMBER() window function: sql WITH RankedOrders AS ( SELECT CustomerID, OrderDate, ROW_NUMBER () OVER ( PARTITION BY CustomerID ORDER BY OrderDate DESC ) AS RowNum FROM Orders ) SELECT CustomerID, OrderDate AS SecondMostRecentOrderDate FROM RankedOrders WHERE RowNum = 2 ; 2️⃣ Write a query to find the nth highest salary from a table Employees with columns ( EmployeeID , Name , Salary ). The DENSE_RANK() fu...