Skip to main content

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Ad

Interview Framework at Paytm for a Business Analyst Role

Interview Framework at Paytm for a Business Analyst Role 

(For Freshers & Experienced Candidates)

In this blog, I’ll share a detailed breakdown of the Paytm interview process for the Business Analyst role, including insights into the technical rounds, expectations, and how to answer key questions. If you’re preparing for this role, this blog is for you!



Round 1: Technical Interview (With Analysts)

Duration: 1 Hour
Structure: This round is conducted by analysts from the team you'll work with. It focuses on SQL, Python, Excel, and Visualization tools like Power BI or Tableau.

Here’s a breakdown of the typical questions and how I would approach them:

1. SQL Questions

  • Question: Explain the difference between ROW_NUMBER(), RANK(), and DENSE_RANK().
    Answer:

    • ROW_NUMBER() assigns a unique number to each row, starting from 1, without caring about duplicates.
    • RANK() assigns ranks, but if there are ties, the next rank skips numbers. For example: 1, 2, 2, 4.
    • DENSE_RANK() is similar to RANK(), but it doesn’t skip numbers: 1, 2, 2, 3.
  • Question: How would you use a self-join in a scenario?
    Answer: Self-joins are useful for comparing rows in the same table.
    Example: If asked to find employees with the same manager in an Employee table, I would write:

    sql

    SELECT e1.name AS Employee1, e2.name AS Employee2 FROM Employee e1 JOIN Employee e2 ON e1.manager_id = e2.manager_id WHERE e1.id != e2.id;

2. Python Questions

  • Question: Write a Python program to print a star pattern.
    Answer:

    python
    n = 5
    for i in range(1, n+1): print("*" * i)
  • Question: How would you use Pandas to handle missing data?
    Answer: I use the following methods:

    • fillna() to replace null values.
    • dropna() to remove rows or columns with nulls.
    • interpolate() to estimate missing values.

3. Excel Questions

  • Question: How do you use a Pivot Table to summarize data?
    Answer: I would explain:
    • Step 1: Select the dataset.
    • Step 2: Go to Insert > Pivot Table.
    • Step 3: Drag relevant fields to Rows, Columns, and Values.
    • Step 4: Apply filters if needed.

4. Visualization Tools

  • Question: Which visualization tool have you worked with, and how did you create dashboards?
    Answer: I have worked with Power BI and Tableau to create interactive dashboards.
    • Example: I built a sales dashboard that included KPIs like revenue growth, total sales, and region-based analysis.

Round 2: Senior Manager Interview

📌 Duration: 1 Hour
Structure: This round focuses on past impactful projects, technical questions, and your problem-solving ability with puzzles and guesstimates.

1. SQL Scenario-Based Questions

  • Question: Write a query to get the second highest salary from the Employee table.
    Answer:
    sql
    SELECT MAX(salary) AS Second_Highest_Salary
    FROM Employee WHERE salary NOT IN (SELECT MAX(salary) FROM Employee);

2. Python Pandas Task

  • Question: How would you create a pivot table in Pandas?
    Answer:
    python
    import pandas as pd
    data = {'Region': ['North', 'South', 'North'], 'Sales': [100, 200, 150]} df = pd.DataFrame(data) pivot_table = df.pivot_table(values='Sales', index='Region', aggfunc='sum') print(pivot_table)

3. Puzzle or Guesstimate

  • Question: How many UPI users are there in Delhi?
    Answer:

    1. Start with Delhi’s population (~30 million).
    2. Assume ~50% have smartphones → 15 million.
    3. Assume ~80% use UPI → 12 million users.

    Tip: Use logical assumptions, break the problem into steps, and justify each assumption.


Round 3: Cultural Fit Interview (Optional)

This round focuses on your alignment with Paytm’s mission and team culture.

  • Common Question: Why do you want to join Paytm?
    Answer: Paytm’s innovative fintech solutions inspire me. I want to contribute to building user-friendly products and solving real-world problems using my data analysis skills.

Round 4: HR and Offer Discussion

If you clear the above rounds, HR will discuss the offer, salary structure, and benefits.

  • Question: What are your salary expectations?
    Answer: Based on my experience and market standards, I am looking for a competitive offer that reflects my skills and contributions.

Pro Tips for Cracking the Paytm Interview

  1. SQL: Focus on window functions, self-joins, and scenario-based queries.
  2. Python: Brush up on Pandas, basic scripts, and problem-solving.
  3. Guesstimates: Structure your approach logically and clearly.
  4. Excel: Know formulas, pivot tables, and data cleaning.

Comments

Ad

Popular posts from this blog

Deloitte Data Analyst Interview Questions and Answer

Deloitte Data Analyst Interview Questions: Insights and My Personal Approach to Answering Them 1. Tell us about yourself and your current job responsibilities. Example Answer: "I am currently working as a Data Analyst at [Company Name], where I manage and analyze large datasets to drive business insights. My responsibilities include creating and maintaining Power BI dashboards, performing advanced SQL queries to extract and transform data, and collaborating with cross-functional teams to improve data-driven decision-making. Recently, I worked on a project where I streamlined reporting processes using DAX measures and optimized SQL queries, reducing report generation time by 30%." 2. Can you share some challenges you encountered in your recent project involving Power BI dashboards, and how did you resolve them? Example Challenge: In a recent project, one of the key challenges was handling complex relationships between multiple datasets, which caused performance issues and in...

Deloitte Recent Interview Questions for Data Analyst Position November 2024

Deloitte Recent Interview Insights for a Data Analyst Position (0-3 Years) When preparing for an interview with a firm like Deloitte, particularly for a data analyst role, it's crucial to combine technical proficiency with real-world experiences. Below are my personalized insights into common interview questions. 1. Tell us about yourself and your current job responsibilities. Hi, I’m [Your Name], currently working as a Sr. Data Analyst with over 3.5 years of experience. I specialize in creating interactive dashboards, analyzing large datasets, and automating workflows. My responsibilities include developing Power BI dashboards for financial and operational reporting, analyzing trends in customer churn rates, and collaborating with cross-functional teams to implement data-driven solutions. Here’s a quick glimpse of my professional journey: Reporting financial metrics using Power BI, Excel, and SQL. Designing dashboards to track sales and marketing KPIs. Teaching data analysis conce...

EXL Interview question and answer for Power BI Developer (3 Years of Experience)

EXL Interview Experience for Power BI Developer (3 Years of Experience) I recently appeared for an interview at EXL for the role of Power BI Developer . The selection process consisted of three rounds: 2 Technical Rounds 1 Managerial Round Here, I’ll share the key technical questions I encountered, along with my approach to answering them. SQL Questions 1️⃣ Write a SQL query to find the second most recent order date for each customer from a table Orders ( OrderID , CustomerID , OrderDate ). To solve this, I used the ROW_NUMBER() window function: sql WITH RankedOrders AS ( SELECT CustomerID, OrderDate, ROW_NUMBER () OVER ( PARTITION BY CustomerID ORDER BY OrderDate DESC ) AS RowNum FROM Orders ) SELECT CustomerID, OrderDate AS SecondMostRecentOrderDate FROM RankedOrders WHERE RowNum = 2 ; 2️⃣ Write a query to find the nth highest salary from a table Employees with columns ( EmployeeID , Name , Salary ). The DENSE_RANK() fu...