Skip to main content

Posts

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Ad

Power BI Gateways – Real-Time Insights & Interview Tips

Power BI Gateways – Real-Time Insights & Interview Tips As a Power BI developer with 3 years of hands-on experience, I’ve encountered several scenarios requiring efficient use of Power BI Gateways. These tools are vital for enabling secure data transfer, especially when connecting on-premises data sources to Power BI for real-time or scheduled insights. What are Power BI Gateways? A Power BI Gateway acts as a bridge that facilitates secure data movement between on-premises sources (e.g., SQL Server, Oracle, Excel) and the Power BI service. Gateways ensure seamless connectivity and enable real-time or scheduled data refreshes, helping organizations make data-driven decisions. Real-World Use Case In one of my recent projects, I configured an On-Premises Data Gateway to provide real-time updates for a client’s sales dashboard. The dashboard sourced data from a SQL Server database hosted on the client’s internal network. This implementation enabled: Live Data Access: Sales teams cou...

PwC Data Analyst Interview question and its answer

PwC Data Analyst Interview question and its answer PwC Data Analyst Interview Experience (1–3 Years) Are you preparing for a data analyst role at PwC or a similar organization? Here’s my recent experience tackling some challenging SQL and Python interview questions during the selection process for a PwC Data Analyst role. These questions test both foundational knowledge and problem-solving skills. Here's how I approached them. SQL Questions 1. How Indexing Works in SQL Indexing improves query performance by allowing faster retrieval of rows. A clustered index organizes data physically, while a non-clustered index uses pointers to rows. Choose columns frequently used in WHERE or JOIN clauses for indexing, like CustomerID in a Transactions table. 2. Identify Customers with Purchases in Consecutive Months Using window functions: sql WITH ConsecutivePurchases AS ( SELECT CustomerID, MONTH (TransactionDate) AS TransactionMonth, YEAR (TransactionD...

Wells Fargo Data Analyst Interview and Answers

My Wells Fargo Data Analyst Interview Experience (1–3 Years) CTC: 16 LPA As a data enthusiast and SQL aficionado, I recently tackled some challenging SQL and Python questions in a Wells Fargo interview for a Data Analyst position. The experience was both rewarding and insightful. Here’s how I approached these questions. SQL Questions 1. Identify Inactive Accounts To identify accounts inactive for more than 12 months: sql SELECT AccountID, CustomerID, Balance FROM Accounts WHERE LastTransactionDate < DATEADD( YEAR , -1 , GETDATE()); This query filters accounts where the LastTransactionDate is older than one year. 2. Top 3 Accounts by Transaction Volume Per Month Using ROW_NUMBER() to rank accounts by total transaction volume for each month: sql WITH MonthlyVolume AS ( SELECT AccountID, SUM (Amount) AS TotalVolume, MONTH (TransactionDate) AS TransactionMonth, YEAR (TransactionDate) AS TransactionYear FROM Transactions GROUP...

SQL Questions Asked in an American Express Interview

How I Would Solve These Tricky SQL Questions Asked in American Express Interview SQL is a fundamental skill for any data analyst, and mastering complex queries is key to standing out in interviews. Below, I break down how I would approach solving the tricky SQL questions mentioned. Each of these challenges is designed to test both your technical proficiency and your problem-solving ability. Let’s dive into the solutions. 1. Find the Second-Highest Salary in a Table Without Using LIMIT or TOP This is a classic problem that requires creativity. My solution: sql SELECT MAX (salary) FROM employees WHERE salary < ( SELECT MAX (salary) FROM employees); Here, the subquery finds the maximum salary, and the outer query selects the highest salary below that. 2. Find All Employees Who Earn More Than Their Managers Joining the table to itself is the key here: sql SELECT e1.employee_name FROM employees e1 JOIN employees e2 ON e1.manager_id = e2.employee_id WHERE e1.salary > ...

How I Cracked the Data Analyst Role at Flipkart

How I Cracked the Data Analyst Role at Flipkart 🚀 The journey to securing a Data Analyst role at Flipkart was both challenging and rewarding. Here’s a detailed walkthrough of my experience, preparation strategy, and key takeaways. Application Process Applied Through: LinkedIn Total Number of Rounds: 5 HR Discussion: Focused on my past roles, experiences, and suitability for the position. 1st Technical Round: Covered foundational concepts in Excel, Power BI, and SQL. 2nd Technical Round: Delved into complex SQL queries and advanced Excel-based problem-solving. Managerial Round: Scenario-based questions to assess analytical thinking and problem-solving in real-world situations. Final HR Discussion: Discussed roles, responsibilities, and expectations from the role. My 3-Month Preparation Strategy 📆 Month 1: Advanced Excel, Power BI, and Data Visualization Source: Pavan Lalwani 🇮🇳 Excel for Data Analysis: Excel was the backbone of my initial preparation. I focused on the followi...

BlackRock Data Analyst Interview and Answer Bengaluru

BlackRock Data Analyst Interview and Answer BlackRock’s Data Analyst interview process is known for its intensity and focus on technical expertise, especially in SQL and Python. The questions were a mix of practical problems, theoretical knowledge, and real-world financial scenarios, reflecting BlackRock's emphasis on analytical rigor and financial acumen. Here’s a breakdown of the questions I encountered and my approach to solving them. SQL Questions 1️⃣ Identify customers who have invested in at least two funds with opposite performance trends over the last 6 months. Answer : sql WITH FundPerformance AS ( SELECT FundID, CASE WHEN AVG ( Return ) > 0 THEN 'Increasing' ELSE 'Decreasing' END AS Trend FROM FundReturns WHERE Date >= DATE_SUB(CURDATE(), INTERVAL 6 MONTH ) GROUP BY FundID ), CustomerInvestments AS ( SELECT CustomerID, FundID FROM Investments ) SELECT ci.CustomerID FR...

Shell Data Analyst Interview question and answer December 2024

Shell Data Analyst Interview Experience: CTC - 18 LPA Shell’s Data Analyst role demands strong SQL, Python, and Power BI skills alongside the ability to align technical insights with business strategy. Below, I’ve shared the questions asked during my interview process and how I would have answered them. SQL Questions 1️⃣ Write a query to calculate the cumulative revenue per customer for each month in the last year. Answer : sql SELECT CustomerID, DATE_FORMAT( Date , '%Y-%m' ) AS Month , SUM (Amount) OVER ( PARTITION BY CustomerID ORDER BY DATE_FORMAT( Date , '%Y-%m' )) AS CumulativeRevenue FROM Transactions WHERE Date >= DATE_SUB(CURDATE(), INTERVAL 1 YEAR ); 2️⃣ Identify plants that consistently exceeded their daily average output for at least 20 days in a given month. Answer : sql WITH DailyAvg AS ( SELECT PlantID, AVG (Output) AS AvgOutput FROM Production GROUP BY PlantID ), ExceedDay...

Ad