Skip to main content

Posts

Showing posts with the label PwC Data Analyst Interview question and its answer

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Ad

PwC Data Analyst Interview question and its answer

PwC Data Analyst Interview question and its answer PwC Data Analyst Interview Experience (1–3 Years) Are you preparing for a data analyst role at PwC or a similar organization? Here’s my recent experience tackling some challenging SQL and Python interview questions during the selection process for a PwC Data Analyst role. These questions test both foundational knowledge and problem-solving skills. Here's how I approached them. SQL Questions 1. How Indexing Works in SQL Indexing improves query performance by allowing faster retrieval of rows. A clustered index organizes data physically, while a non-clustered index uses pointers to rows. Choose columns frequently used in WHERE or JOIN clauses for indexing, like CustomerID in a Transactions table. 2. Identify Customers with Purchases in Consecutive Months Using window functions: sql WITH ConsecutivePurchases AS ( SELECT CustomerID, MONTH (TransactionDate) AS TransactionMonth, YEAR (TransactionD...

Ad