Skip to main content

Posts

Showing posts with the label Power BI Gateways Interview question

Meesho PySpark Interview Questions for Data Engineers in 2025

Meesho PySpark Interview Questions for Data Engineers in 2025 Preparing for a PySpark interview? Let’s tackle some commonly asked questions, along with practical answers and insights to ace your next Data Engineering interview at Meesho or any top-tier tech company. 1. Explain how caching and persistence work in PySpark. When would you use cache() versus persist() and what are their performance implications? Answer : Caching : Stores data in memory (default) for faster retrieval. Use cache() when you need to reuse a DataFrame or RDD multiple times in a session without specifying storage levels. Example: python df.cache() df.count() # Triggers caching Persistence : Allows you to specify storage levels (e.g., memory, disk, or a combination). Use persist() when memory is limited, and you want a fallback to disk storage. Example: python from pyspark import StorageLevel df.persist(StorageLevel.MEMORY_AND_DISK) df.count() # Triggers persistence Performance Implications : cache() is ...

Ad

Power BI Gateways – Real-Time Insights & Interview Tips

Power BI Gateways – Real-Time Insights & Interview Tips As a Power BI developer with 3 years of hands-on experience, I’ve encountered several scenarios requiring efficient use of Power BI Gateways. These tools are vital for enabling secure data transfer, especially when connecting on-premises data sources to Power BI for real-time or scheduled insights. What are Power BI Gateways? A Power BI Gateway acts as a bridge that facilitates secure data movement between on-premises sources (e.g., SQL Server, Oracle, Excel) and the Power BI service. Gateways ensure seamless connectivity and enable real-time or scheduled data refreshes, helping organizations make data-driven decisions. Real-World Use Case In one of my recent projects, I configured an On-Premises Data Gateway to provide real-time updates for a client’s sales dashboard. The dashboard sourced data from a SQL Server database hosted on the client’s internal network. This implementation enabled: Live Data Access: Sales teams cou...

Ad