What is Lakehouse in Microsoft Fabric?

Lakehouse in Microsoft Fabric


Microsoft Fabric is a comprehensive data and analytics Software-as-a-Service (SaaS) cloud offering by Microsoft. One of its essential components is the Lakehouse.

What is Microsoft Fabric?

Microsoft Fabric is an end-to-end data and analytics SaaS cloud suite recently introduced by Microsoft. It provides a structured approach to data storage and management. For a detailed understanding, you can refer to my previous article on the topic.

Introduction to Lakehouse


The term "Lakehouse" merges the concepts of Data Lake and Data Warehouse. It serves as a unified storage location for both structured data (like in a Data Warehouse) and unstructured data (as in a Data Lake). The Lakehouse can scale to handle extensive data volumes and integrates seamlessly with other tools and services for data manipulation and retrieval.

In simpler terms, a Lakehouse functions like a database where you can store both tables and files.

Creating a Sample Lakehouse

A Lakehouse is an element within the Data Engineering workload of Microsoft Fabric. To create a Lakehouse:

  • Navigate to the Microsoft Fabric portal.
  • Go to the Data Engineering home page.
  • Assign a name to your Lakehouse and create it.


Once created, your Lakehouse will initially be empty, without any tables or data. We will add data to it in the next steps.

Loading Data into Lakehouse

There are several methods to load data into a Lakehouse:

  • Dataflows Gen2
  • Data pipelines
  • New Shortcut
  • New Eventstream


The Lakehouse Explorer is the default interface you see upon creating a Lakehouse. Within this environment, you can upload files directly or use other methods to load data.

SQL Endpoint for Lakehouse

A key feature of the Lakehouse is the SQL Endpoint, which allows you to query the data stored in its tables using SQL. However, this endpoint only supports data querying, not data manipulation (insertion, updating, or deletion).

To access the SQL Endpoint:

  1. Navigate to your Workspace and click on the SQL Endpoint.
  2. This opens a SQL editor where you can write and execute SQL queries.
  3. You can also use a visual query editor, like Power Query editor.
  4. The connection string for the SQL Endpoint can be found under settings, allowing you to connect to it from other tools like SQL Server Management Studio (SSMS).

Summary

The Microsoft Fabric Lakehouse is a versatile storage solution that combines the best of Data Warehouses and Data Lakes. You can load data using various methods like Dataflows Gen2 and interact with the data through SQL or visualization tools like Power BI. This article serves as an introductory guide, offering a starting point for leveraging the full capabilities of a Lakehouse in Microsoft Fabric.



Comments

Popular posts from this blog

Embedding Power BI Reports in Websites and Applications: A Complete Guide

Import Microsoft Planner Data into Power BI Using Power Automate and SharePoint

Difference between Append and Merge in Power BI