
Pluralsight – Big Data Foundations Storage and Data Formats 2026
English | Tutorial | Size: 161.83 MB
Big data performance and efficiency depend on proper storage and data organization. This course will teach you how to use distributed storage, select file formats like Parquet and ORC, and apply data layout strategies effectively.
What you’ll learn
Big data workloads often face challenges such as inefficient storage, slow queries, and poorly organized data.
In this course, Big Data Foundations: Storage and Data Formats, you’ll gain the ability to design and manage storage and data formats that improve performance, efficiency, and scalability.
First, you’ll explore distributed storage systems and their core concepts, including replication, partitioning, and durability.
Next, you’ll discover big data file and table formats, how to choose between row-based and columnar formats, and the roles of schema enforcement, metadata, and compression.
Finally, you’ll learn how to organize data using partitioning, bucketing, sorting, and modern table formats like Iceberg, Delta Lake, and Hudi.
When you’re finished with this course, you’ll have the skills and knowledge of storage design, file formats, and data layout strategies needed to manage big data workloads effectively.
DOWNLOAD: