resources

This repo is a one stop destination to find resources for learning various domains. You can find the roadmap for any domain here.

View on GitHub

PySpark


PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment. PySpark supports most of Spark’s features such as Spark SQL, DataFrame, Streaming, MLlib (Machine Learning) and Spark Core.

Complete Courses

Videos lectures

Documentation

Diving Deep