ds-review-hub.github.io

PySpark Resources

PySpark Review Notebooks

These notebooks and links should prove very helpful in learning and using PySpark. I use PySpark in DataBricks, so although I’ve written and run the code for use in a Jupyter Notebook, I’ve tried to include snippets for DataBricks as markdown. This is just the beginning of this resource page, so it will grow over time, but this should get you started.

PySpark Basics to Intermediate

This notebook starts at the very beginning and covers most topics you need to get started. This touches on Window Functions, but there will be more to come in the next notebook where I will bring in a dataset that includes dates.

PySpark Useful Links

Codeup Data Science Curriculum

For those of you who bought the ticket and took the ride, this is a super valuable resource.

PySpark By {Examples}

I use this site all the time, and it’s free, so check it out.