Summary

Microsoft Excel has a long and proud history of bringing data analysis and processing to the masses with its intuitive interface and powerful functionality. However, it has important limitations that are only becoming more apparent in our current age of Big Data and that can only be surpassed by up-skilling to a more programming oriented context. 

This lecture will introduce you to the ways in which Python and Pandas can be used to build up on your Excel analyses to bring the power of sophisticated Data Science and Machine Learning tools into your pipeline. We’ll also cover how to both read data from and write results to Excel spreadsheets.


Program

  • Excel Pitfalls

    • Default settings

    • Worksheet sizes and cross references

    • Formatting and styling

    • Functions and cell evaluation

  • Pandas Data Frames

    • Importing csv files and Excel spreadsheets

    • Data cleaning

    • Subsetting

    • DataFrame Manipulations

    • Merge and Join

    • Generating simple Excel spreadsheets

  • Simple Data Modeling

    • Data smoothing

    • Pivot tables

    • Basic plotting

    • Linear regression

    • Curve fitting

       

  • Manipulating Excel Spreadsheets   

    • Adding sheets to a workbook

    • Reading and formatting Excel files

    • Inspecting arbitrary cells

    • Modifying specific rows and columns

    • Appending a dataframe to an excel sheet


Resources


Python Automation Cookbook
J, Buelta (2020)

Automate the Boring Stuff with Python
A. Sweigart (2019)

Python for Finance
Y. Hilpisch (2019)

Pandas for Everyone
D. Y. Chen (2017)

Python for Data Analysis
W. McKinney (2017)

References: