Summary
Microsoft Excel has a long and proud history of bringing data analysis and processing to the masses with its intuitive interface and powerful functionality. However, it has important limitations that are only becoming more apparent in our current age of Big Data and that can only be surpassed by up-skilling to a more programming oriented context.
This lecture will introduce you to the ways in which Python and Pandas can be used to build up on your Excel analyses to bring the power of sophisticated Data Science and Machine Learning tools into your pipeline. We’ll also cover how to both read data from and write results to Excel spreadsheets.
Program
Excel Pitfalls
Default settings
Worksheet sizes and cross references
Formatting and styling
Functions and cell evaluation
Pandas Data Frames
Importing csv files and Excel spreadsheets
Data cleaning
Subsetting
DataFrame Manipulations
Merge and Join
Generating simple Excel spreadsheets
Simple Data Modeling
Data smoothing
Pivot tables
Basic plotting
Linear regression
Curve fitting
Manipulating Excel Spreadsheets
Adding sheets to a workbook
Reading and formatting Excel files
Inspecting arbitrary cells
Modifying specific rows and columns
Appending a dataframe to an excel sheet
Resources
Python Automation Cookbook
J, Buelta (2020)
Automate the Boring Stuff with Python
A. Sweigart (2019)
Python for Finance
Y. Hilpisch (2019)
Pandas for Everyone
D. Y. Chen (2017)
Python for Data Analysis
W. McKinney (2017)