Basic Charting with Matplotlib (Scatterplot, Barchart, Lineplot)
Original Source: https://www.coursera.org/specializations/data-science-python
In Python, we can plot data with matplotlib
.
Basic Plotting with matplotlib
You can show matplotlib figures directly in the notebook by using the %matplotlib notebook
and %matplotlib inline
magic commands.
%matplotlib notebook
provides an interactive environment.
%matplotlib inline
import matplotlib.pyplot as plt
# set default figure size to (14, 8)
plt.rcParams['figure.figsize'] = (14.0, 8.0)
# create a new figure
plt.figure()
# plot the point (3,2) using the circle marker
plt.plot(3, 2, 'o')
# set axis properties [xmin, xmax, ymin, ymax]
plt.axis([0,6,0,10])
# plot the point (4, 7) using the circle marker
plt.plot(4, 7, 'o')
# show figure
plt.show()
Scatterplots
import numpy as np
x = np.array([1,2,3,4,5,6,7,8])
y = x.copy()
colors = ['r','r','r','r','r','g','b', 'y']
plt.figure()
# plot the point with size 100 and chosen colors
plt.scatter(x, y, s=100, c=colors)
plt.show()
x = [1,2,3,4,5]
y = [6,7,8,9,10]
plt.figure()
# plot a data series 'Tall students' in red using the first two elements of x and y
plt.scatter(x[:2], y[:2], s=100, c='red', label='Tall students')
# plot a second data series 'Short students' in blue using the last three elements of x and y
plt.scatter(x[2:], y[2:], s=100, c='blue', label='Short students')
# add a label to the x axis
plt.xlabel('The number of times the child kicked a ball')
# add a label to the y axis
plt.ylabel('The grade of the student')
# add a title
plt.title('Relationship between ball kicking and grades')
# add a legend (uses the labels from plt.scatter)
plt.legend()
plt.show()
Line Plots
linear_data = np.array([1,2,3,4,5,6,7,8,9,10,12])
exponential_data = linear_data**2
plt.figure()
# plot the linear data and the exponential data
plt.plot(linear_data, '-o', exponential_data, '-o')
# plot another series with a dashed red line
plt.plot([22,44,55], '--r')
plt.xlabel('Some data')
plt.ylabel('Some other data')
plt.title('A title')
# add a legend with legend entries (because we didn't have labels when we plotted the data series)
plt.legend(['Baseline', 'Competition', 'Us'])
# fill the area between the linear data and exponential data
plt.fill_between(range(len(linear_data)), linear_data, exponential_data, facecolor='blue', alpha=0.25)
plt.show()
Datetime as index
plt.figure()
observation_dates = np.arange('2017-01-01', '2017-01-12', dtype='datetime64[D]')
plt.plot(observation_dates, linear_data, '-o', observation_dates, exponential_data, '-o')
plt.show()
Using Pandas
import pandas as pd
plt.figure()
observation_dates = np.arange('2017-01-01', '2017-01-12', dtype='datetime64[D]')
observation_dates = list(map(pd.to_datetime, observation_dates))
plt.plot(observation_dates, linear_data, '-o', observation_dates, exponential_data, '-o')
plt.xlabel('Date')
plt.ylabel('Units')
plt.title("Exponential ($x^2$) vs. Linear ($x$) performance")
plt.show()
Bar Charts
plt.figure()
xvals = range(len(linear_data))
# plot another set of bars, adjusting the new xvals to make up for the first set of bars plotted
new_xvals = [i+0.3 for i in xvals]
# This will plot a new set of bars with errorbars of length 4
plt.bar(xvals, linear_data, width = 0.3, yerr=4, color='blue')
plt.bar(new_xvals, exponential_data, width = 0.3 ,color='red')
plt.show()
# stacked bar charts are also possible
plt.figure()
xvals = range(len(linear_data))
plt.bar(xvals, linear_data, width = 0.3, color='b')
plt.bar(xvals, exponential_data, width = 0.3, bottom=linear_data, color='r')
plt.show()
# or use barh for horizontal bar charts
plt.figure()
xvals = range(len(linear_data))
plt.barh(xvals, linear_data, height = 0.3, color='b')
plt.barh(xvals, exponential_data, height = 0.3, left=linear_data, color='r')
plt.show()
Leave a Comment