I have a pandas data frame and would like to plot values from one column versus the values from another column. Fortunately, there is plot method associated with the data-frames that seems to do what I need:
df.plot(x='col_name_1', y='col_name_2')
Unfortunately, it looks like among the plot styles (listed here after the kind parameter) there are not points. I can use lines or bars or even density but not points. Is there a work around that can help to solve this problem.
import matplotlib.pyplot as plt
plt.scatter(df['col_name_1'], df['col_name_2'])
plt.show()# Depending on whether you use IPython or interactive mode, etc.
For this (and most plotting) I would not rely on the Pandas wrappers to matplotlib. Instead, just use matplotlib directly:
import matplotlib.pyplot as plt
plt.scatter(df['col_name_1'], df['col_name_2'])
plt.show() # Depending on whether you use IPython or interactive mode, etc.
and remember that you can access a NumPy array of the column’s values with df.col_name_1.values for example.
Pandas uses matplotlib as a library for basic plots. The easiest way in your case will using the following:
import pandas as pd
import numpy as np
#creating sample data
sample_data={'col_name_1':np.random.rand(20),
'col_name_2': np.random.rand(20)}
df= pd.DataFrame(sample_data)
df.plot(x='col_name_1', y='col_name_2', style='o')
However, I would recommend to use seaborn as an alternative solution if you want have more customized plots while not going into the basic level of matplotlib. In this case you the solution will be following:
import pandas as pd
import seaborn as sns
import numpy as np
#creating sample data
sample_data={'col_name_1':np.random.rand(20),
'col_name_2': np.random.rand(20)}
df= pd.DataFrame(sample_data)
sns.scatterplot(x="col_name_1", y="col_name_2", data=df)