Python 实用宝典

Question 1

Why does pandas make a distinction between a Series and a single-column DataFrame?
In other words: what is the reason of existence of the Series class?

I’m mainly using time series with datetime index, maybe that helps to set the context.

Question 2

Quoting the Pandas docs

pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure.

So, the Series is the data structure for a single column of a DataFrame, not only conceptually, but literally, i.e. the data in a DataFrame is actually stored in memory as a collection of Series.

Analogously: We need both lists and matrices, because matrices are built with lists. Single row matricies, while equivalent to lists in functionality still cannot exists without the list(s) they’re composed of.

They both have extremely similar APIs, but you’ll find that DataFrame methods always cater to the possibility that you have more than one column. And, of course, you can always add another Series (or equivalent object) to a DataFrame, while adding a Series to another Series involves creating a DataFrame.

Question 3

from the pandas doc http://pandas.pydata.org/pandas-docs/stable/dsintro.html Series is a one-dimensional labeled array capable of holding any data type. To read data in form of panda Series:

import pandas as pd
ds = pd.Series(data, index=index)

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types.

import pandas as pd
df = pd.DataFrame(data, index=index)

In both of the above index is list

for example: I have a csv file with following data:

,country,popuplation,area,capital
BR,Brazil,10210,12015,Brasile
RU,Russia,1025,457,Moscow
IN,India,10458,457787,New Delhi

To read above data as series and data frame:

import pandas as pd
file_data = pd.read_csv("file_path", index_col=0)
d = pd.Series(file_data.country, index=['BR','RU','IN'] or index =  file_data.index)

output:

>>> d
BR           Brazil
RU           Russia
IN            India

df = pd.DataFrame(file_data.area, index=['BR','RU','IN'] or index = file_data.index )

output:

>>> df
      area
BR   12015
RU     457
IN  457787

Question 4

Series is a one-dimensional object that can hold any data type such as integers, floats and strings e.g

   import pandas as pd
   x = pd.Series([A,B,C]) 

0 A
1 B
2 C

The first column of Series is known as index i.e 0,1,2 the second column is your actual data i.e A,B,C

DataFrames is two-dimensional object that can hold series, list, dictionary

df=pd.DataFrame(rd(5,4),['A','B','C','D','E'],['W','X','Y','Z'])

Question 5

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). The axis labels are collectively referred to as the index. The basic method to create a Series is to call:

s = pd.Series(data, index=index)

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects.

 d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
 two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
 df = pd.DataFrame(d)

Question 6

Import cars data

import pandas as pd

cars = pd.read_csv('cars.csv', index_col = 0)

Here is how the cars.csv file looks.

Print out drives_right column as Series:

print(cars.loc[:,"drives_right"])

    US      True
    AUS    False
    JAP    False
    IN     False
    RU      True
    MOR     True
    EG      True
    Name: drives_right, dtype: bool

The single bracket version gives a Pandas Series, the double bracket version gives a Pandas DataFrame.

Print out drives_right column as DataFrame

print(cars.loc[:,["drives_right"]])

         drives_right
    US           True
    AUS         False
    JAP         False
    IN          False
    RU           True
    MOR          True
    EG           True

Adding a Series to another Series creates a DataFrame.

Python 实用宝典

pandas系列和单列DataFrame有什么区别？

问题：pandas系列和单列DataFrame有什么区别？

回答 0

回答 1

回答 2

回答 3

回答 4

有趣好用的Python教程