使用熊猫将字符串前缀添加到字符串列中的每个值-Python 实用宝典

问题：使用熊猫将字符串前缀添加到字符串列中的每个值

我想在熊猫数据帧的所述列中的每个值的开头附加一个字符串（优雅）。我已经弄清楚该如何做，目前正在使用：

df.ix[(df['col'] != False), 'col'] = 'str'+df[(df['col'] != False), 'col']

这似乎是一件微不足道的事情-您是否知道其他任何方式（可能还会将该字符添加到该列为0或NaN的行中）？

如果还不清楚，我想转一下：

    col 
1     a
2     0

变成：

       col 
1     stra
2     str0

I would like to append a string to the start of each value in a said column of a pandas dataframe (elegantly). I already figured out how to kind-of do this and I am currently using:

df.ix[(df['col'] != False), 'col'] = 'str'+df[(df['col'] != False), 'col']

This seems one hell of an inelegant thing to do – do you know any other way (which maybe also adds the character to rows where that column is 0 or NaN)?

In case this is yet unclear, I would like to turn:

    col 
1     a
2     0

into:

       col 
1     stra
2     str0

回答 0

df['col'] = 'str' + df['col'].astype(str)

例：

>>> df = pd.DataFrame({'col':['a',0]})
>>> df
  col
0   a
1   0
>>> df['col'] = 'str' + df['col'].astype(str)
>>> df
    col
0  stra
1  str0

df['col'] = 'str' + df['col'].astype(str)

Example:

>>> df = pd.DataFrame({'col':['a',0]})
>>> df
  col
0   a
1   0
>>> df['col'] = 'str' + df['col'].astype(str)
>>> df
    col
0  stra
1  str0

回答 1

另外，您也可以使用apply组合format（或f字符串更好），如果例如还想添加后缀或操纵元素本身，我会觉得可读性更高：

df = pd.DataFrame({'col':['a', 0]})

df['col'] = df['col'].apply(lambda x: "{}{}".format('str', x))

这也会产生所需的输出：

    col
0  stra
1  str0

如果您使用的是Python 3.6+，则还可以使用f字符串：

df['col'] = df['col'].apply(lambda x: f"str{x}")

产生相同的输出。

f字符串版本几乎与@RomanPekar的解决方案（python 3.6.4）一样快：

df = pd.DataFrame({'col':['a', 0]*200000})

%timeit df['col'].apply(lambda x: f"str{x}")
117 ms ± 451 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit 'str' + df['col'].astype(str)
112 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

format但是，使用的确确实要慢得多：

%timeit df['col'].apply(lambda x: "{}{}".format('str', x))
185 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

As an alternative, you can also use an apply combined with format (or better with f-strings) which I find slightly more readable if one e.g. also wants to add a suffix or manipulate the element itself:

df = pd.DataFrame({'col':['a', 0]})

df['col'] = df['col'].apply(lambda x: "{}{}".format('str', x))

which also yields the desired output:

    col
0  stra
1  str0

If you are using Python 3.6+, you can also use f-strings:

df['col'] = df['col'].apply(lambda x: f"str{x}")

yielding the same output.

The f-string version is almost as fast as @RomanPekar’s solution (python 3.6.4):

df = pd.DataFrame({'col':['a', 0]*200000})

%timeit df['col'].apply(lambda x: f"str{x}")
117 ms ± 451 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit 'str' + df['col'].astype(str)
112 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using format, however, is indeed far slower:

%timeit df['col'].apply(lambda x: "{}{}".format('str', x))
185 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

回答 2

您可以使用pandas.Series.map：

df['col'].map('str{}'.format)

它将在所有值之前加上“ str”一词。

You can use pandas.Series.map :

df['col'].map('str{}'.format)

It will apply the word “str” before all your values.

回答 3

如果使用加载表文件dtype=str
或将列类型转换为字符串，df['a'] = df['a'].astype(str)
则可以使用以下方法：

df['a']= 'col' + df['a'].str[:]

这种方法允许使用的前缀，追加和子集字符串df。
适用于Pandas v0.23.4，v0.24.1。不了解较早的版本。

If you load you table file with dtype=str
or convert column type to string df['a'] = df['a'].astype(str)
then you can use such approach:

df['a']= 'col' + df['a'].str[:]

This approach allows prepend, append, and subset string of df.
Works on Pandas v0.23.4, v0.24.1. Don’t know about earlier versions.

回答 4

.loc的另一种解决方案：

df = pd.DataFrame({'col': ['a', 0]})
df.loc[df.index, 'col'] = 'string' + df['col'].astype(str)

这没有上述解决方案快（每个循环慢1ms以上），但在需要条件更改时可能有用，例如：

mask = (df['col'] == 0)
df.loc[mask, 'col'] = 'string' + df['col'].astype(str)

Another solution with .loc:

df = pd.DataFrame({'col': ['a', 0]})
df.loc[df.index, 'col'] = 'string' + df['col'].astype(str)

This is not as quick as solutions above (>1ms per loop slower) but may be useful in case you need conditional change, like:

mask = (df['col'] == 0)
df.loc[mask, 'col'] = 'string' + df['col'].astype(str)

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

使用熊猫将字符串前缀添加到字符串列中的每个值

问题：使用熊猫将字符串前缀添加到字符串列中的每个值

回答 0

回答 1

回答 2

回答 3

回答 4

排行榜展示

Python 情人节超强技能导出微信聊天记录生成词云

你不得不知道的python超级文献批量搜索下载工具

Python 流程图 — 一键转化代码为流程图

7行代码 Python热力图可视化分析缺失数据处理

Python 优化—算出每条语句执行时间

你的10W块放哪里能赚最多钱？

文章展示

检查网络连接

如何将sys.stdout复制到日志文件？

Python super（）引发TypeError

从ND到一维阵列

Python 超强文献下载工具 Scihub-cn 又更新啦！

python pandas：删除列A的重复项，将行的最高值保留在列B中

使用熊猫将字符串前缀添加到字符串列中的每个值

问题：使用熊猫将字符串前缀添加到字符串列中的每个值

回答 0

回答 1

回答 2

回答 3

回答 4

相关文章

排行榜展示

文章展示