如何在Python中打印Unicode字符?

问题:如何在Python中打印Unicode字符?

我想制作一本字典,其中英语单词指向俄语和法语翻译。

如何在Python中打印出unicode字符?另外,如何将Unicode字符存储在变量中?

I want to make a dictionary where English words point to Russian and French translations.

How do I print out unicode characters in Python? Also, how do you store unicode chars in a variable?


回答 0

要在Python源代码中包含Unicode字符,可以在字符串的形式中使用Unicode转义字符\u0123,并在字符串文字前加上’u’前缀。

这是在Python交互式控制台中运行的示例:

>>> print u'\u0420\u043e\u0441\u0441\u0438\u044f'
Россия

Python Unicode文档中所述,这样声明的字符串是Unicode类型的变量。

如果运行上述命令不能正确显示文本,则可能是您的终端无法显示Unicode字符。

有关从文件读取Unicode数据的信息,请参见以下答案:

使用Python从文件中读取字符

To include Unicode characters in your Python source code, you can use Unicode escape characters in the form \u0123 in your string. In Python 2.x, you also need to prefix the string literal with ‘u’.

Here’s an example running in the Python 2.x interactive console:

>>> print u'\u0420\u043e\u0441\u0441\u0438\u044f'
Россия

In Python 2, prefixing a string with ‘u’ declares them as Unicode-type variables, as described in the Python Unicode documentation.

In Python 3, the ‘u’ prefix is now optional:

>>> print('\u0420\u043e\u0441\u0441\u0438\u044f')
Россия

If running the above commands doesn’t display the text correctly for you, perhaps your terminal isn’t capable of displaying Unicode characters.

These examples use Unicode escapes (\u...), which allows you to print Unicode characters while keeping your source code as plain ASCII. This can help when working with the same source code on different systems. You can also use Unicode characters directly in your Python source code (e.g. print u'Россия' in Python 2), if you are confident all your systems handle Unicode files properly.

For information about reading Unicode data from a file, see this answer:

Character reading from file in Python


回答 1

在Python中打印unicode字符:

直接从python解释器打印unicode字符:

el@apollo:~$ python
Python 2.7.3
>>> print u'\u2713'

Unicode字符u'\u2713'是一个复选标记。口译员将复选标记打印在屏幕上。

从python脚本打印unicode字符:

把它放在test.py中:

#!/usr/bin/python
print("here is your checkmark: " + u'\u2713');

像这样运行它:

el@apollo:~$ python test.py
here is your checkmark: 

如果没有为您显示复选标记,则问题可能出在其他地方,例如终端设置或您正在使用流重定向进行的操作。

将unicode字符存储在文件中:

将此保存到文件:foo.py:

#!/usr/bin/python -tt
# -*- coding: utf-8 -*-
import codecs
import sys 
UTF8Writer = codecs.getwriter('utf8')
sys.stdout = UTF8Writer(sys.stdout)
print(u'e with obfuscation: é')

运行它,并将输出管道传输到文件:

python foo.py > tmp.txt

打开tmp.txt并查看内部,您会看到以下内容:

el@apollo:~$ cat tmp.txt 
e with obfuscation: é

因此,您已将带有混淆标记的unicode e保存到文件中。

Print a unicode character in Python:

Print a unicode character directly from python interpreter:

el@apollo:~$ python
Python 2.7.3
>>> print u'\u2713'
✓

Unicode character u'\u2713' is a checkmark. The interpreter prints the checkmark on the screen.

Print a unicode character from a python script:

Put this in test.py:

#!/usr/bin/python
print("here is your checkmark: " + u'\u2713');

Run it like this:

el@apollo:~$ python test.py
here is your checkmark: ✓

If it doesn’t show a checkmark for you, then the problem could be elsewhere, like the terminal settings or something you are doing with stream redirection.

Store unicode characters in a file:

Save this to file: foo.py:

#!/usr/bin/python -tt
# -*- coding: utf-8 -*-
import codecs
import sys 
UTF8Writer = codecs.getwriter('utf8')
sys.stdout = UTF8Writer(sys.stdout)
print(u'e with obfuscation: é')

Run it and pipe output to file:

python foo.py > tmp.txt

Open tmp.txt and look inside, you see this:

el@apollo:~$ cat tmp.txt 
e with obfuscation: é

Thus you have saved unicode e with a obfuscation mark on it to a file.


回答 2

如果您尝试使用print()Unicode并出现ascii编解码器错误,请查看此页面该页面的TLDR export PYTHONIOENCODING=UTF-8在启动python之前执行(此变量控制控制台尝试将字节数据编码为的字节序列)。在内部,Python3默认使用UTF-8(请参阅Unicode HOWTO),所以这不是问题;您可以将Unicode放入字符串中,如其他答案和注释所示。当您尝试将这些数据发送到控制台时,就会发生问题。Python认为您的控制台只能处理ascii。其他一些答案说:“首先将其写入文件”,但请注意,它们为此指定了编码(UTF-8)(因此,Python不会在书写上进行任何更改),然后使用一种读取方法该文件仅散出字节而无需考虑编码,因此起作用。

If you’re trying to print() Unicode, and getting ascii codec errors, check out this page, the TLDR of which is do export PYTHONIOENCODING=UTF-8 before firing up python (this variable controls what sequence of bytes the console tries to encode your string data as). Internally, Python3 uses UTF-8 by default (see the Unicode HOWTO) so that’s not the problem; you can just put Unicode in strings, as seen in the other answers and comments. It’s when you try and get this data out to your console that the problem happens. Python thinks your console can only handle ascii. Some of the other answers say, “Write it to a file, first” but note they specify the encoding (UTF-8) for doing so (so, Python doesn’t change anything in writing), and then use a method for reading the file that just spits out the bytes without any regard for encoding, which is why that works.


回答 3

在Python 2中,您u可以在中u"猫"使用声明unicode字符串,并分别使用decode()encode()与unicode进行相互转换。

在Python 3中,这要容易得多。在这里可以找到非常好的概述。那场演讲为我澄清了很多事情。

In Python 2, you declare unicode strings with a u, as in u"猫" and use decode() and encode() to translate to and from unicode, respectively.

It’s quite a bit easier in Python 3. A very good overview can be found here. That presentation clarified a lot of things for me.


回答 4

考虑到这是Google搜索此主题时的第一个堆栈溢出结果,因此值得一提的u是,在python 3中Unicode字符串的前缀是可选的。(从最上面的答案复制了Python 2示例)

Python 3(两者均可):

print('\u0420\u043e\u0441\u0441\u0438\u044f')
print(u'\u0420\u043e\u0441\u0441\u0438\u044f')

Python 2:

print u'\u0420\u043e\u0441\u0441\u0438\u044f'

Considering that this is the first stack overflow result when google searching this topic, it bears mentioning that prefixing u to unicode strings is optional in Python 3. (Python 2 example was copied from the top answer)

Python 3 (both work):

print('\u0420\u043e\u0441\u0441\u0438\u044f')
print(u'\u0420\u043e\u0441\u0441\u0438\u044f')

Python 2:

print u'\u0420\u043e\u0441\u0441\u0438\u044f'

回答 5

我在Windows中使用Portable Winpython,它包含IPython QT控制台,我可以实现以下目标。

>>>print ("結婚")
結婚

>>>print ("おはよう")
おはよう

>>>str = "結婚"


>>>print (str)
結婚

您的控制台解释器应支持unicode才能显示unicode字符。

I use Portable winpython in Windows, it includes IPython QT console, I could achieve the following.

>>>print ("結婚")
結婚

>>>print ("おはよう")
おはよう

>>>str = "結婚"


>>>print (str)
結婚

your console interpreter should support unicode in order to show unicode characters.


回答 6

尚未添加的一件事

在Python 2中,如果要打印具有unicode并使用的变量,.format()请执行此操作(将要格式化的基本字符串设置为u''

>>> text = "Université de Montréal"
>>> print(u"This is unicode: {}".format(text))
>>> This is unicode: Université de Montréal

Just one more thing that hasn’t been added yet

In Python 2, if you want to print a variable that has unicode and use .format(), then do this (make the base string that is being formatted a unicode string with u'':

>>> text = "Université de Montréal"
>>> print(u"This is unicode: {}".format(text))
>>> This is unicode: Université de Montréal

回答 7

这修复了python中的UTF-8打印:

UTF8Writer = codecs.getwriter('utf8')
sys.stdout = UTF8Writer(sys.stdout)

This fixes UTF-8 printing in python:

UTF8Writer = codecs.getwriter('utf8')
sys.stdout = UTF8Writer(sys.stdout)

回答 8

‘+’替换为‘000’。例如,“ U + 1F600”将变为“ U0001F600”,并在Unicode代码前添加“ \”并打印。例:

>>> print("Learning : ", "\U0001F40D")
Learning :  🐍
>>> 

检查这也许会帮助 python unicode emoji

Replace ‘+’ with ‘000’. For example, ‘U+1F600’ will become ‘U0001F600’ and prepend the Unicode code with “\” and print. Example:

>>> print("Learning : ", "\U0001F40D")
Learning :  🐍
>>> 

Check this maybe it will help python unicode emoji