使用Boto3将S3对象作为字符串打开

问题:使用Boto3将S3对象作为字符串打开

我知道,使用Boto 2,可以使用以下命令将S3对象作为字符串打开: get_contents_as_string()

boto3中有等效功能吗?

I’m aware that with Boto 2 it’s possible to open an S3 object as a string with: get_contents_as_string()

Is there an equivalent function in boto3 ?


回答 0

read将返回字节。至少对于Python 3,如果要返回字符串,则必须使用正确的编码进行解码:

import boto3

s3 = boto3.resource('s3')

obj = s3.Object(bucket, key)
obj.get()['Body'].read().decode('utf-8') 

read will return bytes. At least for Python 3, if you want to return a string, you have to decode using the right encoding:

import boto3

s3 = boto3.resource('s3')

obj = s3.Object(bucket, key)
obj.get()['Body'].read().decode('utf-8') 

回答 1

由于.get()在AWS Lambda 中使用Python 2.7,我无法从S3读取/解析对象。

我在示例中添加了json以表明它可解析:)

import boto3
import json

s3 = boto3.client('s3')

obj = s3.get_object(Bucket=bucket, Key=key)
j = json.loads(obj['Body'].read())

注意(对于python 2.7):我的对象都是ascii,所以我不需要 .decode('utf-8')

注意(对于python 3.6及更高版本):我们移至python 3.6并发现read()现在返回了,bytes因此,如果要从中获取字符串,则必须使用:

j = json.loads(obj['Body'].read().decode('utf-8'))

I had a problem to read/parse the object from S3 because of .get() using Python 2.7 inside an AWS Lambda.

I added json to the example to show it became parsable :)

import boto3
import json

s3 = boto3.client('s3')

obj = s3.get_object(Bucket=bucket, Key=key)
j = json.loads(obj['Body'].read())

NOTE (for python 2.7): My object is all ascii, so I don’t need .decode('utf-8')

NOTE (for python 3.6+): We moved to python 3.6 and discovered that read() now returns bytes so if you want to get a string out of it, you must use:

j = json.loads(obj['Body'].read().decode('utf-8'))


回答 2

boto3文档中没有此内容。这为我工作:

object.get()["Body"].read()

对象是s3对象:http : //boto3.readthedocs.org/en/latest/reference/services/s3.html#object

This isn’t in the boto3 documentation. This worked for me:

object.get()["Body"].read()

object being an s3 object: http://boto3.readthedocs.org/en/latest/reference/services/s3.html#object


回答 3

Python3 +使用boto3 API方法。

通过使用S3.Client.download_fileobj API类似Python文件的对象,可以将S3对象的内容检索到内存中。

由于检索到的内容是字节,因此为了转换为str,需要对其进行解码。

import io
import boto3

client = boto3.client('s3')
bytes_buffer = io.BytesIO()
client.download_fileobj(Bucket=bucket_name, Key=object_key, Fileobj=bytes_buffer)
byte_value = bytes_buffer.getvalue()
str_value = byte_value.decode() #python3, default decoding is utf-8

Python3 + Using boto3 API approach.

By using S3.Client.download_fileobj API and Python file-like object, S3 Object content can be retrieved to memory.

Since the retrieved content is bytes, in order to convert to str, it need to be decoded.

import io
import boto3

client = boto3.client('s3')
bytes_buffer = io.BytesIO()
client.download_fileobj(Bucket=bucket_name, Key=object_key, Fileobj=bytes_buffer)
byte_value = bytes_buffer.getvalue()
str_value = byte_value.decode() #python3, default decoding is utf-8

回答 4

如果body包含io.StringIO,则必须执行以下操作:

object.get()['Body'].getvalue()

If body contains a io.StringIO, you have to do like below:

object.get()['Body'].getvalue()