标签归档:amazon-s3

将Dataframe保存到csv直接保存到s3 Python

问题:将Dataframe保存到csv直接保存到s3 Python

我有一个要上传到新CSV文件的pandas DataFrame。问题是在将文件传输到s3之前,我不想在本地保存文件。是否有像to_csv这样的方法可以将数据帧直接写入s3?我正在使用boto3。
这是我到目前为止的内容:

import boto3
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
read_file = s3.get_object(Bucket, Key)
df = pd.read_csv(read_file['Body'])

# Make alterations to DataFrame

# Then export DataFrame to CSV through direct transfer to s3

I have a pandas DataFrame that I want to upload to a new CSV file. The problem is that I don’t want to save the file locally before transferring it to s3. Is there any method like to_csv for writing the dataframe to s3 directly? I am using boto3.
Here is what I have so far:

import boto3
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
read_file = s3.get_object(Bucket, Key)
df = pd.read_csv(read_file['Body'])

# Make alterations to DataFrame

# Then export DataFrame to CSV through direct transfer to s3

回答 0

您可以使用:

from io import StringIO # python3; python2: BytesIO 
import boto3

bucket = 'my_bucket_name' # already created on S3
csv_buffer = StringIO()
df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df.csv').put(Body=csv_buffer.getvalue())

You can use:

from io import StringIO # python3; python2: BytesIO 
import boto3

bucket = 'my_bucket_name' # already created on S3
csv_buffer = StringIO()
df.to_csv(csv_buffer)
s3_resource = boto3.resource('s3')
s3_resource.Object(bucket, 'df.csv').put(Body=csv_buffer.getvalue())

回答 1

您可以直接使用S3路径。我正在使用Pandas 0.24.1

In [1]: import pandas as pd

In [2]: df = pd.DataFrame( [ [1, 1, 1], [2, 2, 2] ], columns=['a', 'b', 'c'])

In [3]: df
Out[3]:
   a  b  c
0  1  1  1
1  2  2  2

In [4]: df.to_csv('s3://experimental/playground/temp_csv/dummy.csv', index=False)

In [5]: pd.__version__
Out[5]: '0.24.1'

In [6]: new_df = pd.read_csv('s3://experimental/playground/temp_csv/dummy.csv')

In [7]: new_df
Out[7]:
   a  b  c
0  1  1  1
1  2  2  2

发行公告:

S3文件处理

熊猫现在使用s3fs处理S3连接。这不应破坏任何代码。但是,由于s3fs不是必需的依赖项,因此您将需要单独安装它,例如以前版本的panda中的boto。GH11915

You can directly use the S3 path. I am using Pandas 0.24.1

In [1]: import pandas as pd

In [2]: df = pd.DataFrame( [ [1, 1, 1], [2, 2, 2] ], columns=['a', 'b', 'c'])

In [3]: df
Out[3]:
   a  b  c
0  1  1  1
1  2  2  2

In [4]: df.to_csv('s3://experimental/playground/temp_csv/dummy.csv', index=False)

In [5]: pd.__version__
Out[5]: '0.24.1'

In [6]: new_df = pd.read_csv('s3://experimental/playground/temp_csv/dummy.csv')

In [7]: new_df
Out[7]:
   a  b  c
0  1  1  1
1  2  2  2

Release Note:

S3 File Handling

pandas now uses s3fs for handling S3 connections. This shouldn’t break any code. However, since s3fs is not a required dependency, you will need to install it separately, like boto in prior versions of pandas. GH11915.


回答 2

我喜欢s3fs,它使您可以像本地文件系统一样(几乎)使用s3。

你可以这样做:

import s3fs

bytes_to_write = df.to_csv(None).encode()
fs = s3fs.S3FileSystem(key=key, secret=secret)
with fs.open('s3://bucket/path/to/file.csv', 'wb') as f:
    f.write(bytes_to_write)

s3fs只支持rbwb打开文件,这就是为什么我做这个模式bytes_to_write的东西。

I like s3fs which lets you use s3 (almost) like a local filesystem.

You can do this:

import s3fs

bytes_to_write = df.to_csv(None).encode()
fs = s3fs.S3FileSystem(key=key, secret=secret)
with fs.open('s3://bucket/path/to/file.csv', 'wb') as f:
    f.write(bytes_to_write)

s3fs supports only rb and wb modes of opening the file, that’s why I did this bytes_to_write stuff.


回答 3

这是最新的答案:

import s3fs

s3 = s3fs.S3FileSystem(anon=False)

# Use 'w' for py3, 'wb' for py2
with s3.open('<bucket-name>/<filename>.csv','w') as f:
    df.to_csv(f)

StringIO的问题在于它将吞噬您的内存。使用此方法,您将文件流式传输到s3,而不是将其转换为字符串,然后将其写入s3。将pandas数据框及其字符串副本保存在内存中似乎效率很低。

如果您在ec2 Instant中工作,则可以为其赋予IAM角色以使其能够写入s3,因此您无需直接传递凭据。但是,您也可以通过将凭据传递给S3FileSystem()功能来连接到存储桶。请参阅文档:https : //s3fs.readthedocs.io/en/latest/

This is a more up to date answer:

import s3fs

s3 = s3fs.S3FileSystem(anon=False)

# Use 'w' for py3, 'wb' for py2
with s3.open('<bucket-name>/<filename>.csv','w') as f:
    df.to_csv(f)

The problem with StringIO is that it will eat away at your memory. With this method, you are streaming the file to s3, rather than converting it to string, then writing it into s3. Holding the pandas dataframe and its string copy in memory seems very inefficient.

If you are working in an ec2 instant, you can give it an IAM role to enable writing it to s3, thus you dont need to pass in credentials directly. However, you can also connect to a bucket by passing credentials to the S3FileSystem() function. See documention:https://s3fs.readthedocs.io/en/latest/


回答 4

如果None将第一个参数传递to_csv()给数据,则将以字符串形式返回。从那里开始,只需一步即可将其上传到S3。

也可以将一个StringIO对象传递给to_csv(),但是使用字符串会更容易。

If you pass None as the first argument to to_csv() the data will be returned as a string. From there it’s an easy step to upload that to S3 in one go.

It should also be possible to pass a StringIO object to to_csv(), but using a string will be easier.


回答 5

您还可以使用AWS Data Wrangler

import awswrangler

session = awswrangler.Session()
session.pandas.to_csv(
    dataframe=df,
    path="s3://...",
)

请注意,由于它是并行上传的,因此它将分为几部分。

You can also use the AWS Data Wrangler:

import awswrangler as wr
    
wr.s3.to_csv(
    df=df,
    path="s3://...",
)

Note that it will handle multipart upload for you to make the upload faster.


回答 6

我发现也可以使用client,而不仅仅是resource

from io import StringIO
import boto3
s3 = boto3.client("s3",\
                  region_name=region_name,\
                  aws_access_key_id=aws_access_key_id,\
                  aws_secret_access_key=aws_secret_access_key)
csv_buf = StringIO()
df.to_csv(csv_buf, header=True, index=False)
csv_buf.seek(0)
s3.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key='path/test.csv')

I found this can be done using client also and not just resource.

from io import StringIO
import boto3
s3 = boto3.client("s3",\
                  region_name=region_name,\
                  aws_access_key_id=aws_access_key_id,\
                  aws_secret_access_key=aws_secret_access_key)
csv_buf = StringIO()
df.to_csv(csv_buf, header=True, index=False)
csv_buf.seek(0)
s3.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key='path/test.csv')

回答 7

由于您正在使用boto3.client(),请尝试:

import boto3
from io import StringIO #python3 
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
def copy_to_s3(client, df, bucket, filepath):
    csv_buf = StringIO()
    df.to_csv(csv_buf, header=True, index=False)
    csv_buf.seek(0)
    client.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key=filepath)
    print(f'Copy {df.shape[0]} rows to S3 Bucket {bucket} at {filepath}, Done!')

copy_to_s3(client=s3, df=df_to_upload, bucket='abc', filepath='def/test.csv')

since you are using boto3.client(), try:

import boto3
from io import StringIO #python3 
s3 = boto3.client('s3', aws_access_key_id='key', aws_secret_access_key='secret_key')
def copy_to_s3(client, df, bucket, filepath):
    csv_buf = StringIO()
    df.to_csv(csv_buf, header=True, index=False)
    csv_buf.seek(0)
    client.put_object(Bucket=bucket, Body=csv_buf.getvalue(), Key=filepath)
    print(f'Copy {df.shape[0]} rows to S3 Bucket {bucket} at {filepath}, Done!')

copy_to_s3(client=s3, df=df_to_upload, bucket='abc', filepath='def/test.csv')

回答 8

我找到了一个似乎很有效的简单解决方案:

s3 = boto3.client("s3")

s3.put_object(
    Body=open("filename.csv").read(),
    Bucket="your-bucket",
    Key="your-key"
)

希望能有所帮助!

I found a very simple solution that seems to be working :

s3 = boto3.client("s3")

s3.put_object(
    Body=open("filename.csv").read(),
    Bucket="your-bucket",
    Key="your-key"
)

Hope that helps !


回答 9

我从存储桶s3中读取了两列的csv,并将文件csv的内容放入了pandas数据框。

例:

config.json

{
  "credential": {
    "access_key":"xxxxxx",
    "secret_key":"xxxxxx"
}
,
"s3":{
       "bucket":"mybucket",
       "key":"csv/user.csv"
   }
}

cls_config.json

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import json

class cls_config(object):

    def __init__(self,filename):

        self.filename = filename


    def getConfig(self):

        fileName = os.path.join(os.path.dirname(__file__), self.filename)
        with open(fileName) as f:
        config = json.load(f)
        return config

cls_pandas.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pandas as pd
import io

class cls_pandas(object):

    def __init__(self):
        pass

    def read(self,stream):

        df = pd.read_csv(io.StringIO(stream), sep = ",")
        return df

cls_s3.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import boto3
import json

class cls_s3(object):

    def  __init__(self,access_key,secret_key):

        self.s3 = boto3.client('s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key)

    def getObject(self,bucket,key):

        read_file = self.s3.get_object(Bucket=bucket, Key=key)
        body = read_file['Body'].read().decode('utf-8')
        return body

test.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from cls_config import *
from cls_s3 import *
from cls_pandas import *

class test(object):

    def __init__(self):
        self.conf = cls_config('config.json')

    def process(self):

        conf = self.conf.getConfig()

        bucket = conf['s3']['bucket']
        key = conf['s3']['key']

        access_key = conf['credential']['access_key']
        secret_key = conf['credential']['secret_key']

        s3 = cls_s3(access_key,secret_key)
        ob = s3.getObject(bucket,key)

        pa = cls_pandas()
        df = pa.read(ob)

        print df

if __name__ == '__main__':
    test = test()
    test.process()

I read a csv with two columns from bucket s3, and the content of the file csv i put in pandas dataframe.

Example:

config.json

{
  "credential": {
    "access_key":"xxxxxx",
    "secret_key":"xxxxxx"
}
,
"s3":{
       "bucket":"mybucket",
       "key":"csv/user.csv"
   }
}

cls_config.json

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import json

class cls_config(object):

    def __init__(self,filename):

        self.filename = filename


    def getConfig(self):

        fileName = os.path.join(os.path.dirname(__file__), self.filename)
        with open(fileName) as f:
        config = json.load(f)
        return config

cls_pandas.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import pandas as pd
import io

class cls_pandas(object):

    def __init__(self):
        pass

    def read(self,stream):

        df = pd.read_csv(io.StringIO(stream), sep = ",")
        return df

cls_s3.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import boto3
import json

class cls_s3(object):

    def  __init__(self,access_key,secret_key):

        self.s3 = boto3.client('s3', aws_access_key_id=access_key, aws_secret_access_key=secret_key)

    def getObject(self,bucket,key):

        read_file = self.s3.get_object(Bucket=bucket, Key=key)
        body = read_file['Body'].read().decode('utf-8')
        return body

test.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from cls_config import *
from cls_s3 import *
from cls_pandas import *

class test(object):

    def __init__(self):
        self.conf = cls_config('config.json')

    def process(self):

        conf = self.conf.getConfig()

        bucket = conf['s3']['bucket']
        key = conf['s3']['key']

        access_key = conf['credential']['access_key']
        secret_key = conf['credential']['secret_key']

        s3 = cls_s3(access_key,secret_key)
        ob = s3.getObject(bucket,key)

        pa = cls_pandas()
        df = pa.read(ob)

        print df

if __name__ == '__main__':
    test = test()
    test.process()

如何使用Boto将文件上传到S3存储桶中的目录

问题:如何使用Boto将文件上传到S3存储桶中的目录

我想使用python在s3存储桶中复制文件。

例如:我的存储桶名称=测试。在存储桶中,我有2个文件夹名称为“ dump”和“ input”。现在,我想使用python将文件从本地目录复制到S3“转储”文件夹…有人可以帮助我吗?

I want to copy a file in s3 bucket using python.

Ex : I have bucket name = test. And in the bucket, I have 2 folders name “dump” & “input”. Now I want to copy a file from local directory to S3 “dump” folder using python… Can anyone help me?


回答 0

试试这个…

import boto
import boto.s3
import sys
from boto.s3.key import Key

AWS_ACCESS_KEY_ID = ''
AWS_SECRET_ACCESS_KEY = ''

bucket_name = AWS_ACCESS_KEY_ID.lower() + '-dump'
conn = boto.connect_s3(AWS_ACCESS_KEY_ID,
        AWS_SECRET_ACCESS_KEY)


bucket = conn.create_bucket(bucket_name,
    location=boto.s3.connection.Location.DEFAULT)

testfile = "replace this with an actual filename"
print 'Uploading %s to Amazon S3 bucket %s' % \
   (testfile, bucket_name)

def percent_cb(complete, total):
    sys.stdout.write('.')
    sys.stdout.flush()


k = Key(bucket)
k.key = 'my test file'
k.set_contents_from_filename(testfile,
    cb=percent_cb, num_cb=10)

[更新]我不是pythonist,所以感谢您对import语句的注意。另外,我不建议将凭据放入您自己的源代码中。如果您在AWS内部运行此代码,请使用带有实例配置文件(http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html)的IAM凭证,并在其中保留相同的行为。您的开发/测试环境,请使用类似AdRoll的Hologram(https://github.com/AdRoll/hologram

Try this…

import boto
import boto.s3
import sys
from boto.s3.key import Key

AWS_ACCESS_KEY_ID = ''
AWS_SECRET_ACCESS_KEY = ''

bucket_name = AWS_ACCESS_KEY_ID.lower() + '-dump'
conn = boto.connect_s3(AWS_ACCESS_KEY_ID,
        AWS_SECRET_ACCESS_KEY)


bucket = conn.create_bucket(bucket_name,
    location=boto.s3.connection.Location.DEFAULT)

testfile = "replace this with an actual filename"
print 'Uploading %s to Amazon S3 bucket %s' % \
   (testfile, bucket_name)

def percent_cb(complete, total):
    sys.stdout.write('.')
    sys.stdout.flush()


k = Key(bucket)
k.key = 'my test file'
k.set_contents_from_filename(testfile,
    cb=percent_cb, num_cb=10)

[UPDATE] I am not a pythonist, so thanks for the heads up about the import statements. Also, I’d not recommend placing credentials inside your own source code. If you are running this inside AWS use IAM Credentials with Instance Profiles (http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html), and to keep the same behaviour in your Dev/Test environment, use something like Hologram from AdRoll (https://github.com/AdRoll/hologram)


回答 1

无需使其变得如此复杂:

s3_connection = boto.connect_s3()
bucket = s3_connection.get_bucket('your bucket name')
key = boto.s3.key.Key(bucket, 'some_file.zip')
with open('some_file.zip') as f:
    key.send_file(f)

No need to make it that complicated:

s3_connection = boto.connect_s3()
bucket = s3_connection.get_bucket('your bucket name')
key = boto.s3.key.Key(bucket, 'some_file.zip')
with open('some_file.zip') as f:
    key.send_file(f)

回答 2

import boto3

s3 = boto3.resource('s3')
BUCKET = "test"

s3.Bucket(BUCKET).upload_file("your/local/file", "dump/file")
import boto3

s3 = boto3.resource('s3')
BUCKET = "test"

s3.Bucket(BUCKET).upload_file("your/local/file", "dump/file")

回答 3

我用了它,实现起来很简单

import tinys3

conn = tinys3.Connection('S3_ACCESS_KEY','S3_SECRET_KEY',tls=True)

f = open('some_file.zip','rb')
conn.upload('some_file.zip',f,'my_bucket')

https://www.smore.com/labs/tinys3/

I used this and it is very simple to implement

import tinys3

conn = tinys3.Connection('S3_ACCESS_KEY','S3_SECRET_KEY',tls=True)

f = open('some_file.zip','rb')
conn.upload('some_file.zip',f,'my_bucket')

https://www.smore.com/labs/tinys3/


回答 4

from boto3.s3.transfer import S3Transfer
import boto3
#have all the variables populated which are required below
client = boto3.client('s3', aws_access_key_id=access_key,aws_secret_access_key=secret_key)
transfer = S3Transfer(client)
transfer.upload_file(filepath, bucket_name, folder_name+"/"+filename)
from boto3.s3.transfer import S3Transfer
import boto3
#have all the variables populated which are required below
client = boto3.client('s3', aws_access_key_id=access_key,aws_secret_access_key=secret_key)
transfer = S3Transfer(client)
transfer.upload_file(filepath, bucket_name, folder_name+"/"+filename)

回答 5

在具有凭据的会话中将文件上传到s3。

import boto3

session = boto3.Session(
    aws_access_key_id='AWS_ACCESS_KEY_ID',
    aws_secret_access_key='AWS_SECRET_ACCESS_KEY',
)
s3 = session.resource('s3')
# Filename - File to upload
# Bucket - Bucket to upload to (the top level directory under AWS S3)
# Key - S3 object name (can contain subdirectories). If not specified then file_name is used
s3.meta.client.upload_file(Filename='input_file_path', Bucket='bucket_name', Key='s3_output_key')

Upload file to s3 within a session with credentials.

import boto3

session = boto3.Session(
    aws_access_key_id='AWS_ACCESS_KEY_ID',
    aws_secret_access_key='AWS_SECRET_ACCESS_KEY',
)
s3 = session.resource('s3')
# Filename - File to upload
# Bucket - Bucket to upload to (the top level directory under AWS S3)
# Key - S3 object name (can contain subdirectories). If not specified then file_name is used
s3.meta.client.upload_file(Filename='input_file_path', Bucket='bucket_name', Key='s3_output_key')

回答 6

这也将起作用:

import os 
import boto
import boto.s3.connection
from boto.s3.key import Key

try:

    conn = boto.s3.connect_to_region('us-east-1',
    aws_access_key_id = 'AWS-Access-Key',
    aws_secret_access_key = 'AWS-Secrete-Key',
    # host = 's3-website-us-east-1.amazonaws.com',
    # is_secure=True,               # uncomment if you are not using ssl
    calling_format = boto.s3.connection.OrdinaryCallingFormat(),
    )

    bucket = conn.get_bucket('YourBucketName')
    key_name = 'FileToUpload'
    path = 'images/holiday' #Directory Under which file should get upload
    full_key_name = os.path.join(path, key_name)
    k = bucket.new_key(full_key_name)
    k.set_contents_from_filename(key_name)

except Exception,e:
    print str(e)
    print "error"   

This will also work:

import os 
import boto
import boto.s3.connection
from boto.s3.key import Key

try:

    conn = boto.s3.connect_to_region('us-east-1',
    aws_access_key_id = 'AWS-Access-Key',
    aws_secret_access_key = 'AWS-Secrete-Key',
    # host = 's3-website-us-east-1.amazonaws.com',
    # is_secure=True,               # uncomment if you are not using ssl
    calling_format = boto.s3.connection.OrdinaryCallingFormat(),
    )

    bucket = conn.get_bucket('YourBucketName')
    key_name = 'FileToUpload'
    path = 'images/holiday' #Directory Under which file should get upload
    full_key_name = os.path.join(path, key_name)
    k = bucket.new_key(full_key_name)
    k.set_contents_from_filename(key_name)

except Exception,e:
    print str(e)
    print "error"   

回答 7

这是三班轮。只需按照boto3文档中的说明进行操作

import boto3
s3 = boto3.resource(service_name = 's3')
s3.meta.client.upload_file(Filename = 'C:/foo/bar/baz.filetype', Bucket = 'yourbucketname', Key = 'baz.filetype')

一些重要的论据是:

参数:

  • 文件名str)-要上传的文件的路径。
  • 存储桶str)-要上传到的存储桶的名称。
  • str)-您要分配给s3存储桶中文件的的名称。该名称可以与文件名相同,也可以与您选择的名称不同,但是文件类型应保持不变。

    注意:我假设您已按照boto3文档中最佳配置做法的~\.aws建议将凭据保存在文件夹中。

  • This is a three liner. Just follow the instructions on the boto3 documentation.

    import boto3
    s3 = boto3.resource(service_name = 's3')
    s3.meta.client.upload_file(Filename = 'C:/foo/bar/baz.filetype', Bucket = 'yourbucketname', Key = 'baz.filetype')
    

    Some important arguments are:

    Parameters:

  • Filename (str) — The path to the file to upload.
  • Bucket (str) — The name of the bucket to upload to.
  • Key (str) — The name of the that you want to assign to your file in your s3 bucket. This could be the same as the name of the file or a different name of your choice but the filetype should remain the same.

    Note: I assume that you have saved your credentials in a ~\.aws folder as suggested in the best configuration practices in the boto3 documentation.


  • 回答 8

    import boto
    from boto.s3.key import Key
    
    AWS_ACCESS_KEY_ID = ''
    AWS_SECRET_ACCESS_KEY = ''
    END_POINT = ''                          # eg. us-east-1
    S3_HOST = ''                            # eg. s3.us-east-1.amazonaws.com
    BUCKET_NAME = 'test'        
    FILENAME = 'upload.txt'                
    UPLOADED_FILENAME = 'dumps/upload.txt'
    # include folders in file path. If it doesn't exist, it will be created
    
    s3 = boto.s3.connect_to_region(END_POINT,
                               aws_access_key_id=AWS_ACCESS_KEY_ID,
                               aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                               host=S3_HOST)
    
    bucket = s3.get_bucket(BUCKET_NAME)
    k = Key(bucket)
    k.key = UPLOADED_FILENAME
    k.set_contents_from_filename(FILENAME)
    import boto
    from boto.s3.key import Key
    
    AWS_ACCESS_KEY_ID = ''
    AWS_SECRET_ACCESS_KEY = ''
    END_POINT = ''                          # eg. us-east-1
    S3_HOST = ''                            # eg. s3.us-east-1.amazonaws.com
    BUCKET_NAME = 'test'        
    FILENAME = 'upload.txt'                
    UPLOADED_FILENAME = 'dumps/upload.txt'
    # include folders in file path. If it doesn't exist, it will be created
    
    s3 = boto.s3.connect_to_region(END_POINT,
                               aws_access_key_id=AWS_ACCESS_KEY_ID,
                               aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                               host=S3_HOST)
    
    bucket = s3.get_bucket(BUCKET_NAME)
    k = Key(bucket)
    k.key = UPLOADED_FILENAME
    k.set_contents_from_filename(FILENAME)
    

    回答 9

    使用boto3

    import logging
    import boto3
    from botocore.exceptions import ClientError
    
    
    def upload_file(file_name, bucket, object_name=None):
        """Upload a file to an S3 bucket
    
        :param file_name: File to upload
        :param bucket: Bucket to upload to
        :param object_name: S3 object name. If not specified then file_name is used
        :return: True if file was uploaded, else False
        """
    
        # If S3 object_name was not specified, use file_name
        if object_name is None:
            object_name = file_name
    
        # Upload the file
        s3_client = boto3.client('s3')
        try:
            response = s3_client.upload_file(file_name, bucket, object_name)
        except ClientError as e:
            logging.error(e)
            return False
        return True

    有关更多信息:-https : //boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html

    Using boto3

    import logging
    import boto3
    from botocore.exceptions import ClientError
    
    
    def upload_file(file_name, bucket, object_name=None):
        """Upload a file to an S3 bucket
    
        :param file_name: File to upload
        :param bucket: Bucket to upload to
        :param object_name: S3 object name. If not specified then file_name is used
        :return: True if file was uploaded, else False
        """
    
        # If S3 object_name was not specified, use file_name
        if object_name is None:
            object_name = file_name
    
        # Upload the file
        s3_client = boto3.client('s3')
        try:
            response = s3_client.upload_file(file_name, bucket, object_name)
        except ClientError as e:
            logging.error(e)
            return False
        return True
    

    For more:- https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-uploading-files.html


    回答 10

    对于上传文件夹示例,如下代码和S3文件夹图片

    import boto
    import boto.s3
    import boto.s3.connection
    import os.path
    import sys    
    
    # Fill in info on data to upload
    # destination bucket name
    bucket_name = 'willie20181121'
    # source directory
    sourceDir = '/home/willie/Desktop/x/'  #Linux Path
    # destination directory name (on s3)
    destDir = '/test1/'   #S3 Path
    
    #max size in bytes before uploading in parts. between 1 and 5 GB recommended
    MAX_SIZE = 20 * 1000 * 1000
    #size of parts when uploading in parts
    PART_SIZE = 6 * 1000 * 1000
    
    access_key = 'MPBVAQ*******IT****'
    secret_key = '11t63yDV***********HgUcgMOSN*****'
    
    conn = boto.connect_s3(
            aws_access_key_id = access_key,
            aws_secret_access_key = secret_key,
            host = '******.org.tw',
            is_secure=False,               # uncomment if you are not using ssl
            calling_format = boto.s3.connection.OrdinaryCallingFormat(),
            )
    bucket = conn.create_bucket(bucket_name,
            location=boto.s3.connection.Location.DEFAULT)
    
    
    uploadFileNames = []
    for (sourceDir, dirname, filename) in os.walk(sourceDir):
        uploadFileNames.extend(filename)
        break
    
    def percent_cb(complete, total):
        sys.stdout.write('.')
        sys.stdout.flush()
    
    for filename in uploadFileNames:
        sourcepath = os.path.join(sourceDir + filename)
        destpath = os.path.join(destDir, filename)
        print ('Uploading %s to Amazon S3 bucket %s' % \
               (sourcepath, bucket_name))
    
        filesize = os.path.getsize(sourcepath)
        if filesize > MAX_SIZE:
            print ("multipart upload")
            mp = bucket.initiate_multipart_upload(destpath)
            fp = open(sourcepath,'rb')
            fp_num = 0
            while (fp.tell() < filesize):
                fp_num += 1
                print ("uploading part %i" %fp_num)
                mp.upload_part_from_file(fp, fp_num, cb=percent_cb, num_cb=10, size=PART_SIZE)
    
            mp.complete_upload()
    
        else:
            print ("singlepart upload")
            k = boto.s3.key.Key(bucket)
            k.key = destpath
            k.set_contents_from_filename(sourcepath,
                    cb=percent_cb, num_cb=10)

    PS:有关更多参考URL

    For upload folder example as following code and S3 folder picture

    import boto
    import boto.s3
    import boto.s3.connection
    import os.path
    import sys    
    
    # Fill in info on data to upload
    # destination bucket name
    bucket_name = 'willie20181121'
    # source directory
    sourceDir = '/home/willie/Desktop/x/'  #Linux Path
    # destination directory name (on s3)
    destDir = '/test1/'   #S3 Path
    
    #max size in bytes before uploading in parts. between 1 and 5 GB recommended
    MAX_SIZE = 20 * 1000 * 1000
    #size of parts when uploading in parts
    PART_SIZE = 6 * 1000 * 1000
    
    access_key = 'MPBVAQ*******IT****'
    secret_key = '11t63yDV***********HgUcgMOSN*****'
    
    conn = boto.connect_s3(
            aws_access_key_id = access_key,
            aws_secret_access_key = secret_key,
            host = '******.org.tw',
            is_secure=False,               # uncomment if you are not using ssl
            calling_format = boto.s3.connection.OrdinaryCallingFormat(),
            )
    bucket = conn.create_bucket(bucket_name,
            location=boto.s3.connection.Location.DEFAULT)
    
    
    uploadFileNames = []
    for (sourceDir, dirname, filename) in os.walk(sourceDir):
        uploadFileNames.extend(filename)
        break
    
    def percent_cb(complete, total):
        sys.stdout.write('.')
        sys.stdout.flush()
    
    for filename in uploadFileNames:
        sourcepath = os.path.join(sourceDir + filename)
        destpath = os.path.join(destDir, filename)
        print ('Uploading %s to Amazon S3 bucket %s' % \
               (sourcepath, bucket_name))
    
        filesize = os.path.getsize(sourcepath)
        if filesize > MAX_SIZE:
            print ("multipart upload")
            mp = bucket.initiate_multipart_upload(destpath)
            fp = open(sourcepath,'rb')
            fp_num = 0
            while (fp.tell() < filesize):
                fp_num += 1
                print ("uploading part %i" %fp_num)
                mp.upload_part_from_file(fp, fp_num, cb=percent_cb, num_cb=10, size=PART_SIZE)
    
            mp.complete_upload()
    
        else:
            print ("singlepart upload")
            k = boto.s3.key.Key(bucket)
            k.key = destpath
            k.set_contents_from_filename(sourcepath,
                    cb=percent_cb, num_cb=10)
    

    PS: For more reference URL


    回答 11

    xmlstr = etree.tostring(listings,  encoding='utf8', method='xml')
    conn = boto.connect_s3(
            aws_access_key_id = access_key,
            aws_secret_access_key = secret_key,
            # host = '<bucketName>.s3.amazonaws.com',
            host = 'bycket.s3.amazonaws.com',
            #is_secure=False,               # uncomment if you are not using ssl
            calling_format = boto.s3.connection.OrdinaryCallingFormat(),
            )
    conn.auth_region_name = 'us-west-1'
    
    bucket = conn.get_bucket('resources', validate=False)
    key= bucket.get_key('filename.txt')
    key.set_contents_from_string("SAMPLE TEXT")
    key.set_canned_acl('public-read')
    xmlstr = etree.tostring(listings,  encoding='utf8', method='xml')
    conn = boto.connect_s3(
            aws_access_key_id = access_key,
            aws_secret_access_key = secret_key,
            # host = '<bucketName>.s3.amazonaws.com',
            host = 'bycket.s3.amazonaws.com',
            #is_secure=False,               # uncomment if you are not using ssl
            calling_format = boto.s3.connection.OrdinaryCallingFormat(),
            )
    conn.auth_region_name = 'us-west-1'
    
    bucket = conn.get_bucket('resources', validate=False)
    key= bucket.get_key('filename.txt')
    key.set_contents_from_string("SAMPLE TEXT")
    key.set_canned_acl('public-read')
    

    回答 12

    我觉得有些东西还需要点命令:

    import boto3
    from pprint import pprint
    from botocore.exceptions import NoCredentialsError
    
    
    class S3(object):
        BUCKET = "test"
        connection = None
    
        def __init__(self):
            try:
                vars = get_s3_credentials("aws")
                self.connection = boto3.resource('s3', 'aws_access_key_id',
                                                 'aws_secret_access_key')
            except(Exception) as error:
                print(error)
                self.connection = None
    
    
        def upload_file(self, file_to_upload_path, file_name):
            if file_to_upload is None or file_name is None: return False
            try:
                pprint(file_to_upload)
                file_name = "your-folder-inside-s3/{0}".format(file_name)
                self.connection.Bucket(self.BUCKET).upload_file(file_to_upload_path, 
                                                                          file_name)
                print("Upload Successful")
                return True
    
            except FileNotFoundError:
                print("The file was not found")
                return False
    
            except NoCredentialsError:
                print("Credentials not available")
                return False
    
    

    这里有三个重要的变量,BUCKET const,file_to_uploadfile_name

    BUCKET:是您的S3存储桶的名称

    file_to_upload_path:必须是您要上传的文件的路径

    file_name:是存储桶中生成的文件和路径(这是您添加文件夹或其他内容的位置)

    有很多方法,但是您可以在这样的另一个脚本中重用此代码

    import S3
    
    def some_function():
        S3.S3().upload_file(path_to_file, final_file_name)

    I have something that seems to me has a bit more order:

    import boto3
    from pprint import pprint
    from botocore.exceptions import NoCredentialsError
    
    
    class S3(object):
        BUCKET = "test"
        connection = None
    
        def __init__(self):
            try:
                vars = get_s3_credentials("aws")
                self.connection = boto3.resource('s3', 'aws_access_key_id',
                                                 'aws_secret_access_key')
            except(Exception) as error:
                print(error)
                self.connection = None
    
    
        def upload_file(self, file_to_upload_path, file_name):
            if file_to_upload is None or file_name is None: return False
            try:
                pprint(file_to_upload)
                file_name = "your-folder-inside-s3/{0}".format(file_name)
                self.connection.Bucket(self.BUCKET).upload_file(file_to_upload_path, 
                                                                          file_name)
                print("Upload Successful")
                return True
    
            except FileNotFoundError:
                print("The file was not found")
                return False
    
            except NoCredentialsError:
                print("Credentials not available")
                return False
    
    
    

    There’re three important variables here, the BUCKET const, the file_to_upload and the file_name

    BUCKET: is the name of your S3 bucket

    file_to_upload_path: must be the path from file you want to upload

    file_name: is the resulting file and path in your bucket (this is where you add folders or what ever)

    There’s many ways but you can reuse this code in another script like this

    import S3
    
    def some_function():
        S3.S3().upload_file(path_to_file, final_file_name)
    

    连接到boto3 S3时如何指定凭据?

    问题:连接到boto3 S3时如何指定凭据?

    在boto上,当以这种方式连接到S3时,我通常指定我的凭据:

    import boto
    from boto.s3.connection import Key, S3Connection
    S3 = S3Connection( settings.AWS_SERVER_PUBLIC_KEY, settings.AWS_SERVER_SECRET_KEY )
    

    然后,我可以使用S3执行操作(在我的情况下,从存储桶中删除对象)。

    使用boto3,我发现的所有示例都是这样的:

    import boto3
    S3 = boto3.resource( 's3' )
    S3.Object( bucket_name, key_name ).delete()
    

    我无法指定我的凭据,因此所有尝试均因InvalidAccessKeyId错误而失败。

    如何使用boto3指定凭据?

    On boto I used to specify my credentials when connecting to S3 in such a way:

    import boto
    from boto.s3.connection import Key, S3Connection
    S3 = S3Connection( settings.AWS_SERVER_PUBLIC_KEY, settings.AWS_SERVER_SECRET_KEY )
    

    I could then use S3 to perform my operations (in my case deleting an object from a bucket).

    With boto3 all the examples I found are such:

    import boto3
    S3 = boto3.resource( 's3' )
    S3.Object( bucket_name, key_name ).delete()
    

    I couldn’t specify my credentials and thus all attempts fail with InvalidAccessKeyId error.

    How can I specify credentials with boto3?


    回答 0

    您可以创建一个会话

    import boto3
    session = boto3.Session(
        aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
        aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
    )
    

    然后使用该会话获取S3资源:

    s3 = session.resource('s3')
    

    You can create a session:

    import boto3
    session = boto3.Session(
        aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY,
        aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY,
    )
    

    Then use that session to get an S3 resource:

    s3 = session.resource('s3')
    

    回答 1

    您可以client像下面这样直接获得一个新会话。

     s3_client = boto3.client('s3', 
                          aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, 
                          aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY, 
                          region_name=REGION_NAME
                          )
    

    You can get a client with new session directly like below.

     s3_client = boto3.client('s3', 
                          aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, 
                          aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY, 
                          region_name=REGION_NAME
                          )
    

    回答 2

    这是旧的,但也将其放在这里供我参考。boto3.resource只是实现默认的Session,可以通过boto3.resource会话详细信息。

    Help on function resource in module boto3:
    
    resource(*args, **kwargs)
        Create a resource service client by name using the default session.
    
        See :py:meth:`boto3.session.Session.resource`.
    

    https://github.com/boto/boto3/blob/86392b5ca26da57ce6a776365a52d3cab8487d60/boto3/session.py#L265

    您会看到它只接受与Boto3.Session相同的参数

    import boto3
    S3 = boto3.resource('s3', region_name='us-west-2', aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY)
    S3.Object( bucket_name, key_name ).delete()
    

    This is older but placing this here for my reference too. boto3.resource is just implementing the default Session, you can pass through boto3.resource session details.

    Help on function resource in module boto3:
    
    resource(*args, **kwargs)
        Create a resource service client by name using the default session.
    
        See :py:meth:`boto3.session.Session.resource`.
    

    https://github.com/boto/boto3/blob/86392b5ca26da57ce6a776365a52d3cab8487d60/boto3/session.py#L265

    you can see that it just takes the same arguments as Boto3.Session

    import boto3
    S3 = boto3.resource('s3', region_name='us-west-2', aws_access_key_id=settings.AWS_SERVER_PUBLIC_KEY, aws_secret_access_key=settings.AWS_SERVER_SECRET_KEY)
    S3.Object( bucket_name, key_name ).delete()
    

    回答 3

    我想扩展@JustAGuy的答案。我更喜欢使用的方法是AWS CLI创建配置文件。原因是使用配置文件时,CLISDK会自动在~/.aws文件夹中查找凭据。好AWS CLI是用python编写的。

    如果还没有,可以从pypi获取cli。这是从终端获取CLI的步骤

    $> pip install awscli  #can add user flag 
    $> aws configure
    AWS Access Key ID [****************ABCD]:[enter your key here]
    AWS Secret Access Key [****************xyz]:[enter your secret key here]
    Default region name [us-west-2]:[enter your region here]
    Default output format [None]:
    

    之后,您boto无需指定键即可访问和任何api(除非您想使用其他凭据)。

    I’d like expand on @JustAGuy’s answer. The method I prefer is to use AWS CLI to create a config file. The reason is, with the config file, the CLI or the SDK will automatically look for credentials in the ~/.aws folder. And the good thing is that AWS CLI is written in python.

    You can get cli from pypi if you don’t have it already. Here are the steps to get cli set up from terminal

    $> pip install awscli  #can add user flag 
    $> aws configure
    AWS Access Key ID [****************ABCD]:[enter your key here]
    AWS Secret Access Key [****************xyz]:[enter your secret key here]
    Default region name [us-west-2]:[enter your region here]
    Default output format [None]:
    

    After this you can access boto and any of the api without having to specify keys (unless you want to use a different credentials).


    回答 4

    在仍然使用boto3.resource()的情况下,有多种存储凭证的方法。我自己在使用AWS CLI方法。完美运作。

    https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html?fbclid=IwAR2LlrS4O2gYH6xAF4QDVIH2Q2tzfF_VZ6loM3XfXsPAOR4qA-pX_qAILys

    There are numerous ways to store credentials while still using boto3.resource(). I’m using the AWS CLI method myself. It works perfectly.

    https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html?fbclid=IwAR2LlrS4O2gYH6xAF4QDVIH2Q2tzfF_VZ6loM3XfXsPAOR4qA-pX_qAILys


    如何使用boto3将文件或数据写入S3对象

    问题:如何使用boto3将文件或数据写入S3对象

    在boto 2中,可以使用以下方法写入S3对象:

    是否有boto 3等效项?将数据保存到S3上存储的对象的boto3方法是什么?

    In boto 2, you can write to an S3 object using these methods:

    Is there a boto 3 equivalent? What is the boto3 method for saving data to an object stored on S3?


    回答 0

    在Boto 3中,“ Key.set_contents_from_”方法被替换为

    例如:

    import boto3
    
    some_binary_data = b'Here we have some data'
    more_binary_data = b'Here we have some more data'
    
    # Method 1: Object.put()
    s3 = boto3.resource('s3')
    object = s3.Object('my_bucket_name', 'my/key/including/filename.txt')
    object.put(Body=some_binary_data)
    
    # Method 2: Client.put_object()
    client = boto3.client('s3')
    client.put_object(Body=more_binary_data, Bucket='my_bucket_name', Key='my/key/including/anotherfilename.txt')

    另外,二进制数据可以来自读取文件,如官方文档中比较boto 2和boto 3所述

    储存资料

    从文件,流或字符串存储数据很容易:

    # Boto 2.x
    from boto.s3.key import Key
    key = Key('hello.txt')
    key.set_contents_from_file('/tmp/hello.txt')
    
    # Boto 3
    s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))

    In boto 3, the ‘Key.set_contents_from_’ methods were replaced by

    For example:

    import boto3
    
    some_binary_data = b'Here we have some data'
    more_binary_data = b'Here we have some more data'
    
    # Method 1: Object.put()
    s3 = boto3.resource('s3')
    object = s3.Object('my_bucket_name', 'my/key/including/filename.txt')
    object.put(Body=some_binary_data)
    
    # Method 2: Client.put_object()
    client = boto3.client('s3')
    client.put_object(Body=more_binary_data, Bucket='my_bucket_name', Key='my/key/including/anotherfilename.txt')
    

    Alternatively, the binary data can come from reading a file, as described in the official docs comparing boto 2 and boto 3:

    Storing Data

    Storing data from a file, stream, or string is easy:

    # Boto 2.x
    from boto.s3.key import Key
    key = Key('hello.txt')
    key.set_contents_from_file('/tmp/hello.txt')
    
    # Boto 3
    s3.Object('mybucket', 'hello.txt').put(Body=open('/tmp/hello.txt', 'rb'))
    

    回答 1

    boto3还有一种直接上传文件的方法:

    s3.Bucket('bucketname').upload_file('/local/file/here.txt','folder/sub/path/to/s3key')

    http://boto3.readthedocs.io/zh_CN/latest/reference/services/s3.html#S3.Bucket.upload_file

    boto3 also has a method for uploading a file directly:

    s3.Bucket('bucketname').upload_file('/local/file/here.txt','folder/sub/path/to/s3key')
    

    http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Bucket.upload_file


    回答 2

    在S3中写入文件之前,您不再需要将内容转换为二进制文件。以下示例在具有字符串内容的S3存储桶中创建一个新的文本文件(称为newfile.txt):

    import boto3
    
    s3 = boto3.resource(
        's3',
        region_name='us-east-1',
        aws_access_key_id=KEY_ID,
        aws_secret_access_key=ACCESS_KEY
    )
    content="String content to write to a new S3 file"
    s3.Object('my-bucket-name', 'newfile.txt').put(Body=content)

    You no longer have to convert the contents to binary before writing to the file in S3. The following example creates a new text file (called newfile.txt) in an S3 bucket with string contents:

    import boto3
    
    s3 = boto3.resource(
        's3',
        region_name='us-east-1',
        aws_access_key_id=KEY_ID,
        aws_secret_access_key=ACCESS_KEY
    )
    content="String content to write to a new S3 file"
    s3.Object('my-bucket-name', 'newfile.txt').put(Body=content)
    

    回答 3

    这是一个从s3读取JSON的好技巧:

    import json, boto3
    s3 = boto3.resource("s3").Bucket("bucket")
    json.load_s3 = lambda f: json.load(s3.Object(key=f).get()["Body"])
    json.dump_s3 = lambda obj, f: s3.Object(key=f).put(Body=json.dumps(obj))

    现在你可以使用json.load_s3json.dump_s3使用相同的API loaddump

    data = {"test":0}
    json.dump_s3(data, "key") # saves json to s3://bucket/key
    data = json.load_s3("key") # read json from s3://bucket/key

    Here’s a nice trick to read JSON from s3:

    import json, boto3
    s3 = boto3.resource("s3").Bucket("bucket")
    json.load_s3 = lambda f: json.load(s3.Object(key=f).get()["Body"])
    json.dump_s3 = lambda obj, f: s3.Object(key=f).put(Body=json.dumps(obj))
    

    Now you can use json.load_s3 and json.dump_s3 with the same API as load and dump

    data = {"test":0}
    json.dump_s3(data, "key") # saves json to s3://bucket/key
    data = json.load_s3("key") # read json from s3://bucket/key
    

    回答 4

    简洁明了的版本,可用于将文件动态上传到给定的S3存储桶和子文件夹-

    import boto3
    
    BUCKET_NAME = 'sample_bucket_name'
    PREFIX = 'sub-folder/'
    
    s3 = boto3.resource('s3')
    
    # Creating an empty file called "_DONE" and putting it in the S3 bucket
    s3.Object(BUCKET_NAME, PREFIX + '_DONE').put(Body="")

    注意:您应始终将AWS凭证(aws_access_key_idaws_secret_access_key)放在单独的文件中,例如-~/.aws/credentials

    A cleaner and concise version which I use to upload files on the fly to a given S3 bucket and sub-folder-

    import boto3
    
    BUCKET_NAME = 'sample_bucket_name'
    PREFIX = 'sub-folder/'
    
    s3 = boto3.resource('s3')
    
    # Creating an empty file called "_DONE" and putting it in the S3 bucket
    s3.Object(BUCKET_NAME, PREFIX + '_DONE').put(Body="")
    

    Note: You should ALWAYS put your AWS credentials (aws_access_key_id and aws_secret_access_key) in a separate file, for example- ~/.aws/credentials


    回答 5

    值得一提boto3用作后端的智能开放

    smart-open是一个下拉更换为Python的open,可以从打开的文件s3,以及ftphttp和许多其他协议。

    例如

    from smart_open import open
    import json
    with open("s3://your_bucket/your_key.json", 'r') as f:
        data = json.load(f)

    aws凭证通过boto3凭证(通常是~/.aws/dir中的文件或环境变量)加载。

    it is worth mentioning smart-open that uses boto3 as a back-end.

    smart-open is a drop-in replacement for python’s open that can open files from s3, as well as ftp, http and many other protocols.

    for example

    from smart_open import open
    import json
    with open("s3://your_bucket/your_key.json", 'r') as f:
        data = json.load(f)
    

    The aws credentials are loaded via boto3 credentials, usually a file in the ~/.aws/ dir or an environment variable.


    回答 6

    您可能会使用以下代码进行写操作,例如在2019年将图像写入S3。要连接到S3,您将必须使用command安装AWS CLI pip install awscli,然后使用command 输入少量凭证aws configure

    import urllib3
    import uuid
    from pathlib import Path
    from io import BytesIO
    from errors import custom_exceptions as cex
    
    BUCKET_NAME = "xxx.yyy.zzz"
    POSTERS_BASE_PATH = "assets/wallcontent"
    CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/"
    
    
    class S3(object):
        def __init__(self):
            self.client = boto3.client('s3')
            self.bucket_name = BUCKET_NAME
            self.posters_base_path = POSTERS_BASE_PATH
    
        def __download_image(self, url):
            manager = urllib3.PoolManager()
            try:
                res = manager.request('GET', url)
            except Exception:
                print("Could not download the image from URL: ", url)
                raise cex.ImageDownloadFailed
            return BytesIO(res.data)  # any file-like object that implements read()
    
        def upload_image(self, url):
            try:
                image_file = self.__download_image(url)
            except cex.ImageDownloadFailed:
                raise cex.ImageUploadFailed
    
            extension = Path(url).suffix
            id = uuid.uuid1().hex + extension
            final_path = self.posters_base_path + "/" + id
            try:
                self.client.upload_fileobj(image_file,
                                           self.bucket_name,
                                           final_path
                                           )
            except Exception:
                print("Image Upload Error for URL: ", url)
                raise cex.ImageUploadFailed
    
            return CLOUDFRONT_BASE_URL + id

    You may use the below code to write, for example an image to S3 in 2019. To be able to connect to S3 you will have to install AWS CLI using command pip install awscli, then enter few credentials using command aws configure:

    import urllib3
    import uuid
    from pathlib import Path
    from io import BytesIO
    from errors import custom_exceptions as cex
    
    BUCKET_NAME = "xxx.yyy.zzz"
    POSTERS_BASE_PATH = "assets/wallcontent"
    CLOUDFRONT_BASE_URL = "https://xxx.cloudfront.net/"
    
    
    class S3(object):
        def __init__(self):
            self.client = boto3.client('s3')
            self.bucket_name = BUCKET_NAME
            self.posters_base_path = POSTERS_BASE_PATH
    
        def __download_image(self, url):
            manager = urllib3.PoolManager()
            try:
                res = manager.request('GET', url)
            except Exception:
                print("Could not download the image from URL: ", url)
                raise cex.ImageDownloadFailed
            return BytesIO(res.data)  # any file-like object that implements read()
    
        def upload_image(self, url):
            try:
                image_file = self.__download_image(url)
            except cex.ImageDownloadFailed:
                raise cex.ImageUploadFailed
    
            extension = Path(url).suffix
            id = uuid.uuid1().hex + extension
            final_path = self.posters_base_path + "/" + id
            try:
                self.client.upload_fileobj(image_file,
                                           self.bucket_name,
                                           final_path
                                           )
            except Exception:
                print("Image Upload Error for URL: ", url)
                raise cex.ImageUploadFailed
    
            return CLOUDFRONT_BASE_URL + id
    

    使用boto3检查s3中存储桶中是否存在密钥

    问题:使用boto3检查s3中存储桶中是否存在密钥

    我想知道boto3中是否存在密钥。我可以循环存储桶中的内容并检查密钥是否匹配。

    但这似乎更长,并且是一个过大的杀伤力。Boto3官方文档明确说明了如何执行此操作。

    可能是我缺少明显之处。谁能指出我如何实现这一目标。

    I would like to know if a key exists in boto3. I can loop the bucket contents and check the key if it matches.

    But that seems longer and an overkill. Boto3 official docs explicitly state how to do this.

    May be I am missing the obvious. Can anybody point me how I can achieve this.


    回答 0

    Boto 2的boto.s3.key.Key对象曾经有一种exists方法,该方法通过执行HEAD请求并查看结果来检查密钥是否在S3上存在,但似乎不再存在。您必须自己做:

    import boto3
    import botocore
    
    s3 = boto3.resource('s3')
    
    try:
        s3.Object('my-bucket', 'dootdoot.jpg').load()
    except botocore.exceptions.ClientError as e:
        if e.response['Error']['Code'] == "404":
            # The object does not exist.
            ...
        else:
            # Something else has gone wrong.
            raise
    else:
        # The object does exist.
        ...

    load() 对单个键执行HEAD请求,这是快速的,即使有问题的对象很大或存储桶中有很多对象也是如此。

    当然,您可能正在检查对象是否存在,因为您打算使用它。如果是这种情况,您可以只忘了load()并直接执行a get()或操作download_file(),然后在那里处理错误情况。

    Boto 2’s boto.s3.key.Key object used to have an exists method that checked if the key existed on S3 by doing a HEAD request and looking at the the result, but it seems that that no longer exists. You have to do it yourself:

    import boto3
    import botocore
    
    s3 = boto3.resource('s3')
    
    try:
        s3.Object('my-bucket', 'dootdoot.jpg').load()
    except botocore.exceptions.ClientError as e:
        if e.response['Error']['Code'] == "404":
            # The object does not exist.
            ...
        else:
            # Something else has gone wrong.
            raise
    else:
        # The object does exist.
        ...
    

    load() does a HEAD request for a single key, which is fast, even if the object in question is large or you have many objects in your bucket.

    Of course, you might be checking if the object exists because you’re planning on using it. If that is the case, you can just forget about the load() and do a get() or download_file() directly, then handle the error case there.


    回答 1

    我不是非常喜欢将异常用于控制流。这是在boto3中起作用的替代方法:

    import boto3
    
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('my-bucket')
    key = 'dootdoot.jpg'
    objs = list(bucket.objects.filter(Prefix=key))
    if any([w.key == path_s3 for w in objs]):
        print("Exists!")
    else:
        print("Doesn't exist")

    I’m not a big fan of using exceptions for control flow. This is an alternative approach that works in boto3:

    import boto3
    
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('my-bucket')
    key = 'dootdoot.jpg'
    objs = list(bucket.objects.filter(Prefix=key))
    if any([w.key == path_s3 for w in objs]):
        print("Exists!")
    else:
        print("Doesn't exist")
    

    回答 2

    我发现(可能是最有效)的最简单方法是:

    import boto3
    from botocore.errorfactory import ClientError
    
    s3 = boto3.client('s3')
    try:
        s3.head_object(Bucket='bucket_name', Key='file_path')
    except ClientError:
        # Not found
        pass

    The easiest way I found (and probably the most efficient) is this:

    import boto3
    from botocore.errorfactory import ClientError
    
    s3 = boto3.client('s3')
    try:
        s3.head_object(Bucket='bucket_name', Key='file_path')
    except ClientError:
        # Not found
        pass
    

    回答 3

    在Boto3中,如果要使用list_objects检查文件夹(前缀)或文件。您可以使用响应字典中“目录”的存在来检查对象是否存在。就像@EvilPuppetMaster建议的那样,这是避免try / except捕获的另一种方法

    import boto3
    client = boto3.client('s3')
    results = client.list_objects(Bucket='my-bucket', Prefix='dootdoot.jpg')
    return 'Contents' in results

    In Boto3, if you’re checking for either a folder (prefix) or a file using list_objects. You can use the existence of ‘Contents’ in the response dict as a check for whether the object exists. It’s another way to avoid the try/except catches as @EvilPuppetMaster suggests

    import boto3
    client = boto3.client('s3')
    results = client.list_objects(Bucket='my-bucket', Prefix='dootdoot.jpg')
    return 'Contents' in results
    

    回答 4

    不仅client但是bucket太:

    import boto3
    import botocore
    bucket = boto3.resource('s3', region_name='eu-west-1').Bucket('my-bucket')
    
    try:
      bucket.Object('my-file').get()
    except botocore.exceptions.ClientError as ex:
      if ex.response['Error']['Code'] == 'NoSuchKey':
        print('NoSuchKey')

    Not only client but bucket too:

    import boto3
    import botocore
    bucket = boto3.resource('s3', region_name='eu-west-1').Bucket('my-bucket')
    
    try:
      bucket.Object('my-file').get()
    except botocore.exceptions.ClientError as ex:
      if ex.response['Error']['Code'] == 'NoSuchKey':
        print('NoSuchKey')
    

    回答 5

    您可以使用S3Fs,它实际上是boto3的包装,它公开了典型的文件系统样式操作:

    import s3fs
    s3 = s3fs.S3FileSystem()
    s3.exists('myfile.txt')

    You can use S3Fs, which is essentially a wrapper around boto3 that exposes typical file-system style operations:

    import s3fs
    s3 = s3fs.S3FileSystem()
    s3.exists('myfile.txt')
    

    回答 6

    import boto3
    client = boto3.client('s3')
    s3_key = 'Your file without bucket name e.g. abc/bcd.txt'
    bucket = 'your bucket name'
    content = client.head_object(Bucket=bucket,Key=s3_key)
        if content.get('ResponseMetadata',None) is not None:
            print "File exists - s3://%s/%s " %(bucket,s3_key) 
        else:
            print "File does not exist - s3://%s/%s " %(bucket,s3_key)
    import boto3
    client = boto3.client('s3')
    s3_key = 'Your file without bucket name e.g. abc/bcd.txt'
    bucket = 'your bucket name'
    content = client.head_object(Bucket=bucket,Key=s3_key)
        if content.get('ResponseMetadata',None) is not None:
            print "File exists - s3://%s/%s " %(bucket,s3_key) 
        else:
            print "File does not exist - s3://%s/%s " %(bucket,s3_key)
    

    回答 7

    FWIW,这是我正在使用的非常简单的功能

    import boto3
    
    def get_resource(config: dict={}):
        """Loads the s3 resource.
    
        Expects AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be in the environment
        or in a config dictionary.
        Looks in the environment first."""
    
        s3 = boto3.resource('s3',
                            aws_access_key_id=os.environ.get(
                                "AWS_ACCESS_KEY_ID", config.get("AWS_ACCESS_KEY_ID")),
                            aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY", config.get("AWS_SECRET_ACCESS_KEY")))
        return s3
    
    
    def get_bucket(s3, s3_uri: str):
        """Get the bucket from the resource.
        A thin wrapper, use with caution.
    
        Example usage:
    
        >> bucket = get_bucket(get_resource(), s3_uri_prod)"""
        return s3.Bucket(s3_uri)
    
    
    def isfile_s3(bucket, key: str) -> bool:
        """Returns T/F whether the file exists."""
        objs = list(bucket.objects.filter(Prefix=key))
        return len(objs) == 1 and objs[0].key == key
    
    
    def isdir_s3(bucket, key: str) -> bool:
        """Returns T/F whether the directory exists."""
        objs = list(bucket.objects.filter(Prefix=key))
        return len(objs) > 1

    FWIW, here are the very simple functions that I am using

    import boto3
    
    def get_resource(config: dict={}):
        """Loads the s3 resource.
    
        Expects AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be in the environment
        or in a config dictionary.
        Looks in the environment first."""
    
        s3 = boto3.resource('s3',
                            aws_access_key_id=os.environ.get(
                                "AWS_ACCESS_KEY_ID", config.get("AWS_ACCESS_KEY_ID")),
                            aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY", config.get("AWS_SECRET_ACCESS_KEY")))
        return s3
    
    
    def get_bucket(s3, s3_uri: str):
        """Get the bucket from the resource.
        A thin wrapper, use with caution.
    
        Example usage:
    
        >> bucket = get_bucket(get_resource(), s3_uri_prod)"""
        return s3.Bucket(s3_uri)
    
    
    def isfile_s3(bucket, key: str) -> bool:
        """Returns T/F whether the file exists."""
        objs = list(bucket.objects.filter(Prefix=key))
        return len(objs) == 1 and objs[0].key == key
    
    
    def isdir_s3(bucket, key: str) -> bool:
        """Returns T/F whether the directory exists."""
        objs = list(bucket.objects.filter(Prefix=key))
        return len(objs) > 1
    

    回答 8

    假设您只想检查密钥是否存在(而不是悄悄地覆盖它),请首先执行以下检查:

    import boto3
    
    def key_exists(mykey, mybucket):
      s3_client = boto3.client('s3')
      response = s3_client.list_objects_v2(Bucket=mybucket, Prefix=mykey)
      if response:
          for obj in response['Contents']:
              if mykey == obj['Key']:
                  return True
      return False
    
    if key_exists('someprefix/myfile-abc123', 'my-bucket-name'):
        print("key exists")
    else:
        print("safe to put new bucket object")
        # try:
        #     resp = s3_client.put_object(Body="Your string or file-like object",
        #                                 Bucket=mybucket,Key=mykey)
        # ...check resp success and ClientError exception for errors...

    Assuming you just want to check if a key exists (instead of quietly over-writing it), do this check first:

    import boto3
    
    def key_exists(mykey, mybucket):
      s3_client = boto3.client('s3')
      response = s3_client.list_objects_v2(Bucket=mybucket, Prefix=mykey)
      if response:
          for obj in response['Contents']:
              if mykey == obj['Key']:
                  return True
      return False
    
    if key_exists('someprefix/myfile-abc123', 'my-bucket-name'):
        print("key exists")
    else:
        print("safe to put new bucket object")
        # try:
        #     resp = s3_client.put_object(Body="Your string or file-like object",
        #                                 Bucket=mybucket,Key=mykey)
        # ...check resp success and ClientError exception for errors...
    

    回答 9

    试试这个简单

    import boto3
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('mybucket_name') # just Bucket name
    file_name = 'A/B/filename.txt'      # full file path
    obj = list(bucket.objects.filter(Prefix=file_name))
    if len(obj) > 0:
        print("Exists")
    else:
        print("Not Exists")

    Try This simple

    import boto3
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('mybucket_name') # just Bucket name
    file_name = 'A/B/filename.txt'      # full file path
    obj = list(bucket.objects.filter(Prefix=file_name))
    if len(obj) > 0:
        print("Exists")
    else:
        print("Not Exists")
    

    回答 10

    这可以同时检查前缀和密钥,并最多获取1个密钥。

    def prefix_exits(bucket, prefix):
        s3_client = boto3.client('s3')
        res = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix, MaxKeys=1)
        return 'Contents' in res

    This could check both prefix and key, and fetches at most 1 key.

    def prefix_exits(bucket, prefix):
        s3_client = boto3.client('s3')
        res = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix, MaxKeys=1)
        return 'Contents' in res
    

    回答 11

    如果目录或存储桶中的少于1000个,则可以获取它们的集合,然后检查该集合中是否包含此类密钥:

    files_in_dir = {d['Key'].split('/')[-1] for d in s3_client.list_objects_v2(
    Bucket='mybucket',
    Prefix='my/dir').get('Contents') or []}

    即使my/dir不存在,这样的代码也可以工作。

    http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2

    If you have less than 1000 in a directory or bucket you can get set of them and after check if such key in this set:

    files_in_dir = {d['Key'].split('/')[-1] for d in s3_client.list_objects_v2(
    Bucket='mybucket',
    Prefix='my/dir').get('Contents') or []}
    

    Such code works even if my/dir is not exists.

    http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2


    回答 12

    S3_REGION="eu-central-1"
    bucket="mybucket1"
    name="objectname"
    
    import boto3
    from botocore.client import Config
    client = boto3.client('s3',region_name=S3_REGION,config=Config(signature_version='s3v4'))
    list = client.list_objects_v2(Bucket=bucket,Prefix=name)
    for obj in list.get('Contents', []):
        if obj['Key'] == name: return True
    return False
    S3_REGION="eu-central-1"
    bucket="mybucket1"
    name="objectname"
    
    import boto3
    from botocore.client import Config
    client = boto3.client('s3',region_name=S3_REGION,config=Config(signature_version='s3v4'))
    list = client.list_objects_v2(Bucket=bucket,Prefix=name)
    for obj in list.get('Contents', []):
        if obj['Key'] == name: return True
    return False
    

    回答 13

    对于boto3,可以使用ObjectSummary检查对象是否存在。

    包含存储在Amazon S3存储桶中的对象的摘要。该对象不包含该对象的完整元数据或其任何内容

    import boto3
    from botocore.errorfactory import ClientError
    def path_exists(path, bucket_name):
        """Check to see if an object exists on S3"""
        s3 = boto3.resource('s3')
        try:
            s3.ObjectSummary(bucket_name=bucket_name, key=path).load()
        except ClientError as e:
            if e.response['Error']['Code'] == "404":
                return False
            else:
                raise e
        return True
    
    path_exists('path/to/file.html')

    ObjectSummary.load中

    调用s3.Client.head_object以更新ObjectSummary资源的属性。

    这表明您可以使用,ObjectSummary而不是Object如果您打算不使用get()。该load()函数不检索对象,仅获取摘要。

    For boto3, ObjectSummary can be used to check if an object exists.

    Contains the summary of an object stored in an Amazon S3 bucket. This object doesn’t contain contain the object’s full metadata or any of its contents

    import boto3
    from botocore.errorfactory import ClientError
    def path_exists(path, bucket_name):
        """Check to see if an object exists on S3"""
        s3 = boto3.resource('s3')
        try:
            s3.ObjectSummary(bucket_name=bucket_name, key=path).load()
        except ClientError as e:
            if e.response['Error']['Code'] == "404":
                return False
            else:
                raise e
        return True
    
    path_exists('path/to/file.html')
    

    In ObjectSummary.load

    Calls s3.Client.head_object to update the attributes of the ObjectSummary resource.

    This shows that you can use ObjectSummary instead of Object if you are planning on not using get(). The load() function does not retrieve the object it only obtains the summary.


    回答 14

    这是一个对我有用的解决方案。一个警告是我提前知道密钥的确切格式,所以我只列出单个文件

    import boto3
    
    # The s3 base class to interact with S3
    class S3(object):
      def __init__(self):
        self.s3_client = boto3.client('s3')
    
      def check_if_object_exists(self, s3_bucket, s3_key):
        response = self.s3_client.list_objects(
          Bucket = s3_bucket,
          Prefix = s3_key
          )
        if 'ETag' in str(response):
          return True
        else:
          return False
    
    if __name__ == '__main__':
      s3  = S3()
      if s3.check_if_object_exists(bucket, key):
        print "Found S3 object."
      else:
        print "No object found."

    Here is a solution that works for me. One caveat is that I know the exact format of the key ahead of time, so I am only listing the single file

    import boto3
    
    # The s3 base class to interact with S3
    class S3(object):
      def __init__(self):
        self.s3_client = boto3.client('s3')
    
      def check_if_object_exists(self, s3_bucket, s3_key):
        response = self.s3_client.list_objects(
          Bucket = s3_bucket,
          Prefix = s3_key
          )
        if 'ETag' in str(response):
          return True
        else:
          return False
    
    if __name__ == '__main__':
      s3  = S3()
      if s3.check_if_object_exists(bucket, key):
        print "Found S3 object."
      else:
        print "No object found."
    

    回答 15

    您可以为此使用Boto3。

    import boto3
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('my-bucket')
    objs = list(bucket.objects.filter(Prefix=key))
    if(len(objs)>0):
        print("key exists!!")
    else:
        print("key doesn't exist!")

    此处的关键是您要检查的路径是否存在

    you can use Boto3 for this.

    import boto3
    s3 = boto3.resource('s3')
    bucket = s3.Bucket('my-bucket')
    objs = list(bucket.objects.filter(Prefix=key))
    if(len(objs)>0):
        print("key exists!!")
    else:
        print("key doesn't exist!")
    

    Here key is the path you want to check exists or not


    回答 16

    get()方法 真的很简单

    import botocore
    from boto3.session import Session
    session = Session(aws_access_key_id='AWS_ACCESS_KEY',
                    aws_secret_access_key='AWS_SECRET_ACCESS_KEY')
    s3 = session.resource('s3')
    bucket_s3 = s3.Bucket('bucket_name')
    
    def not_exist(file_key):
        try:
            file_details = bucket_s3.Object(file_key).get()
            # print(file_details) # This line prints the file details
            return False
        except botocore.exceptions.ClientError as e:
            if e.response['Error']['Code'] == "NoSuchKey": # or you can check with e.reponse['HTTPStatusCode'] == '404'
                return True
            return False # For any other error it's hard to determine whether it exists or not. so based on the requirement feel free to change it to True/ False / raise Exception
    
    print(not_exist('hello_world.txt')) 

    It’s really simple with get() method

    import botocore
    from boto3.session import Session
    session = Session(aws_access_key_id='AWS_ACCESS_KEY',
                    aws_secret_access_key='AWS_SECRET_ACCESS_KEY')
    s3 = session.resource('s3')
    bucket_s3 = s3.Bucket('bucket_name')
    
    def not_exist(file_key):
        try:
            file_details = bucket_s3.Object(file_key).get()
            # print(file_details) # This line prints the file details
            return False
        except botocore.exceptions.ClientError as e:
            if e.response['Error']['Code'] == "NoSuchKey": # or you can check with e.reponse['HTTPStatusCode'] == '404'
                return True
            return False # For any other error it's hard to determine whether it exists or not. so based on the requirement feel free to change it to True/ False / raise Exception
    
    print(not_exist('hello_world.txt')) 
    

    回答 17

    有一种简单的方法可以检查S3存储桶中的文件是否存在。我们不需要为此使用异常

    sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)
    s3 = session.client('s3')
    
    object_name = 'filename'
    bucket = 'bucketname'
    obj_status = s3.list_objects(Bucket = bucket, Prefix = object_name)
    if obj_status.get('Contents'):
        print("File exists")
    else:
        print("File does not exists")

    There is one simple way by which we can check if file exists or not in S3 bucket. We donot need to use exception for this

    sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)
    s3 = session.client('s3')
    
    object_name = 'filename'
    bucket = 'bucketname'
    obj_status = s3.list_objects(Bucket = bucket, Prefix = object_name)
    if obj_status.get('Contents'):
        print("File exists")
    else:
        print("File does not exists")
    

    回答 18

    如果您寻找与目录等效的键,则可能需要这种方法

    session = boto3.session.Session()
    resource = session.resource("s3")
    bucket = resource.Bucket('mybucket')
    
    key = 'dir-like-or-file-like-key'
    objects = [o for o in bucket.objects.filter(Prefix=key).limit(1)]    
    has_key = len(objects) > 0

    这适用于父密钥或等同于file的密钥或不存在的密钥。我尝试了上面喜欢的方法,但在父键上失败了。

    If you seek a key that is equivalent to a directory then you might want this approach

    session = boto3.session.Session()
    resource = session.resource("s3")
    bucket = resource.Bucket('mybucket')
    
    key = 'dir-like-or-file-like-key'
    objects = [o for o in bucket.objects.filter(Prefix=key).limit(1)]    
    has_key = len(objects) > 0
    

    This works for a parent key or a key that equates to file or a key that does not exist. I tried the favored approach above and failed on parent keys.


    回答 19

    我注意到,仅仅为了捕获异常,botocore.exceptions.ClientError我们需要安装botocore。botocore占用36M的磁盘空间。如果我们使用aws lambda函数,这尤其会产生影响。代替的是,如果我们只使用异常,那么我们可以跳过使用额外的库!

    • 我正在验证文件扩展名为“ .csv”
    • 如果存储桶不存在,则不会抛出异常!
    • 如果存储桶存在但对象不存在,则不会引发异常!
    • 如果存储桶为空,则会抛出异常!
    • 如果存储桶没有权限,则会抛出异常!

    代码看起来像这样。请分享您的想法:

    import boto3
    import traceback
    
    def download4mS3(s3bucket, s3Path, localPath):
        s3 = boto3.resource('s3')
    
        print('Looking for the csv data file ending with .csv in bucket: ' + s3bucket + ' path: ' + s3Path)
        if s3Path.endswith('.csv') and s3Path != '':
            try:
                s3.Bucket(s3bucket).download_file(s3Path, localPath)
            except Exception as e:
                print(e)
                print(traceback.format_exc())
                if e.response['Error']['Code'] == "404":
                    print("Downloading the file from: [", s3Path, "] failed")
                    exit(12)
                else:
                    raise
            print("Downloading the file from: [", s3Path, "] succeeded")
        else:
            print("csv file not found in in : [", s3Path, "]")
            exit(12)

    I noticed that just for catching the exception using botocore.exceptions.ClientError we need to install botocore. botocore takes up 36M of disk space. This is particularly impacting if we use aws lambda functions. In place of that if we just use exception then we can skip using the extra library!

    • I am validating for the file extension to be ‘.csv’
    • This will not throw an exception if the bucket does not exist!
    • This will not throw an exception if the bucket exists but object does not exist!
    • This throws out an exception if the bucket is empty!
    • This throws out an exception if the bucket has no permissions!

    The code looks like this. Please share your thoughts:

    import boto3
    import traceback
    
    def download4mS3(s3bucket, s3Path, localPath):
        s3 = boto3.resource('s3')
    
        print('Looking for the csv data file ending with .csv in bucket: ' + s3bucket + ' path: ' + s3Path)
        if s3Path.endswith('.csv') and s3Path != '':
            try:
                s3.Bucket(s3bucket).download_file(s3Path, localPath)
            except Exception as e:
                print(e)
                print(traceback.format_exc())
                if e.response['Error']['Code'] == "404":
                    print("Downloading the file from: [", s3Path, "] failed")
                    exit(12)
                else:
                    raise
            print("Downloading the file from: [", s3Path, "] succeeded")
        else:
            print("csv file not found in in : [", s3Path, "]")
            exit(12)
    

    回答 20

    紧随线程之后,有人可以得出结论,哪一种是检查S3中是否存在对象的最有效方法?

    我认为head_object可能会获胜,因为它只是检查比实际对象本身更浅的元数据

    Just following the thread, can someone conclude which one is the most efficient way to check if an object exists in S3?

    I think head_object might win as it just checks the metadata which is lighter than the actual object itself


    回答 21

    https://www.peterbe.com/plog/fastest-way-to-find-out-if-a-file-exists-in-s3中指出,这是最快的方法:

    import boto3
    
    boto3_session = boto3.session.Session()
    s3_session_client = boto3_session.client("s3")
    response = s3_session_client.list_objects_v2(
        Bucket=bc_df_caches_bucket, Prefix=s3_key
    )
    for obj in response.get("Contents", []):
        if obj["Key"] == s3_key:
            return True
    return False

    From https://www.peterbe.com/plog/fastest-way-to-find-out-if-a-file-exists-in-s3 this is pointed out to be the fastest method:

    import boto3
    
    boto3_session = boto3.session.Session()
    s3_session_client = boto3_session.client("s3")
    response = s3_session_client.list_objects_v2(
        Bucket=bc_df_caches_bucket, Prefix=s3_key
    )
    for obj in response.get("Contents", []):
        if obj["Key"] == s3_key:
            return True
    return False
    

    回答 22

    退房

    bucket.get_key(
        key_name, 
        headers=None, 
        version_id=None, 
        response_headers=None, 
        validate=True
    )

    检查存储桶中是否存在特定密钥。此方法使用HEAD请求检查密钥是否存在。返回:Key对象的实例或None

    来自Boto S3 Docs

    您可以只调用bucket.get_key(keyname)并检查返回的对象是否为None。

    Check out

    bucket.get_key(
        key_name, 
        headers=None, 
        version_id=None, 
        response_headers=None, 
        validate=True
    )
    

    Check to see if a particular key exists within the bucket. This method uses a HEAD request to check for the existence of the key. Returns: An instance of a Key object or None

    from Boto S3 Docs

    You can just call bucket.get_key(keyname) and check if the returned object is None.


    使用Boto3将S3对象作为字符串打开

    问题:使用Boto3将S3对象作为字符串打开

    我知道,使用Boto 2,可以使用以下命令将S3对象作为字符串打开: get_contents_as_string()

    boto3中有等效功能吗?

    I’m aware that with Boto 2 it’s possible to open an S3 object as a string with: get_contents_as_string()

    Is there an equivalent function in boto3 ?


    回答 0

    read将返回字节。至少对于Python 3,如果要返回字符串,则必须使用正确的编码进行解码:

    import boto3
    
    s3 = boto3.resource('s3')
    
    obj = s3.Object(bucket, key)
    obj.get()['Body'].read().decode('utf-8') 

    read will return bytes. At least for Python 3, if you want to return a string, you have to decode using the right encoding:

    import boto3
    
    s3 = boto3.resource('s3')
    
    obj = s3.Object(bucket, key)
    obj.get()['Body'].read().decode('utf-8') 
    

    回答 1

    由于.get()在AWS Lambda 中使用Python 2.7,我无法从S3读取/解析对象。

    我在示例中添加了json以表明它可解析:)

    import boto3
    import json
    
    s3 = boto3.client('s3')
    
    obj = s3.get_object(Bucket=bucket, Key=key)
    j = json.loads(obj['Body'].read())

    注意(对于python 2.7):我的对象都是ascii,所以我不需要 .decode('utf-8')

    注意(对于python 3.6及更高版本):我们移至python 3.6并发现read()现在返回了,bytes因此,如果要从中获取字符串,则必须使用:

    j = json.loads(obj['Body'].read().decode('utf-8'))

    I had a problem to read/parse the object from S3 because of .get() using Python 2.7 inside an AWS Lambda.

    I added json to the example to show it became parsable :)

    import boto3
    import json
    
    s3 = boto3.client('s3')
    
    obj = s3.get_object(Bucket=bucket, Key=key)
    j = json.loads(obj['Body'].read())
    

    NOTE (for python 2.7): My object is all ascii, so I don’t need .decode('utf-8')

    NOTE (for python 3.6+): We moved to python 3.6 and discovered that read() now returns bytes so if you want to get a string out of it, you must use:

    j = json.loads(obj['Body'].read().decode('utf-8'))


    回答 2

    boto3文档中没有此内容。这为我工作:

    object.get()["Body"].read()

    对象是s3对象:http : //boto3.readthedocs.org/en/latest/reference/services/s3.html#object

    This isn’t in the boto3 documentation. This worked for me:

    object.get()["Body"].read()
    

    object being an s3 object: http://boto3.readthedocs.org/en/latest/reference/services/s3.html#object


    回答 3

    Python3 +使用boto3 API方法。

    通过使用S3.Client.download_fileobj API类似Python文件的对象,可以将S3对象的内容检索到内存中。

    由于检索到的内容是字节,因此为了转换为str,需要对其进行解码。

    import io
    import boto3
    
    client = boto3.client('s3')
    bytes_buffer = io.BytesIO()
    client.download_fileobj(Bucket=bucket_name, Key=object_key, Fileobj=bytes_buffer)
    byte_value = bytes_buffer.getvalue()
    str_value = byte_value.decode() #python3, default decoding is utf-8

    Python3 + Using boto3 API approach.

    By using S3.Client.download_fileobj API and Python file-like object, S3 Object content can be retrieved to memory.

    Since the retrieved content is bytes, in order to convert to str, it need to be decoded.

    import io
    import boto3
    
    client = boto3.client('s3')
    bytes_buffer = io.BytesIO()
    client.download_fileobj(Bucket=bucket_name, Key=object_key, Fileobj=bytes_buffer)
    byte_value = bytes_buffer.getvalue()
    str_value = byte_value.decode() #python3, default decoding is utf-8
    

    回答 4

    如果body包含io.StringIO,则必须执行以下操作:

    object.get()['Body'].getvalue()

    If body contains a io.StringIO, you have to do like below:

    object.get()['Body'].getvalue()
    

    列出带有boto3的存储桶的内容

    问题:列出带有boto3的存储桶的内容

    如何查看S3中的存储桶中的内容boto3?(即是"ls")?

    执行以下操作:

    import boto3
    s3 = boto3.resource('s3')
    my_bucket = s3.Bucket('some/path/')

    返回:

    s3.Bucket(name='some/path/')

    我如何看其内容?

    How can I see what’s inside a bucket in S3 with boto3? (i.e. do an "ls")?

    Doing the following:

    import boto3
    s3 = boto3.resource('s3')
    my_bucket = s3.Bucket('some/path/')
    

    returns:

    s3.Bucket(name='some/path/')
    

    How do I see its contents?


    回答 0

    查看内容的一种方法是:

    for my_bucket_object in my_bucket.objects.all():
        print(my_bucket_object)

    One way to see the contents would be:

    for my_bucket_object in my_bucket.objects.all():
        print(my_bucket_object)
    

    回答 1

    这类似于“ ls”,但是它没有考虑前缀文件夹约定,并且会列出存储桶中的对象。它留给阅读器以过滤掉作为键名称一部分的前缀。

    在Python 2中:

    from boto.s3.connection import S3Connection
    
    conn = S3Connection() # assumes boto.cfg setup
    bucket = conn.get_bucket('bucket_name')
    for obj in bucket.get_all_keys():
        print(obj.key)

    在Python 3中:

    from boto3 import client
    
    conn = client('s3')  # again assumes boto.cfg setup, assume AWS S3
    for key in conn.list_objects(Bucket='bucket_name')['Contents']:
        print(key['Key'])

    This is similar to an ‘ls’ but it does not take into account the prefix folder convention and will list the objects in the bucket. It’s left up to the reader to filter out prefixes which are part of the Key name.

    In Python 2:

    from boto.s3.connection import S3Connection
    
    conn = S3Connection() # assumes boto.cfg setup
    bucket = conn.get_bucket('bucket_name')
    for obj in bucket.get_all_keys():
        print(obj.key)
    

    In Python 3:

    from boto3 import client
    
    conn = client('s3')  # again assumes boto.cfg setup, assume AWS S3
    for key in conn.list_objects(Bucket='bucket_name')['Contents']:
        print(key['Key'])
    

    回答 2

    我假设您已经分别配置了身份验证。

    import boto3
    s3 = boto3.resource('s3')
    
    my_bucket = s3.Bucket('bucket_name')
    
    for file in my_bucket.objects.all():
        print(file.key)

    I’m assuming you have configured authentication separately.

    import boto3
    s3 = boto3.resource('s3')
    
    my_bucket = s3.Bucket('bucket_name')
    
    for file in my_bucket.objects.all():
        print(file.key)
    

    回答 3

    如果要传递ACCESS和SECRET密钥(由于不安全,则不应该这样做):

    from boto3.session import Session
    
    ACCESS_KEY='your_access_key'
    SECRET_KEY='your_secret_key'
    
    session = Session(aws_access_key_id=ACCESS_KEY,
                      aws_secret_access_key=SECRET_KEY)
    s3 = session.resource('s3')
    your_bucket = s3.Bucket('your_bucket')
    
    for s3_file in your_bucket.objects.all():
        print(s3_file.key)

    If you want to pass the ACCESS and SECRET keys (which you should not do, because it is not secure):

    from boto3.session import Session
    
    ACCESS_KEY='your_access_key'
    SECRET_KEY='your_secret_key'
    
    session = Session(aws_access_key_id=ACCESS_KEY,
                      aws_secret_access_key=SECRET_KEY)
    s3 = session.resource('s3')
    your_bucket = s3.Bucket('your_bucket')
    
    for s3_file in your_bucket.objects.all():
        print(s3_file.key)
    

    回答 4

    为了处理大型键列表(即,当目录列表大于1000个项目时),我使用以下代码来累加具有多个列表的键值(即文件名)(这要归功于上述第一行的Amelio)。代码适用于python3:

        from boto3  import client
        bucket_name = "my_bucket"
        prefix      = "my_key/sub_key/lots_o_files"
    
        s3_conn   = client('s3')  # type: BaseClient  ## again assumes boto.cfg setup, assume AWS S3
        s3_result =  s3_conn.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter = "/")
    
        if 'Contents' not in s3_result:
            #print(s3_result)
            return []
    
        file_list = []
        for key in s3_result['Contents']:
            file_list.append(key['Key'])
        print(f"List count = {len(file_list)}")
    
        while s3_result['IsTruncated']:
            continuation_key = s3_result['NextContinuationToken']
            s3_result = s3_conn.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter="/", ContinuationToken=continuation_key)
            for key in s3_result['Contents']:
                file_list.append(key['Key'])
            print(f"List count = {len(file_list)}")
        return file_list

    In order to handle large key listings (i.e. when the directory list is greater than 1000 items), I used the following code to accumulate key values (i.e. filenames) with multiple listings (thanks to Amelio above for the first lines). Code is for python3:

        from boto3  import client
        bucket_name = "my_bucket"
        prefix      = "my_key/sub_key/lots_o_files"
    
        s3_conn   = client('s3')  # type: BaseClient  ## again assumes boto.cfg setup, assume AWS S3
        s3_result =  s3_conn.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter = "/")
    
        if 'Contents' not in s3_result:
            #print(s3_result)
            return []
    
        file_list = []
        for key in s3_result['Contents']:
            file_list.append(key['Key'])
        print(f"List count = {len(file_list)}")
    
        while s3_result['IsTruncated']:
            continuation_key = s3_result['NextContinuationToken']
            s3_result = s3_conn.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter="/", ContinuationToken=continuation_key)
            for key in s3_result['Contents']:
                file_list.append(key['Key'])
            print(f"List count = {len(file_list)}")
        return file_list
    

    回答 5

    我的s3 keys实用程序函数本质上是@Hephaestus答案的优化版本:

    import boto3
    
    
    s3_paginator = boto3.client('s3').get_paginator('list_objects_v2')
    
    
    def keys(bucket_name, prefix='/', delimiter='/', start_after=''):
        prefix = prefix[1:] if prefix.startswith(delimiter) else prefix
        start_after = (start_after or prefix) if prefix.endswith(delimiter) else start_after
        for page in s3_paginator.paginate(Bucket=bucket_name, Prefix=prefix, StartAfter=start_after):
            for content in page.get('Contents', ()):
                yield content['Key']

    在我的测试(boto3 1.9.84)中,它比等效(但更简单)的代码快得多:

    import boto3
    
    
    def keys(bucket_name, prefix='/', delimiter='/'):
        prefix = prefix[1:] if prefix.startswith(delimiter) else prefix
        bucket = boto3.resource('s3').Bucket(bucket_name)
        return (_.key for _ in bucket.objects.filter(Prefix=prefix))

    由于S3保证UTF-8二进制排序结果start_after因此对第一个函数进行了优化。

    My s3 keys utility function is essentially an optimized version of @Hephaestus’s answer:

    import boto3
    
    
    s3_paginator = boto3.client('s3').get_paginator('list_objects_v2')
    
    
    def keys(bucket_name, prefix='/', delimiter='/', start_after=''):
        prefix = prefix[1:] if prefix.startswith(delimiter) else prefix
        start_after = (start_after or prefix) if prefix.endswith(delimiter) else start_after
        for page in s3_paginator.paginate(Bucket=bucket_name, Prefix=prefix, StartAfter=start_after):
            for content in page.get('Contents', ()):
                yield content['Key']
    

    In my tests (boto3 1.9.84), it’s significantly faster than the equivalent (but simpler) code:

    import boto3
    
    
    def keys(bucket_name, prefix='/', delimiter='/'):
        prefix = prefix[1:] if prefix.startswith(delimiter) else prefix
        bucket = boto3.resource('s3').Bucket(bucket_name)
        return (_.key for _ in bucket.objects.filter(Prefix=prefix))
    

    As S3 guarantees UTF-8 binary sorted results, a start_after optimization has been added to the first function.


    回答 6

    一种更简化的方法,而不是通过for循环进行遍历,您还可以仅打印包含S3存储桶中所有文件的原始对象:

    session = Session(aws_access_key_id=aws_access_key_id,aws_secret_access_key=aws_secret_access_key)
    s3 = session.resource('s3')
    bucket = s3.Bucket('bucket_name')
    
    files_in_s3 = bucket.objects.all() 
    #you can print this iterable with print(list(files_in_s3))

    A more parsimonious way, rather than iterating through via a for loop you could also just print the original object containing all files inside your S3 bucket:

    session = Session(aws_access_key_id=aws_access_key_id,aws_secret_access_key=aws_secret_access_key)
    s3 = session.resource('s3')
    bucket = s3.Bucket('bucket_name')
    
    files_in_s3 = bucket.objects.all() 
    #you can print this iterable with print(list(files_in_s3))
    

    回答 7

    对象摘要:

    ObjectSummary附带有两个标识符:

    • bucket_name

    boto3 S3:ObjectSummary

    AWS S3文档中有关对象密钥的更多信息:

    对象键:

    创建对象时,请指定键名,该键名唯一标识存储桶中的对象。例如,在Amazon S3控制台(请参阅AWS管理控制台)中,突出显示存储桶时,将显示存储桶中的对象列表。这些名称是对象键。密钥的名称是一系列Unicode字符,其UTF-8编码最长为1024个字节。

    Amazon S3数据模型是一个平面结构:创建一个存储桶,该存储桶存储对象。没有子桶或子文件夹的层次结构;但是,您可以像Amazon S3控制台一样使用键名前缀和定界符来推断逻辑层次结构。Amazon S3控制台支持文件夹的概念。假设您的存储桶(由管理员创建)具有四个带有以下对象键的对象:

    开发/项目1.xls

    财务/声明1.pdf

    私人/taxdocument.pdf

    s3-dg.pdf

    参考:

    AWS S3:对象密钥

    这是一些示例代码,演示了如何获取存储桶名称和对象密钥。

    例:

    import boto3
    from pprint import pprint
    
    def main():
    
        def enumerate_s3():
            s3 = boto3.resource('s3')
            for bucket in s3.buckets.all():
                 print("Name: {}".format(bucket.name))
                 print("Creation Date: {}".format(bucket.creation_date))
                 for object in bucket.objects.all():
                     print("Object: {}".format(object))
                     print("Object bucket_name: {}".format(object.bucket_name))
                     print("Object key: {}".format(object.key))
    
        enumerate_s3()
    
    
    if __name__ == '__main__':
        main()

    ObjectSummary:

    There are two identifiers that are attached to the ObjectSummary:

    • bucket_name
    • key

    boto3 S3: ObjectSummary

    More on Object Keys from AWS S3 Documentation:

    Object Keys:

    When you create an object, you specify the key name, which uniquely identifies the object in the bucket. For example, in the Amazon S3 console (see AWS Management Console), when you highlight a bucket, a list of objects in your bucket appears. These names are the object keys. The name for a key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.

    The Amazon S3 data model is a flat structure: you create a bucket, and the bucket stores objects. There is no hierarchy of subbuckets or subfolders; however, you can infer logical hierarchy using key name prefixes and delimiters as the Amazon S3 console does. The Amazon S3 console supports a concept of folders. Suppose that your bucket (admin-created) has four objects with the following object keys:

    Development/Projects1.xls

    Finance/statement1.pdf

    Private/taxdocument.pdf

    s3-dg.pdf

    Reference:

    AWS S3: Object Keys

    Here is some example code that demonstrates how to get the bucket name and the object key.

    Example:

    import boto3
    from pprint import pprint
    
    def main():
    
        def enumerate_s3():
            s3 = boto3.resource('s3')
            for bucket in s3.buckets.all():
                 print("Name: {}".format(bucket.name))
                 print("Creation Date: {}".format(bucket.creation_date))
                 for object in bucket.objects.all():
                     print("Object: {}".format(object))
                     print("Object bucket_name: {}".format(object.bucket_name))
                     print("Object key: {}".format(object.key))
    
        enumerate_s3()
    
    
    if __name__ == '__main__':
        main()
    

    回答 8

    我只是这样做,包括身份验证方法:

    s3_client = boto3.client(
                    's3',
                    aws_access_key_id='access_key',
                    aws_secret_access_key='access_key_secret',
                    config=boto3.session.Config(signature_version='s3v4'),
                    region_name='region'
                )
    
    response = s3_client.list_objects(Bucket='bucket_name', Prefix=key)
    if ('Contents' in response):
        # Object / key exists!
        return True
    else:
        # Object / key DOES NOT exist!
        return False

    I just did it like this, including the authentication method:

    s3_client = boto3.client(
                    's3',
                    aws_access_key_id='access_key',
                    aws_secret_access_key='access_key_secret',
                    config=boto3.session.Config(signature_version='s3v4'),
                    region_name='region'
                )
    
    response = s3_client.list_objects(Bucket='bucket_name', Prefix=key)
    if ('Contents' in response):
        # Object / key exists!
        return True
    else:
        # Object / key DOES NOT exist!
        return False
    

    回答 9

    #To print all filenames in a bucket
    import boto3
    
    s3 = boto3.client('s3')
    
    def get_s3_keys(bucket):
    
        """Get a list of keys in an S3 bucket."""
        resp = s3.list_objects_v2(Bucket=bucket)
        for obj in resp['Contents']:
          files = obj['Key']
        return files
    
    
    filename = get_s3_keys('your_bucket_name')
    
    print(filename)
    
    #To print all filenames in a certain directory in a bucket
    import boto3
    
    s3 = boto3.client('s3')
    
    def get_s3_keys(bucket, prefix):
    
        """Get a list of keys in an S3 bucket."""
        resp = s3.list_objects_v2(Bucket=bucket, Prefix=prefix)
        for obj in resp['Contents']:
          files = obj['Key']
          print(files)
        return files
    
    
    filename = get_s3_keys('your_bucket_name', 'folder_name/sub_folder_name/')
    
    print(filename)
    #To print all filenames in a bucket
    import boto3
    
    s3 = boto3.client('s3')
    
    def get_s3_keys(bucket):
    
        """Get a list of keys in an S3 bucket."""
        resp = s3.list_objects_v2(Bucket=bucket)
        for obj in resp['Contents']:
          files = obj['Key']
        return files
    
    
    filename = get_s3_keys('your_bucket_name')
    
    print(filename)
    
    #To print all filenames in a certain directory in a bucket
    import boto3
    
    s3 = boto3.client('s3')
    
    def get_s3_keys(bucket, prefix):
    
        """Get a list of keys in an S3 bucket."""
        resp = s3.list_objects_v2(Bucket=bucket, Prefix=prefix)
        for obj in resp['Contents']:
          files = obj['Key']
          print(files)
        return files
    
    
    filename = get_s3_keys('your_bucket_name', 'folder_name/sub_folder_name/')
    
    print(filename)
    

    回答 10

    在上述注释之一中,@ Hephaeastus的代码几乎没有修改,编写了以下方法以列出给定路径中的文件夹和对象(文件)。与s3 ls命令类似。

    from boto3 import session
    
    def s3_ls(profile=None, bucket_name=None, folder_path=None):
        folders=[]
        files=[]
        result=dict()
        bucket_name = bucket_name
        prefix= folder_path
        session = boto3.Session(profile_name=profile)
        s3_conn   = session.client('s3')
        s3_result =  s3_conn.list_objects_v2(Bucket=bucket_name, Delimiter = "/", Prefix=prefix)
        if 'Contents' not in s3_result and 'CommonPrefixes' not in s3_result:
            return []
    
        if s3_result.get('CommonPrefixes'):
            for folder in s3_result['CommonPrefixes']:
                folders.append(folder.get('Prefix'))
    
        if s3_result.get('Contents'):
            for key in s3_result['Contents']:
                files.append(key['Key'])
    
        while s3_result['IsTruncated']:
            continuation_key = s3_result['NextContinuationToken']
            s3_result = s3_conn.list_objects_v2(Bucket=bucket_name, Delimiter="/", ContinuationToken=continuation_key, Prefix=prefix)
            if s3_result.get('CommonPrefixes'):
                for folder in s3_result['CommonPrefixes']:
                    folders.append(folder.get('Prefix'))
            if s3_result.get('Contents'):
                for key in s3_result['Contents']:
                    files.append(key['Key'])
    
        if folders:
            result['folders']=sorted(folders)
        if files:
            result['files']=sorted(files)
        return result

    这将列出给定路径中的所有对象/文件夹。默认情况下,Folder_path可以保留为None,方法将列出存储桶根的立即内容。

    With little modification to @Hephaeastus ‘s code in one of the above comments, wrote the below method to list down folders and objects (files) in a given path. Works similar to s3 ls command.

    from boto3 import session
    
    def s3_ls(profile=None, bucket_name=None, folder_path=None):
        folders=[]
        files=[]
        result=dict()
        bucket_name = bucket_name
        prefix= folder_path
        session = boto3.Session(profile_name=profile)
        s3_conn   = session.client('s3')
        s3_result =  s3_conn.list_objects_v2(Bucket=bucket_name, Delimiter = "/", Prefix=prefix)
        if 'Contents' not in s3_result and 'CommonPrefixes' not in s3_result:
            return []
    
        if s3_result.get('CommonPrefixes'):
            for folder in s3_result['CommonPrefixes']:
                folders.append(folder.get('Prefix'))
    
        if s3_result.get('Contents'):
            for key in s3_result['Contents']:
                files.append(key['Key'])
    
        while s3_result['IsTruncated']:
            continuation_key = s3_result['NextContinuationToken']
            s3_result = s3_conn.list_objects_v2(Bucket=bucket_name, Delimiter="/", ContinuationToken=continuation_key, Prefix=prefix)
            if s3_result.get('CommonPrefixes'):
                for folder in s3_result['CommonPrefixes']:
                    folders.append(folder.get('Prefix'))
            if s3_result.get('Contents'):
                for key in s3_result['Contents']:
                    files.append(key['Key'])
    
        if folders:
            result['folders']=sorted(folders)
        if files:
            result['files']=sorted(files)
        return result
    

    This lists down all objects / folders in a given path. Folder_path can be left as None by default and method will list the immediate contents of the root of the bucket.


    回答 11

    这是解决方案

    导入boto3

    s3 = boto3.resource(’s3’)

    BUCKET_NAME =’您的S3存储桶名称,例如’deletemetesting11′

    allFiles = s3.Bucket(BUCKET_NAME).objects.all()

    对于allFiles中的文件:print(file.key)

    Here is the solution

    import boto3
    
    s3=boto3.resource('s3')
    BUCKET_NAME = 'Your S3 Bucket Name'
    allFiles = s3.Bucket(BUCKET_NAME).objects.all()
    for file in allFiles:
        print(file.key)