分类目录归档:知识问答

Python SciPy是否需要BLAS?

问题:Python SciPy是否需要BLAS?

numpy.distutils.system_info.BlasNotFoundError: 
    Blas (http://www.netlib.org/blas/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [blas]) or by setting
    the BLAS environment variable.

我需要从该站点下载哪个tar?

我已经尝试过fortrans,但是一直出现此错误(明显地设置了环境变量之后)。

numpy.distutils.system_info.BlasNotFoundError: 
    Blas (http://www.netlib.org/blas/) libraries not found.
    Directories to search for the libraries can be specified in the
    numpy/distutils/site.cfg file (section [blas]) or by setting
    the BLAS environment variable.

Which tar do I need to download off this site?

I’ve tried the fortrans, but I keep getting this error (after setting the environment variable obviously).


回答 0

SciPy的网页用来提供构建和安装说明,但说明现在依靠操作系统二进制分发。要在没有预编译所需库软件包的操作系统上构建SciPy(和NumPy),必须先构建然后静态链接到Fortran库BLASLAPACK

mkdir -p ~/src/
cd ~/src/
wget http://www.netlib.org/blas/blas.tgz
tar xzf blas.tgz
cd BLAS-*

## NOTE: The selected Fortran compiler must be consistent for BLAS, LAPACK, NumPy, and SciPy.
## For GNU compiler on 32-bit systems:
#g77 -O2 -fno-second-underscore -c *.f                     # with g77
#gfortran -O2 -std=legacy -fno-second-underscore -c *.f    # with gfortran
## OR for GNU compiler on 64-bit systems:
#g77 -O3 -m64 -fno-second-underscore -fPIC -c *.f                     # with g77
gfortran -O3 -std=legacy -m64 -fno-second-underscore -fPIC -c *.f    # with gfortran
## OR for Intel compiler:
#ifort -FI -w90 -w95 -cm -O3 -unroll -c *.f

# Continue below irrespective of compiler:
ar r libfblas.a *.o
ranlib libfblas.a
rm -rf *.o
export BLAS=~/src/BLAS-*/libfblas.a

仅执行五个g77 / gfortran / ifort命令之一。我已注释掉所有内容,但我使用的是gfortran。随后的LAPACK安装需要一个Fortran 90编译器,并且由于两个安装都应使用相同的Fortran编译器,因此g77不应用于BLAS。

接下来,您需要安装LAPACK东西。SciPy网页的说明在这里也对我有所帮助,但我必须对其进行修改以适合我的环境:

mkdir -p ~/src
cd ~/src/
wget http://www.netlib.org/lapack/lapack.tgz
tar xzf lapack.tgz
cd lapack-*/
cp INSTALL/make.inc.gfortran make.inc          # On Linux with lapack-3.2.1 or newer
make lapacklib
make clean
export LAPACK=~/src/lapack-*/liblapack.a

2015年9月3日更新:今天验证了一些评论(感谢所有):运行之前,make lapacklib编辑make.inc文件-fPIC并向OPTSNOOPT设置添加选项。如果您使用的是64位体系结构或要编译为64位体系结构,请同时添加-m64。重要的是,在将这些选项设置为相同值的情况下编译BLAS和LAPACK。如果您忘记了,-fPICSciPy实际上会给您有关符号丢失的错误,并建议您使用此开关。make.inc我的设置中的特定部分如下所示:

FORTRAN  = gfortran 
OPTS     = -O2 -frecursive -fPIC -m64
DRVOPTS  = $(OPTS)
NOOPT    = -O0 -frecursive -fPIC -m64
LOADER   = gfortran

在旧机器(例如RedHat 5)上,gfortran可能安装在旧版本(例如4.1.2)中,并且不理解option -frecursivemake.inc在这种情况下,只需将其从文件中删除即可。

Makefile的lapack测试目标在我的设置中失败,因为它找不到blas库。如果您周全,则可以将blas库临时移至指定位置以测试lapack。我是一个懒惰的人,所以我相信开发人员可以使其工作并仅在SciPy中进行验证。

The SciPy webpage used to provide build and installation instructions, but the instructions there now rely on OS binary distributions. To build SciPy (and NumPy) on operating systems without precompiled packages of the required libraries, you must build and then statically link to the Fortran libraries BLAS and LAPACK:

mkdir -p ~/src/
cd ~/src/
wget http://www.netlib.org/blas/blas.tgz
tar xzf blas.tgz
cd BLAS-*

## NOTE: The selected Fortran compiler must be consistent for BLAS, LAPACK, NumPy, and SciPy.
## For GNU compiler on 32-bit systems:
#g77 -O2 -fno-second-underscore -c *.f                     # with g77
#gfortran -O2 -std=legacy -fno-second-underscore -c *.f    # with gfortran
## OR for GNU compiler on 64-bit systems:
#g77 -O3 -m64 -fno-second-underscore -fPIC -c *.f                     # with g77
gfortran -O3 -std=legacy -m64 -fno-second-underscore -fPIC -c *.f    # with gfortran
## OR for Intel compiler:
#ifort -FI -w90 -w95 -cm -O3 -unroll -c *.f

# Continue below irrespective of compiler:
ar r libfblas.a *.o
ranlib libfblas.a
rm -rf *.o
export BLAS=~/src/BLAS-*/libfblas.a

Execute only one of the five g77/gfortran/ifort commands. I have commented out all, but the gfortran which I use. The subsequent LAPACK installation requires a Fortran 90 compiler, and since both installs should use the same Fortran compiler, g77 should not be used for BLAS.

Next, you’ll need to install the LAPACK stuff. The SciPy webpage’s instructions helped me here as well, but I had to modify them to suit my environment:

mkdir -p ~/src
cd ~/src/
wget http://www.netlib.org/lapack/lapack.tgz
tar xzf lapack.tgz
cd lapack-*/
cp INSTALL/make.inc.gfortran make.inc          # On Linux with lapack-3.2.1 or newer
make lapacklib
make clean
export LAPACK=~/src/lapack-*/liblapack.a

Update on 3-Sep-2015: Verified some comments today (thanks to all): Before running make lapacklib edit the make.inc file and add -fPIC option to OPTS and NOOPT settings. If you are on a 64bit architecture or want to compile for one, also add -m64. It is important that BLAS and LAPACK are compiled with these options set to the same values. If you forget the -fPIC SciPy will actually give you an error about missing symbols and will recommend this switch. The specific section of make.inc looks like this in my setup:

FORTRAN  = gfortran 
OPTS     = -O2 -frecursive -fPIC -m64
DRVOPTS  = $(OPTS)
NOOPT    = -O0 -frecursive -fPIC -m64
LOADER   = gfortran

On old machines (e.g. RedHat 5), gfortran might be installed in an older version (e.g. 4.1.2) and does not understand option -frecursive. Simply remove it from the make.inc file in such cases.

The lapack test target of the Makefile fails in my setup because it cannot find the blas libraries. If you are thorough you can temporarily move the blas library to the specified location to test the lapack. I’m a lazy person, so I trust the devs to have it working and verify only in SciPy.


回答 1

如果您需要使用最新版本的SciPy而非打包的版本,而无需经历构建BLAS和LAPACK的麻烦,则可以按照以下过程进行操作。

从存储库安装线性代数库(对于Ubuntu),

sudo apt-get install gfortran libopenblas-dev liblapack-dev

然后安装SciPy(在下载SciPy源代码之后):python setup.py install

pip install scipy

视情况可以是。

If you need to use the latest versions of SciPy rather than the packaged version, without going through the hassle of building BLAS and LAPACK, you can follow the below procedure.

Install linear algebra libraries from repository (for Ubuntu),

sudo apt-get install gfortran libopenblas-dev liblapack-dev

Then install SciPy, (after downloading the SciPy source): python setup.py install or

pip install scipy

As the case may be.


回答 2

在Fedora上,这有效:

 yum install lapack lapack-devel blas blas-devel
 pip install numpy
 pip install scipy

请记住除了安装“ blas ”和“ lapack ”之外,还要安装“ lapack-devel ”和“ blas-devel ”,否则,您将得到所提到的错误或“ numpy.distutils.system_info。LapackNotFoundError ”错误。

On Fedora, this works:

 yum install lapack lapack-devel blas blas-devel
 pip install numpy
 pip install scipy

Remember to install ‘lapack-devel‘ and ‘blas-devel‘ in addition to ‘blas’ and ‘lapack’ otherwise you’ll get the error you mentioned or the “numpy.distutils.system_info.LapackNotFoundError” error.


回答 3

我猜您在谈论在Ubuntu中进行安装。只需使用:

apt-get install python-numpy python-scipy

那也应该照顾BLAS库的编译。否则,编译BLAS库非常困难。

I guess you are talking about installation in Ubuntu. Just use:

apt-get install python-numpy python-scipy

That should take care of the BLAS libraries compiling as well. Else, compiling the BLAS libraries is very difficult.


回答 4

对于Windows用户,Chris提供了一个不错的二进制程序包(警告:下载量很大,为191 MB):

For Windows users there is a nice binary package by Chris (warning: it’s a pretty large download, 191 MB):


回答 5

遵循“ cfi”给出的说明对我有用,尽管它们遗漏了一些您可能需要的部分:

1)解压缩后的lapack目录可能称为lapack-XY(某些版本号),因此您可以将其重命名为LAPACK。

cd ~/src
mv lapack-[tab] LAPACK

2)在该目录中,您可能需要执行以下操作:

cd ~/src/LAPACK 
cp lapack_LINUX.a libflapack.a

Following the instructions given by ‘cfi’ works for me, although there are a few pieces they left out that you might need:

1) Your lapack directory, after unzipping, may be called lapack-X-Y (some version number), so you can just rename that to LAPACK.

cd ~/src
mv lapack-[tab] LAPACK

2) In that directory, you may need to do:

cd ~/src/LAPACK 
cp lapack_LINUX.a libflapack.a

回答 6

尝试使用

sudo apt-get install python3-scipy

Try using

sudo apt-get install python3-scipy

如何在Python中找到线程ID

问题:如何在Python中找到线程ID

我有一个多线程Python程序和一个实用程序函数, writeLog(message),该写出时间戳记和消息。不幸的是,结果日志文件没有给出哪个线程正在生成哪个消息的指示。

我希望writeLog()能够在消息中添加一些内容,以标识哪个线程正在调用它。显然,我可以使线程将这些信息传递进来,但这将需要更多工作。是否有一些os.getpid()我可以使用的等效线程?

I have a multi-threading Python program, and a utility function, writeLog(message), that writes out a timestamp followed by the message. Unfortunately, the resultant log file gives no indication of which thread is generating which message.

I would like writeLog() to be able to add something to the message to identify which thread is calling it. Obviously I could just make the threads pass this information in, but that would be a lot more work. Is there some thread equivalent of os.getpid() that I could use?


回答 0

threading.get_ident(),或threading.current_thread().ident(或(threading.currentThread().ident对于python <2.6)。

threading.get_ident() works, or threading.current_thread().ident (or threading.currentThread().ident for Python < 2.6).


回答 1

使用日志记录模块,您可以在每个日志条目中自动添加当前线程标识符。只需在记录器格式字符串中使用以下LogRecord映射键之一:

%(thread)d: 线程ID(如果有)。

%(threadName)s: 线程名称(如果有)。

并使用它设置默认处理程序:

logging.basicConfig(format="%(threadName)s:%(message)s")

Using the logging module you can automatically add the current thread identifier in each log entry. Just use one of these LogRecord mapping keys in your logger format string:

%(thread)d : Thread ID (if available).

%(threadName)s : Thread name (if available).

and set up your default handler with it:

logging.basicConfig(format="%(threadName)s:%(message)s")

回答 2

thread.get_ident()函数在Linux上返回一个长整数。这实际上不是线程ID。

我使用这种方法来真正获取Linux上的线程ID:

import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')

# System dependent, see e.g. /usr/include/x86_64-linux-gnu/asm/unistd_64.h
SYS_gettid = 186

def getThreadId():
   """Returns OS thread id - Specific to Linux"""
   return libc.syscall(SYS_gettid)

The thread.get_ident() function returns a long integer on Linux. It’s not really a thread id.

I use this method to really get the thread id on Linux:

import ctypes
libc = ctypes.cdll.LoadLibrary('libc.so.6')

# System dependent, see e.g. /usr/include/x86_64-linux-gnu/asm/unistd_64.h
SYS_gettid = 186

def getThreadId():
   """Returns OS thread id - Specific to Linux"""
   return libc.syscall(SYS_gettid)

回答 3


回答 4

我看到了这样的线程ID的示例:

class myThread(threading.Thread):
    def __init__(self, threadID, name, counter):
        self.threadID = threadID
        ...

线程模块文档列表name属性,以及:

...

A thread has a name. 
The name can be passed to the constructor, 
and read or changed through the name attribute.

...

Thread.name

A string used for identification purposes only. 
It has no semantics. Multiple threads may
be given the same name. The initial name is set by the constructor.

I saw examples of thread IDs like this:

class myThread(threading.Thread):
    def __init__(self, threadID, name, counter):
        self.threadID = threadID
        ...

The threading module docs lists name attribute as well:

...

A thread has a name. 
The name can be passed to the constructor, 
and read or changed through the name attribute.

...

Thread.name

A string used for identification purposes only. 
It has no semantics. Multiple threads may
be given the same name. The initial name is set by the constructor.

回答 5

您可以获得当前正在运行的线程的标识。如果当前线程结束,则该标识可以重用于其他线程。

创建线程实例时,将为该线程隐式指定一个名称,即模式:线程号

名称没有意义,名称不必唯一。所有正在运行的线程的标识都是唯一的。

import threading


def worker():
    print(threading.current_thread().name)
    print(threading.get_ident())


threading.Thread(target=worker).start()
threading.Thread(target=worker, name='foo').start()

函数threading.current_thread()返回当前正在运行的线程。该对象保存线程的全部信息。

You can get the ident of the current running thread. The ident could be reused for other threads, if the current thread ends.

When you crate an instance of Thread, a name is given implicit to the thread, which is the pattern: Thread-number

The name has no meaning and the name don’t have to be unique. The ident of all running threads is unique.

import threading


def worker():
    print(threading.current_thread().name)
    print(threading.get_ident())


threading.Thread(target=worker).start()
threading.Thread(target=worker, name='foo').start()

The function threading.current_thread() returns the current running thread. This object holds the whole information of the thread.


回答 6

我在Python中创建了多个线程,打印了线程对象,并使用ident变量打印了id 。我看到所有ID都一样:

<Thread(Thread-1, stopped 140500807628544)>
<Thread(Thread-2, started 140500807628544)>
<Thread(Thread-3, started 140500807628544)>

I created multiple threads in Python, I printed the thread objects, and I printed the id using the ident variable. I see all the ids are same:

<Thread(Thread-1, stopped 140500807628544)>
<Thread(Thread-2, started 140500807628544)>
<Thread(Thread-3, started 140500807628544)>

回答 7

与@brucexin类似,我需要获取操作系统级别的线程标识符(!= thread.get_ident()),并使用如下所示的内容来不依赖于特定的数字并且仅使用amd64:

---- 8< ---- (xos.pyx)
"""module xos complements standard module os""" 

cdef extern from "<sys/syscall.h>":                                                             
    long syscall(long number, ...)                                                              
    const int SYS_gettid                                                                        

# gettid returns current OS thread identifier.                                                  
def gettid():                                                                                   
    return syscall(SYS_gettid)                                                                  

---- 8< ---- (test.py)
import pyximport; pyximport.install()
import xos

...

print 'my tid: %d' % xos.gettid()

这取决于Cython。

Similarly to @brucexin I needed to get OS-level thread identifier (which != thread.get_ident()) and use something like below not to depend on particular numbers and being amd64-only:

---- 8< ---- (xos.pyx)
"""module xos complements standard module os""" 

cdef extern from "<sys/syscall.h>":                                                             
    long syscall(long number, ...)                                                              
    const int SYS_gettid                                                                        

# gettid returns current OS thread identifier.                                                  
def gettid():                                                                                   
    return syscall(SYS_gettid)                                                                  

and

---- 8< ---- (test.py)
import pyximport; pyximport.install()
import xos

...

print 'my tid: %d' % xos.gettid()

this depends on Cython though.


使用csv模块从csv文件中读取特定列?

问题:使用csv模块从csv文件中读取特定列?

我正在尝试解析一个csv文件,并仅从特定列中提取数据。

范例csv:

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | C... | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

我想只捕获特定的列,说IDNameZipPhone

我看过的代码使我相信我可以通过其对应的编号来调用特定的列,即:Name将使用对应2并遍历每一行将row[2]产生列2中的所有项。只有这样,它才不会。

到目前为止,这是我所做的:

import sys, argparse, csv
from settings import *

# command arguments
parser = argparse.ArgumentParser(description='csv to postgres',\
 fromfile_prefix_chars="@" )
parser.add_argument('file', help='csv file to import', action='store')
args = parser.parse_args()
csv_file = args.file

# open csv file
with open(csv_file, 'rb') as csvfile:

    # get number of columns
    for line in csvfile.readlines():
        array = line.split(',')
        first_item = array[0]

    num_columns = len(array)
    csvfile.seek(0)

    reader = csv.reader(csvfile, delimiter=' ')
        included_cols = [1, 2, 6, 7]

    for row in reader:
            content = list(row[i] for i in included_cols)
            print content

并且我希望这只会打印出我想要的每一行的特定列,除非不是,我只会得到最后一列。

I’m trying to parse through a csv file and extract the data from only specific columns.

Example csv:

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | C... | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

I’m trying to capture only specific columns, say ID, Name, Zip and Phone.

Code I’ve looked at has led me to believe I can call the specific column by its corresponding number, so ie: Name would correspond to 2 and iterating through each row using row[2] would produce all the items in column 2. Only it doesn’t.

Here’s what I’ve done so far:

import sys, argparse, csv
from settings import *

# command arguments
parser = argparse.ArgumentParser(description='csv to postgres',\
 fromfile_prefix_chars="@" )
parser.add_argument('file', help='csv file to import', action='store')
args = parser.parse_args()
csv_file = args.file

# open csv file
with open(csv_file, 'rb') as csvfile:

    # get number of columns
    for line in csvfile.readlines():
        array = line.split(',')
        first_item = array[0]

    num_columns = len(array)
    csvfile.seek(0)

    reader = csv.reader(csvfile, delimiter=' ')
        included_cols = [1, 2, 6, 7]

    for row in reader:
            content = list(row[i] for i in included_cols)
            print content

and I’m expecting that this will print out only the specific columns I want for each row except it doesn’t, I get the last column only.


回答 0

你会得到从这个代码的最后一列的唯一方法是,如果你不包括你的print语句for循环。

这很可能是代码的结尾:

for row in reader:
    content = list(row[i] for i in included_cols)
print content

您希望它是这样的:

for row in reader:
        content = list(row[i] for i in included_cols)
        print content

既然我们已经解决了您的错误,那么我想花时间向您介绍pandas模块。

Pandas在处理csv文件方面非常出色,下面的代码将是您读取csv并将整列保存到变量中所需的全部:

import pandas as pd
df = pd.read_csv(csv_file)
saved_column = df.column_name #you can also use df['column_name']

因此,如果您想将列中的所有信息保存Names到变量中,则只需执行以下操作:

names = df.Names

这是一个很棒的模块,建议您研究一下。如果由于某种原因您的打印语句处于for循环状态,并且仍然仅打印出最后一列,则不应该发生,但是请让我知道我的假设是否错误。您发布的代码有很多缩进错误,因此很难知道应该在哪里。希望这对您有所帮助!

The only way you would be getting the last column from this code is if you don’t include your print statement in your for loop.

This is most likely the end of your code:

for row in reader:
    content = list(row[i] for i in included_cols)
print content

You want it to be this:

for row in reader:
        content = list(row[i] for i in included_cols)
        print content

Now that we have covered your mistake, I would like to take this time to introduce you to the pandas module.

Pandas is spectacular for dealing with csv files, and the following code would be all you need to read a csv and save an entire column into a variable:

import pandas as pd
df = pd.read_csv(csv_file)
saved_column = df.column_name #you can also use df['column_name']

so if you wanted to save all of the info in your column Names into a variable, this is all you need to do:

names = df.Names

It’s a great module and I suggest you look into it. If for some reason your print statement was in for loop and it was still only printing out the last column, which shouldn’t happen, but let me know if my assumption was wrong. Your posted code has a lot of indentation errors so it was hard to know what was supposed to be where. Hope this was helpful!


回答 1

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('file.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k

print(columns['name'])
print(columns['phone'])
print(columns['street'])

带有类似的文件

name,phone,street
Bob,0893,32 Silly
James,000,400 McHilly
Smithers,4442,23 Looped St.

将输出

>>> 
['Bob', 'James', 'Smithers']
['0893', '000', '4442']
['32 Silly', '400 McHilly', '23 Looped St.']

或者,如果您希望对列进行数字索引:

with open('file.txt') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
print(columns[0])

>>> 
['Bob', 'James', 'Smithers']

要更改分隔符,请添加delimiter=" "适当的实例,即reader = csv.reader(f,delimiter=" ")

import csv
from collections import defaultdict

columns = defaultdict(list) # each value in each column is appended to a list

with open('file.txt') as f:
    reader = csv.DictReader(f) # read rows into a dictionary format
    for row in reader: # read a row as {column1: value1, column2: value2,...}
        for (k,v) in row.items(): # go over each column name and value 
            columns[k].append(v) # append the value into the appropriate list
                                 # based on column name k

print(columns['name'])
print(columns['phone'])
print(columns['street'])

With a file like

name,phone,street
Bob,0893,32 Silly
James,000,400 McHilly
Smithers,4442,23 Looped St.

Will output

>>> 
['Bob', 'James', 'Smithers']
['0893', '000', '4442']
['32 Silly', '400 McHilly', '23 Looped St.']

Or alternatively if you want numerical indexing for the columns:

with open('file.txt') as f:
    reader = csv.reader(f)
    reader.next()
    for row in reader:
        for (i,v) in enumerate(row):
            columns[i].append(v)
print(columns[0])

>>> 
['Bob', 'James', 'Smithers']

To change the deliminator add delimiter=" " to the appropriate instantiation, i.e reader = csv.reader(f,delimiter=" ")


回答 2

使用熊猫

import pandas as pd
my_csv = pd.read_csv(filename)
column = my_csv.column_name
# you can also use my_csv['column_name']

在解析时丢弃不需要的列:

my_filtered_csv = pd.read_csv(filename, usecols=['col1', 'col3', 'col7'])

PS:我只是以一种简单的方式来汇总别人的话。实际的答案是从这里这里

Use pandas:

import pandas as pd
my_csv = pd.read_csv(filename)
column = my_csv.column_name
# you can also use my_csv['column_name']

Discard unneeded columns at parse time:

my_filtered_csv = pd.read_csv(filename, usecols=['col1', 'col3', 'col7'])

P.S. I’m just aggregating what other’s have said in a simple manner. Actual answers are taken from here and here.


回答 3

随着熊猫,你可以使用read_csv带有usecols参数:

df = pd.read_csv(filename, usecols=['col1', 'col3', 'col7'])

例:

import pandas as pd
import io

s = '''
total_bill,tip,sex,smoker,day,time,size
16.99,1.01,Female,No,Sun,Dinner,2
10.34,1.66,Male,No,Sun,Dinner,3
21.01,3.5,Male,No,Sun,Dinner,3
'''

df = pd.read_csv(io.StringIO(s), usecols=['total_bill', 'day', 'size'])
print(df)

   total_bill  day  size
0       16.99  Sun     2
1       10.34  Sun     3
2       21.01  Sun     3

With pandas you can use read_csv with usecols parameter:

df = pd.read_csv(filename, usecols=['col1', 'col3', 'col7'])

Example:

import pandas as pd
import io

s = '''
total_bill,tip,sex,smoker,day,time,size
16.99,1.01,Female,No,Sun,Dinner,2
10.34,1.66,Male,No,Sun,Dinner,3
21.01,3.5,Male,No,Sun,Dinner,3
'''

df = pd.read_csv(io.StringIO(s), usecols=['total_bill', 'day', 'size'])
print(df)

   total_bill  day  size
0       16.99  Sun     2
1       10.34  Sun     3
2       21.01  Sun     3

回答 4

您可以使用numpy.loadtext(filename)。例如,如果这是您的数据库.csv

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | Adam | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Carl | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Adolf | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Den | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

您想要该Name列:

import numpy as np 
b=np.loadtxt(r'filepath\name.csv',dtype=str,delimiter='|',skiprows=1,usecols=(1,))

>>> b
array([' Adam ', ' Carl ', ' Adolf ', ' Den '], 
      dtype='|S7')

您可以更轻松地使用genfromtext

b = np.genfromtxt(r'filepath\name.csv', delimiter='|', names=True,dtype=None)
>>> b['Name']
array([' Adam ', ' Carl ', ' Adolf ', ' Den '], 
      dtype='|S7')

You can use numpy.loadtext(filename). For example if this is your database .csv:

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | Adam | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Carl | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Adolf | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |
10 | Den | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

And you want the Name column:

import numpy as np 
b=np.loadtxt(r'filepath\name.csv',dtype=str,delimiter='|',skiprows=1,usecols=(1,))

>>> b
array([' Adam ', ' Carl ', ' Adolf ', ' Den '], 
      dtype='|S7')

More easily you can use genfromtext:

b = np.genfromtxt(r'filepath\name.csv', delimiter='|', names=True,dtype=None)
>>> b['Name']
array([' Adam ', ' Carl ', ' Adolf ', ' Den '], 
      dtype='|S7')

回答 5

上下文:对于这类工作,您应该使用令人惊叹的python petl库。通过标准的csv模块“手动”执行操作,可以节省大量工作和潜在的挫败感。AFAIK,唯一仍在使用csv模块的人是尚未发现更好的工具来处理表格数据(熊猫,petl等)的人,这很好,但是如果您打算在其中处理大量数据,您可以从各种各样的陌生来源获得职业,学习像petl这样的东西是您可以做出的最好的投资之一。pip安装petl后,只需30分钟即可开始使用。该文档非常好。

答:假设您在csv文件中拥有第一个表(也可以使用petl直接从数据库中加载)。然后,您只需加载它并执行以下操作。

from petl import fromcsv, look, cut, tocsv 

#Load the table
table1 = fromcsv('table1.csv')
# Alter the colums
table2 = cut(table1, 'Song_Name','Artist_ID')
#have a quick look to make sure things are ok. Prints a nicely formatted table to your console
print look(table2)
# Save to new file
tocsv(table2, 'new.csv')

Context: For this type of work you should use the amazing python petl library. That will save you a lot of work and potential frustration from doing things ‘manually’ with the standard csv module. AFAIK, the only people who still use the csv module are those who have not yet discovered better tools for working with tabular data (pandas, petl, etc.), which is fine, but if you plan to work with a lot of data in your career from various strange sources, learning something like petl is one of the best investments you can make. To get started should only take 30 minutes after you’ve done pip install petl. The documentation is excellent.

Answer: Let’s say you have the first table in a csv file (you can also load directly from the database using petl). Then you would simply load it and do the following.

from petl import fromcsv, look, cut, tocsv 

#Load the table
table1 = fromcsv('table1.csv')
# Alter the colums
table2 = cut(table1, 'Song_Name','Artist_ID')
#have a quick look to make sure things are ok. Prints a nicely formatted table to your console
print look(table2)
# Save to new file
tocsv(table2, 'new.csv')

回答 6

我认为有一个更简单的方法

import pandas as pd

dataset = pd.read_csv('table1.csv')
ftCol = dataset.iloc[:, 0].values

因此在这里iloc[:, 0]:表示所有值,0表示列的位置。在下面的示例ID中将被选中

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | C... | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

I think there is an easier way

import pandas as pd

dataset = pd.read_csv('table1.csv')
ftCol = dataset.iloc[:, 0].values

So in here iloc[:, 0], : means all values, 0 means the position of the column. in the example below ID will be selected

ID | Name | Address | City | State | Zip | Phone | OPEID | IPEDS |
10 | C... | 130 W.. | Mo.. | AL... | 3.. | 334.. | 01023 | 10063 |

回答 7

import pandas as pd 
csv_file = pd.read_csv("file.csv") 
column_val_list = csv_file.column_name._ndarray_values
import pandas as pd 
csv_file = pd.read_csv("file.csv") 
column_val_list = csv_file.column_name._ndarray_values

回答 8

多亏了您可以为熊猫数据帧建立索引并对其进行子集化的一种方法,一种将csv文件中的单个列提取到变量中的非常简单的方法是:

myVar = pd.read_csv('YourPath', sep = ",")['ColumnName']

需要考虑的几件事:

上面的代码片断会产生大熊猫Series并没有dataframeusecols如果速度是一个问题,ayhan和ayhan的建议也会更快。使用以下方法测试两种不同的方法%timeit大小为2122 KB的csv文件,将产生22.8 msusecols方法和53 ms我建议的方法。

别忘了 import pandas as pd

Thanks to the way you can index and subset a pandas dataframe, a very easy way to extract a single column from a csv file into a variable is:

myVar = pd.read_csv('YourPath', sep = ",")['ColumnName']

A few things to consider:

The snippet above will produce a pandas Series and not dataframe. The suggestion from ayhan with usecols will also be faster if speed is an issue. Testing the two different approaches using %timeit on a 2122 KB sized csv file yields 22.8 ms for the usecols approach and 53 ms for my suggested approach.

And don’t forget import pandas as pd


回答 9

如果您需要分别处理这些列,那么我想使用zip(*iterable)模式来对这些列进行解构(有效地“解压缩”)。因此,对于您的示例:

ids, names, zips, phones = zip(*(
  (row[1], row[2], row[6], row[7])
  for row in reader
))

If you need to process the columns separately, I like to destructure the columns with the zip(*iterable) pattern (effectively “unzip”). So for your example:

ids, names, zips, phones = zip(*(
  (row[1], row[2], row[6], row[7])
  for row in reader
))

回答 10

抓取列名,而不是使用readlines方法()更好地使用的ReadLine() ,以避免循环和读取的完整文件&其存储在数组中。

with open(csv_file, 'rb') as csvfile:

    # get number of columns

    line = csvfile.readline()

    first_item = line.split(',')

To fetch column name, instead of using readlines() better use readline() to avoid loop & reading the complete file & storing it in the array.

with open(csv_file, 'rb') as csvfile:

    # get number of columns

    line = csvfile.readline()

    first_item = line.split(',')

Python 3.x舍入行为

问题:Python 3.x舍入行为

我只是在重新阅读Python 3.0的新增功能,它指出:

round()函数的舍入策略和返回类型已更改。现在,精确的中途案例将舍入到最接近的偶数结果,而不是从零舍入。(例如,round(2.5)现在返回2而不是3。)

以及关于round的文档:

对于支持round()的内置类型,将值四舍五入为乘幂n的最接近10的倍数;如果两个倍数相等接近,则四舍五入取整为偶数选择

因此,在v2.7.3下:

In [85]: round(2.5)
Out[85]: 3.0

In [86]: round(3.5)
Out[86]: 4.0

如我所料 但是,现在在v3.2.3下:

In [32]: round(2.5)
Out[32]: 2

In [33]: round(3.5)
Out[33]: 4

这似乎是违反直觉的,与我对四舍五入(并可能绊倒人)的理解相反。英语不是我的母语,但是直到我读了这篇文章,我才认为我知道四舍五入的含义:-/我确定在引入v3时一定对此进行了一些讨论,但我找不到很好的理由。我的搜索。

  1. 有人知道为什么将其更改为此吗?
  2. 是否有其他主流编程语言(例如C,C ++,Java,Perl等)进行这种(对我来说是不一致的)舍入?

我在这里想念什么?

更新:@ Li-aungYip的评论有关“银行取整”为我提供了正确的搜索词/关键字进行搜索,我发现了这样一个问题:为什么.NET默认使用银行取整?,所以我会仔细阅读。

I was just re-reading What’s New In Python 3.0 and it states:

The round() function rounding strategy and return type have changed. Exact halfway cases are now rounded to the nearest even result instead of away from zero. (For example, round(2.5) now returns 2 rather than 3.)

and the documentation for round:

For the built-in types supporting round(), values are rounded to the closest multiple of 10 to the power minus n; if two multiples are equally close, rounding is done toward the even choice

So, under v2.7.3:

In [85]: round(2.5)
Out[85]: 3.0

In [86]: round(3.5)
Out[86]: 4.0

as I’d have expected. However, now under v3.2.3:

In [32]: round(2.5)
Out[32]: 2

In [33]: round(3.5)
Out[33]: 4

This seems counter-intuitive and contrary to what I understand about rounding (and bound to trip up people). English isn’t my native language but until I read this I thought I knew what rounding meant :-/ I am sure at the time v3 was introduced there must have been some discussion of this, but I was unable to find a good reason in my search.

  1. Does anyone have insight into why this was changed to this?
  2. Are there any other mainstream programming languages (e.g., C, C++, Java, Perl, ..) that do this sort of (to me inconsistent) rounding?

What am I missing here?

UPDATE: @Li-aungYip’s comment re “Banker’s rounding” gave me the right search term/keywords to search for and I found this SO question: Why does .NET use banker’s rounding as default?, so I will be reading that carefully.


回答 0

如今,Python 3.0的方法被认为是标准的舍入方法,尽管某些语言实现尚未上市。

简单的“总是向上取整0.5”技术会导致对较高数字的轻微偏向。通过大量计算,这可能很重要。Python 3.0方法消除了这个问题。

常用的舍入方法不止一种。浮点数学的国际标准IEEE 754定义了五种不同的舍入方法(Python 3.0使用的是默认的一种)。还有其他。

这种行为并未得到应有的广泛了解。如果我没记错的话,AppleScript是这种舍入方法的早期采用者。roundAppleScript中的命令实际上确实提供了几个选项,但是IEEE 754中的默认设置是roundtoward-even round。他实现的“学校”:round 2.5 rounding as taught in school是有效的AppleScript命令。:-)

Python 3’s way (called “round half to even” or “banker’s rounding”) is considered the standard rounding method these days, though some language implementations aren’t on the bus yet.

The simple “always round 0.5 up” technique results in a slight bias toward the higher number. With large numbers of calculations, this can be significant. The Python 3.0 approach eliminates this issue.

There is more than one method of rounding in common use. IEEE 754, the international standard for floating-point math, defines five different rounding methods (the one used by Python 3.0 is the default). And there are others.

This behavior is not as widely known as it ought to be. AppleScript was, if I remember correctly, an early adopter of this rounding method. The round command in AppleScript offers several options, but round-toward-even is the default as it is in IEEE 754. Apparently the engineer who implemented the round command got so fed up with all the requests to “make it work like I learned in school” that he implemented just that: round 2.5 rounding as taught in school is a valid AppleScript command. 🙂


Python Flask,如何设置内容类型

问题:Python Flask,如何设置内容类型

我正在使用Flask,并且从get请求返回一个XML文件。如何将内容类型设置为xml?

例如

@app.route('/ajax_ddl')
def ajax_ddl():
    xml = 'foo'
    header("Content-type: text/xml")
    return xml

I am using Flask and I return an XML file from a get request. How do I set the content type to xml ?

e.g.

@app.route('/ajax_ddl')
def ajax_ddl():
    xml = 'foo'
    header("Content-type: text/xml")
    return xml

回答 0

尝试这样:

from flask import Response
@app.route('/ajax_ddl')
def ajax_ddl():
    xml = 'foo'
    return Response(xml, mimetype='text/xml')

实际的Content-Type基于mimetype参数和字符集(默认为UTF-8)。

响应(和请求)对象记录在这里:http : //werkzeug.pocoo.org/docs/wrappers/

Try like this:

from flask import Response
@app.route('/ajax_ddl')
def ajax_ddl():
    xml = 'foo'
    return Response(xml, mimetype='text/xml')

The actual Content-Type is based on the mimetype parameter and the charset (defaults to UTF-8).

Response (and request) objects are documented here: http://werkzeug.pocoo.org/docs/wrappers/


回答 1

就这么简单

x = "some data you want to return"
return x, 200, {'Content-Type': 'text/css; charset=utf-8'}

希望能帮助到你

更新:使用此方法,因为它可以与python 2.x和python 3.x一起使用

其次,它还消除了多头问题。

from flask import Response
r = Response(response="TEST OK", status=200, mimetype="application/xml")
r.headers["Content-Type"] = "text/xml; charset=utf-8"
return r

As simple as this

x = "some data you want to return"
return x, 200, {'Content-Type': 'text/css; charset=utf-8'}

Hope it helps

Update: Use this method because it will work with both python 2.x and python 3.x

and secondly it also eliminates multiple header problem.

from flask import Response
r = Response(response="TEST OK", status=200, mimetype="application/xml")
r.headers["Content-Type"] = "text/xml; charset=utf-8"
return r

回答 2

我喜欢并赞成@Simon Sapin的答案。但是,我最终采取了稍有不同的策略,并创建了自己的装饰器:

from flask import Response
from functools import wraps

def returns_xml(f):
    @wraps(f)
    def decorated_function(*args, **kwargs):
        r = f(*args, **kwargs)
        return Response(r, content_type='text/xml; charset=utf-8')
    return decorated_function

并以此方式使用它:

@app.route('/ajax_ddl')
@returns_xml
def ajax_ddl():
    xml = 'foo'
    return xml

我认为这稍微舒适一些。

I like and upvoted @Simon Sapin’s answer. I ended up taking a slightly different tack, however, and created my own decorator:

from flask import Response
from functools import wraps

def returns_xml(f):
    @wraps(f)
    def decorated_function(*args, **kwargs):
        r = f(*args, **kwargs)
        return Response(r, content_type='text/xml; charset=utf-8')
    return decorated_function

and use it thus:

@app.route('/ajax_ddl')
@returns_xml
def ajax_ddl():
    xml = 'foo'
    return xml

I think this is slightly more comfortable.


回答 3

使用make_response方法获取数据响应。然后设置mimetype属性。最后返回此响应:

@app.route('/ajax_ddl')
def ajax_ddl():
    xml = 'foo'
    resp = app.make_response(xml)
    resp.mimetype = "text/xml"
    return resp

如果Response直接使用,您将失去通过设置来定制响应的机会app.response_class。该make_response方法使用app.responses_class制作响应对象。在此您可以创建自己的类,添加以使您的应用程序全局使用它:

class MyResponse(app.response_class):
    def __init__(self, *args, **kwargs):
        super(MyResponse, self).__init__(*args, **kwargs)
        self.set_cookie("last-visit", time.ctime())

app.response_class = MyResponse  

Use the make_response method to get a response with your data. Then set the mimetype attribute. Finally return this response:

@app.route('/ajax_ddl')
def ajax_ddl():
    xml = 'foo'
    resp = app.make_response(xml)
    resp.mimetype = "text/xml"
    return resp

If you use Response directly, you lose the chance to customize the responses by setting app.response_class. The make_response method uses the app.responses_class to make the response object. In this you can create your own class, add make your application uses it globally:

class MyResponse(app.response_class):
    def __init__(self, *args, **kwargs):
        super(MyResponse, self).__init__(*args, **kwargs)
        self.set_cookie("last-visit", time.ctime())

app.response_class = MyResponse  

回答 4

from flask import Flask, render_template, make_response
app = Flask(__name__)

@app.route('/user/xml')
def user_xml():
    resp = make_response(render_template('xml/user.html', username='Ryan'))
    resp.headers['Content-type'] = 'text/xml; charset=utf-8'
    return resp
from flask import Flask, render_template, make_response
app = Flask(__name__)

@app.route('/user/xml')
def user_xml():
    resp = make_response(render_template('xml/user.html', username='Ryan'))
    resp.headers['Content-type'] = 'text/xml; charset=utf-8'
    return resp

回答 5

通常,您不必Response自己创建对象,因为make_response()它将为您处理。

from flask import Flask, make_response                                      
app = Flask(__name__)                                                       

@app.route('/')                                                             
def index():                                                                
    bar = '<body>foo</body>'                                                
    response = make_response(bar)                                           
    response.headers['Content-Type'] = 'text/xml; charset=utf-8'            
    return response

还有一件事,似乎没有人提到after_this_request,我想说点什么:

after_this_request

在此请求后执行功能。这对于修改响应对象很有用。该函数传递给响应对象,并且必须返回相同或新的对象。

因此我们可以使用来实现after_this_request,代码应如下所示:

from flask import Flask, after_this_request
app = Flask(__name__)

@app.route('/')
def index():
    @after_this_request
    def add_header(response):
        response.headers['Content-Type'] = 'text/xml; charset=utf-8'
        return response
    return '<body>foobar</body>'

Usually you don’t have to create the Response object yourself because make_response() will take care of that for you.

from flask import Flask, make_response                                      
app = Flask(__name__)                                                       

@app.route('/')                                                             
def index():                                                                
    bar = '<body>foo</body>'                                                
    response = make_response(bar)                                           
    response.headers['Content-Type'] = 'text/xml; charset=utf-8'            
    return response

One more thing, it seems that no one mentioned the after_this_request, I want to say something:

after_this_request

Executes a function after this request. This is useful to modify response objects. The function is passed the response object and has to return the same or a new one.

so we can do it with after_this_request, the code should look like this:

from flask import Flask, after_this_request
app = Flask(__name__)

@app.route('/')
def index():
    @after_this_request
    def add_header(response):
        response.headers['Content-Type'] = 'text/xml; charset=utf-8'
        return response
    return '<body>foobar</body>'

回答 6

您可以尝试以下方法(python3.6.2):

情况一:

@app.route('/hello')
def hello():

    headers={ 'content-type':'text/plain' ,'location':'http://www.stackoverflow'}
    response = make_response('<h1>hello world</h1>',301)
    response.headers = headers
    return response

案例二:

@app.route('/hello')
def hello():

    headers={ 'content-type':'text/plain' ,'location':'http://www.stackoverflow.com'}
    return '<h1>hello world</h1>',301,headers

我正在使用Flask。如果您想返回json,您可以编写以下代码:

import json # 
@app.route('/search/<keyword>')
def search(keyword):

    result = Book.search_by_keyword(keyword)
    return json.dumps(result),200,{'content-type':'application/json'}


from flask import jsonify
@app.route('/search/<keyword>')
def search(keyword):

    result = Book.search_by_keyword(keyword)
    return jsonify(result)

You can try the following method(python3.6.2):

case one:

@app.route('/hello')
def hello():

    headers={ 'content-type':'text/plain' ,'location':'http://www.stackoverflow'}
    response = make_response('<h1>hello world</h1>',301)
    response.headers = headers
    return response

case two:

@app.route('/hello')
def hello():

    headers={ 'content-type':'text/plain' ,'location':'http://www.stackoverflow.com'}
    return '<h1>hello world</h1>',301,headers

I am using Flask .And if you want to return json,you can write this:

import json # 
@app.route('/search/<keyword>')
def search(keyword):

    result = Book.search_by_keyword(keyword)
    return json.dumps(result),200,{'content-type':'application/json'}


from flask import jsonify
@app.route('/search/<keyword>')
def search(keyword):

    result = Book.search_by_keyword(keyword)
    return jsonify(result)

仅在Django启动一次时执行代码?

问题:仅在Django启动一次时执行代码?

我正在编写一个Django中间件类,该类只想在启动时执行一次,以初始化一些其他人工代码。我遵循了sdolan 在此处发布的非常好的解决方案,但是“ Hello”消息两次输出到终端。例如

from django.core.exceptions import MiddlewareNotUsed
from django.conf import settings

class StartupMiddleware(object):
    def __init__(self):
        print "Hello world"
        raise MiddlewareNotUsed('Startup complete')

在我的Django设置文件中,该类已包含在MIDDLEWARE_CLASSES列表中。

但是当我使用runserver运行Django并请求页面时,我进入了终端

Django version 1.3, using settings 'config.server'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Hello world
[22/Jul/2011 15:54:36] "GET / HTTP/1.1" 200 698
Hello world
[22/Jul/2011 15:54:36] "GET /static/css/base.css HTTP/1.1" 200 0

有什么想法为什么要打印两次“ Hello world”?谢谢。

I’m writing a Django Middleware class that I want to execute only once at startup, to initialise some other arbritary code. I’ve followed the very nice solution posted by sdolan here, but the “Hello” message is output to the terminal twice. E.g.

from django.core.exceptions import MiddlewareNotUsed
from django.conf import settings

class StartupMiddleware(object):
    def __init__(self):
        print "Hello world"
        raise MiddlewareNotUsed('Startup complete')

and in my Django settings file, I’ve got the class included in the MIDDLEWARE_CLASSES list.

But when I run Django using runserver and request a page, I get in the terminal

Django version 1.3, using settings 'config.server'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
Hello world
[22/Jul/2011 15:54:36] "GET / HTTP/1.1" 200 698
Hello world
[22/Jul/2011 15:54:36] "GET /static/css/base.css HTTP/1.1" 200 0

Any ideas why “Hello world” is printed twice? Thanks.


回答 0

从以下Pykler的答案进行更新:Django 1.7现在对此具有钩子


不要这样

您不希望一次性使用“中间件”。

您想在顶层执行代码urls.py。该模块将被导入并执行一次。

urls.py

from django.confs.urls.defaults import *
from my_app import one_time_startup

urlpatterns = ...

one_time_startup()

Update from Pykler’s answer below: Django 1.7 now has a hook for this


Don’t do it this way.

You don’t want “middleware” for a one-time startup thing.

You want to execute code in the top-level urls.py. That module is imported and executed once.

urls.py

from django.confs.urls.defaults import *
from my_app import one_time_startup

urlpatterns = ...

one_time_startup()

如何在Python的同一行上打印变量和字符串?

问题:如何在Python的同一行上打印变量和字符串?

我正在使用python算出如果一个孩子每7秒出生一次,那么5年内将有多少个孩子出生。问题出在我的最后一行。当我在文本的任何一侧打印文本时,如何使它工作?

这是我的代码:

currentPop = 312032486
oneYear = 365
hours = 24
minutes = 60
seconds = 60

# seconds in a single day
secondsInDay = hours * minutes * seconds

# seconds in a year
secondsInYear = secondsInDay * oneYear

fiveYears = secondsInYear * 5

#Seconds in 5 years
print fiveYears

# fiveYears in seconds, divided by 7 seconds
births = fiveYears // 7

print "If there was a birth every 7 seconds, there would be: " births "births"

I am using python to work out how many children would be born in 5 years if a child was born every 7 seconds. The problem is on my last line. How do I get a variable to work when I’m printing text either side of it?

Here is my code:

currentPop = 312032486
oneYear = 365
hours = 24
minutes = 60
seconds = 60

# seconds in a single day
secondsInDay = hours * minutes * seconds

# seconds in a year
secondsInYear = secondsInDay * oneYear

fiveYears = secondsInYear * 5

#Seconds in 5 years
print fiveYears

# fiveYears in seconds, divided by 7 seconds
births = fiveYears // 7

print "If there was a birth every 7 seconds, there would be: " births "births"

回答 0

使用,分隔字符串和变量,同时打印:

print "If there was a birth every 7 seconds, there would be: ",births,"births"

, 在print语句中将项目分隔一个空格:

>>> print "foo","bar","spam"
foo bar spam

或更好地使用字符串格式

print "If there was a birth every 7 seconds, there would be: {} births".format(births)

字符串格式化功能强大得多,它还允许您执行其他一些操作,例如:填充,填充,对齐,宽度,设置精度等

>>> print "{:d} {:03d} {:>20f}".format(1,2,1.1)
1 002             1.100000
  ^^^
  0's padded to 2

演示:

>>> births = 4
>>> print "If there was a birth every 7 seconds, there would be: ",births,"births"
If there was a birth every 7 seconds, there would be:  4 births

#formatting
>>> print "If there was a birth every 7 seconds, there would be: {} births".format(births)
If there was a birth every 7 seconds, there would be: 4 births

Use , to separate strings and variables while printing:

print("If there was a birth every 7 seconds, there would be: ", births, "births")

, in print function separates the items by a single space:

>>> print("foo", "bar", "spam")
foo bar spam

or better use string formatting:

print("If there was a birth every 7 seconds, there would be: {} births".format(births))

String formatting is much more powerful and allows you to do some other things as well, like padding, fill, alignment, width, set precision, etc.

>>> print("{:d} {:03d} {:>20f}".format(1, 2, 1.1))
1 002             1.100000
  ^^^
  0's padded to 2

Demo:

>>> births = 4
>>> print("If there was a birth every 7 seconds, there would be: ", births, "births")
If there was a birth every 7 seconds, there would be:  4 births

# formatting
>>> print("If there was a birth every 7 seconds, there would be: {} births".format(births))
If there was a birth every 7 seconds, there would be: 4 births

回答 1

还有两个

第一个

 >>>births = str(5)
 >>>print "there are " + births + " births."
 there are 5 births.

添加字符串时,它们会串联在一起。

第二个

同样format,字符串的(Python 2.6和更高版本)方法可能是标准方法:

>>> births = str(5)
>>>
>>> print "there are {} births.".format(births)
there are 5 births.

format方法也可以与列表一起使用

>>> format_list = ['five','three']
>>> print "there are {} births and {} deaths".format(*format_list) #unpack the list
there are five births and three deaths

或字典

>>> format_dictionary = {'births': 'five', 'deaths': 'three'}
>>> print "there are {births} births, and {deaths} deaths".format(**format_dictionary) #yup, unpack the dictionary
there are five births, and three deaths

Two more

The First one

>>> births = str(5)
>>> print("there are " + births + " births.")
there are 5 births.

When adding strings, they concatenate.

The Second One

Also the format (Python 2.6 and newer) method of strings is probably the standard way:

>>> births = str(5)
>>>
>>> print("there are {} births.".format(births))
there are 5 births.

This format method can be used with lists as well

>>> format_list = ['five', 'three']
>>> # * unpacks the list:
>>> print("there are {} births and {} deaths".format(*format_list))  
there are five births and three deaths

or dictionaries

>>> format_dictionary = {'births': 'five', 'deaths': 'three'}
>>> # ** unpacks the dictionary
>>> print("there are {births} births, and {deaths} deaths".format(**format_dictionary))
there are five births, and three deaths

回答 2

Python是一种非常通用的语言。您可以通过不同的方法打印变量。我列出了以下4种方法。您可以根据需要使用它们。

例:

a=1
b='ball'

方法1:

print('I have %d %s' %(a,b))

方法2:

print('I have',a,b)

方法3:

print('I have {} {}'.format(a,b))

方法4:

print('I have ' + str(a) +' ' +b)

方法5:

  print( f'I have {a} {b}')

输出为:

I have 1 ball

Python is a very versatile language. You may print variables by different methods. I have listed below five methods. You may use them according to your convenience.

Example:

a = 1
b = 'ball'

Method 1:

print('I have %d %s' % (a, b))

Method 2:

print('I have', a, b)

Method 3:

print('I have {} {}'.format(a, b))

Method 4:

print('I have ' + str(a) + ' ' + b)

Method 5:

print(f'I have {a} {b}')

The output would be:

I have 1 ball

回答 3

如果要使用python 3,它非常简单:

print("If there was a birth every 7 second, there would be %d births." % (births))

If you want to work with python 3, it’s very simple:

print("If there was a birth every 7 second, there would be %d births." % (births))

回答 4

从python 3.6开始,您可以使用文字字符串插值。

births = 5.25487
>>> print(f'If there was a birth every 7 seconds, there would be: {births:.2f} births')
If there was a birth every 7 seconds, there would be: 5.25 births

As of python 3.6 you can use Literal String Interpolation.

births = 5.25487
>>> print(f'If there was a birth every 7 seconds, there would be: {births:.2f} births')
If there was a birth every 7 seconds, there would be: 5.25 births

回答 5

您可以使用f-string.format()方法

使用f弦

print(f'If there was a birth every 7 seconds, there would be: {births} births')

使用.format()

print("If there was a birth every 7 seconds, there would be: {births} births".format(births=births))

You can either use the f-string or .format() methods

Using f-string

print(f'If there was a birth every 7 seconds, there would be: {births} births')

Using .format()

print("If there was a birth every 7 seconds, there would be: {births} births".format(births=births))

回答 6

您可以使用格式字符串:

print "There are %d births" % (births,)

或在这种简单情况下:

print "There are ", births, "births"

You can either use a formatstring:

print "There are %d births" % (births,)

or in this simple case:

print "There are ", births, "births"

回答 7

如果您使用的是python 3.6或最新版本,则f-string是最佳和简便的选择

print(f"{your_varaible_name}")

If you are using python 3.6 or latest, f-string is the best and easy one

print(f"{your_varaible_name}")

回答 8

您将首先创建一个变量:例如:D =1。然后执行此操作,但是将字符串替换为所需的任何内容:

D = 1
print("Here is a number!:",D)

You would first make a variable: for example: D = 1. Then Do This but replace the string with whatever you want:

D = 1
print("Here is a number!:",D)

回答 9

在当前的python版本上,您必须使用括号,如下所示:

print ("If there was a birth every 7 seconds", X)

On a current python version you have to use parenthesis, like so :

print ("If there was a birth every 7 seconds", X)

回答 10

使用字符串格式

print("If there was a birth every 7 seconds, there would be: {} births".format(births))
 # Will replace "{}" with births

如果您在进行玩具项目,请使用:

print('If there was a birth every 7 seconds, there would be:' births'births) 

要么

print('If there was a birth every 7 seconds, there would be: %d births' %(births))
# Will replace %d with births

use String formatting

print("If there was a birth every 7 seconds, there would be: {} births".format(births))
 # Will replace "{}" with births

if you doing a toy project use:

print('If there was a birth every 7 seconds, there would be:' births'births) 

or

print('If there was a birth every 7 seconds, there would be: %d births' %(births))
# Will replace %d with births

回答 11

您可以使用字符串格式来执行此操作:

print "If there was a birth every 7 seconds, there would be: %d births" % births

或者您可以提供print多个参数,它将自动用空格分隔它们:

print "If there was a birth every 7 seconds, there would be:", births, "births"

You can use string formatting to do this:

print "If there was a birth every 7 seconds, there would be: %d births" % births

or you can give print multiple arguments, and it will automatically separate them by a space:

print "If there was a birth every 7 seconds, there would be:", births, "births"

回答 12

我将您的脚本复制并粘贴到.py文件中。我使用Python 2.7.10原样运行它,并收到了相同的语法错误。我还在Python 3.5中尝试了该脚本,并收到以下输出:

File "print_strings_on_same_line.py", line 16
print fiveYears
              ^
SyntaxError: Missing parentheses in call to 'print'

然后,我修改了最后一行,其中打印了出生人数,如下所示:

currentPop = 312032486
oneYear = 365
hours = 24
minutes = 60
seconds = 60

# seconds in a single day
secondsInDay = hours * minutes * seconds

# seconds in a year
secondsInYear = secondsInDay * oneYear

fiveYears = secondsInYear * 5

#Seconds in 5 years
print fiveYears

# fiveYears in seconds, divided by 7 seconds
births = fiveYears // 7

print "If there was a birth every 7 seconds, there would be: " + str(births) + " births"

输出为(Python 2.7.10):

157680000
If there was a birth every 7 seconds, there would be: 22525714 births

我希望这有帮助。

I copied and pasted your script into a .py file. I ran it as-is with Python 2.7.10 and received the same syntax error. I also tried the script in Python 3.5 and received the following output:

File "print_strings_on_same_line.py", line 16
print fiveYears
              ^
SyntaxError: Missing parentheses in call to 'print'

Then, I modified the last line where it prints the number of births as follows:

currentPop = 312032486
oneYear = 365
hours = 24
minutes = 60
seconds = 60

# seconds in a single day
secondsInDay = hours * minutes * seconds

# seconds in a year
secondsInYear = secondsInDay * oneYear

fiveYears = secondsInYear * 5

#Seconds in 5 years
print fiveYears

# fiveYears in seconds, divided by 7 seconds
births = fiveYears // 7

print "If there was a birth every 7 seconds, there would be: " + str(births) + " births"

The output was (Python 2.7.10):

157680000
If there was a birth every 7 seconds, there would be: 22525714 births

I hope this helps.


回答 13

只需在之间使用,(逗号)。

请参阅以下代码以获得更好的理解:

# Weight converter pounds to kg

weight_lbs = input("Enter your weight in pounds: ")

weight_kg = 0.45 * int(weight_lbs)

print("You are ", weight_kg, " kg")

Just use , (comma) in between.

See this code for better understanding:

# Weight converter pounds to kg

weight_lbs = input("Enter your weight in pounds: ")

weight_kg = 0.45 * int(weight_lbs)

print("You are ", weight_kg, " kg")

回答 14

稍有不同:使用Python 3并在同一行中打印几个变量:

print("~~Create new DB:",argv[5],"; with user:",argv[3],"; and Password:",argv[4]," ~~")

Slightly different: Using Python 3 and print several variables in the same line:

print("~~Create new DB:",argv[5],"; with user:",argv[3],"; and Password:",argv[4]," ~~")

回答 15

PYTHON 3

最好使用格式选项

user_name=input("Enter your name : )

points = 10

print ("Hello, {} your point is {} : ".format(user_name,points)

或将输入声明为字符串并使用

user_name=str(input("Enter your name : ))

points = 10

print("Hello, "+user_name+" your point is " +str(points))

PYTHON 3

Better to use the format option

user_name=input("Enter your name : )

points = 10

print ("Hello, {} your point is {} : ".format(user_name,points)

or declare the input as string and use

user_name=str(input("Enter your name : ))

points = 10

print("Hello, "+user_name+" your point is " +str(points))

回答 16

如果在字符串和变量之间使用逗号,如下所示:

print "If there was a birth every 7 seconds, there would be: ", births, "births"

If you use a comma inbetween the strings and the variable, like this:

print "If there was a birth every 7 seconds, there would be: ", births, "births"

使用python的eval()与ast.literal_eval()?

问题:使用python的eval()与ast.literal_eval()?

我遇到了一些代码,eval()将其作为一种可能的解决方案。现在,我以前从未使用eval()过,但是,我遇到了很多有关它可能引起的潜在危险的信息。也就是说,我对使用它非常谨慎。

我的情况是我有一个用户输入:

datamap = raw_input('Provide some data here: ')

哪里datamap需要一本字典。我四处搜寻,发现eval()可以解决这个问题。我认为我可以在尝试使用数据之前检查输入的类型,这将是可行的安全预防措施。

datamap = eval(raw_input('Provide some data here: ')
if not isinstance(datamap, dict):
    return

我通读了文档,但仍不清楚这是否安全。eval是否在输入数据后或datamap调用变量后立即评估数据?

ast模块是.literal_eval()唯一安全的选择吗?

I have a situation with some code where eval() came up as a possible solution. Now I have never had to use eval() before but, I have come across plenty of information about the potential danger it can cause. That said, I’m very wary about using it.

My situation is that I have input being given by a user:

datamap = input('Provide some data here: ')

Where datamap needs to be a dictionary. I searched around and found that eval() could work this out. I thought that I might be able to check the type of the input before trying to use the data and that would be a viable security precaution.

datamap = eval(input('Provide some data here: ')
if not isinstance(datamap, dict):
    return

I read through the docs and I am still unclear if this would be safe or not. Does eval evaluate the data as soon as its entered or after the datamap variable is called?

Is the ast module’s .literal_eval() the only safe option?


回答 0

datamap = eval(raw_input('Provide some data here: '))意味着您实际上认为代码不安全之前对其进行了评估。调用该函数后,它将立即评估代码。另请参阅的危险eval

ast.literal_eval 如果输入不是有效的Python数据类型,则会引发异常,因此如果输入无效,则不会执行代码。

使用ast.literal_eval时,你需要eval。通常,您不应该评估文字Python语句。

datamap = eval(input('Provide some data here: ')) means that you actually evaluate the code before you deem it to be unsafe or not. It evaluates the code as soon as the function is called. See also the dangers of eval.

ast.literal_eval raises an exception if the input isn’t a valid Python datatype, so the code won’t be executed if it’s not.

Use ast.literal_eval whenever you need eval. You shouldn’t usually evaluate literal Python statements.


回答 1

ast.literal_eval() 仅认为Python语法的一小部分有效:

提供的字符串或节点只能由以下Python文字结构组成:字符串,数字,元组,列表,字典,布尔值和无。

传递__import__('os').system('rm -rf /a-path-you-really-care-about')ast.literal_eval()将引发一个错误,但eval()会愉快地擦拭您的驱动器。

由于看起来您只是让用户输入普通字典,所以请使用ast.literal_eval()。它可以安全地执行您想要的操作,仅此而已。

ast.literal_eval() only considers a small subset of Python’s syntax to be valid:

The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.

Passing __import__('os').system('rm -rf /a-path-you-really-care-about') into ast.literal_eval() will raise an error, but eval() will happily delete your files.

Since it looks like you’re only letting the user input a plain dictionary, use ast.literal_eval(). It safely does what you want and nothing more.


回答 2

eval: 此功能非常强大,但是如果您接受字符串以从不受信任的输入中求值,则也非常危险。假设要评估的字符串是“ os.system(’rm -rf /’)”?它将真正开始删除计算机上的所有文件。

ast.literal_eval: 安全地评估表达式节点或包含Python文字或容器显示的字符串。提供的字符串或节点只能由以下Python文字结构组成:字符串,字节,数字,元组,列表,字典,集合,布尔值,无,字节和集合。

句法:

eval(expression, globals=None, locals=None)
import ast
ast.literal_eval(node_or_string)

例:

# python 2.x - doesn't accept operators in string format
import ast
ast.literal_eval('[1, 2, 3]')  # output: [1, 2, 3]
ast.literal_eval('1+1') # output: ValueError: malformed string


# python 3.0 -3.6
import ast
ast.literal_eval("1+1") # output : 2
ast.literal_eval("{'a': 2, 'b': 3, 3:'xyz'}") # output : {'a': 2, 'b': 3, 3:'xyz'}
# type dictionary
ast.literal_eval("",{}) # output : Syntax Error required only one parameter
ast.literal_eval("__import__('os').system('rm -rf /')") # output : error

eval("__import__('os').system('rm -rf /')") 
# output : start deleting all the files on your computer.
# restricting using global and local variables
eval("__import__('os').system('rm -rf /')",{'__builtins__':{}},{})
# output : Error due to blocked imports by passing  '__builtins__':{} in global

# But still eval is not safe. we can access and break the code as given below
s = """
(lambda fc=(
lambda n: [
    c for c in 
        ().__class__.__bases__[0].__subclasses__() 
        if c.__name__ == n
    ][0]
):
fc("function")(
    fc("code")(
        0,0,0,0,"KABOOM",(),(),(),"","",0,""
    ),{}
)()
)()
"""
eval(s, {'__builtins__':{}})

在上面的代码中().__class__.__bases__[0],对象本身就是什么。现在我们实例化了所有子类,这里我们的主要enter code here目标是从中找到一个名为n的类。

我们需要code对象和function实例化的子类的对象。这是CPython访问对象子类并附加系统的另一种方法。

从python 3.7开始,ast.literal_eval()更加严格了。不再允许对任意数字进行加减。链接

eval: This is very powerful, but is also very dangerous if you accept strings to evaluate from untrusted input. Suppose the string being evaluated is “os.system(‘rm -rf /’)” ? It will really start deleting all the files on your computer.

ast.literal_eval: Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None, bytes and sets.

Syntax:

eval(expression, globals=None, locals=None)
import ast
ast.literal_eval(node_or_string)

Example:

# python 2.x - doesn't accept operators in string format
import ast
ast.literal_eval('[1, 2, 3]')  # output: [1, 2, 3]
ast.literal_eval('1+1') # output: ValueError: malformed string


# python 3.0 -3.6
import ast
ast.literal_eval("1+1") # output : 2
ast.literal_eval("{'a': 2, 'b': 3, 3:'xyz'}") # output : {'a': 2, 'b': 3, 3:'xyz'}
# type dictionary
ast.literal_eval("",{}) # output : Syntax Error required only one parameter
ast.literal_eval("__import__('os').system('rm -rf /')") # output : error

eval("__import__('os').system('rm -rf /')") 
# output : start deleting all the files on your computer.
# restricting using global and local variables
eval("__import__('os').system('rm -rf /')",{'__builtins__':{}},{})
# output : Error due to blocked imports by passing  '__builtins__':{} in global

# But still eval is not safe. we can access and break the code as given below
s = """
(lambda fc=(
lambda n: [
    c for c in 
        ().__class__.__bases__[0].__subclasses__() 
        if c.__name__ == n
    ][0]
):
fc("function")(
    fc("code")(
        0,0,0,0,"KABOOM",(),(),(),"","",0,""
    ),{}
)()
)()
"""
eval(s, {'__builtins__':{}})

In the above code ().__class__.__bases__[0] nothing but object itself. Now we instantiated all the subclasses, here our main enter code hereobjective is to find one class named n from it.

We need to code object and function object from instantiated subclasses. This is an alternative way from CPython to access subclasses of object and attach the system.

From python 3.7 ast.literal_eval() is now stricter. Addition and subtraction of arbitrary numbers are no longer allowed. link


回答 3

Python 渴望进行评估,因此无论eval(raw_input(...))用户eval随后对数据进行什么操作,只要它点击,就将评估用户的输入。因此,这是不安全的,尤其是在eval用户输入时。

使用ast.literal_eval


例如,在提示符下输入此命令对您非常不利:

__import__('os').system('rm -rf /a-path-you-really-care-about')

Python’s eager in its evaluation, so eval(input(...)) (Python 3) will evaluate the user’s input as soon as it hits the eval, regardless of what you do with the data afterwards. Therefore, this is not safe, especially when you eval user input.

Use ast.literal_eval.


As an example, entering this at the prompt could be very bad for you:

__import__('os').system('rm -rf /a-path-you-really-care-about')

回答 4

如果您需要的只是用户提供的词典,则可能是更好的解决方案json.loads。主要限制是json dict需要字符串键。另外,您只能提供文字数据,但情况也是如此literal_eval

If all you need is a user provided dictionary, possible better solution is json.loads. The main limitation is that json dicts requires string keys. Also you can only provide literal data, but that is also the case for literal_eval.


回答 5

我被困住了ast.literal_eval()。我在IntelliJ IDEA调试器中尝试过它,并一直None在调试器输出中返回。

但是稍后,当我将其输出分配给变量并以代码打印时。工作正常。共享代码示例:

import ast
sample_string = '[{"id":"XYZ_GTTC_TYR", "name":"Suction"}]'
output_value = ast.literal_eval(sample_string)
print(output_value)

其python版本3.6。

I was stuck with ast.literal_eval(). I was trying it in IntelliJ IDEA debugger, and it kept returning None on debugger output.

But later when I assigned its output to a variable and printed it in code. It worked fine. Sharing code example:

import ast
sample_string = '[{"id":"XYZ_GTTC_TYR", "name":"Suction"}]'
output_value = ast.literal_eval(sample_string)
print(output_value)

Its python version 3.6.


如何使用Python将文本文件读取到列表或数组中

问题:如何使用Python将文本文件读取到列表或数组中

我正在尝试将文本文件的行读入python中的列表或数组中。创建后,我只需要能够单独访问列表或数组中的任何项目。

文本文件的格式如下:

0,0,200,0,53,1,0,255,...,0.

...以上,有实际的文本文件中有数百或数千多个项目。

我正在使用以下代码尝试将文件读入列表:

text_file = open("filename.dat", "r")
lines = text_file.readlines()
print lines
print len(lines)
text_file.close()

我得到的输出是:

['0,0,200,0,53,1,0,255,...,0.']
1

显然,它将整个文件读入一个项目列表,而不是单个项目列表。我究竟做错了什么?

I am trying to read the lines of a text file into a list or array in python. I just need to be able to individually access any item in the list or array after it is created.

The text file is formatted as follows:

0,0,200,0,53,1,0,255,...,0.

Where the ... is above, there actual text file has hundreds or thousands more items.

I’m using the following code to try to read the file into a list:

text_file = open("filename.dat", "r")
lines = text_file.readlines()
print lines
print len(lines)
text_file.close()

The output I get is:

['0,0,200,0,53,1,0,255,...,0.']
1

Apparently it is reading the entire file into a list of just one item, rather than a list of individual items. What am I doing wrong?


回答 0

您将必须使用以下方法将字符串拆分为值列表 split()

所以,

lines = text_file.read().split(',')

You will have to split your string into a list of values using split()

So,

lines = text_file.read().split(',')

EDIT: I didn’t realise there would be so much traction to this. Here’s a more idiomatic approach.

import csv
with open('filename.csv', 'r') as fd:
    reader = csv.reader(fd)
    for row in reader:
        # do something

回答 1

您也可以使用numpy loadtxt

from numpy import loadtxt
lines = loadtxt("filename.dat", comments="#", delimiter=",", unpack=False)

You can also use numpy loadtxt like

from numpy import loadtxt
lines = loadtxt("filename.dat", comments="#", delimiter=",", unpack=False)

回答 2

所以您想创建一个列表列表…我们需要从一个空列表开始

list_of_lists = []

接下来,我们逐行读取文件内容

with open('data') as f:
    for line in f:
        inner_list = [elt.strip() for elt in line.split(',')]
        # in alternative, if you need to use the file content as numbers
        # inner_list = [int(elt.strip()) for elt in line.split(',')]
        list_of_lists.append(inner_list)

一个常见的用例是列式数据,但我们的存储单位是文件的行,我们已逐一读取它,因此您可能需要转置 列表列表。这可以通过以下成语来完成

by_cols = zip(*list_of_lists)

另一个常见的用法是为每列命名

col_names = ('apples sold', 'pears sold', 'apples revenue', 'pears revenue')
by_names = {}
for i, col_name in enumerate(col_names):
    by_names[col_name] = by_cols[i]

这样您就可以对同类数据项进行操作

 mean_apple_prices = [money/fruits for money, fruits in
                     zip(by_names['apples revenue'], by_names['apples_sold'])]

我编写的大多数内容都可以使用csv标准库中的模块来加速。另一个第三方模块是pandas,它使您可以自动化典型数据分析的大多数方面(但具有许多依赖性)。


更新虽然在Python 2中zip(*list_of_lists)返回了一个不同的列表(换位后的列表),但在Python 3中情况发生了变化,并zip(*list_of_lists)返回了一个不能下标的zip对象

如果您需要索引访问,则可以使用

by_cols = list(zip(*list_of_lists))

为您提供了两个Python版本中的列表列表。

另一方面,如果您不需要索引访问,而您想要的只是构建一个按列名称索引的字典,那么zip对象就可以了。

file = open('some_data.csv')
names = get_names(next(file))
columns = zip(*((x.strip() for x in line.split(',')) for line in file)))
d = {}
for name, column in zip(names, columns): d[name] = column

So you want to create a list of lists… We need to start with an empty list

list_of_lists = []

next, we read the file content, line by line

with open('data') as f:
    for line in f:
        inner_list = [elt.strip() for elt in line.split(',')]
        # in alternative, if you need to use the file content as numbers
        # inner_list = [int(elt.strip()) for elt in line.split(',')]
        list_of_lists.append(inner_list)

A common use case is that of columnar data, but our units of storage are the rows of the file, that we have read one by one, so you may want to transpose your list of lists. This can be done with the following idiom

by_cols = zip(*list_of_lists)

Another common use is to give a name to each column

col_names = ('apples sold', 'pears sold', 'apples revenue', 'pears revenue')
by_names = {}
for i, col_name in enumerate(col_names):
    by_names[col_name] = by_cols[i]

so that you can operate on homogeneous data items

 mean_apple_prices = [money/fruits for money, fruits in
                     zip(by_names['apples revenue'], by_names['apples_sold'])]

Most of what I’ve written can be speeded up using the csv module, from the standard library. Another third party module is pandas, that lets you automate most aspects of a typical data analysis (but has a number of dependencies).


Update While in Python 2 zip(*list_of_lists) returns a different (transposed) list of lists, in Python 3 the situation has changed and zip(*list_of_lists) returns a zip object that is not subscriptable.

If you need indexed access you can use

by_cols = list(zip(*list_of_lists))

that gives you a list of lists in both versions of Python.

On the other hand, if you don’t need indexed access and what you want is just to build a dictionary indexed by column names, a zip object is just fine…

file = open('some_data.csv')
names = get_names(next(file))
columns = zip(*((x.strip() for x in line.split(',')) for line in file)))
d = {}
for name, column in zip(names, columns): d[name] = column

回答 3

这个问题问如何将文件中的逗号分隔值内容读取到可迭代列表中:

0,0,200,0,53,1,0,255,...,0.

最简单的方法是使用以下csv模块:

import csv
with open('filename.dat', newline='') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',')

现在,您可以spamreader像这样轻松地进行迭代:

for row in spamreader:
    print(', '.join(row))

有关更多示例,请参见文档

This question is asking how to read the comma-separated value contents from a file into an iterable list:

0,0,200,0,53,1,0,255,...,0.

The easiest way to do this is with the csv module as follows:

import csv
with open('filename.dat', newline='') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',')

Now, you can easily iterate over spamreader like this:

for row in spamreader:
    print(', '.join(row))

See documentation for more examples.


如何在Django网站上记录服务器错误

问题:如何在Django网站上记录服务器错误

因此,在进行开发时,我可以设置settings.DEBUGTrue,如果发生错误,我可以看到格式正确,具有良好的堆栈跟踪和请求信息。

但是在某种生产站点上,我更愿意使用DEBUG=False并向访问者展示一些标准错误500页,其中包含我目前正在修复此bug的信息;)
同时,我想以某种方式记录所有这些信息(堆栈跟踪和请求信息)存储到服务器上的文件中-因此我可以将其输出到控制台并观看错误滚动,每小时将日志发送给我或类似的东西。

您会为django站点推荐什么样的日志记录解决方案,这些解决方案可以满足那些简单的要求?我有作为fcgi服务器运行的应用程序,并且我使用apache Web服务器作为前端(尽管考虑使用lighttpd)。

So, when playing with the development I can just set settings.DEBUG to True and if an error occures I can see it nicely formatted, with good stack trace and request information.

But on kind of production site I’d rather use DEBUG=False and show visitors some standard error 500 page with information that I’m working on fixing this bug at this moment 😉
At the same time I’d like to have some way of logging all those information (stack trace and request info) to a file on my server – so I can just output it to my console and watch errors scroll, email the log to me every hour or something like this.

What logging solutions would you recomend for a django-site, that would meet those simple requirements? I have the application running as fcgi server and I’m using apache web server as frontend (although thinking of going to lighttpd).


回答 0

好的,当时DEBUG = False,Django会自动将所有错误的完整回溯邮件发送给ADMINS设置中列出的每个人,这几乎可以免费为您提供通知。如果您想要更细粒度的控件,则可以编写一个中间件类并将其添加到设置中,该中间件类定义了一个名为的方法process_exception(),该方法可以访问所引发的异常:

http://docs.djangoproject.com/en/dev/topics/http/middleware/#process-exception

process_exception()然后,您的方法可以执行您想要的任何类型的日志记录:写入控制台,写入文件等,等等。

编辑:尽管它的用处不大,但是您也可以侦听got_request_exception信号,该信号将在请求处理期间遇到异常时发送:

http://docs.djangoproject.com/en/dev/ref/signals/#got-request-exception

但是,这不能使您访问异常对象,因此中间件方法更容易使用。

Well, when DEBUG = False, Django will automatically mail a full traceback of any error to each person listed in the ADMINS setting, which gets you notifications pretty much for free. If you’d like more fine-grained control, you can write and add to your settings a middleware class which defines a method named process_exception(), which will have access to the exception that was raised:

http://docs.djangoproject.com/en/dev/topics/http/middleware/#process-exception

Your process_exception() method can then perform whatever type of logging you’d like: writing to console, writing to a file, etc., etc.

Edit: though it’s a bit less useful, you can also listen for the got_request_exception signal, which will be sent whenever an exception is encountered during request processing:

http://docs.djangoproject.com/en/dev/ref/signals/#got-request-exception

This does not give you access to the exception object, however, so the middleware method is much easier to work with.


回答 1

如前所述,Django Sentry是一个不错的选择,但是要正确设置它(作为一个单独的网站)需要进行一些工作。如果您只想将所有内容记录到一个简单的文本文件中,请在此处输入以下记录配置settings.py

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        # Include the default Django email handler for errors
        # This is what you'd get without configuring logging at all.
        'mail_admins': {
            'class': 'django.utils.log.AdminEmailHandler',
            'level': 'ERROR',
             # But the emails are plain text by default - HTML is nicer
            'include_html': True,
        },
        # Log to a text file that can be rotated by logrotate
        'logfile': {
            'class': 'logging.handlers.WatchedFileHandler',
            'filename': '/var/log/django/myapp.log'
        },
    },
    'loggers': {
        # Again, default Django configuration to email unhandled exceptions
        'django.request': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
        },
        # Might as well log any errors anywhere else in Django
        'django': {
            'handlers': ['logfile'],
            'level': 'ERROR',
            'propagate': False,
        },
        # Your own app - this assumes all your logger names start with "myapp."
        'myapp': {
            'handlers': ['logfile'],
            'level': 'WARNING', # Or maybe INFO or DEBUG
            'propagate': False
        },
    },
}

Django Sentry is a good way to go, as already mentioned, but there is a bit of work involved in setting it up properly (as a separate website). If you just want to log everything to a simple text file here’s the logging configuration to put in your settings.py

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        # Include the default Django email handler for errors
        # This is what you'd get without configuring logging at all.
        'mail_admins': {
            'class': 'django.utils.log.AdminEmailHandler',
            'level': 'ERROR',
             # But the emails are plain text by default - HTML is nicer
            'include_html': True,
        },
        # Log to a text file that can be rotated by logrotate
        'logfile': {
            'class': 'logging.handlers.WatchedFileHandler',
            'filename': '/var/log/django/myapp.log'
        },
    },
    'loggers': {
        # Again, default Django configuration to email unhandled exceptions
        'django.request': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
        },
        # Might as well log any errors anywhere else in Django
        'django': {
            'handlers': ['logfile'],
            'level': 'ERROR',
            'propagate': False,
        },
        # Your own app - this assumes all your logger names start with "myapp."
        'myapp': {
            'handlers': ['logfile'],
            'level': 'WARNING', # Or maybe INFO or DEBUG
            'propagate': False
        },
    },
}

回答 2

另一个答案中提到的django-db-log已替换为:

https://github.com/dcramer/django-sentry

django-db-log, mentioned in another answer, has been replaced with:

https://github.com/dcramer/django-sentry


回答 3

显然,James是正确的,但是如果您想在数据存储区中记录异常,则已经有一些开源解决方案可用:

1)CrashLog是一个不错的选择:http : //code.google.com/p/django-crashlog/

2)Db-Log也是一个不错的选择:http : //code.google.com/p/django-db-log/

两者有什么区别?我几乎看不到任何东西,所以只要一个就足够了。

我都用过,而且它们运作良好。

Obviously James is correct, but if you wanted to log exceptions in a datastore, there are a few open source solutions already available:

1) CrashLog is a good choice: http://code.google.com/p/django-crashlog/

2) Db-Log is a good choice as well: http://code.google.com/p/django-db-log/

What is the difference between the two? Almost nothing that I can see, so either one will suffice.

I’ve used both and they work well.


回答 4

自EMP提交最有用的代码以来,已经过去了一段时间。我刚刚实现了它,并在尝试使用一些manage.py选项进行尝试以查找错误时,我收到了弃用警告,以表明在当前版本的Django(1.5。?)中,现在需要require_debug_false过滤器mail_admins处理程序所需。

这是修改后的代码:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
         'require_debug_false': {
             '()': 'django.utils.log.RequireDebugFalse'
         }
     },
    'handlers': {
        # Include the default Django email handler for errors
        # This is what you'd get without configuring logging at all.
        'mail_admins': {
            'class': 'django.utils.log.AdminEmailHandler',
            'level': 'ERROR',
            'filters': ['require_debug_false'],
             # But the emails are plain text by default - HTML is nicer
            'include_html': True,
        },
        # Log to a text file that can be rotated by logrotate
        'logfile': {
            'class': 'logging.handlers.WatchedFileHandler',
            'filename': '/home/username/public_html/djangoprojectname/logfilename.log'
        },
    },
    'loggers': {
        # Again, default Django configuration to email unhandled exceptions
        'django.request': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
        },
        # Might as well log any errors anywhere else in Django
        'django': {
            'handlers': ['logfile'],
            'level': 'ERROR',
            'propagate': False,
        },
        # Your own app - this assumes all your logger names start with "myapp."
        'myapp': {
            'handlers': ['logfile'],
            'level': 'DEBUG', # Or maybe INFO or WARNING
            'propagate': False
        },
    },
}

Some time has passed since EMP’s most helpful code submission. I just now implemented it, and while thrashing around with some manage.py option, to try to chase down a bug, I got a deprecation warning to the effect that with my current version of Django (1.5.?) a require_debug_false filter is now needed for the mail_admins handler.

Here is the revised code:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'filters': {
         'require_debug_false': {
             '()': 'django.utils.log.RequireDebugFalse'
         }
     },
    'handlers': {
        # Include the default Django email handler for errors
        # This is what you'd get without configuring logging at all.
        'mail_admins': {
            'class': 'django.utils.log.AdminEmailHandler',
            'level': 'ERROR',
            'filters': ['require_debug_false'],
             # But the emails are plain text by default - HTML is nicer
            'include_html': True,
        },
        # Log to a text file that can be rotated by logrotate
        'logfile': {
            'class': 'logging.handlers.WatchedFileHandler',
            'filename': '/home/username/public_html/djangoprojectname/logfilename.log'
        },
    },
    'loggers': {
        # Again, default Django configuration to email unhandled exceptions
        'django.request': {
            'handlers': ['mail_admins'],
            'level': 'ERROR',
            'propagate': True,
        },
        # Might as well log any errors anywhere else in Django
        'django': {
            'handlers': ['logfile'],
            'level': 'ERROR',
            'propagate': False,
        },
        # Your own app - this assumes all your logger names start with "myapp."
        'myapp': {
            'handlers': ['logfile'],
            'level': 'DEBUG', # Or maybe INFO or WARNING
            'propagate': False
        },
    },
}

回答 5

我的fcgi脚本有一个烦人的问题。它发生在django开始之前。缺少伐木非常痛苦。无论如何,将stderr重定向到文件作为第一件事很有帮助:

#!/home/user/env/bin/python
sys.stderr = open('/home/user/fcgi_errors', 'a')

I just had an annoying problem with my fcgi script. It occurred before django even started. The lack of logging is sooo painful. Anyway, redirecting stderr to a file as the very first thing helped a lot:

#!/home/user/env/bin/python
sys.stderr = open('/home/user/fcgi_errors', 'a')