Emacs适用于Python的批量缩进

问题:Emacs适用于Python的批量缩进

如果我想在代码块中添加try / except,则在Emacs中使用Python,我经常发现我必须逐行缩进整个代码块。在Emacs中,如何立即缩进整个块。

我不是经验丰富的Emacs用户,但是发现它是通过ssh工作的最佳工具。我在命令行(Ubuntu)上使用Emacs,而不是作为gui,如果有什么不同的话。

Working with Python in Emacs if I want to add a try/except to a block of code, I often find that I am having to indent the whole block, line by line. In Emacs, how do you indent the whole block at once.

I am not an experienced Emacs user, but just find it is the best tool for working through ssh. I am using Emacs on the command line(Ubuntu), not as a gui, if that makes any difference.


回答 0

如果您正在使用Emacs编程Python,那么您可能应该使用python-mode。使用python-mode,在标记代码块之后,

C-c >C-c C-l 将区域右移4个空格

C-c <C-c C-r 将区域向左移动4个空格

如果您需要将代码缩进两个级别,或者需要一定程度的缩编,则可以在命令前加上一个参数:

C-u 8 C-c > 将区域右移8个空格

C-u 8 C-c < 将区域向左移动8个空格

另一种选择是使用M-x indent-rigidly绑定到C-x TAB

C-u 8 C-x TAB 将区域右移8个空格

C-u -8 C-x TAB 将区域向左移动8个空格

用于文本矩形而不是文本行的矩形命令也很有用。

例如,在标记矩形区域后,

C-x r o 插入空格以填充矩形区域(有效地向右移动代码)

C-x r k 杀死矩形区域(有效地将代码向左移动)

C-x r t提示输入一个字符串来替换矩形。输入C-u 8 <space>后将输入8个空格。

PS。使用Ubuntu,要将python-mode设置为所有.py文件的默认模式,只需安装该python-mode软件包。

If you are programming Python using Emacs, then you should probably be using python-mode. With python-mode, after marking the block of code,

C-c > or C-c C-l shifts the region 4 spaces to the right

C-c < or C-c C-r shifts the region 4 spaces to the left

If you need to shift code by two levels of indention, or some arbitary amount you can prefix the command with an argument:

C-u 8 C-c > shifts the region 8 spaces to the right

C-u 8 C-c < shifts the region 8 spaces to the left

Another alternative is to use M-x indent-rigidly which is bound to C-x TAB:

C-u 8 C-x TAB shifts the region 8 spaces to the right

C-u -8 C-x TAB shifts the region 8 spaces to the left

Also useful are the rectangle commands that operate on rectangles of text instead of lines of text.

For example, after marking a rectangular region,

C-x r o inserts blank space to fill the rectangular region (effectively shifting code to the right)

C-x r k kills the rectangular region (effectively shifting code to the left)

C-x r t prompts for a string to replace the rectangle with. Entering C-u 8 <space> will then enter 8 spaces.

PS. With Ubuntu, to make python-mode the default mode for all .py files, simply install the python-mode package.


回答 1

除了默认情况下indent-region映射到C-M-\的,矩形编辑命令对Python很有用。将区域标记为正常,然后:

  • C-x r tstring-rectangle):将提示您输入要插入每行的字符;非常适合插入一定数量的空格
  • C-x r kkill-rectangle):删除矩形区域;非常适合去除压痕

您也可以C-x r yyank-rectangle),但这很少有用。

In addition to indent-region, which is mapped to C-M-\ by default, the rectangle edit commands are very useful for Python. Mark a region as normal, then:

  • C-x r t (string-rectangle): will prompt you for characters you’d like to insert into each line; great for inserting a certain number of spaces
  • C-x r k (kill-rectangle): remove a rectangle region; great for removing indentation

You can also C-x r y (yank-rectangle), but that’s only rarely useful.


回答 2

indent-region映射C-M-\应该可以解决问题。

indent-region mapped to C-M-\ should do the trick.


回答 3

我一直在使用此功能来处理缩进和缩进:

(defun unindent-dwim (&optional count-arg)
  "Keeps relative spacing in the region.  Unindents to the next multiple of the current tab-width"
  (interactive)
  (let ((deactivate-mark nil)
        (beg (or (and mark-active (region-beginning)) (line-beginning-position)))
        (end (or (and mark-active (region-end)) (line-end-position)))
        (min-indentation)
        (count (or count-arg 1)))
    (save-excursion
      (goto-char beg)
      (while (< (point) end)
        (add-to-list 'min-indentation (current-indentation))
        (forward-line)))
    (if (< 0 count)
        (if (not (< 0 (apply 'min min-indentation)))
            (error "Can't indent any more.  Try `indent-rigidly` with a negative arg.")))
    (if (> 0 count)
        (indent-rigidly beg end (* (- 0 tab-width) count))
      (let (
            (indent-amount
             (apply 'min (mapcar (lambda (x) (- 0 (mod x tab-width))) min-indentation))))
        (indent-rigidly beg end (or
                                 (and (< indent-amount 0) indent-amount)
                                 (* (or count 1) (- 0 tab-width))))))))

然后将其分配给键盘快捷键:

(global-set-key (kbd "s-[") 'unindent-dwim)
(global-set-key (kbd "s-]") (lambda () (interactive) (unindent-dwim -1)))

I’ve been using this function to handle my indenting and unindenting:

(defun unindent-dwim (&optional count-arg)
  "Keeps relative spacing in the region.  Unindents to the next multiple of the current tab-width"
  (interactive)
  (let ((deactivate-mark nil)
        (beg (or (and mark-active (region-beginning)) (line-beginning-position)))
        (end (or (and mark-active (region-end)) (line-end-position)))
        (min-indentation)
        (count (or count-arg 1)))
    (save-excursion
      (goto-char beg)
      (while (< (point) end)
        (add-to-list 'min-indentation (current-indentation))
        (forward-line)))
    (if (< 0 count)
        (if (not (< 0 (apply 'min min-indentation)))
            (error "Can't indent any more.  Try `indent-rigidly` with a negative arg.")))
    (if (> 0 count)
        (indent-rigidly beg end (* (- 0 tab-width) count))
      (let (
            (indent-amount
             (apply 'min (mapcar (lambda (x) (- 0 (mod x tab-width))) min-indentation))))
        (indent-rigidly beg end (or
                                 (and (< indent-amount 0) indent-amount)
                                 (* (or count 1) (- 0 tab-width))))))))

And then I assign it to a keyboard shortcut:

(global-set-key (kbd "s-[") 'unindent-dwim)
(global-set-key (kbd "s-]") (lambda () (interactive) (unindent-dwim -1)))

回答 4

我是Emacs的新手,因此此答案可能对您毫无用处。

到目前为止,所提到的答案都没有覆盖字面量如dict或的重新缩进list。例如,如果您剪切并粘贴了以下文字并需要对其进行合理的缩进,则M-x indent-regionor或M-x python-indent-shift-rightcompany不会提供帮助:

    foo = {
  'bar' : [
     1,
    2,
        3 ],
      'baz' : {
     'asdf' : {
        'banana' : 1,
        'apple' : 2 } } }

感觉M-x indent-region应该在中做一些明智的事情python-mode,但是还不是这样。

对于将文字放在方括号中的特定情况,在相关行上使用TAB即可获得所需的内容(因为空格不起作用)。

因此,在这种情况下,我一直在快速记录键盘宏,例如<f3> C-n TAB <f4>F3,Ctrl-n(或向下箭头),TAB,F4,然后重复使用F4来应用宏可以节省几次击键。或者,您可以将C-u 10 C-x e其应用10次。

(我知道这听起来不多,但是尝试重新缩进100行垃圾文字而不丢失向下箭头,然后不得不上升5行并重复一遍;)。

I’m an Emacs newb, so this answer it probably bordering on useless.

None of the answers mentioned so far cover re-indentation of literals like dict or list. E.g. M-x indent-region or M-x python-indent-shift-right and company aren’t going to help if you’ve cut-and-pasted the following literal and need it to be re-indented sensibly:

    foo = {
  'bar' : [
     1,
    2,
        3 ],
      'baz' : {
     'asdf' : {
        'banana' : 1,
        'apple' : 2 } } }

It feels like M-x indent-region should do something sensibly in python-mode, but that’s not (yet) the case.

For the specific case where your literals are bracketed, using TAB on the lines in question gets what you want (because whitespace doesn’t play a role).

So what I’ve been doing in such cases is quickly recording a keyboard macro like <f3> C-n TAB <f4> as in F3, Ctrl-n (or down arrow), TAB, F4, and then using F4 repeatedly to apply the macro can save a couple of keystrokes. Or you can do C-u 10 C-x e to apply it 10 times.

(I know it doesn’t sound like much, but try re-indenting 100 lines of garbage literal without missing down-arrow, and then having to go up 5 lines and repeat things ;) ).


回答 5

我使用以下代码段。当选项卡处于非活动状态时,在选项卡上缩进当前行(通常如此);当选择处于非活动状态时,它将使整个区域向右缩进。

(defun my-python-tab-command (&optional _)
  "If the region is active, shift to the right; otherwise, indent current line."
  (interactive)
  (if (not (region-active-p))
      (indent-for-tab-command)
    (let ((lo (min (region-beginning) (region-end)))
          (hi (max (region-beginning) (region-end))))
      (goto-char lo)
      (beginning-of-line)
      (set-mark (point))
      (goto-char hi)
      (end-of-line)
      (python-indent-shift-right (mark) (point)))))
(define-key python-mode-map [remap indent-for-tab-command] 'my-python-tab-command)

I use the following snippet. On tab when the selection is inactive, it indents the current line (as it normally does); when the selection is inactive, it indents the whole region to the right.

(defun my-python-tab-command (&optional _)
  "If the region is active, shift to the right; otherwise, indent current line."
  (interactive)
  (if (not (region-active-p))
      (indent-for-tab-command)
    (let ((lo (min (region-beginning) (region-end)))
          (hi (max (region-beginning) (region-end))))
      (goto-char lo)
      (beginning-of-line)
      (set-mark (point))
      (goto-char hi)
      (end-of-line)
      (python-indent-shift-right (mark) (point)))))
(define-key python-mode-map [remap indent-for-tab-command] 'my-python-tab-command)

回答 6

交互进行缩进。

  1. 选择要缩进的区域。
  2. Cx TAB
  3. 使用箭头(<-->)进行交互缩进。
  4. Esc完成所需的缩进后,按三次。

从我的文章中复制:在Emacs中缩进几行

Do indentation interactively.

  1. Select the region to be indented.
  2. C-x TAB.
  3. Use arrows (<- and ->) to indent interactively.
  4. Press Esc three times when you are done with the required indentation.

Copied from my post in: Indent several lines in Emacs


回答 7

我普遍做这样的事情

;; intent whole buffer 
(defun iwb ()
  "indent whole buffer"
  (interactive)
  ;;(delete-trailing-whitespace)
  (indent-region (point-min) (point-max) nil)
  (untabify (point-min) (point-max)))

I do something like this universally

;; intent whole buffer 
(defun iwb ()
  "indent whole buffer"
  (interactive)
  ;;(delete-trailing-whitespace)
  (indent-region (point-min) (point-max) nil)
  (untabify (point-min) (point-max)))

numpy,scipy,matplotlib和pylab之间的混淆

问题:numpy,scipy,matplotlib和pylab之间的混淆

Numpy,scipy,matplotlib和pylab是使用python进行科学计算的常用术语。

我只是学习了一些有关pylab的知识,而感到困惑。每当我要导入numpy时,我都可以执行以下操作:

import numpy as np

我只是认为,一旦我这样做

from pylab import *

numpy也将被导入(使用np别名)。所以基本上,第二个相比第一个做更多的事情。

我想问的几件事:

  1. pylab仅仅是numpy,scipy和matplotlib的包装吗?
  2. 由于NP是pylab中的numpy别名,因此pylab中的scipy和matplotlib别名是什么?(据我所知,plt是matplotlib.pyplot的别名,但我不知道matplotlib本身的别名)

Numpy, scipy, matplotlib, and pylab are common terms among they who use python for scientific computation.

I just learn a bit about pylab, and I got confused. Whenever I want to import numpy, I can always do:

import numpy as np

I just consider, that once I do

from pylab import *

the numpy will be imported as well (with np alias). So basically the second one does more things compared to the first one.

There are few things I want to ask:

  1. Is it right that pylab is just a wrapper for numpy, scipy and matplotlib?
  2. As np is the numpy alias in pylab, what is the scipy and matplotlib alias in pylab? (as far as I know, plt is alias of matplotlib.pyplot, but I don’t know the alias for the matplotlib itself)

回答 0

  1. 没有,pylab是的一部分matplotlib(在matplotlib.pylab),并试图给你喜欢的环境Matlab的。matplotlib有许多依赖项,其中有一些依赖项numpy以通用别名导入npscipy不是的依赖项matplotlib

  2. 如果运行ipython --pylab自动导入,则会将所有符号从中matplotlib.pylab放入全局范围。就像您写的一样numpy,在np别名下导入。别名matplotlib下的符号来自mpl

  1. No, pylab is part of matplotlib (in matplotlib.pylab) and tries to give you a MatLab like environment. matplotlib has a number of dependencies, among them numpy which it imports under the common alias np. scipy is not a dependency of matplotlib.

  2. If you run ipython --pylab an automatic import will put all symbols from matplotlib.pylab into global scope. Like you wrote numpy gets imported under the np alias. Symbols from matplotlib are available under the mpl alias.


回答 1

Scipy和numpy是科学项目,旨在为python带来高效,快速的数值计算。

Matplotlib是python绘图库的名称。

Pyplot是matplotlib的交互式api,主要用于jupyter之类的笔记本中。您通常会这样使用它:import matplotlib.pyplot as plt

Pylab与pyplot相同,但是具有额外的功能(目前不鼓励使用)。

  • pylab = pyplot + numpy的

在此处查看更多信息:Matplotlib,Pylab,Pyplot等:这些和何时使用它们有什么区别?

Scipy and numpy are scientific projects whose aim is to bring efficient and fast numeric computing to python.

Matplotlib is the name of the python plotting library.

Pyplot is an interactive api for matplotlib, mostly for use in notebooks like jupyter. You generally use it like this: import matplotlib.pyplot as plt.

Pylab is the same thing as pyplot, but with extra features (its use is currently discouraged).

  • pylab = pyplot + numpy

See more information here: Matplotlib, Pylab, Pyplot, etc: What’s the difference between these and when to use each?


回答 2

由于某些示例(例如我)可能仍然对pylab的使用感到困惑,因为pylab互联网上存在使用示例的示例,因此这里引用了官方matplotlib常见问题解答:

pylab是一个便捷模块,可在单个命名空间中批量导入matplotlib.pyplot(用于绘图)和numpy(用于数学以及使用数组)。尽管许多示例都使用pylab,但不再建议使用。

因此,TL; DR; 是不使用pylab,句点。根据需要分别使用pyplot和导入numpy

这是进一步阅读和其他有用示例的链接

Since some people (like me) may still be confused about usage of pylab since examples using pylab are out there on the internet, here is a quote from the official matplotlib FAQ:

pylab is a convenience module that bulk imports matplotlib.pyplot (for plotting) and numpy (for mathematics and working with arrays) in a single name space. Although many examples use pylab, it is no longer recommended.

So, TL;DR; is do not use pylab, period. Use pyplot and import numpy separately as needed.

Here is the link for further reading and other useful examples.


检查类是否已定义函数的最快方法是什么?

问题:检查类是否已定义函数的最快方法是什么?

我正在编写AI状态空间搜索算法,并且有一个通用类可以用于快速实现搜索算法。子类将定义必要的操作,然后算法执行其余操作。

这是我遇到的问题:我想避免一遍又一遍地重新生成父状态,所以我有以下函数,该函数返回可以合法地应用于任何状态的操作:

def get_operations(self, include_parent=True):
    ops = self._get_operations()
    if not include_parent and self.path.parent_op:
        try:
            parent_inverse = self.invert_op(self.path.parent_op)
            ops.remove(parent_inverse)
        except NotImplementedError:
            pass
    return ops

并且invert_op函数默认情况下抛出。

有没有比捕获异常更快的方法来检查函数是否未定义?

我在检查dir中是否存在内容时正在思考,但这似乎不正确。hasattr是通过调用getattr并检查它是否引发来实现的,这不是我想要的。

I’m writing an AI state space search algorithm, and I have a generic class which can be used to quickly implement a search algorithm. A subclass would define the necessary operations, and the algorithm does the rest.

Here is where I get stuck: I want to avoid regenerating the parent state over and over again, so I have the following function, which returns the operations that can be legally applied to any state:

def get_operations(self, include_parent=True):
    ops = self._get_operations()
    if not include_parent and self.path.parent_op:
        try:
            parent_inverse = self.invert_op(self.path.parent_op)
            ops.remove(parent_inverse)
        except NotImplementedError:
            pass
    return ops

And the invert_op function throws by default.

Is there a faster way to check to see if the function is not defined than catching an exception?

I was thinking something on the lines of checking for present in dir, but that doesn’t seem right. hasattr is implemented by calling getattr and checking if it raises, which is not what I want.


回答 0

是的,用于getattr()获取属性并callable()验证它是否为方法:

invert_op = getattr(self, "invert_op", None)
if callable(invert_op):
    invert_op(self.path.parent_op)

请注意,getattr()当属性不存在时,通常会引发异常。但是,如果您指定默认值(None在本例中为),它将返回该值。

Yes, use getattr() to get the attribute, and callable() to verify it is a method:

invert_op = getattr(self, "invert_op", None)
if callable(invert_op):
    invert_op(self.path.parent_op)

Note that getattr() normally throws exception when the attribute doesn’t exist. However, if you specify a default value (None, in this case), it will return that instead.


回答 1

它同时适用于Python 2和Python 3

hasattr(connection, 'invert_opt')

hasattrTrue如果连接对象已invert_opt定义函数,则返回。这是供您放牧的文档

https://docs.python.org/2/library/functions.html#hasattr https://docs.python.org/3/library/functions.html#hasattr

It works in both Python 2 and Python 3

hasattr(connection, 'invert_opt')

hasattr returns True if connection object has a function invert_opt defined. Here is the documentation for you to graze

https://docs.python.org/2/library/functions.html#hasattr https://docs.python.org/3/library/functions.html#hasattr


回答 2

有没有比捕获异常更快的方法来检查函数是否未定义?

你为什么反对那个?在大多数Pythonic情况下,最好是请求宽恕而不是允许。;-)

hasattr是通过调用getattr并检查它是否引发来实现的,这不是我想要的。

同样,为什么呢?以下是相当Pythonic的内容:

    try:
        invert_op = self.invert_op
    except AttributeError:
        pass
    else:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

要么,

    # if you supply the optional `default` parameter, no exception is thrown
    invert_op = getattr(self, 'invert_op', None)  
    if invert_op is not None:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

但是请注意,这getattr(obj, attr, default)基本上也是通过捕获异常来实现的。Python领域没有错!

Is there a faster way to check to see if the function is not defined than catching an exception?

Why are you against that? In most Pythonic cases, it’s better to ask forgiveness than permission. ;-)

hasattr is implemented by calling getattr and checking if it raises, which is not what I want.

Again, why is that? The following is quite Pythonic:

    try:
        invert_op = self.invert_op
    except AttributeError:
        pass
    else:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

Or,

    # if you supply the optional `default` parameter, no exception is thrown
    invert_op = getattr(self, 'invert_op', None)  
    if invert_op is not None:
        parent_inverse = invert_op(self.path.parent_op)
        ops.remove(parent_inverse)

Note, however, that getattr(obj, attr, default) is basically implemented by catching an exception, too. There is nothing wrong with that in Python land!


回答 3

这里的响应检查字符串是否是对象的属性的名称。需要一个额外的步骤(使用callable)来检查属性是否为方法。

因此,可以归结为:检查对象obj是否具有属性attrib的最快方法是什么。答案是

'attrib' in obj.__dict__

之所以如此,是因为dict对其键进行了哈希处理,因此可以快速检查键的存在。

请参见下面的时序比较。

>>> class SomeClass():
...         pass
...
>>> obj = SomeClass()
>>>
>>> getattr(obj, "invert_op", None)
>>>
>>> %timeit getattr(obj, "invert_op", None)
1000000 loops, best of 3: 723 ns per loop
>>> %timeit hasattr(obj, "invert_op")
The slowest run took 4.60 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 674 ns per loop
>>> %timeit "invert_op" in obj.__dict__
The slowest run took 12.19 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 176 ns per loop

The responses herein check if a string is the name of an attribute of the object. An extra step (using callable) is needed to check if the attribute is a method.

So it boils down to: what is the fastest way to check if an object obj has an attribute attrib. The answer is

'attrib' in obj.__dict__

This is so because a dict hashes its keys so checking for the key’s existence is fast.

See timing comparisons below.

>>> class SomeClass():
...         pass
...
>>> obj = SomeClass()
>>>
>>> getattr(obj, "invert_op", None)
>>>
>>> %timeit getattr(obj, "invert_op", None)
1000000 loops, best of 3: 723 ns per loop
>>> %timeit hasattr(obj, "invert_op")
The slowest run took 4.60 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 674 ns per loop
>>> %timeit "invert_op" in obj.__dict__
The slowest run took 12.19 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 176 ns per loop

回答 4

我喜欢内森·奥斯特加德的回答,并对此进行了投票。但是解决问题的另一种方法是使用记忆修饰符,该修饰符将缓存函数调用的结果。因此,您可以继续使用具有昂贵功能的功能来解决某些问题,但是当您一遍又一遍地调用它时,后续调用很快。函数的记忆版本会在字典中查找参数,然后从实际函数计算结果时开始在字典中查找结果,然后立即返回结果。

这是雷蒙德·海廷格(Raymond Hettinger)称为“ lru_cache”的记忆修饰器的食谱。现在,此版本是Python 3.2中functools模块的标准版本。

http://code.activestate.com/recipes/498245-lru-and-lfu-cache-decorators/

http://docs.python.org/release/3.2/library/functools.html

I like Nathan Ostgard’s answer and I up-voted it. But another way you could solve your problem would be to use a memoizing decorator, which would cache the result of the function call. So you can go ahead and have an expensive function that figures something out, but then when you call it over and over the subsequent calls are fast; the memoized version of the function looks up the arguments in a dict, finds the result in the dict from when the actual function computed the result, and returns the result right away.

Here is a recipe for a memoizing decorator called “lru_cache” by Raymond Hettinger. A version of this is now standard in the functools module in Python 3.2.

http://code.activestate.com/recipes/498245-lru-and-lfu-cache-decorators/

http://docs.python.org/release/3.2/library/functools.html


回答 5

像Python中的任何东西一样,如果您尽力而为,那么您就可以直截了当地去做一些令人讨厌的事情。现在,这是令人讨厌的部分:

def invert_op(self, op):
    raise NotImplementedError

def is_invert_op_implemented(self):
    # Only works in CPython 2.x of course
    return self.invert_op.__code__.co_code == 't\x00\x00\x82\x01\x00d\x00\x00S'

请帮我们一个忙,只要继续解决您的问题,就不要使用它,除非您是PyPy团队的黑客,他们正在侵入Python解释器。您所拥有的是Pythonic,我在这里拥有的是纯EVIL

Like anything in Python, if you try hard enough, you can get at the guts and do something really nasty. Now, here’s the nasty part:

def invert_op(self, op):
    raise NotImplementedError

def is_invert_op_implemented(self):
    # Only works in CPython 2.x of course
    return self.invert_op.__code__.co_code == 't\x00\x00\x82\x01\x00d\x00\x00S'

Please do us a favor, just keep doing what you have in your question and DON’T ever use this unless you are on the PyPy team hacking into the Python interpreter. What you have up there is Pythonic, what I have here is pure EVIL.


回答 6

您也可以遍历类:

import inspect


def get_methods(cls_):
    methods = inspect.getmembers(cls_, inspect.isfunction)
    return dict(methods)

# Example
class A(object):
    pass

class B(object):
    def foo():
        print('B')


# If you only have an object, you can use `cls_ = obj.__class__`
if 'foo' in get_methods(A):
    print('A has foo')

if 'foo' in get_methods(B):
    print('B has foo')

You can also go over the class:

import inspect


def get_methods(cls_):
    methods = inspect.getmembers(cls_, inspect.isfunction)
    return dict(methods)

# Example
class A(object):
    pass

class B(object):
    def foo():
        print('B')


# If you only have an object, you can use `cls_ = obj.__class__`
if 'foo' in get_methods(A):
    print('A has foo')

if 'foo' in get_methods(B):
    print('B has foo')

回答 7

虽然在__dict__属性中检查属性确实非常快,但是您不能将其用于方法,因为它们不会出现在__dict__哈希中。但是,如果性能至关重要,则可以在课堂上采用棘手的解决方法:

class Test():
    def __init__():
        # redefine your method as attribute
        self.custom_method = self.custom_method

    def custom_method(self):
        pass

然后检查方法为:

t = Test()
'custom_method' in t.__dict__

时间比较getattr

>>%timeit 'custom_method' in t.__dict__
55.9 ns ± 0.626 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'custom_method', None)
116 ns ± 0.765 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

我并不是在鼓励这种方法,但是它似乎有效。

[EDIT]当方法名称不在给定的类中时,性能提升甚至更高:

>>%timeit 'rubbish' in t.__dict__
65.5 ns ± 11 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'rubbish', None)
385 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

While checking for attributes in __dict__ property is really fast, you cannot use this for methods, since they do not appear in __dict__ hash. You could however resort to hackish workaround in your class, if performance is that critical:

class Test():
    def __init__():
        # redefine your method as attribute
        self.custom_method = self.custom_method

    def custom_method(self):
        pass

Then check for method as:

t = Test()
'custom_method' in t.__dict__

Time comparision with getattr:

>>%timeit 'custom_method' in t.__dict__
55.9 ns ± 0.626 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'custom_method', None)
116 ns ± 0.765 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Not that I’m encouraging this approach, but it seems to work.

[EDIT] Performance boost is even higher when method name is not in given class:

>>%timeit 'rubbish' in t.__dict__
65.5 ns ± 11 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>%timeit getattr(t, 'rubbish', None)
385 ns ± 12.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Python,创建对象

问题:Python,创建对象

我正在尝试学习python,现在我试图摆脱类的困扰,以及如何使用实例操纵它们。

我似乎无法理解这个练习问题:

创建并返回其名称,年龄和专业与输入相同的学生对象

def make_student(name, age, major)

我只是不了解对象的含义,是否意味着我应该在包含这些值的函数内创建一个数组?或创建一个类,让该函数位于其中,并分配实例?(在问这个问题之前,我被要求开设一个学生班,里面要说姓名,年龄和专业)

class Student:
    name = "Unknown name"
    age = 0
    major = "Unknown major"

I’m trying to learn python and I now I am trying to get the hang of classes and how to manipulate them with instances.

I can’t seem to understand this practice problem:

Create and return a student object whose name, age, and major are the same as those given as input

def make_student(name, age, major)

I just don’t get what it means by object, do they mean I should create an array inside the function that holds these values? or create a class and let this function be inside it, and assign instances? (before this question i was asked to set up a student class with name, age, and major inside)

class Student:
    name = "Unknown name"
    age = 0
    major = "Unknown major"

回答 0

class Student(object):
    name = ""
    age = 0
    major = ""

    # The class "constructor" - It's actually an initializer 
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

def make_student(name, age, major):
    student = Student(name, age, major)
    return student

请注意,即使Python哲学中的原则之一是“应该有一个,最好只有一个,这是显而易见的方式”,但仍然有多种方式可以做到这一点。您还可以使用以下两个代码段来利用Python的动态功能:

class Student(object):
    name = ""
    age = 0
    major = ""

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    # Note: I didn't need to create a variable in the class definition before doing this.
    student.gpa = float(4.0)
    return student

我更喜欢前者,但在某些情况下后者可能有用–一种是在使用文档数据库(如MongoDB)时。

class Student(object):
    name = ""
    age = 0
    major = ""

    # The class "constructor" - It's actually an initializer 
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

def make_student(name, age, major):
    student = Student(name, age, major)
    return student

Note that even though one of the principles in Python’s philosophy is “there should be one—and preferably only one—obvious way to do it”, there are still multiple ways to do this. You can also use the two following snippets of code to take advantage of Python’s dynamic capabilities:

class Student(object):
    name = ""
    age = 0
    major = ""

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    # Note: I didn't need to create a variable in the class definition before doing this.
    student.gpa = float(4.0)
    return student

I prefer the former, but there are instances where the latter can be useful – one being when working with document databases like MongoDB.


回答 1

创建一个类并为其提供__init__方法:

class Student:
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

    def is_old(self):
        return self.age > 100

现在,您可以初始化Student该类的实例:

>>> s = Student('John', 88, None)
>>> s.name
    'John'
>>> s.age
    88

尽管我不知道make_student如果做与相同的功能为什么为什么需要一个学生函数Student.__init__

Create a class and give it an __init__ method:

class Student:
    def __init__(self, name, age, major):
        self.name = name
        self.age = age
        self.major = major

    def is_old(self):
        return self.age > 100

Now, you can initialize an instance of the Student class:

>>> s = Student('John', 88, None)
>>> s.name
    'John'
>>> s.age
    88

Although I’m not sure why you need a make_student student function if it does the same thing as Student.__init__.


回答 2

对象是类的实例。类只是对象的蓝图。因此,根据您的类定义-

# Note the added (object) - this is the preferred way of creating new classes
class Student(object):
    name = "Unknown name"
    age = 0
    major = "Unknown major"

您可以make_student通过将属性明确分配给Student– 的新实例来创建函数

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    return student

但是在构造函数(__init__)中执行此操作可能更有意义-

class Student(object):
    def __init__(self, name="Unknown name", age=0, major="Unknown major"):
        self.name = name
        self.age = age
        self.major = major

使用时会调用构造函数Student()。它将采用__init__方法中定义的参数。现在,构造函数签名实际上将是Student(name, age, major)

如果使用该make_student函数,那么函数是微不足道的(并且是多余的)-

def make_student(name, age, major):
    return Student(name, age, major)

为了好玩,这里有一个示例,说明如何在make_student不定义类的情况下创建函数。请不要在家尝试。

def make_student(name, age, major):
    return type('Student', (object,),
                {'name': name, 'age': age, 'major': major})()

Objects are instances of classes. Classes are just the blueprints for objects. So given your class definition –

# Note the added (object) - this is the preferred way of creating new classes
class Student(object):
    name = "Unknown name"
    age = 0
    major = "Unknown major"

You can create a make_student function by explicitly assigning the attributes to a new instance of Student

def make_student(name, age, major):
    student = Student()
    student.name = name
    student.age = age
    student.major = major
    return student

But it probably makes more sense to do this in a constructor (__init__) –

class Student(object):
    def __init__(self, name="Unknown name", age=0, major="Unknown major"):
        self.name = name
        self.age = age
        self.major = major

The constructor is called when you use Student(). It will take the arguments defined in the __init__ method. The constructor signature would now essentially be Student(name, age, major).

If you use that, then a make_student function is trivial (and superfluous) –

def make_student(name, age, major):
    return Student(name, age, major)

For fun, here is an example of how to create a make_student function without defining a class. Please do not try this at home.

def make_student(name, age, major):
    return type('Student', (object,),
                {'name': name, 'age': age, 'major': major})()

回答 3

使用predefine类创建对象时,首先要创建一个用于存储该对象的变量。然后,您可以创建对象并存储您创建的变量。

class Student:
     def __init__(self):

# creating an object....

   student1=Student()

实际上,此init方法是class的构造方法。您可以使用一些属性来初始化该方法。在这一点上,创建对象时,您将必须为特定属性传递一些值。

class Student:
      def __init__(self,name,age):
            self.name=value
            self.age=value

 # creating an object.......

     student2=Student("smith",25)

when you create an object using predefine class, at first you want to create a variable for storing that object. Then you can create object and store variable that you created.

class Student:
     def __init__(self):

# creating an object....

   student1=Student()

Actually this init method is the constructor of class.you can initialize that method using some attributes.. In that point , when you creating an object , you will have to pass some values for particular attributes..

class Student:
      def __init__(self,name,age):
            self.name=value
            self.age=value

 # creating an object.......

     student2=Student("smith",25)

脾气暴躁的地方有多个条件

问题:脾气暴躁的地方有多个条件

我有一组距离称为dists。我想选择两个值之间的距离。我编写了以下代码行:

 dists[(np.where(dists >= r)) and (np.where(dists <= r + dr))]

但是,这仅针对条件选择

 (np.where(dists <= r + dr))

如果我通过使用临时变量按顺序执行命令,则效果很好。为什么上面的代码不起作用,如何使它起作用?

干杯

I have an array of distances called dists. I want to select dists which are between two values. I wrote the following line of code to do that:

 dists[(np.where(dists >= r)) and (np.where(dists <= r + dr))]

However this selects only for the condition

 (np.where(dists <= r + dr))

If I do the commands sequentially by using a temporary variable it works fine. Why does the above code not work, and how do I get it to work?

Cheers


回答 0

您的特定情况下,最好的方法将两个条件更改为一个条件:

dists[abs(dists - r - dr/2.) <= dr/2.]

它仅创建一个布尔数组,在我看来是更易于阅读,因为它说,dist内部的dr还是r(尽管我将重新定义r为您感兴趣的区域的中心,而不是开始的位置,所以r = r + dr/2.)但这并不能回答您的问题。


问题的答案:如果您只是想过滤出不符合标准的元素,则
实际上并不需要:wheredists

dists[(dists >= r) & (dists <= r+dr)]

因为&将会为您提供基本元素and(括号是必需的)。

或者,如果您where出于某些原因要使用,可以执行以下操作:

 dists[(np.where((dists >= r) & (dists <= r + dr)))]

原因:
不起作用的原因是因为np.where返回的是索引列表,而不是布尔数组。您试图and在两个数字列表之间移动,这些数字当然没有您期望的True/ False值。如果ab都是两个True值,则a and b返回b。所以说这样的话[0,1,2] and [2,3,4]只会给你[2,3,4]。它在起作用:

In [230]: dists = np.arange(0,10,.5)
In [231]: r = 5
In [232]: dr = 1

In [233]: np.where(dists >= r)
Out[233]: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),)

In [234]: np.where(dists <= r+dr)
Out[234]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

In [235]: np.where(dists >= r) and np.where(dists <= r+dr)
Out[235]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

您期望比较的只是布尔数组,例如

In [236]: dists >= r
Out[236]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [237]: dists <= r + dr
Out[237]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

In [238]: (dists >= r) & (dists <= r + dr)
Out[238]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

现在,您可以调用np.where组合的布尔数组:

In [239]: np.where((dists >= r) & (dists <= r + dr))
Out[239]: (array([10, 11, 12]),)

In [240]: dists[np.where((dists >= r) & (dists <= r + dr))]
Out[240]: array([ 5. ,  5.5,  6. ])

或者使用花式索引简单地用布尔数组对原始数组进行索引

In [241]: dists[(dists >= r) & (dists <= r + dr)]
Out[241]: array([ 5. ,  5.5,  6. ])

The best way in your particular case would just be to change your two criteria to one criterion:

dists[abs(dists - r - dr/2.) <= dr/2.]

It only creates one boolean array, and in my opinion is easier to read because it says, is dist within a dr or r? (Though I’d redefine r to be the center of your region of interest instead of the beginning, so r = r + dr/2.) But that doesn’t answer your question.


The answer to your question:
You don’t actually need where if you’re just trying to filter out the elements of dists that don’t fit your criteria:

dists[(dists >= r) & (dists <= r+dr)]

Because the & will give you an elementwise and (the parentheses are necessary).

Or, if you do want to use where for some reason, you can do:

 dists[(np.where((dists >= r) & (dists <= r + dr)))]

Why:
The reason it doesn’t work is because np.where returns a list of indices, not a boolean array. You’re trying to get and between two lists of numbers, which of course doesn’t have the True/False values that you expect. If a and b are both True values, then a and b returns b. So saying something like [0,1,2] and [2,3,4] will just give you [2,3,4]. Here it is in action:

In [230]: dists = np.arange(0,10,.5)
In [231]: r = 5
In [232]: dr = 1

In [233]: np.where(dists >= r)
Out[233]: (array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19]),)

In [234]: np.where(dists <= r+dr)
Out[234]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

In [235]: np.where(dists >= r) and np.where(dists <= r+dr)
Out[235]: (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]),)

What you were expecting to compare was simply the boolean array, for example

In [236]: dists >= r
Out[236]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True], dtype=bool)

In [237]: dists <= r + dr
Out[237]: 
array([ True,  True,  True,  True,  True,  True,  True,  True,  True,
        True,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

In [238]: (dists >= r) & (dists <= r + dr)
Out[238]: 
array([False, False, False, False, False, False, False, False, False,
       False,  True,  True,  True, False, False, False, False, False,
       False, False], dtype=bool)

Now you can call np.where on the combined boolean array:

In [239]: np.where((dists >= r) & (dists <= r + dr))
Out[239]: (array([10, 11, 12]),)

In [240]: dists[np.where((dists >= r) & (dists <= r + dr))]
Out[240]: array([ 5. ,  5.5,  6. ])

Or simply index the original array with the boolean array using fancy indexing

In [241]: dists[(dists >= r) & (dists <= r + dr)]
Out[241]: array([ 5. ,  5.5,  6. ])

回答 1

公认的答案已经很好地解释了这个问题。但是,应用多个条件的Numpythonic方法更多是使用numpy逻辑函数。在这种情况下,您可以使用np.logical_and

np.where(np.logical_and(np.greater_equal(dists,r),np.greater_equal(dists,r + dr)))

The accepted answer explained the problem well enough. However, the the more Numpythonic approach for applying multiple conditions is to use numpy logical functions. In this ase you can use np.logical_and:

np.where(np.logical_and(np.greater_equal(dists,r),np.greater_equal(dists,r + dr)))

回答 2

这里要指出的一件有趣的事情是:在这种情况下,通常也可以使用ORAND的方式,但有一点点变化。代替“ and”和“ or”,而使用Ampersand(&)Pipe Operator(|),它将起作用。

当我们使用‘and’时

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) and (ar<6), 'yo', ar)

Output:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

当我们使用&符时

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) & (ar<6), 'yo', ar)

Output:
array(['3', 'yo', 'yo', '14', '2', 'yo', '3', '7'], dtype='<U11')

当我们尝试应用大熊猫Dataframe的多个过滤器时,情况也是如此。现在,其背后的原因必须与逻辑运算符和按位运算符有关,并且为了对它们有更多的了解,我建议在stackoverflow中仔细研究一下此答案或类似的Q / A。

更新

用户问,为什么需要在括号内给出(ar> 3)和(ar <6)。好吧,这就是事情。在我开始讨论这里发生的事情之前,需要了解Python中的运算符优先级。

类似于BODMAS所涉及的内容,python还优先执行应首先执行的操作。首先执行括号内的项目,然后按位运算符开始工作。我将在下面显示两种情况,当您确实使用和不使用“(”,“)”时会发生什么。

情况1:

np.where( ar>3 & ar<6, 'yo', ar)
np.where( np.array([3,4,5,14,2,4,3,7])>3 & np.array([3,4,5,14,2,4,3,7])<6, 'yo', ar)

由于这里没有括号,因此按位运算符(&)在这里变得困惑,您甚至要求它获得逻辑与,因为在运算符优先级表中(如果看到的话)&被赋予了优先于<>运算符。这是从最低优先级到最高优先级的表格。

它甚至不执行<>操作被要求执行逻辑与操作。这就是为什么它会导致该错误。

您可以查看以下链接以了解更多信息:运算符优先级

现在转到案例2:

如果您确实使用了支架,那么您会清楚地看到会发生什么。

np.where( (ar>3) & (ar<6), 'yo', ar)
np.where( (array([False,  True,  True,  True, False,  True, False,  True])) & (array([ True,  True,  True, False,  True,  True,  True, False])), 'yo', ar)

真假两个数组。而且,您可以轻松地对其执行逻辑AND操作。这给你:

np.where( array([False,  True,  True, False, False,  True, False, False]),  'yo', ar)

休息一下,np.where,对于给定的情况,在任何情况下,True都会分配第一个值(即“ yo”),如果为False,则分配另一个值(即在此保留原始值)。

就这样。我希望我能很好地解释查询。

One interesting thing to point here; the usual way of using OR and AND too will work in this case, but with a small change. Instead of “and” and instead of “or”, rather use Ampersand(&) and Pipe Operator(|) and it will work.

When we use ‘and’:

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) and (ar<6), 'yo', ar)

Output:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

When we use Ampersand(&):

ar = np.array([3,4,5,14,2,4,3,7])
np.where((ar>3) & (ar<6), 'yo', ar)

Output:
array(['3', 'yo', 'yo', '14', '2', 'yo', '3', '7'], dtype='<U11')

And this is same in the case when we are trying to apply multiple filters in case of pandas Dataframe. Now the reasoning behind this has to do something with Logical Operators and Bitwise Operators and for more understanding about same, I’d suggest to go through this answer or similar Q/A in stackoverflow.

UPDATE

A user asked, why is there a need for giving (ar>3) and (ar<6) inside the parenthesis. Well here’s the thing. Before I start talking about what’s happening here, one needs to know about Operator precedence in Python.

Similar to what BODMAS is about, python also gives precedence to what should be performed first. Items inside the parenthesis are performed first and then the bitwise operator comes to work. I’ll show below what happens in both the cases when you do use and not use “(“, “)”.

Case1:

np.where( ar>3 & ar<6, 'yo', ar)
np.where( np.array([3,4,5,14,2,4,3,7])>3 & np.array([3,4,5,14,2,4,3,7])<6, 'yo', ar)

Since there are no brackets here, the bitwise operator(&) is getting confused here that what are you even asking it to get logical AND of, because in the operator precedence table if you see, & is given precedence over < or > operators. Here’s the table from from lowest precedence to highest precedence.

It’s not even performing the < and > operation and being asked to perform a logical AND operation. So that’s why it gives that error.

One can check out the following link to learn more about: operator precedence

Now to Case 2:

If you do use the bracket, you clearly see what happens.

np.where( (ar>3) & (ar<6), 'yo', ar)
np.where( (array([False,  True,  True,  True, False,  True, False,  True])) & (array([ True,  True,  True, False,  True,  True,  True, False])), 'yo', ar)

Two arrays of True and False. And you can easily perform logical AND operation on them. Which gives you:

np.where( array([False,  True,  True, False, False,  True, False, False]),  'yo', ar)

And rest you know, np.where, for given cases, wherever True, assigns first value(i.e. here ‘yo’) and if False, the other(i.e. here, keeping the original).

That’s all. I hope I explained the query well.


回答 3

我喜欢np.vectorize用于此类任务。考虑以下:

>>> # function which returns True when constraints are satisfied.
>>> func = lambda d: d >= r and d<= (r+dr) 
>>>
>>> # Apply constraints element-wise to the dists array.
>>> result = np.vectorize(func)(dists) 
>>>
>>> result = np.where(result) # Get output.

您也可以使用np.argwhere代替以np.where获得清晰的输出。但这是您的电话:)

希望能帮助到你。

I like to use np.vectorize for such tasks. Consider the following:

>>> # function which returns True when constraints are satisfied.
>>> func = lambda d: d >= r and d<= (r+dr) 
>>>
>>> # Apply constraints element-wise to the dists array.
>>> result = np.vectorize(func)(dists) 
>>>
>>> result = np.where(result) # Get output.

You can also use np.argwhere instead of np.where for clear output. But that is your call :)

Hope it helps.


回答 4

尝试:

np.intersect1d(np.where(dists >= r)[0],np.where(dists <= r + dr)[0])

Try:

np.intersect1d(np.where(dists >= r)[0],np.where(dists <= r + dr)[0])

回答 5

这应该工作:

dists[((dists >= r) & (dists <= r+dr))]

最优雅的方式~~

This should work:

dists[((dists >= r) & (dists <= r+dr))]

The most elegant way~~


回答 6

尝试:

import numpy as np
dist = np.array([1,2,3,4,5])
r = 2
dr = 3
np.where(np.logical_and(dist> r, dist<=r+dr))

输出:(array([2,3]),)

您可以查看逻辑功能以获取更多详细信息。

Try:

import numpy as np
dist = np.array([1,2,3,4,5])
r = 2
dr = 3
np.where(np.logical_and(dist> r, dist<=r+dr))

Output: (array([2, 3]),)

You can see Logic functions for more details.


回答 7

我已经解决了这个简单的例子

import numpy as np

ar = np.array([3,4,5,14,2,4,3,7])

print [X for X in list(ar) if (X >= 3 and X <= 6)]

>>> 
[3, 4, 5, 4, 3]

I have worked out this simple example

import numpy as np

ar = np.array([3,4,5,14,2,4,3,7])

print [X for X in list(ar) if (X >= 3 and X <= 6)]

>>> 
[3, 4, 5, 4, 3]

在__init__内调用类函数

问题:在__init__内调用类函数

我正在编写一些使用文件名,打开文件并解析出一些数据的代码。我想在课堂上做到这一点。以下代码有效:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None

        def parse_file():
            #do some parsing
            self.stat1 = result_from_parse1
            self.stat2 = result_from_parse2
            self.stat3 = result_from_parse3
            self.stat4 = result_from_parse4
            self.stat5 = result_from_parse5

        parse_file()

但是,这涉及到我将所有解析机制置于__init__类的功能范围之内。现在,对于此简化的代码来说,这看起来还不错,但是该函数parse_file还具有许多缩进级别。我更喜欢将函数定义parse_file()为类函数,如下所示:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        parse_file()

    def parse_file():
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

当然,此代码不起作用,因为该函数parse_file()不在函数范围内__init__。有没有办法从该类内部调用类函数__init__?还是我想这是错误的方式?

I’m writing some code that takes a filename, opens the file, and parses out some data. I’d like to do this in a class. The following code works:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None

        def parse_file():
            #do some parsing
            self.stat1 = result_from_parse1
            self.stat2 = result_from_parse2
            self.stat3 = result_from_parse3
            self.stat4 = result_from_parse4
            self.stat5 = result_from_parse5

        parse_file()

But it involves me putting all of the parsing machinery in the scope of the __init__ function for my class. That looks fine now for this simplified code, but the function parse_file has quite a few levels of indention as well. I’d prefer to define the function parse_file() as a class function like below:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        parse_file()

    def parse_file():
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

Of course this code doesn’t work because the function parse_file() is not within the scope of the __init__ function. Is there a way to call a class function from within __init__ of that class? Or am I thinking about this the wrong way?


回答 0

以这种方式调用该函数:

self.parse_file()

您还需要像这样定义parse_file()函数:

def parse_file(self):

parse_file方法必须在调用时绑定到对象(因为它不是静态方法)。这是通过在对象的实例上调用函数来完成的(在您的情况下,实例是)self

Call the function in this way:

self.parse_file()

You also need to define your parse_file() function like this:

def parse_file(self):

The parse_file method has to be bound to an object upon calling it (because it’s not a static method). This is done by calling the function on an instance of the object, in your case the instance is self.


回答 1

如果我没记错的话,这两个函数都是您的类的一部分,则应像这样使用它:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        self.parse_file()

    def parse_file(self):
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

替换行:

parse_file() 

与:

self.parse_file()

If I’m not wrong, both functions are part of your class, you should use it like this:

class MyClass():
    def __init__(self, filename):
        self.filename = filename 

        self.stat1 = None
        self.stat2 = None
        self.stat3 = None
        self.stat4 = None
        self.stat5 = None
        self.parse_file()

    def parse_file(self):
        #do some parsing
        self.stat1 = result_from_parse1
        self.stat2 = result_from_parse2
        self.stat3 = result_from_parse3
        self.stat4 = result_from_parse4
        self.stat5 = result_from_parse5

replace your line:

parse_file() 

with:

self.parse_file()

回答 2

怎么样:

class MyClass(object):
    def __init__(self, filename):
        self.filename = filename 
        self.stats = parse_file(filename)

def parse_file(filename):
    #do some parsing
    return results_from_parse

顺便说一句,如果有一个名为变量stat1stat2等等,情况正在乞求一个元组: stats = (...)

因此,让我们parse_file返回一个元组,并将其存储在中 self.stats

然后,例如,您可以访问曾经使用调用stat3的内容self.stats[2]

How about:

class MyClass(object):
    def __init__(self, filename):
        self.filename = filename 
        self.stats = parse_file(filename)

def parse_file(filename):
    #do some parsing
    return results_from_parse

By the way, if you have variables named stat1, stat2, etc., the situation is begging for a tuple: stats = (...).

So let parse_file return a tuple, and store the tuple in self.stats.

Then, for example, you can access what used to be called stat3 with self.stats[2].


回答 3

在中parse_file,接受self参数(与中一样__init__)。如果您需要任何其他上下文,则只需照常将其作为附加参数传递。

In parse_file, take the self argument (just like in __init__). If there’s any other context you need then just pass it as additional arguments as usual.


回答 4

您必须像这样声明parse_file; def parse_file(self)。在大多数语言中,“ self”参数是一个隐藏参数,但在python中则不是。您必须将其添加到属于一个类的所有方法的定义中。然后您可以使用以下方法从类中的任何方法调用该函数self.parse_file

您的最终程序将如下所示:

class MyClass():
  def __init__(self, filename):
      self.filename = filename 

      self.stat1 = None
      self.stat2 = None
      self.stat3 = None
      self.stat4 = None
      self.stat5 = None
      self.parse_file()

  def parse_file(self):
      #do some parsing
      self.stat1 = result_from_parse1
      self.stat2 = result_from_parse2
      self.stat3 = result_from_parse3
      self.stat4 = result_from_parse4
      self.stat5 = result_from_parse5

You must declare parse_file like this; def parse_file(self). The “self” parameter is a hidden parameter in most languages, but not in python. You must add it to the definition of all that methods that belong to a class. Then you can call the function from any method inside the class using self.parse_file

your final program is going to look like this:

class MyClass():
  def __init__(self, filename):
      self.filename = filename 

      self.stat1 = None
      self.stat2 = None
      self.stat3 = None
      self.stat4 = None
      self.stat5 = None
      self.parse_file()

  def parse_file(self):
      #do some parsing
      self.stat1 = result_from_parse1
      self.stat2 = result_from_parse2
      self.stat3 = result_from_parse3
      self.stat4 = result_from_parse4
      self.stat5 = result_from_parse5

回答 5

我认为您的问题实际上是没有正确缩进init函数,应该是这样的

class MyClass():
     def __init__(self, filename):
          pass

     def parse_file():
          pass

I think that your problem is actually with not correctly indenting init function.It should be like this

class MyClass():
     def __init__(self, filename):
          pass

     def parse_file():
          pass

生成具有给定(数字)分布的随机数

问题:生成具有给定(数字)分布的随机数

我有一个具有不同值的概率的文件,例如:

1 0.1
2 0.05
3 0.05
4 0.2
5 0.4
6 0.2

我想使用此分布生成随机数。是否存在处理此问题的现有模块?自己编写代码是很简单的(构建累积密度函数,生成随机值[0,1]并选择相应的值),但这似乎是一个常见问题,可能有人为它创建了一个函数/模块它。

我需要这个,因为我想生成一个生日列表(不遵循标准random模块中的任何分布)。

I have a file with some probabilities for different values e.g.:

1 0.1
2 0.05
3 0.05
4 0.2
5 0.4
6 0.2

I would like to generate random numbers using this distribution. Does an existing module that handles this exist? It’s fairly simple to code on your own (build the cumulative density function, generate a random value [0,1] and pick the corresponding value) but it seems like this should be a common problem and probably someone has created a function/module for it.

I need this because I want to generate a list of birthdays (which do not follow any distribution in the standard random module).


回答 0

scipy.stats.rv_discrete可能就是您想要的。您可以通过values参数提供概率。然后,您可以使用rvs()分发对象的方法来生成随机数。

正如Eugene Pakhomov在评论中指出的那样,您还可以将p关键字参数传递给numpy.random.choice(),例如

numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

如果您使用的是Python 3.6或更高版本,则可以random.choices()在标准库中使用–请参见Mark Dickinson答案

scipy.stats.rv_discrete might be what you want. You can supply your probabilities via the values parameter. You can then use the rvs() method of the distribution object to generate random numbers.

As pointed out by Eugene Pakhomov in the comments, you can also pass a p keyword parameter to numpy.random.choice(), e.g.

numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

If you are using Python 3.6 or above, you can use random.choices() from the standard library – see the answer by Mark Dickinson.


回答 1

从Python 3.6开始,Python的标准库中提供了一个解决方案random.choices

用法示例:让我们设置与OP中的问题相匹配的总体和权重:

>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]

现在choices(population, weights)生成一个样本:

>>> choices(population, weights)
4

可选的仅关键字参数k允许一个参数一次请求多个样本。这很有价值,因为random.choices在生成任何样本之前,每次调用时都要做一些准备工作。通过一次生成许多样本,我们只需要做一次准备工作。在这里,我们生成了一百万个样本,并collections.Counter用来检查我们得到的分布与我们赋予的权重大致匹配。

>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})

Since Python 3.6, there’s a solution for this in Python’s standard library, namely random.choices.

Example usage: let’s set up a population and weights matching those in the OP’s question:

>>> from random import choices
>>> population = [1, 2, 3, 4, 5, 6]
>>> weights = [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]

Now choices(population, weights) generates a single sample:

>>> choices(population, weights)
4

The optional keyword-only argument k allows one to request more than one sample at once. This is valuable because there’s some preparatory work that random.choices has to do every time it’s called, prior to generating any samples; by generating many samples at once, we only have to do that preparatory work once. Here we generate a million samples, and use collections.Counter to check that the distribution we get roughly matches the weights we gave.

>>> million_samples = choices(population, weights, k=10**6)
>>> from collections import Counter
>>> Counter(million_samples)
Counter({5: 399616, 6: 200387, 4: 200117, 1: 99636, 3: 50219, 2: 50025})

回答 2

使用CDF生成列表的一个优点是可以使用二进制搜索。当您需要O(n)的时间和空间进行预处理时,您可以在O(k log n)中获得k个数字。由于普通的Python列表效率低下,因此可以使用array模块。

如果您坚持使用恒定的空间,则可以执行以下操作;O(n)时间,O(1)空间。

def random_distr(l):
    r = random.uniform(0, 1)
    s = 0
    for item, prob in l:
        s += prob
        if s >= r:
            return item
    return item  # Might occur because of floating point inaccuracies

An advantage to generating the list using CDF is that you can use binary search. While you need O(n) time and space for preprocessing, you can get k numbers in O(k log n). Since normal Python lists are inefficient, you can use array module.

If you insist on constant space, you can do the following; O(n) time, O(1) space.

def random_distr(l):
    r = random.uniform(0, 1)
    s = 0
    for item, prob in l:
        s += prob
        if s >= r:
            return item
    return item  # Might occur because of floating point inaccuracies

回答 3

也许有点晚了。但是您可以使用numpy.random.choice()传递p参数:

val = numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

Maybe it is kind of late. But you can use numpy.random.choice(), passing the p parameter:

val = numpy.random.choice(numpy.arange(1, 7), p=[0.1, 0.05, 0.05, 0.2, 0.4, 0.2])

回答 4

(好吧,我知道您正在要求收缩包装,但是也许这些自制的解决方案还不够简洁,无法满足您的喜好。:-)

pdf = [(1, 0.1), (2, 0.05), (3, 0.05), (4, 0.2), (5, 0.4), (6, 0.2)]
cdf = [(i, sum(p for j,p in pdf if j < i)) for i,_ in pdf]
R = max(i for r in [random.random()] for i,c in cdf if c <= r)

我通过确认此表达式的输出来伪确认此方法有效:

sorted(max(i for r in [random.random()] for i,c in cdf if c <= r)
       for _ in range(1000))

(OK, I know you are asking for shrink-wrap, but maybe those home-grown solutions just weren’t succinct enough for your liking. :-)

pdf = [(1, 0.1), (2, 0.05), (3, 0.05), (4, 0.2), (5, 0.4), (6, 0.2)]
cdf = [(i, sum(p for j,p in pdf if j < i)) for i,_ in pdf]
R = max(i for r in [random.random()] for i,c in cdf if c <= r)

I pseudo-confirmed that this works by eyeballing the output of this expression:

sorted(max(i for r in [random.random()] for i,c in cdf if c <= r)
       for _ in range(1000))

回答 5

我写了一个从自定义连续分布中抽取随机样本的解决方案。

我需要一个与您的用例类似的用例(即生成具有给定概率分布的随机日期)。

您只需要功能random_custDist和功能samples=random_custDist(x0,x1,custDist=custDist,size=1000)。剩下的就是装饰^^。

import numpy as np

#funtion
def random_custDist(x0,x1,custDist,size=None, nControl=10**6):
    #genearte a list of size random samples, obeying the distribution custDist
    #suggests random samples between x0 and x1 and accepts the suggestion with probability custDist(x)
    #custDist noes not need to be normalized. Add this condition to increase performance. 
    #Best performance for max_{x in [x0,x1]} custDist(x) = 1
    samples=[]
    nLoop=0
    while len(samples)<size and nLoop<nControl:
        x=np.random.uniform(low=x0,high=x1)
        prop=custDist(x)
        assert prop>=0 and prop<=1
        if np.random.uniform(low=0,high=1) <=prop:
            samples += [x]
        nLoop+=1
    return samples

#call
x0=2007
x1=2019
def custDist(x):
    if x<2010:
        return .3
    else:
        return (np.exp(x-2008)-1)/(np.exp(2019-2007)-1)
samples=random_custDist(x0,x1,custDist=custDist,size=1000)
print(samples)

#plot
import matplotlib.pyplot as plt
#hist
bins=np.linspace(x0,x1,int(x1-x0+1))
hist=np.histogram(samples, bins )[0]
hist=hist/np.sum(hist)
plt.bar( (bins[:-1]+bins[1:])/2, hist, width=.96, label='sample distribution')
#dist
grid=np.linspace(x0,x1,100)
discCustDist=np.array([custDist(x) for x in grid]) #distrete version
discCustDist*=1/(grid[1]-grid[0])/np.sum(discCustDist)
plt.plot(grid,discCustDist,label='custom distribustion (custDist)', color='C1', linewidth=4)
#decoration
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()

该解决方案的性能肯定可以提高,但是我更喜欢可读性。

I wrote a solution for drawing random samples from a custom continuous distribution.

I needed this for a similar use-case to yours (i.e. generating random dates with a given probability distribution).

You just need the funtion random_custDist and the line samples=random_custDist(x0,x1,custDist=custDist,size=1000). The rest is decoration ^^.

import numpy as np

#funtion
def random_custDist(x0,x1,custDist,size=None, nControl=10**6):
    #genearte a list of size random samples, obeying the distribution custDist
    #suggests random samples between x0 and x1 and accepts the suggestion with probability custDist(x)
    #custDist noes not need to be normalized. Add this condition to increase performance. 
    #Best performance for max_{x in [x0,x1]} custDist(x) = 1
    samples=[]
    nLoop=0
    while len(samples)<size and nLoop<nControl:
        x=np.random.uniform(low=x0,high=x1)
        prop=custDist(x)
        assert prop>=0 and prop<=1
        if np.random.uniform(low=0,high=1) <=prop:
            samples += [x]
        nLoop+=1
    return samples

#call
x0=2007
x1=2019
def custDist(x):
    if x<2010:
        return .3
    else:
        return (np.exp(x-2008)-1)/(np.exp(2019-2007)-1)
samples=random_custDist(x0,x1,custDist=custDist,size=1000)
print(samples)

#plot
import matplotlib.pyplot as plt
#hist
bins=np.linspace(x0,x1,int(x1-x0+1))
hist=np.histogram(samples, bins )[0]
hist=hist/np.sum(hist)
plt.bar( (bins[:-1]+bins[1:])/2, hist, width=.96, label='sample distribution')
#dist
grid=np.linspace(x0,x1,100)
discCustDist=np.array([custDist(x) for x in grid]) #distrete version
discCustDist*=1/(grid[1]-grid[0])/np.sum(discCustDist)
plt.plot(grid,discCustDist,label='custom distribustion (custDist)', color='C1', linewidth=4)
#decoration
plt.legend(loc=3,bbox_to_anchor=(1,0))
plt.show()

The performance of this solution is improvable for sure, but I prefer readability.


回答 6

根据以下内容列出项目weights

items = [1, 2, 3, 4, 5, 6]
probabilities= [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
# if the list of probs is normalized (sum(probs) == 1), omit this part
prob = sum(probabilities) # find sum of probs, to normalize them
c = (1.0)/prob # a multiplier to make a list of normalized probs
probabilities = map(lambda x: c*x, probabilities)
print probabilities

ml = max(probabilities, key=lambda x: len(str(x)) - str(x).find('.'))
ml = len(str(ml)) - str(ml).find('.') -1
amounts = [ int(x*(10**ml)) for x in probabilities]
itemsList = list()
for i in range(0, len(items)): # iterate through original items
  itemsList += items[i:i+1]*amounts[i]

# choose from itemsList randomly
print itemsList

一种优化可能是通过最大公约数对数量进行归一化,以使目标列表更小。

另外,可能很有趣。

Make a list of items, based on their weights:

items = [1, 2, 3, 4, 5, 6]
probabilities= [0.1, 0.05, 0.05, 0.2, 0.4, 0.2]
# if the list of probs is normalized (sum(probs) == 1), omit this part
prob = sum(probabilities) # find sum of probs, to normalize them
c = (1.0)/prob # a multiplier to make a list of normalized probs
probabilities = map(lambda x: c*x, probabilities)
print probabilities

ml = max(probabilities, key=lambda x: len(str(x)) - str(x).find('.'))
ml = len(str(ml)) - str(ml).find('.') -1
amounts = [ int(x*(10**ml)) for x in probabilities]
itemsList = list()
for i in range(0, len(items)): # iterate through original items
  itemsList += items[i:i+1]*amounts[i]

# choose from itemsList randomly
print itemsList

An optimization may be to normalize amounts by the greatest common divisor, to make the target list smaller.

Also, this might be interesting.


回答 7

另一个答案,可能更快:)

distribution = [(1, 0.2), (2, 0.3), (3, 0.5)]  
# init distribution  
dlist = []  
sumchance = 0  
for value, chance in distribution:  
    sumchance += chance  
    dlist.append((value, sumchance))  
assert sumchance == 1.0 # not good assert because of float equality  

# get random value  
r = random.random()  
# for small distributions use lineair search  
if len(distribution) < 64: # don't know exact speed limit  
    for value, sumchance in dlist:  
        if r < sumchance:  
            return value  
else:  
    # else (not implemented) binary search algorithm  

Another answer, probably faster :)

distribution = [(1, 0.2), (2, 0.3), (3, 0.5)]  
# init distribution  
dlist = []  
sumchance = 0  
for value, chance in distribution:  
    sumchance += chance  
    dlist.append((value, sumchance))  
assert sumchance == 1.0 # not good assert because of float equality  

# get random value  
r = random.random()  
# for small distributions use lineair search  
if len(distribution) < 64: # don't know exact speed limit  
    for value, sumchance in dlist:  
        if r < sumchance:  
            return value  
else:  
    # else (not implemented) binary search algorithm  

回答 8

from __future__ import division
import random
from collections import Counter


def num_gen(num_probs):
    # calculate minimum probability to normalize
    min_prob = min(prob for num, prob in num_probs)
    lst = []
    for num, prob in num_probs:
        # keep appending num to lst, proportional to its probability in the distribution
        for _ in range(int(prob/min_prob)):
            lst.append(num)
    # all elems in lst occur proportional to their distribution probablities
    while True:
        # pick a random index from lst
        ind = random.randint(0, len(lst)-1)
        yield lst[ind]

验证:

gen = num_gen([(1, 0.1),
               (2, 0.05),
               (3, 0.05),
               (4, 0.2),
               (5, 0.4),
               (6, 0.2)])
lst = []
times = 10000
for _ in range(times):
    lst.append(next(gen))
# Verify the created distribution:
for item, count in Counter(lst).iteritems():
    print '%d has %f probability' % (item, count/times)

1 has 0.099737 probability
2 has 0.050022 probability
3 has 0.049996 probability 
4 has 0.200154 probability
5 has 0.399791 probability
6 has 0.200300 probability
from __future__ import division
import random
from collections import Counter


def num_gen(num_probs):
    # calculate minimum probability to normalize
    min_prob = min(prob for num, prob in num_probs)
    lst = []
    for num, prob in num_probs:
        # keep appending num to lst, proportional to its probability in the distribution
        for _ in range(int(prob/min_prob)):
            lst.append(num)
    # all elems in lst occur proportional to their distribution probablities
    while True:
        # pick a random index from lst
        ind = random.randint(0, len(lst)-1)
        yield lst[ind]

Verification:

gen = num_gen([(1, 0.1),
               (2, 0.05),
               (3, 0.05),
               (4, 0.2),
               (5, 0.4),
               (6, 0.2)])
lst = []
times = 10000
for _ in range(times):
    lst.append(next(gen))
# Verify the created distribution:
for item, count in Counter(lst).iteritems():
    print '%d has %f probability' % (item, count/times)

1 has 0.099737 probability
2 has 0.050022 probability
3 has 0.049996 probability 
4 has 0.200154 probability
5 has 0.399791 probability
6 has 0.200300 probability

回答 9

根据其他解决方案,您可以生成累积分布(任意形式为整数或浮点数),然后可以使用二等分来使其快速

这是一个简单的示例(我在这里使用了整数)

l=[(20, 'foo'), (60, 'banana'), (10, 'monkey'), (10, 'monkey2')]
def get_cdf(l):
    ret=[]
    c=0
    for i in l: c+=i[0]; ret.append((c, i[1]))
    return ret

def get_random_item(cdf):
    return cdf[bisect.bisect_left(cdf, (random.randint(0, cdf[-1][0]),))][1]

cdf=get_cdf(l)
for i in range(100): print get_random_item(cdf),

get_cdf函数会将其从20、60、10、10转换为20、20 + 60、20 + 60 + 10、20 + 60 + 10 + 10

现在我们使用选取最大为20 + 60 + 10 + 10的随机数,random.randint然后使用bisect快速获取实际值

based on other solutions, you generate accumulative distribution (as integer or float whatever you like), then you can use bisect to make it fast

this is a simple example (I used integers here)

l=[(20, 'foo'), (60, 'banana'), (10, 'monkey'), (10, 'monkey2')]
def get_cdf(l):
    ret=[]
    c=0
    for i in l: c+=i[0]; ret.append((c, i[1]))
    return ret

def get_random_item(cdf):
    return cdf[bisect.bisect_left(cdf, (random.randint(0, cdf[-1][0]),))][1]

cdf=get_cdf(l)
for i in range(100): print get_random_item(cdf),

the get_cdf function would convert it from 20, 60, 10, 10 into 20, 20+60, 20+60+10, 20+60+10+10

now we pick a random number up to 20+60+10+10 using random.randint then we use bisect to get the actual value in a fast way


回答 10

您可能想看看NumPy 随机抽样分布

you might want to have a look at NumPy Random sampling distributions


回答 11

这些答案都不是特别清楚或简单的。

这是保证可以正常工作的一种清晰,简单的方法。

accumulate_normalize_probabilities采用字典p将符号映射到概率频率。它输出要选择的元组的可用列表。

def accumulate_normalize_values(p):
        pi = p.items() if isinstance(p,dict) else p
        accum_pi = []
        accum = 0
        for i in pi:
                accum_pi.append((i[0],i[1]+accum))
                accum += i[1]
        if accum == 0:
                raise Exception( "You are about to explode the universe. Continue ? Y/N " )
        normed_a = []
        for a in accum_pi:
                normed_a.append((a[0],a[1]*1.0/accum))
        return normed_a

Yield:

>>> accumulate_normalize_values( { 'a': 100, 'b' : 300, 'c' : 400, 'd' : 200  } )
[('a', 0.1), ('c', 0.5), ('b', 0.8), ('d', 1.0)]

为什么运作

所述积累步骤变成每个符号到(在第一符号的情况下或0)本身和先前符号概率或频率之间的间隔。通过简单地逐步遍历列表,直到间隔0.0-> 1.0中的随机数(之前已准备好)小于或等于当前符号的间隔端点,可以使用这些间隔进行选择(从而对提供的分布进行采样)。

规范化释放我们从需求,以确保一切资金以一定的价值。归一化后,概率的“向量”总计为1.0。

下面用于从分布中选择并生成任意长样本的其余代码

def select(symbol_intervals,random):
        print symbol_intervals,random
        i = 0
        while random > symbol_intervals[i][1]:
                i += 1
                if i >= len(symbol_intervals):
                        raise Exception( "What did you DO to that poor list?" )
        return symbol_intervals[i][0]


def gen_random(alphabet,length,probabilities=None):
        from random import random
        from itertools import repeat
        if probabilities is None:
                probabilities = dict(zip(alphabet,repeat(1.0)))
        elif len(probabilities) > 0 and isinstance(probabilities[0],(int,long,float)):
                probabilities = dict(zip(alphabet,probabilities)) #ordered
        usable_probabilities = accumulate_normalize_values(probabilities)
        gen = []
        while len(gen) < length:
                gen.append(select(usable_probabilities,random()))
        return gen

用法:

>>> gen_random (['a','b','c','d'],10,[100,300,400,200])
['d', 'b', 'b', 'a', 'c', 'c', 'b', 'c', 'c', 'c']   #<--- some of the time

None of these answers is particularly clear or simple.

Here is a clear, simple method that is guaranteed to work.

accumulate_normalize_probabilities takes a dictionary p that maps symbols to probabilities OR frequencies. It outputs usable list of tuples from which to do selection.

def accumulate_normalize_values(p):
        pi = p.items() if isinstance(p,dict) else p
        accum_pi = []
        accum = 0
        for i in pi:
                accum_pi.append((i[0],i[1]+accum))
                accum += i[1]
        if accum == 0:
                raise Exception( "You are about to explode the universe. Continue ? Y/N " )
        normed_a = []
        for a in accum_pi:
                normed_a.append((a[0],a[1]*1.0/accum))
        return normed_a

Yields:

>>> accumulate_normalize_values( { 'a': 100, 'b' : 300, 'c' : 400, 'd' : 200  } )
[('a', 0.1), ('c', 0.5), ('b', 0.8), ('d', 1.0)]

Why it works

The accumulation step turns each symbol into an interval between itself and the previous symbols probability or frequency (or 0 in the case of the first symbol). These intervals can be used to select from (and thus sample the provided distribution) by simply stepping through the list until the random number in interval 0.0 -> 1.0 (prepared earlier) is less or equal to the current symbol’s interval end-point.

The normalization releases us from the need to make sure everything sums to some value. After normalization the “vector” of probabilities sums to 1.0.

The rest of the code for selection and generating a arbitrarily long sample from the distribution is below :

def select(symbol_intervals,random):
        print symbol_intervals,random
        i = 0
        while random > symbol_intervals[i][1]:
                i += 1
                if i >= len(symbol_intervals):
                        raise Exception( "What did you DO to that poor list?" )
        return symbol_intervals[i][0]


def gen_random(alphabet,length,probabilities=None):
        from random import random
        from itertools import repeat
        if probabilities is None:
                probabilities = dict(zip(alphabet,repeat(1.0)))
        elif len(probabilities) > 0 and isinstance(probabilities[0],(int,long,float)):
                probabilities = dict(zip(alphabet,probabilities)) #ordered
        usable_probabilities = accumulate_normalize_values(probabilities)
        gen = []
        while len(gen) < length:
                gen.append(select(usable_probabilities,random()))
        return gen

Usage :

>>> gen_random (['a','b','c','d'],10,[100,300,400,200])
['d', 'b', 'b', 'a', 'c', 'c', 'b', 'c', 'c', 'c']   #<--- some of the time

回答 12

这是一种更有效的方法

只需使用您的“权重”数组(假定索引为对应项)和否调用以下函数。需要的样本数。可以轻松修改此功能以处理有序对。

使用各自的概率返回采样/挑选(替换)的索引(或项目):

def resample(weights, n):
    beta = 0

    # Caveat: Assign max weight to max*2 for best results
    max_w = max(weights)*2

    # Pick an item uniformly at random, to start with
    current_item = random.randint(0,n-1)
    result = []

    for i in range(n):
        beta += random.uniform(0,max_w)

        while weights[current_item] < beta:
            beta -= weights[current_item]
            current_item = (current_item + 1) % n   # cyclic
        else:
            result.append(current_item)
    return result

关于while循环中使用的概念的简短说明。我们从累积beta减少当前项目的权重,该累积值是随机统一构造的累积值,并增加当前索引以找到其权重与beta值匹配的项目。

Here is a more effective way of doing this:

Just call the following function with your ‘weights’ array (assuming the indices as the corresponding items) and the no. of samples needed. This function can be easily modified to handle ordered pair.

Returns indexes (or items) sampled/picked (with replacement) using their respective probabilities:

def resample(weights, n):
    beta = 0

    # Caveat: Assign max weight to max*2 for best results
    max_w = max(weights)*2

    # Pick an item uniformly at random, to start with
    current_item = random.randint(0,n-1)
    result = []

    for i in range(n):
        beta += random.uniform(0,max_w)

        while weights[current_item] < beta:
            beta -= weights[current_item]
            current_item = (current_item + 1) % n   # cyclic
        else:
            result.append(current_item)
    return result

A short note on the concept used in the while loop. We reduce the current item’s weight from cumulative beta, which is a cumulative value constructed uniformly at random, and increment current index in order to find the item, the weight of which matches the value of beta.


如何仅列出Python中的顶级目录?

问题:如何仅列出Python中的顶级目录?

我希望仅列出某个文件夹内的目录。这意味着我既不需要列出文件名,也不需要其他子文件夹。

让我们看看一个例子是否有帮助。在当前目录中,我们有:

>>> os.listdir(os.getcwd())
['cx_Oracle-doc', 'DLLs', 'Doc', 'include', 'Lib', 'libs', 'LICENSE.txt', 'mod_p
ython-wininst.log', 'NEWS.txt', 'pymssql-wininst.log', 'python.exe', 'pythonw.ex
e', 'README.txt', 'Removemod_python.exe', 'Removepymssql.exe', 'Scripts', 'tcl',
 'Tools', 'w9xpopen.exe']

但是,我不想列出文件名。我也不需要子文件夹,例如\ Lib \ curses。本质上,我想要的东西适用于以下情况:

>>> for root, dirnames, filenames in os.walk('.'):
...     print dirnames
...     break
...
['cx_Oracle-doc', 'DLLs', 'Doc', 'include', 'Lib', 'libs', 'Scripts', 'tcl', 'Tools']

但是,我想知道是否有一种更简单的方法来获得相同的结果。我得到的印象是仅使用os.walk返回顶级是无效的/太多了。

I want to be able to list only the directories inside some folder. This means I don’t want filenames listed, nor do I want additional sub-folders.

Let’s see if an example helps. In the current directory we have:

>>> os.listdir(os.getcwd())
['cx_Oracle-doc', 'DLLs', 'Doc', 'include', 'Lib', 'libs', 'LICENSE.txt', 'mod_p
ython-wininst.log', 'NEWS.txt', 'pymssql-wininst.log', 'python.exe', 'pythonw.ex
e', 'README.txt', 'Removemod_python.exe', 'Removepymssql.exe', 'Scripts', 'tcl',
 'Tools', 'w9xpopen.exe']

However, I don’t want filenames listed. Nor do I want sub-folders such as \Lib\curses. Essentially what I want works with the following:

>>> for root, dirnames, filenames in os.walk('.'):
...     print dirnames
...     break
...
['cx_Oracle-doc', 'DLLs', 'Doc', 'include', 'Lib', 'libs', 'Scripts', 'tcl', 'Tools']

However, I’m wondering if there’s a simpler way of achieving the same results. I get the impression that using os.walk only to return the top level is inefficient/too much.


回答 0

使用os.path.isdir()过滤结果(并使用os.path.join()获得真实路径):

>>> [ name for name in os.listdir(thedir) if os.path.isdir(os.path.join(thedir, name)) ]
['ctypes', 'distutils', 'encodings', 'lib-tk', 'config', 'idlelib', 'xml', 'bsddb', 'hotshot', 'logging', 'doc', 'test', 'compiler', 'curses', 'site-packages', 'email', 'sqlite3', 'lib-dynload', 'wsgiref', 'plat-linux2', 'plat-mac']

Filter the result using os.path.isdir() (and use os.path.join() to get the real path):

>>> [ name for name in os.listdir(thedir) if os.path.isdir(os.path.join(thedir, name)) ]
['ctypes', 'distutils', 'encodings', 'lib-tk', 'config', 'idlelib', 'xml', 'bsddb', 'hotshot', 'logging', 'doc', 'test', 'compiler', 'curses', 'site-packages', 'email', 'sqlite3', 'lib-dynload', 'wsgiref', 'plat-linux2', 'plat-mac']

回答 1

步行

os.walknext项目功能一起使用:

next(os.walk('.'))[1]

对于Python <= 2.5,请使用:

os.walk('.').next()[1]

如何运作

os.walk是一个生成器,调用next将以3元组(目录路径,目录名,文件名)的形式获取第一个结果。因此,[1]索引仅返回dirnames该元组的。

os.walk

Use os.walk with next item function:

next(os.walk('.'))[1]

For Python <=2.5 use:

os.walk('.').next()[1]

How this works

os.walk is a generator and calling next will get the first result in the form of a 3-tuple (dirpath, dirnames, filenames). Thus the [1] index returns only the dirnames from that tuple.


回答 2

使用os.path.isdir筛选列表以检测目录。

filter(os.path.isdir, os.listdir(os.getcwd()))

Filter the list using os.path.isdir to detect directories.

filter(os.path.isdir, os.listdir(os.getcwd()))

回答 3

directories=[d for d in os.listdir(os.getcwd()) if os.path.isdir(d)]
directories=[d for d in os.listdir(os.getcwd()) if os.path.isdir(d)]

回答 4

请注意,os.listdir(os.getcwd())最好不要这样做,而要这样做os.listdir(os.path.curdir)。少调用一个函数,它具有可移植性。

因此,要完成答案,请获取文件夹中的目录列表:

def listdirs(folder):
    return [d for d in os.listdir(folder) if os.path.isdir(os.path.join(folder, d))]

如果您希望使用完整路径名,请使用以下功能:

def listdirs(folder):
    return [
        d for d in (os.path.join(folder, d1) for d1 in os.listdir(folder))
        if os.path.isdir(d)
    ]

Note that, instead of doing os.listdir(os.getcwd()), it’s preferable to do os.listdir(os.path.curdir). One less function call, and it’s as portable.

So, to complete the answer, to get a list of directories in a folder:

def listdirs(folder):
    return [d for d in os.listdir(folder) if os.path.isdir(os.path.join(folder, d))]

If you prefer full pathnames, then use this function:

def listdirs(folder):
    return [
        d for d in (os.path.join(folder, d1) for d1 in os.listdir(folder))
        if os.path.isdir(d)
    ]

回答 5

这似乎也起作用(至少在Linux上):

import glob, os
glob.glob('*' + os.path.sep)

This seems to work too (at least on linux):

import glob, os
glob.glob('*' + os.path.sep)

回答 6

只是要补充一点,使用os.listdir()不会“比非常简单的os.walk()。next()[1]花费更多的处理时间”)。这是因为os.walk()在内部使用os.listdir()。实际上,如果您一起测试它们:

>>>> import timeit
>>>> timeit.timeit("os.walk('.').next()[1]", "import os", number=10000)
1.1215229034423828
>>>> timeit.timeit("[ name for name in os.listdir('.') if os.path.isdir(os.path.join('.', name)) ]", "import os", number=10000)
1.0592019557952881

os.listdir()的过滤非常快。

Just to add that using os.listdir() does not “take a lot of processing vs very simple os.walk().next()[1]”. This is because os.walk() uses os.listdir() internally. In fact if you test them together:

>>>> import timeit
>>>> timeit.timeit("os.walk('.').next()[1]", "import os", number=10000)
1.1215229034423828
>>>> timeit.timeit("[ name for name in os.listdir('.') if os.path.isdir(os.path.join('.', name)) ]", "import os", number=10000)
1.0592019557952881

The filtering of os.listdir() is very slightly faster.


回答 7

一种非常简单而优雅的方法是使用此方法:

 import os
 dir_list = os.walk('.').next()[1]
 print dir_list

在需要文件夹名称的同一文件夹中运行此脚本,它将仅为您提供直接的文件夹名称(也没有文件夹的完整路径)。

A very much simpler and elegant way is to use this:

 import os
 dir_list = os.walk('.').next()[1]
 print dir_list

Run this script in the same folder for which you want folder names.It will give you exactly the immediate folders name only(that too without the full path of the folders).


回答 8

使用列表理解

[a for a in os.listdir() if os.path.isdir(a)]

我认为这是最简单的方法

Using list comprehension,

[a for a in os.listdir() if os.path.isdir(a)]

I think It is the simplest way


回答 9

作为一个新手,我还不能直接发表评论,但这是我想补充到ΤζΩΤζΙΟΥ的以下部分的一个小更正:

如果您希望使用完整路径名,请使用以下功能:

def listdirs(folder):  
  return [
    d for d in (os.path.join(folder, d1) for d1 in os.listdir(folder))
    if os.path.isdir(d)
]

对于仍然使用python <2.4的用户:内部构造需要是列表而不是元组,因此应如下所示:

def listdirs(folder):  
  return [
    d for d in [os.path.join(folder, d1) for d1 in os.listdir(folder)]
    if os.path.isdir(d)
  ]

否则会出现语法错误。

being a newbie here i can’t yet directly comment but here is a small correction i’d like to add to the following part of ΤΖΩΤΖΙΟΥ’s answer :

If you prefer full pathnames, then use this function:

def listdirs(folder):  
  return [
    d for d in (os.path.join(folder, d1) for d1 in os.listdir(folder))
    if os.path.isdir(d)
]

for those still on python < 2.4: the inner construct needs to be a list instead of a tuple and therefore should read like this:

def listdirs(folder):  
  return [
    d for d in [os.path.join(folder, d1) for d1 in os.listdir(folder)]
    if os.path.isdir(d)
  ]

otherwise one gets a syntax error.


回答 10

[x for x in os.listdir(somedir) if os.path.isdir(os.path.join(somedir, x))]
[x for x in os.listdir(somedir) if os.path.isdir(os.path.join(somedir, x))]

回答 11

有关完整路径名的列表,相对于其他解决方案,我更喜欢此版本:

def listdirs(dir):
    return [os.path.join(os.path.join(dir, x)) for x in os.listdir(dir) 
        if os.path.isdir(os.path.join(dir, x))]

For a list of full path names I prefer this version to the other solutions here:

def listdirs(dir):
    return [os.path.join(os.path.join(dir, x)) for x in os.listdir(dir) 
        if os.path.isdir(os.path.join(dir, x))]

回答 12

scanDir = "abc"
directories = [d for d in os.listdir(scanDir) if os.path.isdir(os.path.join(os.path.abspath(scanDir), d))]
scanDir = "abc"
directories = [d for d in os.listdir(scanDir) if os.path.isdir(os.path.join(os.path.abspath(scanDir), d))]

回答 13

没有目录时不会失败的更安全的选项。

def listdirs(folder):
    if os.path.exists(folder):
         return [d for d in os.listdir(folder) if os.path.isdir(os.path.join(folder, d))]
    else:
         return []

A safer option that does not fail when there is no directory.

def listdirs(folder):
    if os.path.exists(folder):
         return [d for d in os.listdir(folder) if os.path.isdir(os.path.join(folder, d))]
    else:
         return []

回答 14

这样吗

>>>> [path for path in os.listdir(os.getcwd()) if os.path.isdir(path)]

Like so?

>>>> [path for path in os.listdir(os.getcwd()) if os.path.isdir(path)]

回答 15

蟒3.4引入pathlib模块到标准库,它提供了一个面向对象的方法来处理的文件系统的路径:

from pathlib import Path

p = Path('./')
[f for f in p.iterdir() if f.is_dir()]

Python 3.4 introduced the pathlib module into the standard library, which provides an object oriented approach to handle filesystem paths:

from pathlib import Path

p = Path('./')
[f for f in p.iterdir() if f.is_dir()]

回答 16

-- This will exclude files and traverse through 1 level of sub folders in the root

def list_files(dir):
    List = []
    filterstr = ' '
    for root, dirs, files in os.walk(dir, topdown = True):
        #r.append(root)
        if (root == dir):
            pass
        elif filterstr in root:
            #filterstr = ' '
            pass
        else:
            filterstr = root
            #print(root)
            for name in files:
                print(root)
                print(dirs)
                List.append(os.path.join(root,name))
            #print(os.path.join(root,name),"\n")
                print(List,"\n")

    return List
-- This will exclude files and traverse through 1 level of sub folders in the root

def list_files(dir):
    List = []
    filterstr = ' '
    for root, dirs, files in os.walk(dir, topdown = True):
        #r.append(root)
        if (root == dir):
            pass
        elif filterstr in root:
            #filterstr = ' '
            pass
        else:
            filterstr = root
            #print(root)
            for name in files:
                print(root)
                print(dirs)
                List.append(os.path.join(root,name))
            #print(os.path.join(root,name),"\n")
                print(List,"\n")

    return List

在Python的相对位置打开文件

问题:在Python的相对位置打开文件

假设python代码在以前的Windows目录“ main”中未知的位置执行,并且在运行时将代码安装在任何地方,都需要访问目录“ main / 2091 / data.txt”。

我应该如何使用open(location)函数?应该在什么位置?

编辑:

我发现下面的简单代码可以工作..它有什么缺点吗?

    file="\2091\sample.txt"
    path=os.getcwd()+file
    fp=open(path,'r+');

Suppose python code is executed in not known by prior windows directory say ‘main’ , and wherever code is installed when it runs it needs to access to directory ‘main/2091/data.txt’ .

how should I use open(location) function? what should be location ?

Edit :

I found that below simple code will work..does it have any disadvantages ?

    file="\2091\sample.txt"
    path=os.getcwd()+file
    fp=open(path,'r+');

回答 0

使用这种类型的东西时,您需要注意实际的工作目录是什么。例如,您可能无法从文件所在的目录中运行脚本。在这种情况下,您不能仅使用相对路径本身。

如果确定所需文件位于脚本实际所在的子目录中,则可以__file__在此处使用帮助。 __file__是您正在运行的脚本所在的完整路径。

因此,您可以摆弄这样的东西:

import os
script_dir = os.path.dirname(__file__) #<-- absolute dir the script is in
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)

With this type of thing you need to be careful what your actual working directory is. For example, you may not run the script from the directory the file is in. In this case, you can’t just use a relative path by itself.

If you are sure the file you want is in a subdirectory beneath where the script is actually located, you can use __file__ to help you out here. __file__ is the full path to where the script you are running is located.

So you can fiddle with something like this:

import os
script_dir = os.path.dirname(__file__) #<-- absolute dir the script is in
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)

回答 1

这段代码可以正常工作:

import os


def readFile(filename):
    filehandle = open(filename)
    print filehandle.read()
    filehandle.close()



fileDir = os.path.dirname(os.path.realpath('__file__'))
print fileDir

#For accessing the file in the same folder
filename = "same.txt"
readFile(filename)

#For accessing the file in a folder contained in the current folder
filename = os.path.join(fileDir, 'Folder1.1/same.txt')
readFile(filename)

#For accessing the file in the parent folder of the current folder
filename = os.path.join(fileDir, '../same.txt')
readFile(filename)

#For accessing the file inside a sibling folder.
filename = os.path.join(fileDir, '../Folder2/same.txt')
filename = os.path.abspath(os.path.realpath(filename))
print filename
readFile(filename)

This code works fine:

import os


def readFile(filename):
    filehandle = open(filename)
    print filehandle.read()
    filehandle.close()



fileDir = os.path.dirname(os.path.realpath('__file__'))
print fileDir

#For accessing the file in the same folder
filename = "same.txt"
readFile(filename)

#For accessing the file in a folder contained in the current folder
filename = os.path.join(fileDir, 'Folder1.1/same.txt')
readFile(filename)

#For accessing the file in the parent folder of the current folder
filename = os.path.join(fileDir, '../same.txt')
readFile(filename)

#For accessing the file inside a sibling folder.
filename = os.path.join(fileDir, '../Folder2/same.txt')
filename = os.path.abspath(os.path.realpath(filename))
print filename
readFile(filename)

回答 2

我创建一个帐户只是为了澄清我认为在Russ原始回复中发现的差异。

作为参考,他的原始答案是:

import os
script_dir = os.path.dirname(__file__)
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)

这是一个很好的答案,因为它试图动态创建所需文件的绝对系统路径。

Cory Mawhorter注意到这__file__是相对路径(在我的系统上也是这样),建议使用os.path.abspath(__file__)os.path.abspath,但是,返回当前脚本的绝对路径(即/path/to/dir/foobar.py

要使用此方法(以及我最终如何使用它),必须从路径末尾删除脚本名称:

import os
script_path = os.path.abspath(__file__) # i.e. /path/to/dir/foobar.py
script_dir = os.path.split(script_path)[0] #i.e. /path/to/dir/
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)

所得的abs_file_path(在此示例中)变为: /path/to/dir/2091/data.txt

I created an account just so I could clarify a discrepancy I think I found in Russ’s original response.

For reference, his original answer was:

import os
script_dir = os.path.dirname(__file__)
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)

This is a great answer because it is trying to dynamically creates an absolute system path to the desired file.

Cory Mawhorter noticed that __file__ is a relative path (it is as well on my system) and suggested using os.path.abspath(__file__). os.path.abspath, however, returns the absolute path of your current script (i.e. /path/to/dir/foobar.py)

To use this method (and how I eventually got it working) you have to remove the script name from the end of the path:

import os
script_path = os.path.abspath(__file__) # i.e. /path/to/dir/foobar.py
script_dir = os.path.split(script_path)[0] #i.e. /path/to/dir/
rel_path = "2091/data.txt"
abs_file_path = os.path.join(script_dir, rel_path)

The resulting abs_file_path (in this example) becomes: /path/to/dir/2091/data.txt


回答 3

这取决于您使用的操作系统。如果您想要一个与Windows和* nix兼容的解决方案,例如:

from os import path

file_path = path.relpath("2091/data.txt")
with open(file_path) as f:
    <do stuff>

应该工作正常。

path模块能够格式化其正在运行的任何操作系统的路径。另外,只要您具有正确的权限,python就能很好地处理相对路径。

编辑

正如kindall在评论中提到的那样,python仍然可以在unix样式和Windows样式路径之间进行转换,因此,即使是更简单的代码也可以使用:

with open("2091/data/txt") as f:
    <do stuff>

话虽如此,该path模块仍然具有一些有用的功能。

It depends on what operating system you’re using. If you want a solution that is compatible with both Windows and *nix something like:

from os import path

file_path = path.relpath("2091/data.txt")
with open(file_path) as f:
    <do stuff>

should work fine.

The path module is able to format a path for whatever operating system it’s running on. Also, python handles relative paths just fine, so long as you have correct permissions.

Edit:

As mentioned by kindall in the comments, python can convert between unix-style and windows-style paths anyway, so even simpler code will work:

with open("2091/data/txt") as f:
    <do stuff>

That being said, the path module still has some useful functions.


回答 4

我花了很多时间发现为什么我的代码找不到在Windows系统上运行Python 3的文件。所以我加了。之前/一切正常:

import os

script_dir = os.path.dirname(__file__)
file_path = os.path.join(script_dir, './output03.txt')
print(file_path)
fptr = open(file_path, 'w')

I spend a lot time to discover why my code could not find my file running Python 3 on the Windows system. So I added . before / and everything worked fine:

import os

script_dir = os.path.dirname(__file__)
file_path = os.path.join(script_dir, './output03.txt')
print(file_path)
fptr = open(file_path, 'w')

回答 5

码:

import os
script_path = os.path.abspath(__file__) 
path_list = script_path.split(os.sep)
script_directory = path_list[0:len(path_list)-1]
rel_path = "main/2091/data.txt"
path = "/".join(script_directory) + "/" + rel_path

说明:

导入库:

import os

用于__file__获取当前脚本的路径:

script_path = os.path.abspath(__file__)

将脚本路径分为多个项目:

path_list = script_path.split(os.sep)

删除列表中的最后一项(实际的脚本文件):

script_directory = path_list[0:len(path_list)-1]

添加相对文件的路径:

rel_path = "main/2091/data.txt

加入列表项,并添加相对路径的文件:

path = "/".join(script_directory) + "/" + rel_path

现在,您可以设置要对文件执行的任何操作,例如:

file = open(path)

Code:

import os
script_path = os.path.abspath(__file__) 
path_list = script_path.split(os.sep)
script_directory = path_list[0:len(path_list)-1]
rel_path = "main/2091/data.txt"
path = "/".join(script_directory) + "/" + rel_path

Explanation:

Import library:

import os

Use __file__ to attain the current script’s path:

script_path = os.path.abspath(__file__)

Separates the script path into multiple items:

path_list = script_path.split(os.sep)

Remove the last item in the list (the actual script file):

script_directory = path_list[0:len(path_list)-1]

Add the relative file’s path:

rel_path = "main/2091/data.txt

Join the list items, and addition the relative path’s file:

path = "/".join(script_directory) + "/" + rel_path

Now you are set to do whatever you want with the file, such as, for example:

file = open(path)

回答 6

如果文件在您的父文件夹中,例如。follower.txt,您可以简单地使用open('../follower.txt', 'r').read()

If the file is in your parent folder, eg. follower.txt, you can simply use open('../follower.txt', 'r').read()


回答 7

试试这个:

from pathlib import Path

data_folder = Path("/relative/path")
file_to_open = data_folder / "file.pdf"

f = open(file_to_open)

print(f.read())

Python 3.4引入了一个新的用于处理文件和路径的标准库,称为pathlib。这个对我有用!

Try this:

from pathlib import Path

data_folder = Path("/relative/path")
file_to_open = data_folder / "file.pdf"

f = open(file_to_open)

print(f.read())

Python 3.4 introduced a new standard library for dealing with files and paths called pathlib. It works for me!


回答 8

不确定是否到处都能使用。

我在ubuntu中使用ipython。

如果要读取当前文件夹的子目录中的文件:

/current-folder/sub-directory/data.csv

您的脚本在当前文件夹中,只需尝试以下操作:

import pandas as pd
path = './sub-directory/data.csv'
pd.read_csv(path)

Not sure if this work everywhere.

I’m using ipython in ubuntu.

If you want to read file in current folder’s sub-directory:

/current-folder/sub-directory/data.csv

your script is in current-folder simply try this:

import pandas as pd
path = './sub-directory/data.csv'
pd.read_csv(path)

回答 9

Python只是将您提供的文件名传递给操作系统,然后将其打开。如果您的操作系统支持相对路径main/2091/data.txt(如:提示),则可以正常工作。

您可能会发现,回答此类问题的最简单方法是尝试一下,看看会发生什么。

Python just passes the filename you give it to the operating system, which opens it. If your operating system supports relative paths like main/2091/data.txt (hint: it does), then that will work fine.

You may find that the easiest way to answer a question like this is to try it and see what happens.


回答 10

import os
def file_path(relative_path):
    dir = os.path.dirname(os.path.abspath(__file__))
    split_path = relative_path.split("/")
    new_path = os.path.join(dir, *split_path)
    return new_path

with open(file_path("2091/data.txt"), "w") as f:
    f.write("Powerful you have become.")
import os
def file_path(relative_path):
    dir = os.path.dirname(os.path.abspath(__file__))
    split_path = relative_path.split("/")
    new_path = os.path.join(dir, *split_path)
    return new_path

with open(file_path("2091/data.txt"), "w") as f:
    f.write("Powerful you have become.")

回答 11

当我还是初学者时,我发现这些描述有些令人生畏。一开始我会尝试 For Windows

f= open('C:\Users\chidu\Desktop\Skipper New\Special_Note.txt','w+')
print(f) 

这将引发一个syntax error。我曾经很困惑。然后在Google上进行一些冲浪之后。找到了发生错误的原因。写给初学者

这是因为要以Unicode读取路径,您只需\在启动文件路径时添加一个

f= open('C:\\Users\chidu\Desktop\Skipper New\Special_Note.txt','w+')
print(f)

现在,它可以\在启动目录之前添加。

When I was a beginner I found these descriptions a bit intimidating. As at first I would try For Windows

f= open('C:\Users\chidu\Desktop\Skipper New\Special_Note.txt','w+')
print(f) 

and this would raise an syntax error. I used get confused alot. Then after some surfing across google. found why the error occurred. Writing this for beginners

It’s because for path to be read in Unicode you simple add a \ when starting file path

f= open('C:\\Users\chidu\Desktop\Skipper New\Special_Note.txt','w+')
print(f)

And now it works just add \ before starting the directory.


numpy max vs amax vs maximum

问题:numpy max vs amax vs maximum

numpy的具有看起来他们可被用于同样的东西三个不同的函数—不同之处在于numpy.maximum被用于逐元素,而numpy.maxnumpy.amax可以在特定轴,或所有元件一起使用。为什么不仅仅存在numpy.max?在性能上有一些微妙之处吗?

(类似minvs. aminvs. minimum

numpy has three different functions which seem like they can be used for the same things — except that numpy.maximum can only be used element-wise, while numpy.max and numpy.amax can be used on particular axes, or all elements. Why is there more than just numpy.max? Is there some subtlety to this in performance?

(Similarly for min vs. amin vs. minimum)


回答 0

np.max只是的别名np.amax。此函数仅在单个输入数组上起作用,并在整个数组中找到最大元素的值(返回标量)。或者,它接受一个axis参数,并沿输入数组的轴找到最大值(返回一个新数组)。

>>> a = np.array([[0, 1, 6],
                  [2, 4, 1]])
>>> np.max(a)
6
>>> np.max(a, axis=0) # max of each column
array([2, 4, 6])

的默认行为np.maximum是采用两个数组并计算其按元素的最大值。在这里,“兼容”意味着可以将一个阵列广播到另一个阵列。例如:

>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])

但是np.maximum它也是一个通用函数,这意味着它具有使用多维数组时有用的其他功能和方法。例如,您可以计算数组(或数组的特定轴)上的累积最大值:

>>> d = np.array([2, 0, 3, -4, -2, 7, 9])
>>> np.maximum.accumulate(d)
array([2, 2, 3, 3, 3, 7, 9])

无法使用np.max

您可以在使用时在一定程度上进行np.maximum模仿:np.maxnp.maximum.reduce

>>> np.maximum.reduce(d)
9
>>> np.max(d)
9

基本测试表明这两种方法在性能上是可比的。它们应该是np.max()实际需要np.maximum.reduce执行的计算。

np.max is just an alias for np.amax. This function only works on a single input array and finds the value of maximum element in that entire array (returning a scalar). Alternatively, it takes an axis argument and will find the maximum value along an axis of the input array (returning a new array).

>>> a = np.array([[0, 1, 6],
                  [2, 4, 1]])
>>> np.max(a)
6
>>> np.max(a, axis=0) # max of each column
array([2, 4, 6])

The default behaviour of np.maximum is to take two arrays and compute their element-wise maximum. Here, ‘compatible’ means that one array can be broadcast to the other. For example:

>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])

But np.maximum is also a universal function which means that it has other features and methods which come in useful when working with multidimensional arrays. For example you can compute the cumulative maximum over an array (or a particular axis of the array):

>>> d = np.array([2, 0, 3, -4, -2, 7, 9])
>>> np.maximum.accumulate(d)
array([2, 2, 3, 3, 3, 7, 9])

This is not possible with np.max.

You can make np.maximum imitate np.max to a certain extent when using np.maximum.reduce:

>>> np.maximum.reduce(d)
9
>>> np.max(d)
9

Basic testing suggests the two approaches are comparable in performance; and they should be, as np.max() actually calls np.maximum.reduce to do the computation.


回答 1

您已经说明了为什么np.maximum不同的地方-它返回的数组是两个数组之间按元素的最大值。

至于np.amaxnp.max:它们都调用相同的函数- np.max只是的别名np.amax,它们计算数组中或沿数组轴上所有元素的最大值。

In [1]: import numpy as np

In [2]: np.amax
Out[2]: <function numpy.core.fromnumeric.amax>

In [3]: np.max
Out[3]: <function numpy.core.fromnumeric.amax>

You’ve already stated why np.maximum is different – it returns an array that is the element-wise maximum between two arrays.

As for np.amax and np.max: they both call the same function – np.max is just an alias for np.amax, and they compute the maximum of all elements in an array, or along an axis of an array.

In [1]: import numpy as np

In [2]: np.amax
Out[2]: <function numpy.core.fromnumeric.amax>

In [3]: np.max
Out[3]: <function numpy.core.fromnumeric.amax>

回答 2

为了完整起见,在Numpy中有四个最大相关函数。它们分为两个不同的类别:

  • np.amax/np.maxnp.nanmax::用于单阵列订单统计
  • np.maximumnp.fmax:用于两个数组的元素比较

单阵列订单统计

NaNs传播者np.amax/np.max及其NaN无知对应物np.nanmax

  • np.max只是的别名np.amax,因此它们被视为一个函数。

    >>> np.max.__name__
    'amax'
    >>> np.max is np.amax
    True
  • np.max传播NaN,而np.nanmax忽略NaN。

    >>> np.max([np.nan, 3.14, -1])
    nan
    >>> np.nanmax([np.nan, 3.14, -1])
    3.14

二。用于两个数组的元素比较

NaNs传播者np.maximum及其NaNs无知对应物np.fmax

  • 这两个函数都需要两个数组作为要比较的前两个位置args。

    # x1 and x2 must be the same shape or can be broadcast
    np.maximum(x1, x2, /, ...);
    np.fmax(x1, x2, /, ...)
  • np.maximum传播NaN,而np.fmax忽略NaN。

    >>> np.maximum([np.nan, 3.14, 0], [np.NINF, np.nan, 2.72])
    array([ nan,  nan, 2.72])
    >>> np.fmax([np.nan, 3.14, 0], [np.NINF, np.nan, 2.72])
    array([-inf, 3.14, 2.72])
  • 逐个元素的函数是np.ufuncUniversal Function,这意味着它们具有正常Numpy函数所不具备的一些特殊属性。

    >>> type(np.maximum)
    <class 'numpy.ufunc'>
    >>> type(np.fmax)
    <class 'numpy.ufunc'>
    >>> #---------------#
    >>> type(np.max)
    <class 'function'>
    >>> type(np.nanmax)
    <class 'function'>

最后,相同的规则适用于四个最小相关功能:

  • np.amin/np.minnp.nanmin;
  • 并且np.minimumnp.fmin

For completeness, in Numpy there are four maximum related functions. They fall into two different categories:

  • np.amax/np.max, np.nanmax: for single array order statistics
  • and np.maximum, np.fmax: for element-wise comparison of two arrays

I. For single array order statistics

NaNs propagator np.amax/np.max and its NaN ignorant counterpart np.nanmax.

  • np.max is just an alias of np.amax, so they are considered as one function.

    >>> np.max.__name__
    'amax'
    >>> np.max is np.amax
    True
    
  • np.max propagates NaNs while np.nanmax ignores NaNs.

    >>> np.max([np.nan, 3.14, -1])
    nan
    >>> np.nanmax([np.nan, 3.14, -1])
    3.14
    

II. For element-wise comparison of two arrays

NaNs propagator np.maximum and its NaNs ignorant counterpart np.fmax.

  • Both functions require two arrays as the first two positional args to compare with.

    # x1 and x2 must be the same shape or can be broadcast
    np.maximum(x1, x2, /, ...);
    np.fmax(x1, x2, /, ...)
    
  • np.maximum propagates NaNs while np.fmax ignores NaNs.

    >>> np.maximum([np.nan, 3.14, 0], [np.NINF, np.nan, 2.72])
    array([ nan,  nan, 2.72])
    >>> np.fmax([np.nan, 3.14, 0], [np.NINF, np.nan, 2.72])
    array([-inf, 3.14, 2.72])
    
  • The element-wise functions are np.ufunc(Universal Function), which means they have some special properties that normal Numpy function don’t have.

    >>> type(np.maximum)
    <class 'numpy.ufunc'>
    >>> type(np.fmax)
    <class 'numpy.ufunc'>
    >>> #---------------#
    >>> type(np.max)
    <class 'function'>
    >>> type(np.nanmax)
    <class 'function'>
    

And finally, the same rules apply to the four minimum related functions:

  • np.amin/np.min, np.nanmin;
  • and np.minimum, np.fmin.

回答 3

np.maximum 不仅按元素进行比较,而且将数组与单个值进行比较

>>>np.maximum([23, 14, 16, 20, 25], 18)
array([23, 18, 18, 20, 25])

np.maximum not only compares elementwise but also compares array elementwise with single value

>>>np.maximum([23, 14, 16, 20, 25], 18)
array([23, 18, 18, 20, 25])

有趣好用的Python教程

退出移动版
微信支付
请使用 微信 扫码支付