Python 实用宝典

Question 1

I’m building a Django project that needs search functionality, and until there’s a django.contrib.search, I have to choose a search app. So, which is the best? By “best” I mean…

easy to install / set up
has a Django- or at least Python-friendly API
can perform reasonably complex searches

Here are some apps I’ve heard of, please suggest others if you know of any:

I’d also like to avoid using a third-party search engine (like Google SiteSearch), because some of the data I’d like to index is for site members only and should not be public.

Question 2

Check out Haystack Search – a new model based search abstraction layer that currently supports Xapian, Solr and Whoosh. Looks like it’s well supported and documented.

Question 3

Justin, I’d try djangosearch first: Jacob Kaplan-Moss (Django’s lead developer) is working on it.

Potential hazards:

The home page warns the API might not be entirely stable

Potential benefits:

“The long term goal is for this to become django.contrib.search.”

Question 4

I am searching for the same thing, as are a lot of other people. Let’s hope that django.contrib.search will be added soon.

In the meantime, this is what I found:

To me, most look quite complicated and, frankly, a little daunting to implement. I’d be interested to learn what you think of these.

Question 5

The google code page for djangosearch indicates that it is no longer under active development, and suggests haystack or solango.

Question 6

I’d recommend Sphinx for full-text search and aggregation, and django-sphinx is good enough for production use. We found that Sphinx was the least resource-intensive and fastest way to index and search our documents and that django-sphinx was a nice wrapper on top of the sphinx client.

The group by aggregation is particularly nice, if for example you want to display how many documents with a certain tag or by a certain author (or both) matched a search. In memory attribute updates were convenient too, especially for removing deleted articles immediately.

Question 7

Thanks Garth. I had seen that djangosearch wanted to become the official Django search, but I was hesitant to use it because I couldn’t find any documentation! Luckily, there’s a README in subversion that I hadn’t seen before, and it makes the API look very cool:

# set up the model
class Event(models.Model):
    title = models.CharField(max_length=255)
    date = models.DateField()
    is_outdoors = models.BooleanField()

    index = djangosearch.ModelIndex(text=['title'], 
                                    additional=['date', 'is_outdoors'])

# run a search
results = Event.index.search("django conference")

Question 8

I just needed a very quick solution that was no-fuss for an internal app.

I found the article Adding search to Django in a snap, and that worked splendid for me!

Obviously it lacks the speed, scalability and features of the real projects like Haystack, but this one is easier to set up, and I don’t really need anything else than keyword AND-search.

Question 9

You might want to consider letting Yahoo do all the hard work with their Build your own Search Service (BOSS). Here is a great blog post that walks you through the process: http://www.peterkrantz.com/2008/yahoo-search-in-django/

Question 10

It looks like everyone here missed django-xappy

After quick evaluation of all existing search addons for Django, I found this one as most flexible and easiest to use. It’s rough on the edges in few places, but it’s still the best way to use power of Xapian search engine inside Django projects.

Question 11

You might want to look at Django Solr search (aka “Solango”) which comes with some nice documentation to get you started…

Question 12

If you have large amount of data to be indexed or you expect high traffic, I’d suggest using some external search engine, like Solr. This way, you’ll keep shared-nothing approach and be able to scale your site components independently.

Question 13

I think I am going to have to give a shout out to Djapian.

It is rock-solid…just pull down a source distribution and peek inside. Top notch code, not very many comments tho..

It’s still a young software project, but I think the django community should throw it’s weight behind this one.

Question 14

Thanks Joe,

We decided to go with Tsearch2 and a custom postgres adaptor. Tsearch2 does not need an extra process to run, which was convenient since we are on a WebFaction hosting with limited memory… It’s not completely done yet, but seems to be a good solution…

Question 15

I found Djoosh which relies on the pure-python external search engine Whoosh to work well with my ‘Python’ brain.

Question 16

If you are willing to use a 3rd party search engine I can recommend Yahoo BOSS and django-bosssearch.

Yahoo BOSS is a paid service, but it saves you setting up and maintaining other search software on your server.

Question 17

I have just joined a project with a rather large existing code base. We develop in linux and do not use and IDE. We run through the command line. I’m trying to figure out how to get python to search for the right path when I run project modules. For instance, when I run something like:

python someprojectfile.py

I get

ImportError: no module named core.'somemodule'

I get this for all of my imports to I assume it’s an issue with the path.

TLDR:

How do I get Python to search ~/codez/project/ and all the files and folders for *.py files during import statements.

Question 18

There are a few possible ways to do this:

Set the environment variable PYTHONPATH to a colon-separated list of directories to search for imported modules.
In your program, use sys.path.append('/path/to/search') to add the names of directories you want Python to search for imported modules. sys.path is just the list of directories Python searches every time it gets asked to import a module, and you can alter it as needed (although I wouldn’t recommend removing any of the standard directories!). Any directories you put in the environment variable PYTHONPATH will be inserted into sys.path when Python starts up.
Use site.addsitedir to add a directory to sys.path. The difference between this and just plain appending is that when you use addsitedir, it also looks for .pth files within that directory and uses them to possibly add additional directories to sys.path based on the contents of the files. See the documentation for more detail.

Which one of these you want to use depends on your situation. Remember that when you distribute your project to other users, they typically install it in such a manner that the Python code files will be automatically detected by Python’s importer (i.e. packages are usually installed in the site-packages directory), so if you mess with sys.path in your code, that may be unnecessary and might even have adverse effects when that code runs on another computer. For development, I would venture a guess that setting PYTHONPATH is usually the best way to go.

However, when you’re using something that just runs on your own computer (or when you have nonstandard setups, e.g. sometimes in web app frameworks), it’s not entirely uncommon to do something like

import sys
from os.path import dirname
sys.path.append(dirname(__file__))

Question 19

You should also read about python packages here: http://docs.python.org/tutorial/modules.html.

From your example, I would guess that you really have a package at ~/codez/project. The file __init__.py in a python directory maps a directory into a namespace. If your subdirectories all have an __init__.py file, then you only need to add the base directory to your PYTHONPATH. For example:

PYTHONPATH=$PYTHONPATH:$HOME/adaifotis/project

In addition to testing your PYTHONPATH environment variable, as David explains, you can test it in python like this:

$ python
>>> import project                      # should work if PYTHONPATH set
>>> import sys
>>> for line in sys.path: print line    # print current python path

…

Question 20

I know this thread is a bit old, but it took me some time to get to the heart of this, so I wanted to share.

In my project, I had the main script in a parent directory, and, to differentiate the modules, I put all the supporting modules in a sub-folder called “modules”. In my main script, I import these modules like this (for a module called report.py):

from modules.report import report, reportError

If I call my main script, this works. HOWEVER, I wanted to test each module by including a main() in each, and calling each directly, as:

python modules/report.py

Now Python complains that it can’t find “a module called modules”. The key here is that, by default, Python includes the folder of the script in its search path, BUT NOT THE CWD. So what this error says, really, is “I can’t find a modules subfolder”. The is because there is no “modules” subdirectory from the directory where the report.py module resides.

I find that the neatest solution to this is to append the CWD in Python search path by including this at the top:

import sys

sys.path.append(".")

Now Python searches the CWD (current directory), finds the “modules” sub-folder, and all is well.

Question 21

I read this question looking for an answer, and didn’t like any of them.

So I wrote a quick and dirty solution. Just put this somewhere on your sys.path, and it’ll add any directory under folder (from the current working directory), or under abspath:

#using.py

import sys, os.path

def all_from(folder='', abspath=None):
    """add all dirs under `folder` to sys.path if any .py files are found.
    Use an abspath if you'd rather do it that way.

    Uses the current working directory as the location of using.py. 
    Keep in mind that os.walk goes *all the way* down the directory tree.
    With that, try not to use this on something too close to '/'

    """
    add = set(sys.path)
    if abspath is None:
        cwd = os.path.abspath(os.path.curdir)
        abspath = os.path.join(cwd, folder)
    for root, dirs, files in os.walk(abspath):
        for f in files:
            if f[-3:] in '.py':
                add.add(root)
                break
    for i in add: sys.path.append(i)

>>> import using, sys, pprint
>>> using.all_from('py') #if in ~, /home/user/py/
>>> pprint.pprint(sys.path)
[
#that was easy
]

And I like it because I can have a folder for some random tools and not have them be a part of packages or anything, and still get access to some (or all) of them in a couple lines of code.

Question 22

The easiest way I find is to create a file “any_name.pth” and put it in your folder “\Lib\site-packages”. You should find that folder wherever python is installed.

In that file, put a list of directories where you want to keep modules for importing. For instance, make a line in that file like this:

C:\Users\example…\example

You will be able to tell it works by running this in python:

import sys
for line in sys: print line

You will see your directory printed out, amongst others from where you can also import. Now you can import a “mymodule.py” file that sits in that directory as easily as:

import mymodule

This will not import subfolders. For that you could imagine creating a python script to create a .pth file containing all sub folders of a folder you define. Have it run at startup perhaps.

Question 23

New option for old question.
Installing fail2ban package on Debian, looks like it’s hardcoded to install on /usr/lib/python3/dist-packages/fail2ban a path not on python3 sys.path.


> python3
Python 3.7.3 (v3.7.3:ef4ec6ed12, Jun 25 2019, 18:51:50)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
>>>

so, instead of just copying, I (bash) linked the library to newer versions.
Future updates to the original app, will also be automatically applied to the linked versions.

 if [ -d /usr/lib/python3/dist-packages/fail2ban ]
   then
      for d in /usr/lib/python3.*
      do
         [ -d ${d}/fail2ban ] || \
            ln -vs /usr/lib/python3/dist-packages/fail2ban ${d}/
      done
   fi

Question 24

Which is the best data structure that can be used to implement a binary tree in Python?

Question 25

Here is my simple recursive implementation of binary search tree.

#!/usr/bin/python

class Node:
    def __init__(self, val):
        self.l = None
        self.r = None
        self.v = val

class Tree:
    def __init__(self):
        self.root = None

    def getRoot(self):
        return self.root

    def add(self, val):
        if self.root is None:
            self.root = Node(val)
        else:
            self._add(val, self.root)

    def _add(self, val, node):
        if val < node.v:
            if node.l is not None:
                self._add(val, node.l)
            else:
                node.l = Node(val)
        else:
            if node.r is not None:
                self._add(val, node.r)
            else:
                node.r = Node(val)

    def find(self, val):
        if self.root is not None:
            return self._find(val, self.root)
        else:
            return None

    def _find(self, val, node):
        if val == node.v:
            return node
        elif (val < node.v and node.l is not None):
            self._find(val, node.l)
        elif (val > node.v and node.r is not None):
            self._find(val, node.r)

    def deleteTree(self):
        # garbage collector will do this for us. 
        self.root = None

    def printTree(self):
        if self.root is not None:
            self._printTree(self.root)

    def _printTree(self, node):
        if node is not None:
            self._printTree(node.l)
            print(str(node.v) + ' ')
            self._printTree(node.r)

#     3
# 0     4
#   2      8
tree = Tree()
tree.add(3)
tree.add(4)
tree.add(0)
tree.add(8)
tree.add(2)
tree.printTree()
print(tree.find(3).v)
print(tree.find(10))
tree.deleteTree()
tree.printTree()

Question 26

# simple binary tree
# in this implementation, a node is inserted between an existing node and the root


class BinaryTree():

    def __init__(self,rootid):
      self.left = None
      self.right = None
      self.rootid = rootid

    def getLeftChild(self):
        return self.left
    def getRightChild(self):
        return self.right
    def setNodeValue(self,value):
        self.rootid = value
    def getNodeValue(self):
        return self.rootid

    def insertRight(self,newNode):
        if self.right == None:
            self.right = BinaryTree(newNode)
        else:
            tree = BinaryTree(newNode)
            tree.right = self.right
            self.right = tree

    def insertLeft(self,newNode):
        if self.left == None:
            self.left = BinaryTree(newNode)
        else:
            tree = BinaryTree(newNode)
            tree.left = self.left
            self.left = tree


def printTree(tree):
        if tree != None:
            printTree(tree.getLeftChild())
            print(tree.getNodeValue())
            printTree(tree.getRightChild())



# test tree

def testTree():
    myTree = BinaryTree("Maud")
    myTree.insertLeft("Bob")
    myTree.insertRight("Tony")
    myTree.insertRight("Steven")
    printTree(myTree)

Read more about it Here:-This is a very simple implementation of a binary tree.

This is a nice tutorial with questions in between

Question 27

[What you need for interviews] A Node class is the sufficient data structure to represent a binary tree.

(While other answers are mostly correct, they are not required for a binary tree: no need to extend object class, no need to be a BST, no need to import deque).

class Node:

    def __init__(self, value = None):
        self.left  = None
        self.right = None
        self.value = value

Here is an example of a tree:

n1 = Node(1)
n2 = Node(2)
n3 = Node(3)
n1.left  = n2
n1.right = n3

In this example n1 is the root of the tree having n2, n3 as its children.

Question 28

Simple implementation of BST in Python

class TreeNode:
    def __init__(self, value):
        self.left = None
        self.right = None
        self.data = value

class Tree:
    def __init__(self):
        self.root = None

    def addNode(self, node, value):
        if(node==None):
            self.root = TreeNode(value)
        else:
            if(value<node.data):
                if(node.left==None):
                    node.left = TreeNode(value)
                else:
                    self.addNode(node.left, value)
            else:
                if(node.right==None):
                    node.right = TreeNode(value)
                else:
                    self.addNode(node.right, value)

    def printInorder(self, node):
        if(node!=None):
            self.printInorder(node.left)
            print(node.data)
            self.printInorder(node.right)

def main():
    testTree = Tree()
    testTree.addNode(testTree.root, 200)
    testTree.addNode(testTree.root, 300)
    testTree.addNode(testTree.root, 100)
    testTree.addNode(testTree.root, 30)
    testTree.printInorder(testTree.root)

Question 29

A very quick ‘n dirty way of implementing a binary tree using lists. Not the most efficient, nor does it handle nil values all too well. But it’s very transparent (at least to me):

def _add(node, v):
    new = [v, [], []]
    if node:
        left, right = node[1:]
        if not left:
            left.extend(new)
        elif not right:
            right.extend(new)
        else:
            _add(left, v)
    else:
        node.extend(new)

def binary_tree(s):
    root = []
    for e in s:
        _add(root, e)
    return root

def traverse(n, order):
    if n:
        v = n[0]
        if order == 'pre':
            yield v
        for left in traverse(n[1], order):
            yield left
        if order == 'in':
            yield v
        for right in traverse(n[2], order):
            yield right
        if order == 'post':
            yield v

Constructing a tree from an iterable:

 >>> tree = binary_tree('A B C D E'.split())
 >>> print tree
 ['A', ['B', ['D', [], []], ['E', [], []]], ['C', [], []]]

Traversing a tree:

 >>> list(traverse(tree, 'pre')), list(traverse(tree, 'in')), list(traverse(tree, 'post'))
 (['A', 'B', 'D', 'E', 'C'],
  ['D', 'B', 'E', 'A', 'C'],
  ['D', 'E', 'B', 'C', 'A'])

Question 30

I can’t help but notice that most answers here are implementing a Binary Search Tree. Binary Search Tree != Binary Tree.

A Binary Search Tree has a very specific property: for any node X, X’s key is larger than the key of any descendent of its left child, and smaller than the key of any descendant of its right child.
A Binary Tree imposes no such restriction. A Binary Tree is simply a data structure with a ‘key’ element, and two children, say ‘left’ and ‘right’.
A Tree is an even more general case of a Binary Tree where each node can have an arbitrary number of children. Typically, each node has a ‘children’ element which is of type list/array.

Now, to answer the OP’s question, I am including a full implementation of a Binary Tree in Python. The underlying data structure storing each BinaryTreeNode is a dictionary, given it offers optimal O(1) lookups. I’ve also implemented depth-first and breadth-first traversals. These are very common operations performed on trees.

from collections import deque

class BinaryTreeNode:
    def __init__(self, key, left=None, right=None):
        self.key = key
        self.left = left
        self.right = right

    def __repr__(self):
        return "%s l: (%s) r: (%s)" % (self.key, self.left, self.right)

    def __eq__(self, other):
        if self.key == other.key and \
            self.right == other.right and \
                self.left == other.left:
            return True
        else:
            return False

class BinaryTree:
    def __init__(self, root_key=None):
        # maps from BinaryTreeNode key to BinaryTreeNode instance.
        # Thus, BinaryTreeNode keys must be unique.
        self.nodes = {}
        if root_key is not None:
            # create a root BinaryTreeNode
            self.root = BinaryTreeNode(root_key)
            self.nodes[root_key] = self.root

    def add(self, key, left_key=None, right_key=None):
        if key not in self.nodes:
            # BinaryTreeNode with given key does not exist, create it
            self.nodes[key] = BinaryTreeNode(key)
        # invariant: self.nodes[key] exists

        # handle left child
        if left_key is None:
            self.nodes[key].left = None
        else:
            if left_key not in self.nodes:
                self.nodes[left_key] = BinaryTreeNode(left_key)
            # invariant: self.nodes[left_key] exists
            self.nodes[key].left = self.nodes[left_key]

        # handle right child
        if right_key == None:
            self.nodes[key].right = None
        else:
            if right_key not in self.nodes:
                self.nodes[right_key] = BinaryTreeNode(right_key)
            # invariant: self.nodes[right_key] exists
            self.nodes[key].right = self.nodes[right_key]

    def remove(self, key):
        if key not in self.nodes:
            raise ValueError('%s not in tree' % key)
        # remove key from the list of nodes
        del self.nodes[key]
        # if node removed is left/right child, update parent node
        for k in self.nodes:
            if self.nodes[k].left and self.nodes[k].left.key == key:
                self.nodes[k].left = None
            if self.nodes[k].right and self.nodes[k].right.key == key:
                self.nodes[k].right = None
        return True

    def _height(self, node):
        if node is None:
            return 0
        else:
            return 1 + max(self._height(node.left), self._height(node.right))

    def height(self):
        return self._height(self.root)

    def size(self):
        return len(self.nodes)

    def __repr__(self):
        return str(self.traverse_inorder(self.root))

    def bfs(self, node):
        if not node or node not in self.nodes:
            return
        reachable = []    
        q = deque()
        # add starting node to queue
        q.append(node)
        while len(q):
            visit = q.popleft()
            # add currently visited BinaryTreeNode to list
            reachable.append(visit)
            # add left/right children as needed
            if visit.left:
                q.append(visit.left)
            if visit.right:
                q.append(visit.right)
        return reachable

    # visit left child, root, then right child
    def traverse_inorder(self, node, reachable=None):
        if not node or node.key not in self.nodes:
            return
        if reachable is None:
            reachable = []
        self.traverse_inorder(node.left, reachable)
        reachable.append(node.key)
        self.traverse_inorder(node.right, reachable)
        return reachable

    # visit left and right children, then root
    # root of tree is always last to be visited
    def traverse_postorder(self, node, reachable=None):
        if not node or node.key not in self.nodes:
            return
        if reachable is None:
            reachable = []
        self.traverse_postorder(node.left, reachable)
        self.traverse_postorder(node.right, reachable)
        reachable.append(node.key)
        return reachable

    # visit root, left, then right children
    # root is always visited first
    def traverse_preorder(self, node, reachable=None):
        if not node or node.key not in self.nodes:
            return
        if reachable is None:
            reachable = []
        reachable.append(node.key)
        self.traverse_preorder(node.left, reachable)
        self.traverse_preorder(node.right, reachable)
        return reachable

Question 31

you don’t need to have two classes

class Tree:
    val = None
    left = None
    right = None

    def __init__(self, val):
        self.val = val


    def insert(self, val):
        if self.val is not None:
            if val < self.val:
                if self.left is not None:
                    self.left.insert(val)
                else:
                    self.left = Tree(val)
            elif val > self.val:
                if self.right is not None:
                    self.right.insert(val)
                else:
                    self.right = Tree(val)
            else:
                return
        else:
            self.val = val
            print("new node added")

    def showTree(self):
        if self.left is not None:
            self.left.showTree()
        print(self.val, end = ' ')
        if self.right is not None:
            self.right.showTree()

Question 32

A little more “Pythonic” ?

class Node:
    def __init__(self, value):
        self.value = value
        self.left = None
        self.right = None

    def __repr__(self):
        return str(self.value)



class BST:
    def __init__(self):
        self.root = None

    def __repr__(self):
        self.sorted = []
        self.get_inorder(self.root)
        return str(self.sorted)

    def get_inorder(self, node):
        if node:
            self.get_inorder(node.left)
            self.sorted.append(str(node.value))
            self.get_inorder(node.right)

    def add(self, value):
        if not self.root:
            self.root = Node(value)
        else:
            self._add(self.root, value)

    def _add(self, node, value):
        if value <= node.value:
            if node.left:
                self._add(node.left, value)
            else:
                node.left = Node(value)
        else:
            if node.right:
                self._add(node.right, value)
            else:
                node.right = Node(value)



from random import randint

bst = BST()

for i in range(100):
    bst.add(randint(1, 1000))
print (bst)

Question 33

#!/usr/bin/python

class BinaryTree:
    def __init__(self, left, right, data):
        self.left = left
        self.right = right
        self.data = data


    def pre_order_traversal(root):
        print(root.data, end=' ')

        if root.left != None:
            pre_order_traversal(root.left)

        if root.right != None:
            pre_order_traversal(root.right)

    def in_order_traversal(root):
        if root.left != None:
            in_order_traversal(root.left)
        print(root.data, end=' ')
        if root.right != None:
            in_order_traversal(root.right)

    def post_order_traversal(root):
        if root.left != None:
            post_order_traversal(root.left)
        if root.right != None:
            post_order_traversal(root.right)
        print(root.data, end=' ')

Question 34

A Node-based class of connected nodes is a standard approach. These can be hard to visualize.

Motivated from an essay on Python Patterns – Implementing Graphs, consider a simple dictionary:

Given

A binary tree

Code

Make a dictionary of unique nodes:

tree = {
   "a": ["b", "c"],
   "b": ["d", "e"],
   "c": [None, "f"],
   "d": [None, None],
   "e": [None, None],
   "f": [None, None],
}

Details

Each key-value pair is a unique node pointing to its children.
A list (or tuple) holds an ordered pair of left/right children.
With a dict having ordered insertion, assume the first entry is the root.
Common methods can be functions that mutate or traverse the dict (see find_all_paths()).

Tree-based functions often include the following common operations:

traverse: yield each node in a given order (usually left-to-right)
- breadth-first search (BFS): traverse levels
- depth-first search (DFS): traverse branches first (pre-/in-/post-order)
insert: add a node to the tree depending on the number of children
remove: remove a node depending on the number of children
update: merge missing nodes from one tree to the other
visit: yield the value of a traversed node

Try implementing all of these operations. Here we demonstrate one of these functions – a BFS traversal:

Example

import collections as ct


def traverse(tree):
    """Yield nodes from a tree via BFS."""
    q = ct.deque()                                         # 1
    root = next(iter(tree))                                # 2
    q.append(root)

    while q:
        node = q.popleft()
        children = filter(None, tree.get(node))
        for n in children:                                 # 3 
            q.append(n)
        yield node

list(traverse(tree))
# ['a', 'b', 'c', 'd', 'e', 'f']

This is a breadth-first search (level-order) algorithm applied to a dict of nodes and children.

Initialize a FIFO queue. We use a deque, but a queue or a list works (the latter is inefficient).
Get and enqueue the root node (assumes the root is the first entry in the dict, Python 3.6+).
Iteratively dequeue a node, enqueue its children and yield the node value.

See also this in-depth tutorial on trees.

Insight

Something great about traversals in general, we can easily alter the latter iterative approach to depth-first search (DFS) by simply replacing the queue with a stack (a.k.a LIFO Queue). This simply means we dequeue from the same side that we enqueue. DFS allows us to search each branch.

How? Since we are using a deque, we can emulate a stack by changing node = q.popleft() to node = q.pop() (right). The result is a right-favored, pre-ordered DFS: ['a', 'c', 'f', 'b', 'e', 'd'].

Question 35

import random

class TreeNode:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None
        self.p = None

class BinaryTree:
    def __init__(self):
        self.root = None

    def length(self):
        return self.size

    def inorder(self, node):
        if node == None:
            return None
        else:
            self.inorder(node.left)
            print node.key,
            self.inorder(node.right)

    def search(self, k):
        node = self.root
        while node != None:
            if node.key == k:
                return node
            if node.key > k:
                node = node.left
            else:
                node = node.right
        return None

    def minimum(self, node):
        x = None
        while node.left != None:
            x = node.left
            node = node.left
        return x

    def maximum(self, node):
        x = None
        while node.right != None:
            x = node.right
            node = node.right
        return x

    def successor(self, node):
        parent = None
        if node.right != None:
            return self.minimum(node.right)
        parent = node.p
        while parent != None and node == parent.right:
            node = parent
            parent = parent.p
        return parent

    def predecessor(self, node):
        parent = None
        if node.left != None:
            return self.maximum(node.left)
        parent = node.p
        while parent != None and node == parent.left:
            node = parent
            parent = parent.p
        return parent

    def insert(self, k):
        t = TreeNode(k)
        parent = None
        node = self.root
        while node != None:
            parent = node
            if node.key > t.key:
                node = node.left
            else:
                node = node.right
        t.p = parent
        if parent == None:
            self.root = t
        elif t.key < parent.key:
            parent.left = t
        else:
            parent.right = t
        return t


    def delete(self, node):
        if node.left == None:
            self.transplant(node, node.right)
        elif node.right == None:
            self.transplant(node, node.left)
        else:
            succ = self.minimum(node.right)
            if succ.p != node:
                self.transplant(succ, succ.right)
                succ.right = node.right
                succ.right.p = succ
            self.transplant(node, succ)
            succ.left = node.left
            succ.left.p = succ

    def transplant(self, node, newnode):
        if node.p == None:
            self.root = newnode
        elif node == node.p.left:
            node.p.left = newnode
        else:
            node.p.right = newnode
        if newnode != None:
            newnode.p = node.p

Question 36

This implementation supports insert, find and delete operations without destroy the structure of the tree. This is not a banlanced tree.

# Class for construct the nodes of the tree. (Subtrees)
class Node:
def __init__(self, key, parent_node = None):
    self.left = None
    self.right = None
    self.key = key
    if parent_node == None:
        self.parent = self
    else:
        self.parent = parent_node

# Class with the  structure of the tree. 
# This Tree is not balanced.
class Tree:
def __init__(self):
    self.root = None

# Insert a single element
def insert(self, x):
    if(self.root == None):
        self.root = Node(x)
    else:
        self._insert(x, self.root)

def _insert(self, x, node):
    if(x < node.key):
        if(node.left == None):
            node.left = Node(x, node)
        else:
            self._insert(x, node.left)
    else:
        if(node.right == None):
            node.right = Node(x, node)
        else:
            self._insert(x, node.right)

# Given a element, return a node in the tree with key x. 
def find(self, x):
    if(self.root == None):
        return None
    else:
        return self._find(x, self.root)
def _find(self, x, node):
    if(x == node.key):
        return node
    elif(x < node.key):
        if(node.left == None):
            return None
        else:
            return self._find(x, node.left)
    elif(x > node.key):
        if(node.right == None):
            return None
        else:
            return self._find(x, node.right)

# Given a node, return the node in the tree with the next largest element.
def next(self, node):
    if node.right != None:
        return self._left_descendant(node.right)
    else:
        return self._right_ancestor(node)

def _left_descendant(self, node):
    if node.left == None:
        return node
    else:
        return self._left_descendant(node.left)

def _right_ancestor(self, node):
    if node.key <= node.parent.key:
        return node.parent
    else:
        return self._right_ancestor(node.parent)

# Delete an element of the tree
def delete(self, x):
    node = self.find(x)
    if node == None:
        print(x, "isn't in the tree")
    else:
        if node.right == None:
            if node.left == None:
                if node.key < node.parent.key:
                    node.parent.left = None
                    del node # Clean garbage
                else:
                    node.parent.right = None
                    del Node # Clean garbage
            else:
                node.key = node.left.key
                node.left = None
        else:
            x = self.next(node)
            node.key = x.key
            x = None


# tests
t = Tree()
t.insert(5)
t.insert(8)
t.insert(3)
t.insert(4)
t.insert(6)
t.insert(2)

t.delete(8)
t.delete(5)

t.insert(9)
t.insert(1)

t.delete(2)
t.delete(100)

# Remember: Find method return the node object. 
# To return a number use t.find(nº).key
# But it will cause an error if the number is not in the tree.
print(t.find(5)) 
print(t.find(8))
print(t.find(4))
print(t.find(6))
print(t.find(9))

Question 37

I know many good solutions have already been posted but I usually have a different approach for binary trees: going with some Node class and implementing it directly is more readable but when you have a lot of nodes it can become very greedy regarding memory, so I suggest adding one layer of complexity and storing the nodes in a python list, and then simulating a tree behavior using only the list.

You can still define a Node class to finally represent the nodes in the tree when needed, but keeping them in a simple form [value, left, right] in a list will use half the memory or less!

Here is a quick example of a Binary Search Tree class storing the nodes in an array. It provides basic fonctions such as add, remove, find…

"""
Basic Binary Search Tree class without recursion...
"""

__author__ = "@fbparis"

class Node(object):
    __slots__ = "value", "parent", "left", "right"
    def __init__(self, value, parent=None, left=None, right=None):
        self.value = value
        self.parent = parent
        self.left = left
        self.right = right

    def __repr__(self):
        return "<%s object at %s: parent=%s, left=%s, right=%s, value=%s>" % (self.__class__.__name__, hex(id(self)), self.parent, self.left, self.right, self.value)

class BinarySearchTree(object):
    __slots__ = "_tree"
    def __init__(self, *args):
        self._tree = []
        if args:
            for x in args[0]:
                self.add(x)

    def __len__(self):
        return len(self._tree)

    def __repr__(self):
        return "<%s object at %s with %d nodes>" % (self.__class__.__name__, hex(id(self)), len(self))

    def __str__(self, nodes=None, level=0):
        ret = ""
        if nodes is None:
            if len(self):
                nodes = [0]
            else:
                nodes = []
        for node in nodes:
            if node is None:
                continue
            ret += "-" * level + " %s\n" % self._tree[node][0]
            ret += self.__str__(self._tree[node][2:4], level + 1)
        if level == 0:
            ret = ret.strip()
        return ret

    def __contains__(self, value):
        if len(self):
            node_index = 0
            while self._tree[node_index][0] != value:
                if value < self._tree[node_index][0]:
                    node_index = self._tree[node_index][2]
                else:
                    node_index = self._tree[node_index][3]
                if node_index is None:
                    return False
            return True
        return False

    def __eq__(self, other):
        return self._tree == other._tree

    def add(self, value):
        if len(self):
            node_index = 0
            while self._tree[node_index][0] != value:
                if value < self._tree[node_index][0]:
                    b = self._tree[node_index][2]
                    k = 2
                else:
                    b = self._tree[node_index][3]
                    k = 3
                if b is None:
                    self._tree[node_index][k] = len(self)
                    self._tree.append([value, node_index, None, None])
                    break
                node_index = b
        else:
            self._tree.append([value, None, None, None])

    def remove(self, value):
        if len(self):
            node_index = 0
            while self._tree[node_index][0] != value:
                if value < self._tree[node_index][0]:
                    node_index = self._tree[node_index][2]
                else:
                    node_index = self._tree[node_index][3]
                if node_index is None:
                    raise KeyError
            if self._tree[node_index][2] is not None:
                b, d = 2, 3
            elif self._tree[node_index][3] is not None:
                b, d = 3, 2
            else:
                i = node_index
                b = None
            if b is not None:
                i = self._tree[node_index][b]
                while self._tree[i][d] is not None:
                    i = self._tree[i][d]
                p = self._tree[i][1]
                b = self._tree[i][b]
                if p == node_index:
                    self._tree[p][5-d] = b
                else:
                    self._tree[p][d] = b
                if b is not None:
                    self._tree[b][1] = p
                self._tree[node_index][0] = self._tree[i][0]
            else:
                p = self._tree[i][1]
                if p is not None:
                    if self._tree[p][2] == i:
                        self._tree[p][2] = None
                    else:
                        self._tree[p][3] = None
            last = self._tree.pop()
            n = len(self)
            if i < n:
                self._tree[i] = last[:]
                if last[2] is not None:
                    self._tree[last[2]][1] = i
                if last[3] is not None:
                    self._tree[last[3]][1] = i
                if self._tree[last[1]][2] == n:
                    self._tree[last[1]][2] = i
                else:
                    self._tree[last[1]][3] = i
        else:
            raise KeyError

    def find(self, value):
        if len(self):
            node_index = 0
            while self._tree[node_index][0] != value:
                if value < self._tree[node_index][0]:
                    node_index = self._tree[node_index][2]
                else:
                    node_index = self._tree[node_index][3]
                if node_index is None:
                    return None
            return Node(*self._tree[node_index])
        return None

I’ve added a parent attribute so that you can remove any node and maintain the BST structure.

Sorry for the readability, especially for the “remove” function. Basically, when a node is removed, we pop the tree array and replace it with the last element (except if we wanted to remove the last node). To maintain the BST structure, the removed node is replaced with the max of its left children or the min of its right children and some operations have to be done in order to keep the indexes valid but it’s fast enough.

I used this technique for more advanced stuff to build some big words dictionaries with an internal radix trie and I was able to divide memory consumption by 7-8 (you can see an example here: https://gist.github.com/fbparis/b3ddd5673b603b42c880974b23db7cda)

Question 38

A good implementation of binary search tree, taken from here:

'''
A binary search Tree
'''
from __future__ import print_function
class Node:

    def __init__(self, label, parent):
        self.label = label
        self.left = None
        self.right = None
        #Added in order to delete a node easier
        self.parent = parent

    def getLabel(self):
        return self.label

    def setLabel(self, label):
        self.label = label

    def getLeft(self):
        return self.left

    def setLeft(self, left):
        self.left = left

    def getRight(self):
        return self.right

    def setRight(self, right):
        self.right = right

    def getParent(self):
        return self.parent

    def setParent(self, parent):
        self.parent = parent

class BinarySearchTree:

    def __init__(self):
        self.root = None

    def insert(self, label):
        # Create a new Node
        new_node = Node(label, None)
        # If Tree is empty
        if self.empty():
            self.root = new_node
        else:
            #If Tree is not empty
            curr_node = self.root
            #While we don't get to a leaf
            while curr_node is not None:
                #We keep reference of the parent node
                parent_node = curr_node
                #If node label is less than current node
                if new_node.getLabel() < curr_node.getLabel():
                #We go left
                    curr_node = curr_node.getLeft()
                else:
                    #Else we go right
                    curr_node = curr_node.getRight()
            #We insert the new node in a leaf
            if new_node.getLabel() < parent_node.getLabel():
                parent_node.setLeft(new_node)
            else:
                parent_node.setRight(new_node)
            #Set parent to the new node
            new_node.setParent(parent_node)      

    def delete(self, label):
        if (not self.empty()):
            #Look for the node with that label
            node = self.getNode(label)
            #If the node exists
            if(node is not None):
                #If it has no children
                if(node.getLeft() is None and node.getRight() is None):
                    self.__reassignNodes(node, None)
                    node = None
                #Has only right children
                elif(node.getLeft() is None and node.getRight() is not None):
                    self.__reassignNodes(node, node.getRight())
                #Has only left children
                elif(node.getLeft() is not None and node.getRight() is None):
                    self.__reassignNodes(node, node.getLeft())
                #Has two children
                else:
                    #Gets the max value of the left branch
                    tmpNode = self.getMax(node.getLeft())
                    #Deletes the tmpNode
                    self.delete(tmpNode.getLabel())
                    #Assigns the value to the node to delete and keesp tree structure
                    node.setLabel(tmpNode.getLabel())

    def getNode(self, label):
        curr_node = None
        #If the tree is not empty
        if(not self.empty()):
            #Get tree root
            curr_node = self.getRoot()
            #While we don't find the node we look for
            #I am using lazy evaluation here to avoid NoneType Attribute error
            while curr_node is not None and curr_node.getLabel() is not label:
                #If node label is less than current node
                if label < curr_node.getLabel():
                    #We go left
                    curr_node = curr_node.getLeft()
                else:
                    #Else we go right
                    curr_node = curr_node.getRight()
        return curr_node

    def getMax(self, root = None):
        if(root is not None):
            curr_node = root
        else:
            #We go deep on the right branch
            curr_node = self.getRoot()
        if(not self.empty()):
            while(curr_node.getRight() is not None):
                curr_node = curr_node.getRight()
        return curr_node

    def getMin(self, root = None):
        if(root is not None):
            curr_node = root
        else:
            #We go deep on the left branch
            curr_node = self.getRoot()
        if(not self.empty()):
            curr_node = self.getRoot()
            while(curr_node.getLeft() is not None):
                curr_node = curr_node.getLeft()
        return curr_node

    def empty(self):
        if self.root is None:
            return True
        return False

    def __InOrderTraversal(self, curr_node):
        nodeList = []
        if curr_node is not None:
            nodeList.insert(0, curr_node)
            nodeList = nodeList + self.__InOrderTraversal(curr_node.getLeft())
            nodeList = nodeList + self.__InOrderTraversal(curr_node.getRight())
        return nodeList

    def getRoot(self):
        return self.root

    def __isRightChildren(self, node):
        if(node == node.getParent().getRight()):
            return True
        return False

    def __reassignNodes(self, node, newChildren):
        if(newChildren is not None):
            newChildren.setParent(node.getParent())
        if(node.getParent() is not None):
            #If it is the Right Children
            if(self.__isRightChildren(node)):
                node.getParent().setRight(newChildren)
            else:
                #Else it is the left children
                node.getParent().setLeft(newChildren)

    #This function traversal the tree. By default it returns an
    #In order traversal list. You can pass a function to traversal
    #The tree as needed by client code
    def traversalTree(self, traversalFunction = None, root = None):
        if(traversalFunction is None):
            #Returns a list of nodes in preOrder by default
            return self.__InOrderTraversal(self.root)
        else:
            #Returns a list of nodes in the order that the users wants to
            return traversalFunction(self.root)

    #Returns an string of all the nodes labels in the list 
    #In Order Traversal
    def __str__(self):
        list = self.__InOrderTraversal(self.root)
        str = ""
        for x in list:
            str = str + " " + x.getLabel().__str__()
        return str

def InPreOrder(curr_node):
    nodeList = []
    if curr_node is not None:
        nodeList = nodeList + InPreOrder(curr_node.getLeft())
        nodeList.insert(0, curr_node.getLabel())
        nodeList = nodeList + InPreOrder(curr_node.getRight())
    return nodeList

def testBinarySearchTree():
    r'''
    Example
                  8
                 / \
                3   10
               / \    \
              1   6    14
                 / \   /
                4   7 13 
    '''

    r'''
    Example After Deletion
                  7
                 / \
                1   4

    '''
    t = BinarySearchTree()
    t.insert(8)
    t.insert(3)
    t.insert(6)
    t.insert(1)
    t.insert(10)
    t.insert(14)
    t.insert(13)
    t.insert(4)
    t.insert(7)

    #Prints all the elements of the list in order traversal
    print(t.__str__())

    if(t.getNode(6) is not None):
        print("The label 6 exists")
    else:
        print("The label 6 doesn't exist")

    if(t.getNode(-1) is not None):
        print("The label -1 exists")
    else:
        print("The label -1 doesn't exist")

    if(not t.empty()):
        print(("Max Value: ", t.getMax().getLabel()))
        print(("Min Value: ", t.getMin().getLabel()))

    t.delete(13)
    t.delete(10)
    t.delete(8)
    t.delete(3)
    t.delete(6)
    t.delete(14)

    #Gets all the elements of the tree In pre order
    #And it prints them
    list = t.traversalTree(InPreOrder, t.root)
    for x in list:
        print(x)

if __name__ == "__main__":
    testBinarySearchTree()

Question 39

I want to show a variation of @apadana’s method, which is more useful when there is a considerable number of nodes:

'''
Suppose we have the following tree
      10
    /    \
  11      9
 /  \     / \
7   12  15   8
'''
# Step 1 - Create nodes - Use a list instead of defining each node separately
nlist = [10,11,7,9,15,8,12]; n = []
for i in range(len(nlist)): n.append(Node(nlist[i]))

# Step 2 - Set each node position
n[0].left  = n[1]
n[1].left = n[2]
n[0].right = n[3]
n[3].left = n[4]
n[3].right = n[5]
n[1].right = n[6]

Question 40

class Node:
    """
    single Node for tree
    """

    def __init__(self, data):
        self.data = data
        self.right = None
        self.left = None


class binaryTree:
    """
    binary tree implementation
    """

    def __init__(self):
        self.root = None

    def push(self, element, node=None):
        if node is None:
            node = self.root

        if self.root is None:
            self.root = Node(element)

        else:
            if element < node.data:
                if node.left is not None:
                    self.push(element, node.left)
                else:
                    node.left = Node(element)
            else:
                if node.right is not None:
                    self.push(element, node.right)
                else:
                    node.right = Node(element)

    def __str__(self):
        self.printInorder(self.root)
        return "\n"

    def printInorder(self, node):
        """
        print tree in inorder
        """
        if node is not None:
            self.printInorder(node.left)
            print(node.data)
            self.printInorder(node.right)


def main():
    """
    Main code and logic comes here
    """
    tree = binaryTree()
    tree.push(5)
    tree.push(3)
    tree.push(1)
    tree.push(3)
    tree.push(0)
    tree.push(2)
    tree.push(9)
    tree.push(10)
    print(tree)


if __name__ == "__main__":
    main()

Question 41

Binary Tree in Python

 class Tree(object):
    def __init__(self):
        self.data=None
        self.left=None
        self.right=None
    def insert(self, x, root):
        if root==None:
            t=node(x)
            t.data=x
            t.right=None
            t.left=None
            root=t
            return root
        elif x<root.data:
            root.left=self.insert(x, root.left)
        else:
            root.right=self.insert(x, root.right)
        return root

    def printTree(self, t):
        if t==None:
            return

        self.printTree(t.left)
        print t.data
        self.printTree(t.right)

class node(object):
    def __init__(self, x):
        self.x=x

bt=Tree()
root=None
n=int(raw_input())
a=[]
for i in range(n):
    a.append(int(raw_input()))
for i in range(n):
    root=bt.insert(a[i], root)
bt.printTree(root)

Question 42

Here is a simple solution which can be used to build a binary tree using a recursive approach to display the tree in order traversal has been used in the below code.

class Node(object):

    def __init__(self):
        self.left = None
        self.right = None
        self.value = None
    @property
    def get_value(self):
        return self.value

    @property
    def get_left(self):
        return self.left

    @property
    def get_right(self):
        return self.right

    @get_left.setter
    def set_left(self, left_node):
        self.left = left_node
    @get_value.setter
    def set_value(self, value):
        self.value = value
    @get_right.setter
    def set_right(self, right_node):
        self.right = right_node



    def create_tree(self):
        _node = Node() #creating new node.
        _x = input("Enter the node data(-1 for null)")
        if(_x == str(-1)): #for defining no child.
            return None
        _node.set_value = _x #setting the value of the node.
        print("Enter the left child of {}".format(_x))
        _node.set_left = self.create_tree() #setting the left subtree
        print("Enter the right child of {}".format(_x))
        _node.set_right = self.create_tree() #setting the right subtree.

        return _node

    def pre_order(self, root):
        if root is not None:
            print(root.get_value)
            self.pre_order(root.get_left)
            self.pre_order(root.get_right)

if __name__ == '__main__':
    node = Node()
    root_node = node.create_tree()
    node.pre_order(root_node)

Code taken from : Binary Tree in Python

Question 43

I have a list ‘a’

a= [(1,2),(1,4),(3,5),(5,7)]

I need to find all the tuples for a particular number. say for 1 it will be

result = [(1,2),(1,4)]

How do I do that?

Question 44

If you just want the first number to match you can do it like this:

[item for item in a if item[0] == 1]

If you are just searching for tuples with 1 in them:

[item for item in a if 1 in item]

Question 45

There is actually a clever way to do this that is useful for any list of tuples where the size of each tuple is 2: you can convert your list into a single dictionary.

For example,

test = [("hi", 1), ("there", 2)]
test = dict(test)
print test["hi"] # prints 1

Question 46

Read up on List Comprehensions

[ (x,y) for x, y in a if x  == 1 ]

Also read up up generator functions and the yield statement.

def filter_value( someList, value ):
    for x, y in someList:
        if x == value :
            yield x,y

result= list( filter_value( a, 1 ) )

Question 47

[tup for tup in a if tup[0] == 1]

Question 48

for item in a:
   if 1 in item:
       print item

Question 49

>>> [i for i in a if 1 in i]

[(1, 2), (1, 4)]

Question 50

The filter function can also provide an interesting solution:

result = list(filter(lambda x: x.count(1) > 0, a))

which searches the tuples in the list a for any occurrences of 1. If the search is limited to the first element, the solution can be modified into:

result = list(filter(lambda x: x[0] == 1, a))

Question 51

Using filter function:

>>> def get_values(iterables, key_to_find):

        return list(filter(lambda x:key_to_find in x, iterables)) 

>>> a = [(1,2),(1,4),(3,5),(5,7)]

>>> get_values(a, 1)

>>> [(1, 2), (1, 4)]

Question 52

Or takewhile, ( addition to this, example of more values is shown ):

>>> a= [(1,2),(1,4),(3,5),(5,7),(0,2)]
>>> import itertools
>>> list(itertools.takewhile(lambda x: x[0]==1,a))
[(1, 2), (1, 4)]
>>>

if unsorted, like:

>>> a= [(1,2),(3,5),(1,4),(5,7)]
>>> import itertools
>>> list(itertools.takewhile(lambda x: x[0]==1,sorted(a,key=lambda x: x[0]==1)))
[(1, 2), (1, 4)]
>>>

Question 53

if you want to search tuple for any number which is present in tuple then you can use

a= [(1,2),(1,4),(3,5),(5,7)]
i=1
result=[]
for j in a:
    if i in j:
        result.append(j)

print(result)

You can also use if i==j[0] or i==j[index] if you want to search a number in particular index

Question 54

In a program I’m writing I have Python use the re.search() function to find matches in a block of text and print the results. However, the program exits once it finds the first match in the block of text.

How do I do this repeatedly where the program doesn’t stop until ALL matches have been found? Is there a separate function to do this?

Question 55

Use re.findall or re.finditer instead.

re.findall(pattern, string) returns a list of matching strings.

re.finditer(pattern, string) returns an iterator over MatchObject objects.

Example:

re.findall( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')
# Output: ['cats', 'dogs']

[x.group() for x in re.finditer( r'all (.*?) are', 'all cats are smarter than dogs, all dogs are dumber than cats')]
# Output: ['all cats are', 'all dogs are']

x86/64、arm64、v6、v7	Linux/MacOS和Python 3.7/3.8/3.9	Docker用户
最低要求 _{(不支持HTTP、WebSocket、Docker)}	`pip install jina`	`docker run jinaai/jina:latest`
_Daemon	_{pip install "jina[daemon]"}	_{docker run --network=host jinaai/jina:latest-daemon}
_{使用附加服务}	_{pip install "jina[devel]"}	_{docker run jinaai/jina:latest-devel}

问题：最好的Django搜索应用是什么？[关闭]

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

问题：将Python搜索路径扩展到其他来源

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

问题：如何实现二叉树？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

回答 16

回答 17

问题：在元组列表中查找元素

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

问题：如何在Python中找到与正则表达式的所有匹配项？

回答 0

问题：在numpy数组中查找最接近的值

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6

回答 7

回答 8

回答 9

回答 10

回答 11

回答 12

回答 13

回答 14

回答 15

问题：re.search和re.match有什么区别？

回答 0

回答 1

回答 2

回答 3

回答 4

回答 5

回答 6