标签归档:machine-learning

将索引数组转换为1-hot编码的numpy数组

问题:将索引数组转换为1-hot编码的numpy数组

假设我有一个一维numpy数组

a = array([1,0,3])

我想将此编码为2d 1-hot数组

b = array([[0,1,0,0], [1,0,0,0], [0,0,0,1]])

有快速的方法可以做到这一点吗?比循环遍历a设置元素更快b

Let’s say I have a 1d numpy array

a = array([1,0,3])

I would like to encode this as a 2D one-hot array

b = array([[0,1,0,0], [1,0,0,0], [0,0,0,1]])

Is there a quick way to do this? Quicker than just looping over a to set elements of b, that is.


回答 0

您的数组a定义了输出数组中非零元素的列。您还需要定义行,然后使用花式索引:

>>> a = np.array([1, 0, 3])
>>> b = np.zeros((a.size, a.max()+1))
>>> b[np.arange(a.size),a] = 1
>>> b
array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])

Your array a defines the columns of the nonzero elements in the output array. You need to also define the rows and then use fancy indexing:

>>> a = np.array([1, 0, 3])
>>> b = np.zeros((a.size, a.max()+1))
>>> b[np.arange(a.size),a] = 1
>>> b
array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])

回答 1

>>> values = [1, 0, 3]
>>> n_values = np.max(values) + 1
>>> np.eye(n_values)[values]
array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])
>>> values = [1, 0, 3]
>>> n_values = np.max(values) + 1
>>> np.eye(n_values)[values]
array([[ 0.,  1.,  0.,  0.],
       [ 1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.]])

回答 2

如果您使用的是keras,则有一个内置实用程序:

from keras.utils.np_utils import to_categorical   

categorical_labels = to_categorical(int_labels, num_classes=3)

它与@YXD的答案几乎相同(请参阅源代码)。

In case you are using keras, there is a built in utility for that:

from keras.utils.np_utils import to_categorical   

categorical_labels = to_categorical(int_labels, num_classes=3)

And it does pretty much the same as @YXD’s answer (see source-code).


回答 3

这是我发现有用的:

def one_hot(a, num_classes):
  return np.squeeze(np.eye(num_classes)[a.reshape(-1)])

num_classes代表您所拥有的类数量。因此,如果您拥有a形状为(10000,)的向量,则此函数会将其转换为(10000,C)。请注意,a是零索引,即one_hot(np.array([0, 1]), 2)会给[[1, 0], [0, 1]]

正是您想要的,我相信。

PS:来源是序列模型-deeplearning.ai

Here is what I find useful:

def one_hot(a, num_classes):
  return np.squeeze(np.eye(num_classes)[a.reshape(-1)])

Here num_classes stands for number of classes you have. So if you have a vector with shape of (10000,) this function transforms it to (10000,C). Note that a is zero-indexed, i.e. one_hot(np.array([0, 1]), 2) will give [[1, 0], [0, 1]].

Exactly what you wanted to have I believe.

PS: the source is Sequence models – deeplearning.ai


回答 4

您可以使用 sklearn.preprocessing.LabelBinarizer

例:

import sklearn.preprocessing
a = [1,0,3]
label_binarizer = sklearn.preprocessing.LabelBinarizer()
label_binarizer.fit(range(max(a)+1))
b = label_binarizer.transform(a)
print('{0}'.format(b))

输出:

[[0 1 0 0]
 [1 0 0 0]
 [0 0 0 1]]

除其他事项外,您可以初始化sklearn.preprocessing.LabelBinarizer()以便的输出transform稀疏。

You can use sklearn.preprocessing.LabelBinarizer:

Example:

import sklearn.preprocessing
a = [1,0,3]
label_binarizer = sklearn.preprocessing.LabelBinarizer()
label_binarizer.fit(range(max(a)+1))
b = label_binarizer.transform(a)
print('{0}'.format(b))

output:

[[0 1 0 0]
 [1 0 0 0]
 [0 0 0 1]]

Amongst other things, you may initialize sklearn.preprocessing.LabelBinarizer() so that the output of transform is sparse.


回答 5

您还可以使用numpy的eye函数:

numpy.eye(number of classes)[vector containing the labels]

You can also use eye function of numpy:

numpy.eye(number of classes)[vector containing the labels]


回答 6

这是将一维矢量转换为一维二维热阵列的函数。

#!/usr/bin/env python
import numpy as np

def convertToOneHot(vector, num_classes=None):
    """
    Converts an input 1-D vector of integers into an output
    2-D array of one-hot vectors, where an i'th input value
    of j will set a '1' in the i'th row, j'th column of the
    output array.

    Example:
        v = np.array((1, 0, 4))
        one_hot_v = convertToOneHot(v)
        print one_hot_v

        [[0 1 0 0 0]
         [1 0 0 0 0]
         [0 0 0 0 1]]
    """

    assert isinstance(vector, np.ndarray)
    assert len(vector) > 0

    if num_classes is None:
        num_classes = np.max(vector)+1
    else:
        assert num_classes > 0
        assert num_classes >= np.max(vector)

    result = np.zeros(shape=(len(vector), num_classes))
    result[np.arange(len(vector)), vector] = 1
    return result.astype(int)

以下是一些用法示例:

>>> a = np.array([1, 0, 3])

>>> convertToOneHot(a)
array([[0, 1, 0, 0],
       [1, 0, 0, 0],
       [0, 0, 0, 1]])

>>> convertToOneHot(a, num_classes=10)
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]])

Here is a function that converts a 1-D vector to a 2-D one-hot array.

#!/usr/bin/env python
import numpy as np

def convertToOneHot(vector, num_classes=None):
    """
    Converts an input 1-D vector of integers into an output
    2-D array of one-hot vectors, where an i'th input value
    of j will set a '1' in the i'th row, j'th column of the
    output array.

    Example:
        v = np.array((1, 0, 4))
        one_hot_v = convertToOneHot(v)
        print one_hot_v

        [[0 1 0 0 0]
         [1 0 0 0 0]
         [0 0 0 0 1]]
    """

    assert isinstance(vector, np.ndarray)
    assert len(vector) > 0

    if num_classes is None:
        num_classes = np.max(vector)+1
    else:
        assert num_classes > 0
        assert num_classes >= np.max(vector)

    result = np.zeros(shape=(len(vector), num_classes))
    result[np.arange(len(vector)), vector] = 1
    return result.astype(int)

Below is some example usage:

>>> a = np.array([1, 0, 3])

>>> convertToOneHot(a)
array([[0, 1, 0, 0],
       [1, 0, 0, 0],
       [0, 0, 0, 1]])

>>> convertToOneHot(a, num_classes=10)
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 0]])

回答 7

对于1热编码

   one_hot_encode=pandas.get_dummies(array)

例如

享受编码

For 1-hot-encoding

   one_hot_encode=pandas.get_dummies(array)

For Example

ENJOY CODING


回答 8

我认为简短的答案是“否”。对于更通用的n尺寸,我想到了:

# For 2-dimensional data, 4 values
a = np.array([[0, 1, 2], [3, 2, 1]])
z = np.zeros(list(a.shape) + [4])
z[list(np.indices(z.shape[:-1])) + [a]] = 1

我想知道是否有更好的解决方案-我不喜欢我必须在最后两行中创建这些列表。无论如何,我使用进行了一些测量,timeit看来numpy基于-(indices/ arange)和迭代版本的性能大致相同。

I think the short answer is no. For a more generic case in n dimensions, I came up with this:

# For 2-dimensional data, 4 values
a = np.array([[0, 1, 2], [3, 2, 1]])
z = np.zeros(list(a.shape) + [4])
z[list(np.indices(z.shape[:-1])) + [a]] = 1

I am wondering if there is a better solution — I don’t like that I have to create those lists in the last two lines. Anyway, I did some measurements with timeit and it seems that the numpy-based (indices/arange) and the iterative versions perform about the same.


回答 9

只是在阐述出色答卷K3 — RNC,这里是一个更宽泛的版本:

def onehottify(x, n=None, dtype=float):
    """1-hot encode x with the max value n (computed from data if n is None)."""
    x = np.asarray(x)
    n = np.max(x) + 1 if n is None else n
    return np.eye(n, dtype=dtype)[x]

此外,这里是这种方法的快速和肮脏的基准,并从一个方法目前公认的答案YXD(微变,让他们提供相同的API但后者只能与1D ndarrays):

def onehottify_only_1d(x, n=None, dtype=float):
    x = np.asarray(x)
    n = np.max(x) + 1 if n is None else n
    b = np.zeros((len(x), n), dtype=dtype)
    b[np.arange(len(x)), x] = 1
    return b

后一种方法的速度提高了约35%(MacBook Pro 13 2015),但前一种方法更通用:

>>> import numpy as np
>>> np.random.seed(42)
>>> a = np.random.randint(0, 9, size=(10_000,))
>>> a
array([6, 3, 7, ..., 5, 8, 6])
>>> %timeit onehottify(a, 10)
188 µs ± 5.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit onehottify_only_1d(a, 10)
139 µs ± 2.78 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Just to elaborate on the excellent answer from K3—rnc, here is a more generic version:

def onehottify(x, n=None, dtype=float):
    """1-hot encode x with the max value n (computed from data if n is None)."""
    x = np.asarray(x)
    n = np.max(x) + 1 if n is None else n
    return np.eye(n, dtype=dtype)[x]

Also, here is a quick-and-dirty benchmark of this method and a method from the currently accepted answer by YXD (slightly changed, so that they offer the same API except that the latter works only with 1D ndarrays):

def onehottify_only_1d(x, n=None, dtype=float):
    x = np.asarray(x)
    n = np.max(x) + 1 if n is None else n
    b = np.zeros((len(x), n), dtype=dtype)
    b[np.arange(len(x)), x] = 1
    return b

The latter method is ~35% faster (MacBook Pro 13 2015), but the former is more general:

>>> import numpy as np
>>> np.random.seed(42)
>>> a = np.random.randint(0, 9, size=(10_000,))
>>> a
array([6, 3, 7, ..., 5, 8, 6])
>>> %timeit onehottify(a, 10)
188 µs ± 5.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit onehottify_only_1d(a, 10)
139 µs ± 2.78 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

回答 10

您可以使用以下代码将其转换为单热向量:

令x为具有单个列的普通类向量,该类具有从0到某个数字的类:

import numpy as np
np.eye(x.max()+1)[x]

如果0不是一个类;然后删除+1。

You can use the following code for converting into a one-hot vector:

let x is the normal class vector having a single column with classes 0 to some number:

import numpy as np
np.eye(x.max()+1)[x]

if 0 is not a class; then remove +1.


回答 11

我最近遇到了一个同类问题,发现上述解决方案只有在您的数字在一定范围内时才令人满意。例如,如果您要对以下列表进行一次热编码:

all_good_list = [0,1,2,3,4]

继续,上面已经提到过发布的解决方案。但是如果考虑这些数据怎么办:

problematic_list = [0,23,12,89,10]

如果使用上述方法进行操作,则可能会得到90个单柱色谱柱。这是因为所有答案都包含n = np.max(a)+1。我找到了一个更通用的解决方案,可以为我解决这个问题,并希望与您分享:

import numpy as np
import sklearn
sklb = sklearn.preprocessing.LabelBinarizer()
a = np.asarray([1,2,44,3,2])
n = np.unique(a)
sklb.fit(n)
b = sklb.transform(a)

我希望有人在上述解决方案上遇到同样的限制,并且这可能派上用场

I recently ran into a problem of same kind and found said solution which turned out to be only satisfying if you have numbers that go within a certain formation. For example if you want to one-hot encode following list:

all_good_list = [0,1,2,3,4]

go ahead, the posted solutions are already mentioned above. But what if considering this data:

problematic_list = [0,23,12,89,10]

If you do it with methods mentioned above, you will likely end up with 90 one-hot columns. This is because all answers include something like n = np.max(a)+1. I found a more generic solution that worked out for me and wanted to share with you:

import numpy as np
import sklearn
sklb = sklearn.preprocessing.LabelBinarizer()
a = np.asarray([1,2,44,3,2])
n = np.unique(a)
sklb.fit(n)
b = sklb.transform(a)

I hope someone encountered same restrictions on above solutions and this might come in handy


回答 12

这种编码类型通常是numpy数组的一部分。如果您使用这样的numpy数组:

a = np.array([1,0,3])

那么有一种非常简单的方法可以将其转换为1-hot编码

out = (np.arange(4) == a[:,None]).astype(np.float32)

而已。

Such type of encoding are usually part of numpy array. If you are using a numpy array like this :

a = np.array([1,0,3])

then there is very simple way to convert that to 1-hot encoding

out = (np.arange(4) == a[:,None]).astype(np.float32)

That’s it.


回答 13

  • p将是一个二维数组。
  • 我们想知道哪个值是连续最高的,在那放置1,在其他任何地方放置0。

清洁简便的解决方案:

max_elements_i = np.expand_dims(np.argmax(p, axis=1), axis=1)
one_hot = np.zeros(p.shape)
np.put_along_axis(one_hot, max_elements_i, 1, axis=1)
  • p will be a 2d ndarray.
  • We want to know which value is the highest in a row, to put there 1 and everywhere else 0.

clean and easy solution:

max_elements_i = np.expand_dims(np.argmax(p, axis=1), axis=1)
one_hot = np.zeros(p.shape)
np.put_along_axis(one_hot, max_elements_i, 1, axis=1)

回答 14

使用Neuraxle管道步骤:

  1. 建立你的例子
import numpy as np
a = np.array([1,0,3])
b = np.array([[0,1,0,0], [1,0,0,0], [0,0,0,1]])
  1. 做实际的转换
from neuraxle.steps.numpy import OneHotEncoder
encoder = OneHotEncoder(nb_columns=4)
b_pred = encoder.transform(a)
  1. 断言有效
assert b_pred == b

链接到文档:neuraxle.steps.numpy.OneHotEncoder

Using a Neuraxle pipeline step:

  1. Set up your example
import numpy as np
a = np.array([1,0,3])
b = np.array([[0,1,0,0], [1,0,0,0], [0,0,0,1]])
  1. Do the actual conversion
from neuraxle.steps.numpy import OneHotEncoder
encoder = OneHotEncoder(nb_columns=4)
b_pred = encoder.transform(a)
  1. Assert it works
assert b_pred == b

Link to documentation: neuraxle.steps.numpy.OneHotEncoder


回答 15

这是我根据上述答案和自己的用例编写的一个示例函数:

def label_vector_to_one_hot_vector(vector, one_hot_size=10):
    """
    Use to convert a column vector to a 'one-hot' matrix

    Example:
        vector: [[2], [0], [1]]
        one_hot_size: 3
        returns:
            [[ 0.,  0.,  1.],
             [ 1.,  0.,  0.],
             [ 0.,  1.,  0.]]

    Parameters:
        vector (np.array): of size (n, 1) to be converted
        one_hot_size (int) optional: size of 'one-hot' row vector

    Returns:
        np.array size (vector.size, one_hot_size): converted to a 'one-hot' matrix
    """
    squeezed_vector = np.squeeze(vector, axis=-1)

    one_hot = np.zeros((squeezed_vector.size, one_hot_size))

    one_hot[np.arange(squeezed_vector.size), squeezed_vector] = 1

    return one_hot

label_vector_to_one_hot_vector(vector=[[2], [0], [1]], one_hot_size=3)

Here is an example function that I wrote to do this based upon the answers above and my own use case:

def label_vector_to_one_hot_vector(vector, one_hot_size=10):
    """
    Use to convert a column vector to a 'one-hot' matrix

    Example:
        vector: [[2], [0], [1]]
        one_hot_size: 3
        returns:
            [[ 0.,  0.,  1.],
             [ 1.,  0.,  0.],
             [ 0.,  1.,  0.]]

    Parameters:
        vector (np.array): of size (n, 1) to be converted
        one_hot_size (int) optional: size of 'one-hot' row vector

    Returns:
        np.array size (vector.size, one_hot_size): converted to a 'one-hot' matrix
    """
    squeezed_vector = np.squeeze(vector, axis=-1)

    one_hot = np.zeros((squeezed_vector.size, one_hot_size))

    one_hot[np.arange(squeezed_vector.size), squeezed_vector] = 1

    return one_hot

label_vector_to_one_hot_vector(vector=[[2], [0], [1]], one_hot_size=3)

回答 16

为了添加完整的功能,我仅使用numpy运算符:

   def probs_to_onehot(output_probabilities):
        argmax_indices_array = np.argmax(output_probabilities, axis=1)
        onehot_output_array = np.eye(np.unique(argmax_indices_array).shape[0])[argmax_indices_array.reshape(-1)]
        return onehot_output_array

它以概率矩阵作为输入:例如:

[[0.03038822 0.65810204 0.16549407 0.3797123] … [0.02771272 0.2760752 0.3280924 0.33458805]

它将返回

[[0 1 0 0] … [0 0 0 1]]

I am adding for completion a simple function, using only numpy operators:

   def probs_to_onehot(output_probabilities):
        argmax_indices_array = np.argmax(output_probabilities, axis=1)
        onehot_output_array = np.eye(np.unique(argmax_indices_array).shape[0])[argmax_indices_array.reshape(-1)]
        return onehot_output_array

It takes as input a probability matrix: e.g.:

[[0.03038822 0.65810204 0.16549407 0.3797123 ] … [0.02771272 0.2760752 0.3280924 0.33458805]]

And it will return

[[0 1 0 0] … [0 0 0 1]]


回答 17

这是一个与维数无关的独立解决方案。

这会将arr非负整数的任何N维数组转换为一维N + 1维数组one_hot,其中one_hot[i_1,...,i_N,c] = 1means arr[i_1,...,i_N] = c。您可以通过以下方式恢复输入np.argmax(one_hot, -1)

def expand_integer_grid(arr, n_classes):
    """

    :param arr: N dim array of size i_1, ..., i_N
    :param n_classes: C
    :returns: one-hot N+1 dim array of size i_1, ..., i_N, C
    :rtype: ndarray

    """
    one_hot = np.zeros(arr.shape + (n_classes,))
    axes_ranges = [range(arr.shape[i]) for i in range(arr.ndim)]
    flat_grids = [_.ravel() for _ in np.meshgrid(*axes_ranges, indexing='ij')]
    one_hot[flat_grids + [arr.ravel()]] = 1
    assert((one_hot.sum(-1) == 1).all())
    assert(np.allclose(np.argmax(one_hot, -1), arr))
    return one_hot

Here’s a dimensionality-independent standalone solution.

This will convert any N-dimensional array arr of nonnegative integers to a one-hot N+1-dimensional array one_hot, where one_hot[i_1,...,i_N,c] = 1 means arr[i_1,...,i_N] = c. You can recover the input via np.argmax(one_hot, -1)

def expand_integer_grid(arr, n_classes):
    """

    :param arr: N dim array of size i_1, ..., i_N
    :param n_classes: C
    :returns: one-hot N+1 dim array of size i_1, ..., i_N, C
    :rtype: ndarray

    """
    one_hot = np.zeros(arr.shape + (n_classes,))
    axes_ranges = [range(arr.shape[i]) for i in range(arr.ndim)]
    flat_grids = [_.ravel() for _ in np.meshgrid(*axes_ranges, indexing='ij')]
    one_hot[flat_grids + [arr.ravel()]] = 1
    assert((one_hot.sum(-1) == 1).all())
    assert(np.allclose(np.argmax(one_hot, -1), arr))
    return one_hot

回答 18

使用以下代码。效果最好。

def one_hot_encode(x):
"""
    argument
        - x: a list of labels
    return
        - one hot encoding matrix (number of labels, number of class)
"""
encoded = np.zeros((len(x), 10))

for idx, val in enumerate(x):
    encoded[idx][val] = 1

return encoded

在这里找到它 PS无需进入链接。

Use the following code. It works best.

def one_hot_encode(x):
"""
    argument
        - x: a list of labels
    return
        - one hot encoding matrix (number of labels, number of class)
"""
encoded = np.zeros((len(x), 10))

for idx, val in enumerate(x):
    encoded[idx][val] = 1

return encoded

Found it here P.S You don’t need to go into the link.


什么是logits,softmax和softmax_cross_entropy_with_logits?

问题:什么是logits,softmax和softmax_cross_entropy_with_logits?

我正在这里浏览tensorflow API文档。在tensorflow文档中,他们使用了名为的关键字logits。它是什么?API文档中的许多方法都将其编写为

tf.nn.softmax(logits, name=None)

如果写的是什么是那些logitsTensors,为什么保持一个不同的名称,如logits

另一件事是,我无法区分两种方法。他们是

tf.nn.softmax(logits, name=None)
tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

它们之间有什么区别?这些文档对我来说还不清楚。我知道是什么tf.nn.softmax呢。但是没有其他。一个例子将非常有帮助。

I was going through the tensorflow API docs here. In the tensorflow documentation, they used a keyword called logits. What is it? In a lot of methods in the API docs it is written like

tf.nn.softmax(logits, name=None)

If what is written is those logits are only Tensors, why keeping a different name like logits?

Another thing is that there are two methods I could not differentiate. They were

tf.nn.softmax(logits, name=None)
tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)

What are the differences between them? The docs are not clear to me. I know what tf.nn.softmax does. But not the other. An example will be really helpful.


回答 0

Logits只是意味着函数在较早的图层的未缩放输出上运行,并且理解单位的相对缩放是线性的。特别是,这意味着输入的总和可能不等于1,这意味着值不是概率(输入可能为5)。

tf.nn.softmax仅产生将softmax函数应用于输入张量的结果。softmax“压缩”输入,以便sum(input) = 1:这是一种规范化方法。softmax的输出形状与输入相同:它只是将值标准化。softmax的输出可以解释为概率。

a = tf.constant(np.array([[.1, .3, .5, .9]]))
print s.run(tf.nn.softmax(a))
[[ 0.16838508  0.205666    0.25120102  0.37474789]]

相比之下,tf.nn.softmax_cross_entropy_with_logits在应用softmax函数之后计算结果的交叉熵(但以数学上更仔细的方式将其全部合并在一起)。它类似于以下结果:

sm = tf.nn.softmax(x)
ce = cross_entropy(sm)

交叉熵是一个汇总指标:跨元素求和。tf.nn.softmax_cross_entropy_with_logits形状[2,5]张量的输出是一定形状的[2,1](将第一维视为批处理)。

如果要进行优化以最小化交叉熵,并且要在最后一层之后进行软最大化,则应使用tf.nn.softmax_cross_entropy_with_logits而不是自己进行处理,因为它以数学上正确的方式涵盖了数值不稳定的拐角情况。否则,您最终会在这里和那里添加少量epsilon,从而对其进行破解。

编辑于2016-02-07: 如果您具有单类标签,而一个对象只能属于一个类,则现在可以考虑使用tf.nn.sparse_softmax_cross_entropy_with_logits,这样就不必将标签转换为密集的一键热阵列。在0.6.0版本之后添加了此功能。

Logits simply means that the function operates on the unscaled output of earlier layers and that the relative scale to understand the units is linear. It means, in particular, the sum of the inputs may not equal 1, that the values are not probabilities (you might have an input of 5).

tf.nn.softmax produces just the result of applying the softmax function to an input tensor. The softmax “squishes” the inputs so that sum(input) = 1: it’s a way of normalizing. The shape of output of a softmax is the same as the input: it just normalizes the values. The outputs of softmax can be interpreted as probabilities.

a = tf.constant(np.array([[.1, .3, .5, .9]]))
print s.run(tf.nn.softmax(a))
[[ 0.16838508  0.205666    0.25120102  0.37474789]]

In contrast, tf.nn.softmax_cross_entropy_with_logits computes the cross entropy of the result after applying the softmax function (but it does it all together in a more mathematically careful way). It’s similar to the result of:

sm = tf.nn.softmax(x)
ce = cross_entropy(sm)

The cross entropy is a summary metric: it sums across the elements. The output of tf.nn.softmax_cross_entropy_with_logits on a shape [2,5] tensor is of shape [2,1] (the first dimension is treated as the batch).

If you want to do optimization to minimize the cross entropy AND you’re softmaxing after your last layer, you should use tf.nn.softmax_cross_entropy_with_logits instead of doing it yourself, because it covers numerically unstable corner cases in the mathematically right way. Otherwise, you’ll end up hacking it by adding little epsilons here and there.

Edited 2016-02-07: If you have single-class labels, where an object can only belong to one class, you might now consider using tf.nn.sparse_softmax_cross_entropy_with_logits so that you don’t have to convert your labels to a dense one-hot array. This function was added after release 0.6.0.


回答 1

简洁版本:

假设您有两个张量,其中y_hat包含每个类的计算得分(例如,来自y = W * x + b),并且y_true包含一个热编码的真实标签。

y_hat  = ... # Predicted label, e.g. y = tf.matmul(X, W) + b
y_true = ... # True label, one-hot encoded

如果将分数解释y_hat为未归一化的对数概率,则它们为logits

此外,以这种方式计算的总交叉熵损失为:

y_hat_softmax = tf.nn.softmax(y_hat)
total_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), [1]))

基本上等于用函数计算的总交叉熵损失softmax_cross_entropy_with_logits()

total_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))

长版:

在神经网络的输出层中,您可能会计算一个数组,其中包含每个训练实例的类分数,例如来自计算y_hat = W*x + b。作为示例,我在下面创建了y_hat一个2 x 3的数组,其中行对应于训练实例,列对应于类。因此,这里有2个训练实例和3个类。

import tensorflow as tf
import numpy as np

sess = tf.Session()

# Create example y_hat.
y_hat = tf.convert_to_tensor(np.array([[0.5, 1.5, 0.1],[2.2, 1.3, 1.7]]))
sess.run(y_hat)
# array([[ 0.5,  1.5,  0.1],
#        [ 2.2,  1.3,  1.7]])

请注意,这些值未规范化(即,各行的总和不等于1)。为了对其进行归一化,我们可以应用softmax函数,该函数将输入解释为归一化的对数概率(aka logits),并输出归一化的线性概率。

y_hat_softmax = tf.nn.softmax(y_hat)
sess.run(y_hat_softmax)
# array([[ 0.227863  ,  0.61939586,  0.15274114],
#        [ 0.49674623,  0.20196195,  0.30129182]])

完全了解softmax输出在说什么很重要。下面我显示了一个表格,可以更清楚地表示上面的输出。可以看出,例如,训练实例1为“等级2”的概率为0.619。每个训练实例的类概率均已标准化,因此每行的总和为1.0。

                      Pr(Class 1)  Pr(Class 2)  Pr(Class 3)
                    ,--------------------------------------
Training instance 1 | 0.227863   | 0.61939586 | 0.15274114
Training instance 2 | 0.49674623 | 0.20196195 | 0.30129182

因此,现在我们有了每个训练实例的类概率,在这里我们可以采用每一行的argmax()来生成最终分类。从上面可以生成训练实例1属于“类别2”,训练实例2属于“类别1”。

这些分类正确吗?我们需要根据训练集中的真实标签进行衡量。您将需要一个一次性编码的y_true数组,其中行又是训练实例,列又是类。下面,我创建了y_true一个单热点数组示例,其中训练实例1的真实标签为“ Class 2”,训练实例2的真实标签为“ Class 3”。

y_true = tf.convert_to_tensor(np.array([[0.0, 1.0, 0.0],[0.0, 0.0, 1.0]]))
sess.run(y_true)
# array([[ 0.,  1.,  0.],
#        [ 0.,  0.,  1.]])

概率分布是否y_hat_softmax接近的概率分布y_true?我们可以使用交叉熵损失来测量误差。

我们可以逐行计算交叉熵损失并查看结果。在下面我们可以看到训练实例1的损失为0.479,而训练实例2的损失为1.200。该结果之所以有意义,是因为在上面的示例中y_hat_softmax,训练实例1的最高机率是“类别2”,它与中的训练实例1相匹配y_true;但是,针对训练实例2的预测显示出“类别1”的最高概率,这与真实的类别“类别3”不匹配。

loss_per_instance_1 = -tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1])
sess.run(loss_per_instance_1)
# array([ 0.4790107 ,  1.19967598])

我们真正想要的是所有训练实例的总损失。因此我们可以计算:

total_loss_1 = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1]))
sess.run(total_loss_1)
# 0.83934333897877944

使用softmax_cross_entropy_with_logits()

相反,我们可以使用tf.nn.softmax_cross_entropy_with_logits()函数来计算总交叉熵损失,如下所示。

loss_per_instance_2 = tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true)
sess.run(loss_per_instance_2)
# array([ 0.4790107 ,  1.19967598])

total_loss_2 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))
sess.run(total_loss_2)
# 0.83934333897877922

请注意,total_loss_1total_loss_2在非常最后的数字有些小的差异产生几乎相同的结果。但是,您也可以使用第二种方法:它减少了一行代码,并减少了数值误差,因为softmax是在中完成的softmax_cross_entropy_with_logits()

Short version:

Suppose you have two tensors, where y_hat contains computed scores for each class (for example, from y = W*x +b) and y_true contains one-hot encoded true labels.

y_hat  = ... # Predicted label, e.g. y = tf.matmul(X, W) + b
y_true = ... # True label, one-hot encoded

If you interpret the scores in y_hat as unnormalized log probabilities, then they are logits.

Additionally, the total cross-entropy loss computed in this manner:

y_hat_softmax = tf.nn.softmax(y_hat)
total_loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), [1]))

is essentially equivalent to the total cross-entropy loss computed with the function softmax_cross_entropy_with_logits():

total_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))

Long version:

In the output layer of your neural network, you will probably compute an array that contains the class scores for each of your training instances, such as from a computation y_hat = W*x + b. To serve as an example, below I’ve created a y_hat as a 2 x 3 array, where the rows correspond to the training instances and the columns correspond to classes. So here there are 2 training instances and 3 classes.

import tensorflow as tf
import numpy as np

sess = tf.Session()

# Create example y_hat.
y_hat = tf.convert_to_tensor(np.array([[0.5, 1.5, 0.1],[2.2, 1.3, 1.7]]))
sess.run(y_hat)
# array([[ 0.5,  1.5,  0.1],
#        [ 2.2,  1.3,  1.7]])

Note that the values are not normalized (i.e. the rows don’t add up to 1). In order to normalize them, we can apply the softmax function, which interprets the input as unnormalized log probabilities (aka logits) and outputs normalized linear probabilities.

y_hat_softmax = tf.nn.softmax(y_hat)
sess.run(y_hat_softmax)
# array([[ 0.227863  ,  0.61939586,  0.15274114],
#        [ 0.49674623,  0.20196195,  0.30129182]])

It’s important to fully understand what the softmax output is saying. Below I’ve shown a table that more clearly represents the output above. It can be seen that, for example, the probability of training instance 1 being “Class 2” is 0.619. The class probabilities for each training instance are normalized, so the sum of each row is 1.0.

                      Pr(Class 1)  Pr(Class 2)  Pr(Class 3)
                    ,--------------------------------------
Training instance 1 | 0.227863   | 0.61939586 | 0.15274114
Training instance 2 | 0.49674623 | 0.20196195 | 0.30129182

So now we have class probabilities for each training instance, where we can take the argmax() of each row to generate a final classification. From above, we may generate that training instance 1 belongs to “Class 2” and training instance 2 belongs to “Class 1”.

Are these classifications correct? We need to measure against the true labels from the training set. You will need a one-hot encoded y_true array, where again the rows are training instances and columns are classes. Below I’ve created an example y_true one-hot array where the true label for training instance 1 is “Class 2” and the true label for training instance 2 is “Class 3”.

y_true = tf.convert_to_tensor(np.array([[0.0, 1.0, 0.0],[0.0, 0.0, 1.0]]))
sess.run(y_true)
# array([[ 0.,  1.,  0.],
#        [ 0.,  0.,  1.]])

Is the probability distribution in y_hat_softmax close to the probability distribution in y_true? We can use cross-entropy loss to measure the error.

We can compute the cross-entropy loss on a row-wise basis and see the results. Below we can see that training instance 1 has a loss of 0.479, while training instance 2 has a higher loss of 1.200. This result makes sense because in our example above, y_hat_softmax showed that training instance 1’s highest probability was for “Class 2”, which matches training instance 1 in y_true; however, the prediction for training instance 2 showed a highest probability for “Class 1”, which does not match the true class “Class 3”.

loss_per_instance_1 = -tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1])
sess.run(loss_per_instance_1)
# array([ 0.4790107 ,  1.19967598])

What we really want is the total loss over all the training instances. So we can compute:

total_loss_1 = tf.reduce_mean(-tf.reduce_sum(y_true * tf.log(y_hat_softmax), reduction_indices=[1]))
sess.run(total_loss_1)
# 0.83934333897877944

Using softmax_cross_entropy_with_logits()

We can instead compute the total cross entropy loss using the tf.nn.softmax_cross_entropy_with_logits() function, as shown below.

loss_per_instance_2 = tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true)
sess.run(loss_per_instance_2)
# array([ 0.4790107 ,  1.19967598])

total_loss_2 = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y_hat, y_true))
sess.run(total_loss_2)
# 0.83934333897877922

Note that total_loss_1 and total_loss_2 produce essentially equivalent results with some small differences in the very final digits. However, you might as well use the second approach: it takes one less line of code and accumulates less numerical error because the softmax is done for you inside of softmax_cross_entropy_with_logits().


回答 2

tf.nn.softmax计算通过softmax层的前向传播。计算模型输出的概率时,可以在评估模型时使用它。

tf.nn.softmax_cross_entropy_with_logits计算softmax层的成本。仅在训练期间使用。

logits是模型输出的未归一化对数概率(将softmax归一化之前对它们输出的值)。

tf.nn.softmax computes the forward propagation through a softmax layer. You use it during evaluation of the model when you compute the probabilities that the model outputs.

tf.nn.softmax_cross_entropy_with_logits computes the cost for a softmax layer. It is only used during training.

The logits are the unnormalized log probabilities output the model (the values output before the softmax normalization is applied to them).


回答 3

以上答案对所提问题有足够的描述。

除此之外,Tensorflow还优化了应用激活函数的操作,然后使用其自身的激活以及成本函数来计算成本。因此,它是一个很好的做法,使用:tf.nn.softmax_cross_entropy()tf.nn.softmax(); tf.nn.cross_entropy()

您可以在资源密集型模型中找到它们之间的显着差异。

Above answers have enough description for the asked question.

Adding to that, Tensorflow has optimised the operation of applying the activation function then calculating cost using its own activation followed by cost functions. Hence it is a good practice to use: tf.nn.softmax_cross_entropy() over tf.nn.softmax(); tf.nn.cross_entropy()

You can find prominent difference between them in a resource intensive model.


回答 4

曾经发生过的softmax就是logit,这就是J. Hinton一直在Coursera视频中重复的内容。

What ever goes to softmax is logit, this is what J. Hinton repeats in coursera videos all the time.


回答 5

Tensorflow 2.0兼容答案:的解释dga,并stackoverflowuser2010有很详细的关于Logits和相关的功能。

所有这些功能Tensorflow 1.x都可以正常使用,但是如果您从1.x (1.14, 1.15, etc)2.x (2.0, 2.1, etc..),则使用这些功能会导致错误。

因此,如果我们从迁移,请为上面讨论的所有功能指定2.0兼容的调用。 1.x to 2.x为社区的利益。

1.x中的功能

  1. tf.nn.softmax
  2. tf.nn.softmax_cross_entropy_with_logits
  3. tf.nn.sparse_softmax_cross_entropy_with_logits

从1.x迁移到2.x的相应功能

  1. tf.compat.v2.nn.softmax
  2. tf.compat.v2.nn.softmax_cross_entropy_with_logits
  3. tf.compat.v2.nn.sparse_softmax_cross_entropy_with_logits

有关从1.x迁移到2.x的更多信息,请参阅此迁移指南

Tensorflow 2.0 Compatible Answer: The explanations of dga and stackoverflowuser2010 are very detailed about Logits and the related Functions.

All those functions, when used in Tensorflow 1.x will work fine, but if you migrate your code from 1.x (1.14, 1.15, etc) to 2.x (2.0, 2.1, etc..), using those functions result in error.

Hence, specifying the 2.0 Compatible Calls for all the functions, we discussed above, if we migrate from 1.x to 2.x, for the benefit of the community.

Functions in 1.x:

  1. tf.nn.softmax
  2. tf.nn.softmax_cross_entropy_with_logits
  3. tf.nn.sparse_softmax_cross_entropy_with_logits

Respective Functions when Migrated from 1.x to 2.x:

  1. tf.compat.v2.nn.softmax
  2. tf.compat.v2.nn.softmax_cross_entropy_with_logits
  3. tf.compat.v2.nn.sparse_softmax_cross_entropy_with_logits

For more information about migration from 1.x to 2.x, please refer this Migration Guide.


回答 6

我肯定要强调的一件事是logit仅仅是原始输出,通常是最后一层的输出。这也可以是负值。如果我们将其用于“交叉熵”评估,如下所述:

-tf.reduce_sum(y_true * tf.log(logits))

那就行不通了。由于-ve的日志未定义。因此,使用o softmax激活将克服此问题。

这是我的理解,如果我错了,请纠正我。

One more thing that I would definitely like to highlight as logit is just a raw output, generally the output of last layer. This can be a negative value as well. If we use it as it’s for “cross entropy” evaluation as mentioned below:

-tf.reduce_sum(y_true * tf.log(logits))

then it wont work. As log of -ve is not defined. So using o softmax activation, will overcome this problem.

This is my understanding, please correct me if Im wrong.


Tensorflow:如何保存/恢复模型?

问题:Tensorflow:如何保存/恢复模型?

在Tensorflow中训练模型后:

  1. 您如何保存经过训练的模型?
  2. 您以后如何还原此保存的模型?

After you train a model in Tensorflow:

  1. How do you save the trained model?
  2. How do you later restore this saved model?

回答 0

文件

从文档:

# Create some variables.
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)

inc_v1 = v1.assign(v1+1)
dec_v2 = v2.assign(v2-1)

# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, initialize the variables, do some work, and save the
# variables to disk.
with tf.Session() as sess:
  sess.run(init_op)
  # Do some work with the model.
  inc_v1.op.run()
  dec_v2.op.run()
  # Save the variables to disk.
  save_path = saver.save(sess, "/tmp/model.ckpt")
  print("Model saved in path: %s" % save_path)

恢复

tf.reset_default_graph()

# Create some variables.
v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  saver.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Check the values of the variables
  print("v1 : %s" % v1.eval())
  print("v2 : %s" % v2.eval())

Tensorflow 2

这仍然是beta版,因此我建议不要使用。如果您仍然想走那条路,这里就是tf.saved_model使用指南

Tensorflow <2

simple_save

许多好答案,为完整性起见,我将加2美分:simple_save。也是使用tf.data.DatasetAPI 的独立代码示例。

Python 3; Tensorflow 1.14

import tensorflow as tf
from tensorflow.saved_model import tag_constants

with tf.Graph().as_default():
    with tf.Session() as sess:
        ...

        # Saving
        inputs = {
            "batch_size_placeholder": batch_size_placeholder,
            "features_placeholder": features_placeholder,
            "labels_placeholder": labels_placeholder,
        }
        outputs = {"prediction": model_output}
        tf.saved_model.simple_save(
            sess, 'path/to/your/location/', inputs, outputs
        )

恢复:

graph = tf.Graph()
with restored_graph.as_default():
    with tf.Session() as sess:
        tf.saved_model.loader.load(
            sess,
            [tag_constants.SERVING],
            'path/to/your/location/',
        )
        batch_size_placeholder = graph.get_tensor_by_name('batch_size_placeholder:0')
        features_placeholder = graph.get_tensor_by_name('features_placeholder:0')
        labels_placeholder = graph.get_tensor_by_name('labels_placeholder:0')
        prediction = restored_graph.get_tensor_by_name('dense/BiasAdd:0')

        sess.run(prediction, feed_dict={
            batch_size_placeholder: some_value,
            features_placeholder: some_other_value,
            labels_placeholder: another_value
        })

独立示例

原始博客文章

为了演示,以下代码生成随机数据。

  1. 我们首先创建占位符。它们将在运行时保存数据。根据它们,我们创建Dataset然后Iterator。我们得到迭代器的生成张量,称为input_tensor,它将用作模型的输入。
  2. 模型本身是从 input_tensor基于:基于GRU的双向RNN,然后是密集分类器。因为为什么不。
  3. 损耗为softmax_cross_entropy_with_logits,优化为Adam。在2个时期(每个2批次)之后,我们使用保存了“训练”模型tf.saved_model.simple_save。如果按原样运行代码,则模型将保存在名为simple/当前工作目录下中。
  4. 在新的图形中,然后使用还原保存的模型tf.saved_model.loader.load。我们使用来抢占占位符并登录,graph.get_tensor_by_name并使用进行Iterator初始化操作graph.get_operation_by_name
  5. 最后,我们对数据集中的两个批次进行推断,并检查保存和恢复的模型是否产生相同的值。他们是这样!

码:

import os
import shutil
import numpy as np
import tensorflow as tf
from tensorflow.python.saved_model import tag_constants


def model(graph, input_tensor):
    """Create the model which consists of
    a bidirectional rnn (GRU(10)) followed by a dense classifier

    Args:
        graph (tf.Graph): Tensors' graph
        input_tensor (tf.Tensor): Tensor fed as input to the model

    Returns:
        tf.Tensor: the model's output layer Tensor
    """
    cell = tf.nn.rnn_cell.GRUCell(10)
    with graph.as_default():
        ((fw_outputs, bw_outputs), (fw_state, bw_state)) = tf.nn.bidirectional_dynamic_rnn(
            cell_fw=cell,
            cell_bw=cell,
            inputs=input_tensor,
            sequence_length=[10] * 32,
            dtype=tf.float32,
            swap_memory=True,
            scope=None)
        outputs = tf.concat((fw_outputs, bw_outputs), 2)
        mean = tf.reduce_mean(outputs, axis=1)
        dense = tf.layers.dense(mean, 5, activation=None)

        return dense


def get_opt_op(graph, logits, labels_tensor):
    """Create optimization operation from model's logits and labels

    Args:
        graph (tf.Graph): Tensors' graph
        logits (tf.Tensor): The model's output without activation
        labels_tensor (tf.Tensor): Target labels

    Returns:
        tf.Operation: the operation performing a stem of Adam optimizer
    """
    with graph.as_default():
        with tf.variable_scope('loss'):
            loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
                    logits=logits, labels=labels_tensor, name='xent'),
                    name="mean-xent"
                    )
        with tf.variable_scope('optimizer'):
            opt_op = tf.train.AdamOptimizer(1e-2).minimize(loss)
        return opt_op


if __name__ == '__main__':
    # Set random seed for reproducibility
    # and create synthetic data
    np.random.seed(0)
    features = np.random.randn(64, 10, 30)
    labels = np.eye(5)[np.random.randint(0, 5, (64,))]

    graph1 = tf.Graph()
    with graph1.as_default():
        # Random seed for reproducibility
        tf.set_random_seed(0)
        # Placeholders
        batch_size_ph = tf.placeholder(tf.int64, name='batch_size_ph')
        features_data_ph = tf.placeholder(tf.float32, [None, None, 30], 'features_data_ph')
        labels_data_ph = tf.placeholder(tf.int32, [None, 5], 'labels_data_ph')
        # Dataset
        dataset = tf.data.Dataset.from_tensor_slices((features_data_ph, labels_data_ph))
        dataset = dataset.batch(batch_size_ph)
        iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
        dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
        input_tensor, labels_tensor = iterator.get_next()

        # Model
        logits = model(graph1, input_tensor)
        # Optimization
        opt_op = get_opt_op(graph1, logits, labels_tensor)

        with tf.Session(graph=graph1) as sess:
            # Initialize variables
            tf.global_variables_initializer().run(session=sess)
            for epoch in range(3):
                batch = 0
                # Initialize dataset (could feed epochs in Dataset.repeat(epochs))
                sess.run(
                    dataset_init_op,
                    feed_dict={
                        features_data_ph: features,
                        labels_data_ph: labels,
                        batch_size_ph: 32
                    })
                values = []
                while True:
                    try:
                        if epoch < 2:
                            # Training
                            _, value = sess.run([opt_op, logits])
                            print('Epoch {}, batch {} | Sample value: {}'.format(epoch, batch, value[0]))
                            batch += 1
                        else:
                            # Final inference
                            values.append(sess.run(logits))
                            print('Epoch {}, batch {} | Final inference | Sample value: {}'.format(epoch, batch, values[-1][0]))
                            batch += 1
                    except tf.errors.OutOfRangeError:
                        break
            # Save model state
            print('\nSaving...')
            cwd = os.getcwd()
            path = os.path.join(cwd, 'simple')
            shutil.rmtree(path, ignore_errors=True)
            inputs_dict = {
                "batch_size_ph": batch_size_ph,
                "features_data_ph": features_data_ph,
                "labels_data_ph": labels_data_ph
            }
            outputs_dict = {
                "logits": logits
            }
            tf.saved_model.simple_save(
                sess, path, inputs_dict, outputs_dict
            )
            print('Ok')
    # Restoring
    graph2 = tf.Graph()
    with graph2.as_default():
        with tf.Session(graph=graph2) as sess:
            # Restore saved values
            print('\nRestoring...')
            tf.saved_model.loader.load(
                sess,
                [tag_constants.SERVING],
                path
            )
            print('Ok')
            # Get restored placeholders
            labels_data_ph = graph2.get_tensor_by_name('labels_data_ph:0')
            features_data_ph = graph2.get_tensor_by_name('features_data_ph:0')
            batch_size_ph = graph2.get_tensor_by_name('batch_size_ph:0')
            # Get restored model output
            restored_logits = graph2.get_tensor_by_name('dense/BiasAdd:0')
            # Get dataset initializing operation
            dataset_init_op = graph2.get_operation_by_name('dataset_init')

            # Initialize restored dataset
            sess.run(
                dataset_init_op,
                feed_dict={
                    features_data_ph: features,
                    labels_data_ph: labels,
                    batch_size_ph: 32
                }

            )
            # Compute inference for both batches in dataset
            restored_values = []
            for i in range(2):
                restored_values.append(sess.run(restored_logits))
                print('Restored values: ', restored_values[i][0])

    # Check if original inference and restored inference are equal
    valid = all((v == rv).all() for v, rv in zip(values, restored_values))
    print('\nInferences match: ', valid)

这将打印:

$ python3 save_and_restore.py

Epoch 0, batch 0 | Sample value: [-0.13851789 -0.3087595   0.12804556  0.20013677 -0.08229901]
Epoch 0, batch 1 | Sample value: [-0.00555491 -0.04339041 -0.05111827 -0.2480045  -0.00107776]
Epoch 1, batch 0 | Sample value: [-0.19321944 -0.2104792  -0.00602257  0.07465433  0.11674127]
Epoch 1, batch 1 | Sample value: [-0.05275984  0.05981954 -0.15913513 -0.3244143   0.10673307]
Epoch 2, batch 0 | Final inference | Sample value: [-0.26331693 -0.13013336 -0.12553    -0.04276478  0.2933622 ]
Epoch 2, batch 1 | Final inference | Sample value: [-0.07730117  0.11119192 -0.20817074 -0.35660955  0.16990358]

Saving...
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: b'/some/path/simple/saved_model.pb'
Ok

Restoring...
INFO:tensorflow:Restoring parameters from b'/some/path/simple/variables/variables'
Ok
Restored values:  [-0.26331693 -0.13013336 -0.12553    -0.04276478  0.2933622 ]
Restored values:  [-0.07730117  0.11119192 -0.20817074 -0.35660955  0.16990358]

Inferences match:  True

Docs

From the docs:

Save

# Create some variables.
v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer)
v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer)

inc_v1 = v1.assign(v1+1)
dec_v2 = v2.assign(v2-1)

# Add an op to initialize the variables.
init_op = tf.global_variables_initializer()

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, initialize the variables, do some work, and save the
# variables to disk.
with tf.Session() as sess:
  sess.run(init_op)
  # Do some work with the model.
  inc_v1.op.run()
  dec_v2.op.run()
  # Save the variables to disk.
  save_path = saver.save(sess, "/tmp/model.ckpt")
  print("Model saved in path: %s" % save_path)

Restore

tf.reset_default_graph()

# Create some variables.
v1 = tf.get_variable("v1", shape=[3])
v2 = tf.get_variable("v2", shape=[5])

# Add ops to save and restore all the variables.
saver = tf.train.Saver()

# Later, launch the model, use the saver to restore variables from disk, and
# do some work with the model.
with tf.Session() as sess:
  # Restore variables from disk.
  saver.restore(sess, "/tmp/model.ckpt")
  print("Model restored.")
  # Check the values of the variables
  print("v1 : %s" % v1.eval())
  print("v2 : %s" % v2.eval())

Tensorflow 2

This is still beta so I’d advise against for now. If you still want to go down that road here is the tf.saved_model usage guide

Tensorflow < 2

simple_save

Many good answer, for completeness I’ll add my 2 cents: simple_save. Also a standalone code example using the tf.data.Dataset API.

Python 3 ; Tensorflow 1.14

import tensorflow as tf
from tensorflow.saved_model import tag_constants

with tf.Graph().as_default():
    with tf.Session() as sess:
        ...

        # Saving
        inputs = {
            "batch_size_placeholder": batch_size_placeholder,
            "features_placeholder": features_placeholder,
            "labels_placeholder": labels_placeholder,
        }
        outputs = {"prediction": model_output}
        tf.saved_model.simple_save(
            sess, 'path/to/your/location/', inputs, outputs
        )

Restoring:

graph = tf.Graph()
with restored_graph.as_default():
    with tf.Session() as sess:
        tf.saved_model.loader.load(
            sess,
            [tag_constants.SERVING],
            'path/to/your/location/',
        )
        batch_size_placeholder = graph.get_tensor_by_name('batch_size_placeholder:0')
        features_placeholder = graph.get_tensor_by_name('features_placeholder:0')
        labels_placeholder = graph.get_tensor_by_name('labels_placeholder:0')
        prediction = restored_graph.get_tensor_by_name('dense/BiasAdd:0')

        sess.run(prediction, feed_dict={
            batch_size_placeholder: some_value,
            features_placeholder: some_other_value,
            labels_placeholder: another_value
        })

Standalone example

Original blog post

The following code generates random data for the sake of the demonstration.

  1. We start by creating the placeholders. They will hold the data at runtime. From them, we create the Dataset and then its Iterator. We get the iterator’s generated tensor, called input_tensor which will serve as input to our model.
  2. The model itself is built from input_tensor: a GRU-based bidirectional RNN followed by a dense classifier. Because why not.
  3. The loss is a softmax_cross_entropy_with_logits, optimized with Adam. After 2 epochs (of 2 batches each), we save the “trained” model with tf.saved_model.simple_save. If you run the code as is, then the model will be saved in a folder called simple/ in your current working directory.
  4. In a new graph, we then restore the saved model with tf.saved_model.loader.load. We grab the placeholders and logits with graph.get_tensor_by_name and the Iterator initializing operation with graph.get_operation_by_name.
  5. Lastly we run an inference for both batches in the dataset, and check that the saved and restored model both yield the same values. They do!

Code:

import os
import shutil
import numpy as np
import tensorflow as tf
from tensorflow.python.saved_model import tag_constants


def model(graph, input_tensor):
    """Create the model which consists of
    a bidirectional rnn (GRU(10)) followed by a dense classifier

    Args:
        graph (tf.Graph): Tensors' graph
        input_tensor (tf.Tensor): Tensor fed as input to the model

    Returns:
        tf.Tensor: the model's output layer Tensor
    """
    cell = tf.nn.rnn_cell.GRUCell(10)
    with graph.as_default():
        ((fw_outputs, bw_outputs), (fw_state, bw_state)) = tf.nn.bidirectional_dynamic_rnn(
            cell_fw=cell,
            cell_bw=cell,
            inputs=input_tensor,
            sequence_length=[10] * 32,
            dtype=tf.float32,
            swap_memory=True,
            scope=None)
        outputs = tf.concat((fw_outputs, bw_outputs), 2)
        mean = tf.reduce_mean(outputs, axis=1)
        dense = tf.layers.dense(mean, 5, activation=None)

        return dense


def get_opt_op(graph, logits, labels_tensor):
    """Create optimization operation from model's logits and labels

    Args:
        graph (tf.Graph): Tensors' graph
        logits (tf.Tensor): The model's output without activation
        labels_tensor (tf.Tensor): Target labels

    Returns:
        tf.Operation: the operation performing a stem of Adam optimizer
    """
    with graph.as_default():
        with tf.variable_scope('loss'):
            loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
                    logits=logits, labels=labels_tensor, name='xent'),
                    name="mean-xent"
                    )
        with tf.variable_scope('optimizer'):
            opt_op = tf.train.AdamOptimizer(1e-2).minimize(loss)
        return opt_op


if __name__ == '__main__':
    # Set random seed for reproducibility
    # and create synthetic data
    np.random.seed(0)
    features = np.random.randn(64, 10, 30)
    labels = np.eye(5)[np.random.randint(0, 5, (64,))]

    graph1 = tf.Graph()
    with graph1.as_default():
        # Random seed for reproducibility
        tf.set_random_seed(0)
        # Placeholders
        batch_size_ph = tf.placeholder(tf.int64, name='batch_size_ph')
        features_data_ph = tf.placeholder(tf.float32, [None, None, 30], 'features_data_ph')
        labels_data_ph = tf.placeholder(tf.int32, [None, 5], 'labels_data_ph')
        # Dataset
        dataset = tf.data.Dataset.from_tensor_slices((features_data_ph, labels_data_ph))
        dataset = dataset.batch(batch_size_ph)
        iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
        dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
        input_tensor, labels_tensor = iterator.get_next()

        # Model
        logits = model(graph1, input_tensor)
        # Optimization
        opt_op = get_opt_op(graph1, logits, labels_tensor)

        with tf.Session(graph=graph1) as sess:
            # Initialize variables
            tf.global_variables_initializer().run(session=sess)
            for epoch in range(3):
                batch = 0
                # Initialize dataset (could feed epochs in Dataset.repeat(epochs))
                sess.run(
                    dataset_init_op,
                    feed_dict={
                        features_data_ph: features,
                        labels_data_ph: labels,
                        batch_size_ph: 32
                    })
                values = []
                while True:
                    try:
                        if epoch < 2:
                            # Training
                            _, value = sess.run([opt_op, logits])
                            print('Epoch {}, batch {} | Sample value: {}'.format(epoch, batch, value[0]))
                            batch += 1
                        else:
                            # Final inference
                            values.append(sess.run(logits))
                            print('Epoch {}, batch {} | Final inference | Sample value: {}'.format(epoch, batch, values[-1][0]))
                            batch += 1
                    except tf.errors.OutOfRangeError:
                        break
            # Save model state
            print('\nSaving...')
            cwd = os.getcwd()
            path = os.path.join(cwd, 'simple')
            shutil.rmtree(path, ignore_errors=True)
            inputs_dict = {
                "batch_size_ph": batch_size_ph,
                "features_data_ph": features_data_ph,
                "labels_data_ph": labels_data_ph
            }
            outputs_dict = {
                "logits": logits
            }
            tf.saved_model.simple_save(
                sess, path, inputs_dict, outputs_dict
            )
            print('Ok')
    # Restoring
    graph2 = tf.Graph()
    with graph2.as_default():
        with tf.Session(graph=graph2) as sess:
            # Restore saved values
            print('\nRestoring...')
            tf.saved_model.loader.load(
                sess,
                [tag_constants.SERVING],
                path
            )
            print('Ok')
            # Get restored placeholders
            labels_data_ph = graph2.get_tensor_by_name('labels_data_ph:0')
            features_data_ph = graph2.get_tensor_by_name('features_data_ph:0')
            batch_size_ph = graph2.get_tensor_by_name('batch_size_ph:0')
            # Get restored model output
            restored_logits = graph2.get_tensor_by_name('dense/BiasAdd:0')
            # Get dataset initializing operation
            dataset_init_op = graph2.get_operation_by_name('dataset_init')

            # Initialize restored dataset
            sess.run(
                dataset_init_op,
                feed_dict={
                    features_data_ph: features,
                    labels_data_ph: labels,
                    batch_size_ph: 32
                }

            )
            # Compute inference for both batches in dataset
            restored_values = []
            for i in range(2):
                restored_values.append(sess.run(restored_logits))
                print('Restored values: ', restored_values[i][0])

    # Check if original inference and restored inference are equal
    valid = all((v == rv).all() for v, rv in zip(values, restored_values))
    print('\nInferences match: ', valid)

This will print:

$ python3 save_and_restore.py

Epoch 0, batch 0 | Sample value: [-0.13851789 -0.3087595   0.12804556  0.20013677 -0.08229901]
Epoch 0, batch 1 | Sample value: [-0.00555491 -0.04339041 -0.05111827 -0.2480045  -0.00107776]
Epoch 1, batch 0 | Sample value: [-0.19321944 -0.2104792  -0.00602257  0.07465433  0.11674127]
Epoch 1, batch 1 | Sample value: [-0.05275984  0.05981954 -0.15913513 -0.3244143   0.10673307]
Epoch 2, batch 0 | Final inference | Sample value: [-0.26331693 -0.13013336 -0.12553    -0.04276478  0.2933622 ]
Epoch 2, batch 1 | Final inference | Sample value: [-0.07730117  0.11119192 -0.20817074 -0.35660955  0.16990358]

Saving...
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: b'/some/path/simple/saved_model.pb'
Ok

Restoring...
INFO:tensorflow:Restoring parameters from b'/some/path/simple/variables/variables'
Ok
Restored values:  [-0.26331693 -0.13013336 -0.12553    -0.04276478  0.2933622 ]
Restored values:  [-0.07730117  0.11119192 -0.20817074 -0.35660955  0.16990358]

Inferences match:  True

回答 1

我正在改善我的答案,以添加更多有关保存和还原模型的详细信息。

Tensorflow版本0.11中(及之后):

保存模型:

import tensorflow as tf

#Prepare to feed input, i.e. feed_dict and placeholders
w1 = tf.placeholder("float", name="w1")
w2 = tf.placeholder("float", name="w2")
b1= tf.Variable(2.0,name="bias")
feed_dict ={w1:4,w2:8}

#Define a test operation that we will restore
w3 = tf.add(w1,w2)
w4 = tf.multiply(w3,b1,name="op_to_restore")
sess = tf.Session()
sess.run(tf.global_variables_initializer())

#Create a saver object which will save all the variables
saver = tf.train.Saver()

#Run the operation by feeding input
print sess.run(w4,feed_dict)
#Prints 24 which is sum of (w1+w2)*b1 

#Now, save the graph
saver.save(sess, 'my_test_model',global_step=1000)

还原模型:

import tensorflow as tf

sess=tf.Session()    
#First let's load meta graph and restore weights
saver = tf.train.import_meta_graph('my_test_model-1000.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))


# Access saved Variables directly
print(sess.run('bias:0'))
# This will print 2, which is the value of bias that we saved


# Now, let's access and create placeholders variables and
# create feed-dict to feed new data

graph = tf.get_default_graph()
w1 = graph.get_tensor_by_name("w1:0")
w2 = graph.get_tensor_by_name("w2:0")
feed_dict ={w1:13.0,w2:17.0}

#Now, access the op that you want to run. 
op_to_restore = graph.get_tensor_by_name("op_to_restore:0")

print sess.run(op_to_restore,feed_dict)
#This will print 60 which is calculated 

这里和一些更高级的用例已经很好地解释了。

快速完整的教程,用于保存和恢复Tensorflow模型

I am improving my answer to add more details for saving and restoring models.

In(and after) Tensorflow version 0.11:

Save the model:

import tensorflow as tf

#Prepare to feed input, i.e. feed_dict and placeholders
w1 = tf.placeholder("float", name="w1")
w2 = tf.placeholder("float", name="w2")
b1= tf.Variable(2.0,name="bias")
feed_dict ={w1:4,w2:8}

#Define a test operation that we will restore
w3 = tf.add(w1,w2)
w4 = tf.multiply(w3,b1,name="op_to_restore")
sess = tf.Session()
sess.run(tf.global_variables_initializer())

#Create a saver object which will save all the variables
saver = tf.train.Saver()

#Run the operation by feeding input
print sess.run(w4,feed_dict)
#Prints 24 which is sum of (w1+w2)*b1 

#Now, save the graph
saver.save(sess, 'my_test_model',global_step=1000)

Restore the model:

import tensorflow as tf

sess=tf.Session()    
#First let's load meta graph and restore weights
saver = tf.train.import_meta_graph('my_test_model-1000.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))


# Access saved Variables directly
print(sess.run('bias:0'))
# This will print 2, which is the value of bias that we saved


# Now, let's access and create placeholders variables and
# create feed-dict to feed new data

graph = tf.get_default_graph()
w1 = graph.get_tensor_by_name("w1:0")
w2 = graph.get_tensor_by_name("w2:0")
feed_dict ={w1:13.0,w2:17.0}

#Now, access the op that you want to run. 
op_to_restore = graph.get_tensor_by_name("op_to_restore:0")

print sess.run(op_to_restore,feed_dict)
#This will print 60 which is calculated 

This and some more advanced use-cases have been explained very well here.

A quick complete tutorial to save and restore Tensorflow models


回答 2

在TensorFlow版本0.11.0RC1中(及之后),您可以直接调用tf.train.export_meta_graphtf.train.import_meta_graph根据https://www.tensorflow.org/programmers_guide/meta_graph来保存和恢复模型。

保存模型

w1 = tf.Variable(tf.truncated_normal(shape=[10]), name='w1')
w2 = tf.Variable(tf.truncated_normal(shape=[20]), name='w2')
tf.add_to_collection('vars', w1)
tf.add_to_collection('vars', w2)
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
saver.save(sess, 'my-model')
# `save` method will call `export_meta_graph` implicitly.
# you will get saved graph files:my-model.meta

恢复模型

sess = tf.Session()
new_saver = tf.train.import_meta_graph('my-model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
all_vars = tf.get_collection('vars')
for v in all_vars:
    v_ = sess.run(v)
    print(v_)

In (and after) TensorFlow version 0.11.0RC1, you can save and restore your model directly by calling tf.train.export_meta_graph and tf.train.import_meta_graph according to https://www.tensorflow.org/programmers_guide/meta_graph.

Save the model

w1 = tf.Variable(tf.truncated_normal(shape=[10]), name='w1')
w2 = tf.Variable(tf.truncated_normal(shape=[20]), name='w2')
tf.add_to_collection('vars', w1)
tf.add_to_collection('vars', w2)
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
saver.save(sess, 'my-model')
# `save` method will call `export_meta_graph` implicitly.
# you will get saved graph files:my-model.meta

Restore the model

sess = tf.Session()
new_saver = tf.train.import_meta_graph('my-model.meta')
new_saver.restore(sess, tf.train.latest_checkpoint('./'))
all_vars = tf.get_collection('vars')
for v in all_vars:
    v_ = sess.run(v)
    print(v_)

回答 3

对于TensorFlow版本<0.11.0RC1:

保存的检查点包含以下值: Variable模型中 s,而不是模型/图形本身,这意味着在还原检查点时,图形应相同。

这是线性回归的示例,其中存在一个训练循环,该循环保存变量检查点,而评估部分将恢复先前运行中保存的变量并计算预测。当然,您也可以根据需要恢复变量并继续训练。

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

w = tf.Variable(tf.zeros([1, 1], dtype=tf.float32))
b = tf.Variable(tf.ones([1, 1], dtype=tf.float32))
y_hat = tf.add(b, tf.matmul(x, w))

...more setup for optimization and what not...

saver = tf.train.Saver()  # defaults to saving all variables - in this case w and b

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    if FLAGS.train:
        for i in xrange(FLAGS.training_steps):
            ...training loop...
            if (i + 1) % FLAGS.checkpoint_steps == 0:
                saver.save(sess, FLAGS.checkpoint_dir + 'model.ckpt',
                           global_step=i+1)
    else:
        # Here's where you're restoring the variables w and b.
        # Note that the graph is exactly as it was when the variables were
        # saved in a prior training run.
        ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess, ckpt.model_checkpoint_path)
        else:
            ...no checkpoint found...

        # Now you can run the model to get predictions
        batch_x = ...load some data...
        predictions = sess.run(y_hat, feed_dict={x: batch_x})

下面是文档Variables,这包括保存和恢复。这里是文档Saver

For TensorFlow version < 0.11.0RC1:

The checkpoints that are saved contain values for the Variables in your model, not the model/graph itself, which means that the graph should be the same when you restore the checkpoint.

Here’s an example for a linear regression where there’s a training loop that saves variable checkpoints and an evaluation section that will restore variables saved in a prior run and compute predictions. Of course, you can also restore variables and continue training if you’d like.

x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

w = tf.Variable(tf.zeros([1, 1], dtype=tf.float32))
b = tf.Variable(tf.ones([1, 1], dtype=tf.float32))
y_hat = tf.add(b, tf.matmul(x, w))

...more setup for optimization and what not...

saver = tf.train.Saver()  # defaults to saving all variables - in this case w and b

with tf.Session() as sess:
    sess.run(tf.initialize_all_variables())
    if FLAGS.train:
        for i in xrange(FLAGS.training_steps):
            ...training loop...
            if (i + 1) % FLAGS.checkpoint_steps == 0:
                saver.save(sess, FLAGS.checkpoint_dir + 'model.ckpt',
                           global_step=i+1)
    else:
        # Here's where you're restoring the variables w and b.
        # Note that the graph is exactly as it was when the variables were
        # saved in a prior training run.
        ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir)
        if ckpt and ckpt.model_checkpoint_path:
            saver.restore(sess, ckpt.model_checkpoint_path)
        else:
            ...no checkpoint found...

        # Now you can run the model to get predictions
        batch_x = ...load some data...
        predictions = sess.run(y_hat, feed_dict={x: batch_x})

Here are the docs for Variables, which cover saving and restoring. And here are the docs for the Saver.


回答 4

我的环境:Python 3.6,Tensorflow 1.3.0

尽管有许多解决方案,但是大多数解决方案都基于tf.train.Saver。当我们加载.ckpt保存的Saver,我们必须要么重新定义tensorflow网络,或者使用一些奇怪的和难以记住的名称,例如'placehold_0:0''dense/Adam/Weight:0'。我建议在这里使用tf.saved_model下面给出的一个最简单的示例,您可以从服务TensorFlow模型中了解更多信息:

保存模型:

import tensorflow as tf

# define the tensorflow network and do some trains
x = tf.placeholder("float", name="x")
w = tf.Variable(2.0, name="w")
b = tf.Variable(0.0, name="bias")

h = tf.multiply(x, w)
y = tf.add(h, b, name="y")
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# save the model
export_path =  './savedmodel'
builder = tf.saved_model.builder.SavedModelBuilder(export_path)

tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
tensor_info_y = tf.saved_model.utils.build_tensor_info(y)

prediction_signature = (
  tf.saved_model.signature_def_utils.build_signature_def(
      inputs={'x_input': tensor_info_x},
      outputs={'y_output': tensor_info_y},
      method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

builder.add_meta_graph_and_variables(
  sess, [tf.saved_model.tag_constants.SERVING],
  signature_def_map={
      tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
          prediction_signature 
  },
  )
builder.save()

加载模型:

import tensorflow as tf
sess=tf.Session() 
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'x_input'
output_key = 'y_output'

export_path =  './savedmodel'
meta_graph_def = tf.saved_model.loader.load(
           sess,
          [tf.saved_model.tag_constants.SERVING],
          export_path)
signature = meta_graph_def.signature_def

x_tensor_name = signature[signature_key].inputs[input_key].name
y_tensor_name = signature[signature_key].outputs[output_key].name

x = sess.graph.get_tensor_by_name(x_tensor_name)
y = sess.graph.get_tensor_by_name(y_tensor_name)

y_out = sess.run(y, {x: 3.0})

My environment: Python 3.6, Tensorflow 1.3.0

Though there have been many solutions, most of them is based on tf.train.Saver. When we load a .ckpt saved by Saver, we have to either redefine the tensorflow network or use some weird and hard-remembered name, e.g. 'placehold_0:0','dense/Adam/Weight:0'. Here I recommend to use tf.saved_model, one simplest example given below, your can learn more from Serving a TensorFlow Model:

Save the model:

import tensorflow as tf

# define the tensorflow network and do some trains
x = tf.placeholder("float", name="x")
w = tf.Variable(2.0, name="w")
b = tf.Variable(0.0, name="bias")

h = tf.multiply(x, w)
y = tf.add(h, b, name="y")
sess = tf.Session()
sess.run(tf.global_variables_initializer())

# save the model
export_path =  './savedmodel'
builder = tf.saved_model.builder.SavedModelBuilder(export_path)

tensor_info_x = tf.saved_model.utils.build_tensor_info(x)
tensor_info_y = tf.saved_model.utils.build_tensor_info(y)

prediction_signature = (
  tf.saved_model.signature_def_utils.build_signature_def(
      inputs={'x_input': tensor_info_x},
      outputs={'y_output': tensor_info_y},
      method_name=tf.saved_model.signature_constants.PREDICT_METHOD_NAME))

builder.add_meta_graph_and_variables(
  sess, [tf.saved_model.tag_constants.SERVING],
  signature_def_map={
      tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY:
          prediction_signature 
  },
  )
builder.save()

Load the model:

import tensorflow as tf
sess=tf.Session() 
signature_key = tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY
input_key = 'x_input'
output_key = 'y_output'

export_path =  './savedmodel'
meta_graph_def = tf.saved_model.loader.load(
           sess,
          [tf.saved_model.tag_constants.SERVING],
          export_path)
signature = meta_graph_def.signature_def

x_tensor_name = signature[signature_key].inputs[input_key].name
y_tensor_name = signature[signature_key].outputs[output_key].name

x = sess.graph.get_tensor_by_name(x_tensor_name)
y = sess.graph.get_tensor_by_name(y_tensor_name)

y_out = sess.run(y, {x: 3.0})

回答 5

有两个部分的模型,该模型定义,通过保存Supervisorgraph.pbtxt模型中的目录和张量的数值,保存到像检查点文件model.ckpt-1003418

可以使用还原模型定义tf.import_graph_def,并使用还原权重Saver

但是,Saver使用包含在模型Graph上的变量的特殊集合保存列表,并且此集合未使用import_graph_def进行初始化,因此您目前无法将两者一起使用(正在修复中)。现在,您必须使用Ryan Sepassi的方法-手动构造具有相同节点名称的图,然后Saver将权重加载到其中。

(或者,您可以通过使用import_graph_def,手动创建变量,并tf.add_to_collection(tf.GraphKeys.VARIABLES, variable)针对每个变量使用,然后使用来对其进行破解Saver

There are two parts to the model, the model definition, saved by Supervisor as graph.pbtxt in the model directory and the numerical values of tensors, saved into checkpoint files like model.ckpt-1003418.

The model definition can be restored using tf.import_graph_def, and the weights are restored using Saver.

However, Saver uses special collection holding list of variables that’s attached to the model Graph, and this collection is not initialized using import_graph_def, so you can’t use the two together at the moment (it’s on our roadmap to fix). For now, you have to use approach of Ryan Sepassi — manually construct a graph with identical node names, and use Saver to load the weights into it.

(Alternatively you could hack it by using by using import_graph_def, creating variables manually, and using tf.add_to_collection(tf.GraphKeys.VARIABLES, variable) for each variable, then using Saver)


回答 6

您也可以采用这种更简单的方法。

步骤1:初始化所有变量

W1 = tf.Variable(tf.truncated_normal([6, 6, 1, K], stddev=0.1), name="W1")
B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]), name="B1")

Similarly, W2, B2, W3, .....

步骤2:Saver将会话保存在模型中并保存

model_saver = tf.train.Saver()

# Train the model and save it in the end
model_saver.save(session, "saved_models/CNN_New.ckpt")

步骤3:还原模型

with tf.Session(graph=graph_cnn) as session:
    model_saver.restore(session, "saved_models/CNN_New.ckpt")
    print("Model restored.") 
    print('Initialized')

第4步:检查您的变量

W1 = session.run(W1)
print(W1)

在其他python实例中运行时,请使用

with tf.Session() as sess:
    # Restore latest checkpoint
    saver.restore(sess, tf.train.latest_checkpoint('saved_model/.'))

    # Initalize the variables
    sess.run(tf.global_variables_initializer())

    # Get default graph (supply your custom graph if you have one)
    graph = tf.get_default_graph()

    # It will give tensor object
    W1 = graph.get_tensor_by_name('W1:0')

    # To get the value (numpy array)
    W1_value = session.run(W1)

You can also take this easier way.

Step 1: initialize all your variables

W1 = tf.Variable(tf.truncated_normal([6, 6, 1, K], stddev=0.1), name="W1")
B1 = tf.Variable(tf.constant(0.1, tf.float32, [K]), name="B1")

Similarly, W2, B2, W3, .....

Step 2: save the session inside model Saver and save it

model_saver = tf.train.Saver()

# Train the model and save it in the end
model_saver.save(session, "saved_models/CNN_New.ckpt")

Step 3: restore the model

with tf.Session(graph=graph_cnn) as session:
    model_saver.restore(session, "saved_models/CNN_New.ckpt")
    print("Model restored.") 
    print('Initialized')

Step 4: check your variable

W1 = session.run(W1)
print(W1)

While running in different python instance, use

with tf.Session() as sess:
    # Restore latest checkpoint
    saver.restore(sess, tf.train.latest_checkpoint('saved_model/.'))

    # Initalize the variables
    sess.run(tf.global_variables_initializer())

    # Get default graph (supply your custom graph if you have one)
    graph = tf.get_default_graph()

    # It will give tensor object
    W1 = graph.get_tensor_by_name('W1:0')

    # To get the value (numpy array)
    W1_value = session.run(W1)

回答 7

在大多数情况下,使用a从磁盘保存和还原tf.train.Saver是最佳选择:

... # build your model
saver = tf.train.Saver()

with tf.Session() as sess:
    ... # train the model
    saver.save(sess, "/tmp/my_great_model")

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_great_model")
    ... # use the model

您也可以保存/恢复图结构本身(有关详细信息,请参见MetaGraph文档)。默认情况下,Saver将图形结构保存到.meta文件中。您可以调用import_meta_graph()进行恢复。它还原图形结构并返回一个Saver可用于还原模型状态的:

saver = tf.train.import_meta_graph("/tmp/my_great_model.meta")

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_great_model")
    ... # use the model

但是,在某些情况下,您需要更快的速度。例如,如果实施提前停止,则希望在训练过程中每次模型改进时都保存检查点(以验证集为准),然后如果一段时间没有进展,则希望回滚到最佳模型。如果您在每次改进时都将模型保存到磁盘,则会极大地减慢训练速度。诀窍是将变量状态保存到内存中,然后稍后再恢复它们:

... # build your model

# get a handle on the graph nodes we need to save/restore the model
graph = tf.get_default_graph()
gvars = graph.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
assign_ops = [graph.get_operation_by_name(v.op.name + "/Assign") for v in gvars]
init_values = [assign_op.inputs[1] for assign_op in assign_ops]

with tf.Session() as sess:
    ... # train the model

    # when needed, save the model state to memory
    gvars_state = sess.run(gvars)

    # when needed, restore the model state
    feed_dict = {init_value: val
                 for init_value, val in zip(init_values, gvars_state)}
    sess.run(assign_ops, feed_dict=feed_dict)

快速说明:创建变量时X,TensorFlow自动创建一个赋值操作X/Assign以设置变量的初始值。与其创建占位符和额外的分配操作(这只会使图形混乱),我们仅使用这些现有的分配操作。每个赋值op的第一个输入是对应该初始化的变量的引用,第二个输入(assign_op.inputs[1])是初始值。因此,为了设置所需的任何值(而不是初始值),我们需要使用a feed_dict并替换初始值。是的,TensorFlow允许您为任何操作提供值,而不仅仅是占位符,因此可以正常工作。

In most cases, saving and restoring from disk using a tf.train.Saver is your best option:

... # build your model
saver = tf.train.Saver()

with tf.Session() as sess:
    ... # train the model
    saver.save(sess, "/tmp/my_great_model")

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_great_model")
    ... # use the model

You can also save/restore the graph structure itself (see the MetaGraph documentation for details). By default, the Saver saves the graph structure into a .meta file. You can call import_meta_graph() to restore it. It restores the graph structure and returns a Saver that you can use to restore the model’s state:

saver = tf.train.import_meta_graph("/tmp/my_great_model.meta")

with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_great_model")
    ... # use the model

However, there are cases where you need something much faster. For example, if you implement early stopping, you want to save checkpoints every time the model improves during training (as measured on the validation set), then if there is no progress for some time, you want to roll back to the best model. If you save the model to disk every time it improves, it will tremendously slow down training. The trick is to save the variable states to memory, then just restore them later:

... # build your model

# get a handle on the graph nodes we need to save/restore the model
graph = tf.get_default_graph()
gvars = graph.get_collection(tf.GraphKeys.GLOBAL_VARIABLES)
assign_ops = [graph.get_operation_by_name(v.op.name + "/Assign") for v in gvars]
init_values = [assign_op.inputs[1] for assign_op in assign_ops]

with tf.Session() as sess:
    ... # train the model

    # when needed, save the model state to memory
    gvars_state = sess.run(gvars)

    # when needed, restore the model state
    feed_dict = {init_value: val
                 for init_value, val in zip(init_values, gvars_state)}
    sess.run(assign_ops, feed_dict=feed_dict)

A quick explanation: when you create a variable X, TensorFlow automatically creates an assignment operation X/Assign to set the variable’s initial value. Instead of creating placeholders and extra assignment ops (which would just make the graph messy), we just use these existing assignment ops. The first input of each assignment op is a reference to the variable it is supposed to initialize, and the second input (assign_op.inputs[1]) is the initial value. So in order to set any value we want (instead of the initial value), we need to use a feed_dict and replace the initial value. Yes, TensorFlow lets you feed a value for any op, not just for placeholders, so this works fine.


回答 8

正如Yaroslav所说,您可以通过导入图形,手动创建变量然后使用Saver来从graph_def和检查点恢复。

我将其实现为个人使用,因此尽管我在这里共享了代码。

链接: https //gist.github.com/nikitakit/6ef3b72be67b86cb7868

(当然,这是黑客,不能保证以此方式保存的模型在TensorFlow的未来版本中仍可读取。)

As Yaroslav said, you can hack restoring from a graph_def and checkpoint by importing the graph, manually creating variables, and then using a Saver.

I implemented this for my personal use, so I though I’d share the code here.

Link: https://gist.github.com/nikitakit/6ef3b72be67b86cb7868

(This is, of course, a hack, and there is no guarantee that models saved this way will remain readable in future versions of TensorFlow.)


回答 9

如果是内部保存的模型,则只需为所有变量指定一个还原器即可

restorer = tf.train.Saver(tf.all_variables())

并使用它来还原当前会话中的变量:

restorer.restore(self._sess, model_file)

对于外部模型,您需要指定从其变量名到变量名的映射。您可以使用以下命令查看模型变量名称

python /path/to/tensorflow/tensorflow/python/tools/inspect_checkpoint.py --file_name=/path/to/pretrained_model/model.ckpt

可以在Tensorflow源的’./tensorflow/python/tools’文件夹中找到inspect_checkpoint.py脚本。

要指定映射,您可以使用我的Tensorflow-Worklab,其中包含一组用于训练和重新训练不同模型的类和脚本。它包含一个重新训练ResNet模型的示例,位于此处

If it is an internally saved model, you just specify a restorer for all variables as

restorer = tf.train.Saver(tf.all_variables())

and use it to restore variables in a current session:

restorer.restore(self._sess, model_file)

For the external model you need to specify the mapping from the its variable names to your variable names. You can view the model variable names using the command

python /path/to/tensorflow/tensorflow/python/tools/inspect_checkpoint.py --file_name=/path/to/pretrained_model/model.ckpt

The inspect_checkpoint.py script can be found in ‘./tensorflow/python/tools’ folder of the Tensorflow source.

To specify the mapping, you can use my Tensorflow-Worklab, which contains a set of classes and scripts to train and retrain different models. It includes an example of retraining ResNet models, located here


回答 10

这是我针对两种基本情况的简单解决方案,不同之处在于您是要从文件中加载图形还是在运行时构建图形。

该答案适用于Tensorflow 0.12+(包括1.0)。

在代码中重建图形

保存

graph = ... # build the graph
saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
    saver.save(sess, 'my-model')

载入中

graph = ... # build the graph
saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
    saver.restore(sess, tf.train.latest_checkpoint('./'))
    # now you can use the graph, continue training or whatever

还从文件加载图形

使用此技术时,请确保所有图层/变量均已明确设置唯一名称。否则,Tensorflow将使名称本身具有唯一性,因此它们将与文件中存储的名称不同。在以前的技术中这不是问题,因为在加载和保存时都以相同的方式“混合”了名称。

保存

graph = ... # build the graph

for op in [ ... ]:  # operators you want to use after restoring the model
    tf.add_to_collection('ops_to_restore', op)

saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
    saver.save(sess, 'my-model')

载入中

with ... as sess:  # your session object
    saver = tf.train.import_meta_graph('my-model.meta')
    saver.restore(sess, tf.train.latest_checkpoint('./'))
    ops = tf.get_collection('ops_to_restore')  # here are your operators in the same order in which you saved them to the collection

Here’s my simple solution for the two basic cases differing on whether you want to load the graph from file or build it during runtime.

This answer holds for Tensorflow 0.12+ (including 1.0).

Rebuilding the graph in code

Saving

graph = ... # build the graph
saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
    saver.save(sess, 'my-model')

Loading

graph = ... # build the graph
saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
    saver.restore(sess, tf.train.latest_checkpoint('./'))
    # now you can use the graph, continue training or whatever

Loading also the graph from a file

When using this technique, make sure all your layers/variables have explicitly set unique names. Otherwise Tensorflow will make the names unique itself and they’ll be thus different from the names stored in the file. It’s not a problem in the previous technique, because the names are “mangled” the same way in both loading and saving.

Saving

graph = ... # build the graph

for op in [ ... ]:  # operators you want to use after restoring the model
    tf.add_to_collection('ops_to_restore', op)

saver = tf.train.Saver()  # create the saver after the graph
with ... as sess:  # your session object
    saver.save(sess, 'my-model')

Loading

with ... as sess:  # your session object
    saver = tf.train.import_meta_graph('my-model.meta')
    saver.restore(sess, tf.train.latest_checkpoint('./'))
    ops = tf.get_collection('ops_to_restore')  # here are your operators in the same order in which you saved them to the collection

回答 11

您还可以在TensorFlow / skflow中查看示例,这些示例和方法可以帮助您轻松管理模型。它具有的参数还可以控制您备份模型的频率。saverestore

You can also check out examples in TensorFlow/skflow, which offers save and restore methods that can help you easily manage your models. It has parameters that you can also control how frequently you want to back up your model.


回答 12

如果您将tf.train.MonitoredTrainingSession用作默认会话,则无需添加额外的代码即可保存/还原内容。只需将检查点目录名传递给MonitoredTrainingSession的构造函数,它将使用会话挂钩来处理这些。

If you use tf.train.MonitoredTrainingSession as the default session, you don’t need to add extra code to do save/restore things. Just pass a checkpoint dir name to MonitoredTrainingSession’s constructor, it will use session hooks to handle these.


回答 13

这里的所有答案都很好,但我想补充两点。

首先,要详细说明@ user7505159的答案,将“ ./”添加到要还原的文件名的开头很重要。

例如,您可以保存一个图形,文件名中不包含“ ./”,如下所示:

# Some graph defined up here with specific names

saver = tf.train.Saver()
save_file = 'model.ckpt'

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.save(sess, save_file)

但是为了还原图形,您可能需要在file_name前面加上“ ./”:

# Same graph defined up here

saver = tf.train.Saver()
save_file = './' + 'model.ckpt' # String addition used for emphasis

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.restore(sess, save_file)

您不一定总是需要“ ./”,但是根据您的环境和TensorFlow的版本,它可能会引起问题。

它还要提到,sess.run(tf.global_variables_initializer())在恢复会话之前,这可能很重要。

如果在尝试还原已保存的会话时收到关于未初始化变量的错误,请确保在行sess.run(tf.global_variables_initializer())之前包括saver.restore(sess, save_file)。它可以节省您的头痛。

All the answers here are great, but I want to add two things.

First, to elaborate on @user7505159’s answer, the “./” can be important to add to the beginning of the file name that you are restoring.

For example, you can save a graph with no “./” in the file name like so:

# Some graph defined up here with specific names

saver = tf.train.Saver()
save_file = 'model.ckpt'

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.save(sess, save_file)

But in order to restore the graph, you may need to prepend a “./” to the file_name:

# Same graph defined up here

saver = tf.train.Saver()
save_file = './' + 'model.ckpt' # String addition used for emphasis

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.restore(sess, save_file)

You will not always need the “./”, but it can cause problems depending on your environment and version of TensorFlow.

It also want to mention that the sess.run(tf.global_variables_initializer()) can be important before restoring the session.

If you are receiving an error regarding uninitialized variables when trying to restore a saved session, make sure you include sess.run(tf.global_variables_initializer()) before the saver.restore(sess, save_file) line. It can save you a headache.


回答 14

如问题6255中所述

use '**./**model_name.ckpt'
saver.restore(sess,'./my_model_final.ckpt')

代替

saver.restore('my_model_final.ckpt')

As described in issue 6255:

use '**./**model_name.ckpt'
saver.restore(sess,'./my_model_final.ckpt')

instead of

saver.restore('my_model_final.ckpt')

回答 15

根据新的Tensorflow版本,tf.train.Checkpoint保存和还原模型的首选方法是:

Checkpoint.saveCheckpoint.restore写入和读取基于对象的检查点,而tf.train.Saver则写入和读取基于variable.name的检查点。基于对象的检查点保存带有命名边的Python对象(层,优化程序,变量等)之间的依存关系图,该图用于在恢复检查点时匹配变量。它对Python程序中的更改可能更健壮,并有助于在急切执行时支持变量的创建时恢复。身高tf.train.Checkpoint超过 tf.train.Saver对新代码

这是一个例子:

import tensorflow as tf
import os

tf.enable_eager_execution()

checkpoint_directory = "/tmp/training_checkpoints"
checkpoint_prefix = os.path.join(checkpoint_directory, "ckpt")

checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)
status = checkpoint.restore(tf.train.latest_checkpoint(checkpoint_directory))
for _ in range(num_training_steps):
  optimizer.minimize( ... )  # Variables will be restored on creation.
status.assert_consumed()  # Optional sanity checks.
checkpoint.save(file_prefix=checkpoint_prefix)

更多信息和示例在这里。

According to the new Tensorflow version, tf.train.Checkpoint is the preferable way of saving and restoring a model:

Checkpoint.save and Checkpoint.restore write and read object-based checkpoints, in contrast to tf.train.Saver which writes and reads variable.name based checkpoints. Object-based checkpointing saves a graph of dependencies between Python objects (Layers, Optimizers, Variables, etc.) with named edges, and this graph is used to match variables when restoring a checkpoint. It can be more robust to changes in the Python program, and helps to support restore-on-create for variables when executing eagerly. Prefer tf.train.Checkpoint over tf.train.Saver for new code.

Here is an example:

import tensorflow as tf
import os

tf.enable_eager_execution()

checkpoint_directory = "/tmp/training_checkpoints"
checkpoint_prefix = os.path.join(checkpoint_directory, "ckpt")

checkpoint = tf.train.Checkpoint(optimizer=optimizer, model=model)
status = checkpoint.restore(tf.train.latest_checkpoint(checkpoint_directory))
for _ in range(num_training_steps):
  optimizer.minimize( ... )  # Variables will be restored on creation.
status.assert_consumed()  # Optional sanity checks.
checkpoint.save(file_prefix=checkpoint_prefix)

More information and example here.


回答 16

对于tensorflow 2.0,它很简单

# Save the model
model.save('path_to_my_model.h5')

恢复:

new_model = tensorflow.keras.models.load_model('path_to_my_model.h5')

For tensorflow 2.0, it is as simple as

# Save the model
model.save('path_to_my_model.h5')

To restore:

new_model = tensorflow.keras.models.load_model('path_to_my_model.h5')

回答 17

tf.keras模型保存 TF2.0

对于使用TF1.x保存模型,我看到了很好的答案。我想在保存中提供更多的指示tensorflow.keras模型时这有点复杂,因为有很多方法可以保存模型。

在这里,我提供了一个将tensorflow.keras模型保存到model_path当前目录下的文件夹的示例。这与最新的tensorflow(TF2.0)一起很好地工作。如果近期有任何更改,我将更新此描述。

保存和加载整个模型

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

#import data
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# create a model
def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])
# compile the model
  model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()

model.fit(x_train, y_train, epochs=1)
loss, acc = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))

# Save entire model to a HDF5 file
model.save('./model_path/my_model.h5')

# Recreate the exact same model, including weights and optimizer.
new_model = keras.models.load_model('./model_path/my_model.h5')
loss, acc = new_model.evaluate(x_test, y_test)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

仅保存和加载模型权重

如果只对保存模型权重感兴趣,然后对加载权重以恢复模型感兴趣,那么

model.fit(x_train, y_train, epochs=5)
loss, acc = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))

# Save the weights
model.save_weights('./checkpoints/my_checkpoint')

# Restore the weights
model = create_model()
model.load_weights('./checkpoints/my_checkpoint')

loss,acc = model.evaluate(x_test, y_test)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

使用keras检查点回调进行保存和还原

# include the epoch in the file name. (uses `str.format`)
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(
    checkpoint_path, verbose=1, save_weights_only=True,
    # Save weights, every 5-epochs.
    period=5)

model = create_model()
model.save_weights(checkpoint_path.format(epoch=0))
model.fit(train_images, train_labels,
          epochs = 50, callbacks = [cp_callback],
          validation_data = (test_images,test_labels),
          verbose=0)

latest = tf.train.latest_checkpoint(checkpoint_dir)

new_model = create_model()
new_model.load_weights(latest)
loss, acc = new_model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

使用自定义指标保存模型

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Custom Loss1 (for example) 
@tf.function() 
def customLoss1(yTrue,yPred):
  return tf.reduce_mean(yTrue-yPred) 

# Custom Loss2 (for example) 
@tf.function() 
def customLoss2(yTrue, yPred):
  return tf.reduce_mean(tf.square(tf.subtract(yTrue,yPred))) 

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),  
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])
  model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy', customLoss1, customLoss2])
  return model

# Create a basic model instance
model=create_model()

# Fit and evaluate model 
model.fit(x_train, y_train, epochs=1)
loss, acc,loss1, loss2 = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))

model.save("./model.h5")

new_model=tf.keras.models.load_model("./model.h5",custom_objects={'customLoss1':customLoss1,'customLoss2':customLoss2})

使用自定义操作保存Keras模型

如以下情况(tf.tile)所示,当我们具有自定义操作时,我们需要创建一个函数并包装一个Lambda层。否则,无法保存模型。

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Lambda
from tensorflow.keras import Model

def my_fun(a):
  out = tf.tile(a, (1, tf.shape(a)[0]))
  return out

a = Input(shape=(10,))
#out = tf.tile(a, (1, tf.shape(a)[0]))
out = Lambda(lambda x : my_fun(x))(a)
model = Model(a, out)

x = np.zeros((50,10), dtype=np.float32)
print(model(x).numpy())

model.save('my_model.h5')

#load the model
new_model=tf.keras.models.load_model("my_model.h5")

我想我已经介绍了许多保存tf.keras模型的方法。但是,还有许多其他方法。如果您发现上面没有涉及用例,请在下面发表评论。谢谢!

tf.keras Model saving with TF2.0

I see great answers for saving models using TF1.x. I want to provide couple of more pointers in saving tensorflow.keras models which is a little complicated as there are many ways to save a model.

Here I am providing an example of saving a tensorflow.keras model to model_path folder under current directory. This works well with most recent tensorflow (TF2.0). I will update this description if there is any change in near future.

Saving and loading entire model

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

#import data
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# create a model
def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])
# compile the model
  model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
  return model

# Create a basic model instance
model=create_model()

model.fit(x_train, y_train, epochs=1)
loss, acc = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))

# Save entire model to a HDF5 file
model.save('./model_path/my_model.h5')

# Recreate the exact same model, including weights and optimizer.
new_model = keras.models.load_model('./model_path/my_model.h5')
loss, acc = new_model.evaluate(x_test, y_test)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Saving and loading model Weights only

If you are interested in saving model weights only and then load weights to restore the model, then

model.fit(x_train, y_train, epochs=5)
loss, acc = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))

# Save the weights
model.save_weights('./checkpoints/my_checkpoint')

# Restore the weights
model = create_model()
model.load_weights('./checkpoints/my_checkpoint')

loss,acc = model.evaluate(x_test, y_test)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

Saving and restoring using keras checkpoint callback

# include the epoch in the file name. (uses `str.format`)
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(
    checkpoint_path, verbose=1, save_weights_only=True,
    # Save weights, every 5-epochs.
    period=5)

model = create_model()
model.save_weights(checkpoint_path.format(epoch=0))
model.fit(train_images, train_labels,
          epochs = 50, callbacks = [cp_callback],
          validation_data = (test_images,test_labels),
          verbose=0)

latest = tf.train.latest_checkpoint(checkpoint_dir)

new_model = create_model()
new_model.load_weights(latest)
loss, acc = new_model.evaluate(test_images, test_labels)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))

saving model with custom metrics

import tensorflow as tf
from tensorflow import keras
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Custom Loss1 (for example) 
@tf.function() 
def customLoss1(yTrue,yPred):
  return tf.reduce_mean(yTrue-yPred) 

# Custom Loss2 (for example) 
@tf.function() 
def customLoss2(yTrue, yPred):
  return tf.reduce_mean(tf.square(tf.subtract(yTrue,yPred))) 

def create_model():
  model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),  
    tf.keras.layers.Dropout(0.2),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
    ])
  model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy', customLoss1, customLoss2])
  return model

# Create a basic model instance
model=create_model()

# Fit and evaluate model 
model.fit(x_train, y_train, epochs=1)
loss, acc,loss1, loss2 = model.evaluate(x_test, y_test,verbose=1)
print("Original model, accuracy: {:5.2f}%".format(100*acc))

model.save("./model.h5")

new_model=tf.keras.models.load_model("./model.h5",custom_objects={'customLoss1':customLoss1,'customLoss2':customLoss2})

Saving keras model with custom ops

When we have custom ops as in the following case (tf.tile), we need to create a function and wrap with a Lambda layer. Otherwise, model cannot be saved.

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import Input, Lambda
from tensorflow.keras import Model

def my_fun(a):
  out = tf.tile(a, (1, tf.shape(a)[0]))
  return out

a = Input(shape=(10,))
#out = tf.tile(a, (1, tf.shape(a)[0]))
out = Lambda(lambda x : my_fun(x))(a)
model = Model(a, out)

x = np.zeros((50,10), dtype=np.float32)
print(model(x).numpy())

model.save('my_model.h5')

#load the model
new_model=tf.keras.models.load_model("my_model.h5")

I think I have covered a few of the many ways of saving tf.keras model. However, there are many other ways. Please comment below if you see your use case is not covered above. Thanks!


回答 18

使用tf.train.Saver保存模型,重命名,如果要减小模型大小,则需要指定var_list。val_list可以是tf.trainable_variables或tf.global_variables。

Use tf.train.Saver to save a model, remerber, you need to specify the var_list, if you want to reduce the model size. The val_list can be tf.trainable_variables or tf.global_variables.


回答 19

您可以使用以下方法将变量保存到网络

saver = tf.train.Saver() 
saver.save(sess, 'path of save/fileName.ckpt')

还原网络以供以后重复使用或在另一个脚本中使用,请使用:

saver = tf.train.Saver()
saver.restore(sess, tf.train.latest_checkpoint('path of save/')
sess.run(....) 

重要事项:

  1. sess 首次运行和后续运行之间必须相同(一致的结构)。
  2. saver.restore 需要已保存文件的文件夹路径,而不是单个文件路径。

You can save the variables in the network using

saver = tf.train.Saver() 
saver.save(sess, 'path of save/fileName.ckpt')

To restore the network for reuse later or in another script, use:

saver = tf.train.Saver()
saver.restore(sess, tf.train.latest_checkpoint('path of save/')
sess.run(....) 

Important points:

  1. sess must be same between first and later runs (coherent structure).
  2. saver.restore needs the path of the folder of the saved files, not an individual file path.

回答 20

无论您要将模型保存到哪里,

self.saver = tf.train.Saver()
with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            ...
            self.saver.save(sess, filename)

确保您所有的人tf.Variable都有名字,因为您以后可能要使用他们的名字来还原它们。在您想要预测的地方

saver = tf.train.import_meta_graph(filename)
name = 'name given when you saved the file' 
with tf.Session() as sess:
      saver.restore(sess, name)
      print(sess.run('W1:0')) #example to retrieve by variable name

确保保护程序在相应的会话中运行。请记住,如果使用tf.train.latest_checkpoint('./'),则将仅使用最新的检查点。

Wherever you want to save the model,

self.saver = tf.train.Saver()
with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            ...
            self.saver.save(sess, filename)

Make sure, all your tf.Variable have names, because you may want to restore them later using their names. And where you want to predict,

saver = tf.train.import_meta_graph(filename)
name = 'name given when you saved the file' 
with tf.Session() as sess:
      saver.restore(sess, name)
      print(sess.run('W1:0')) #example to retrieve by variable name

Make sure that saver runs inside the corresponding session. Remember that, if you use the tf.train.latest_checkpoint('./'), then only the latest check point will be used.


回答 21

我正在使用版本:

tensorflow (1.13.1)
tensorflow-gpu (1.13.1)

简单的方法是

救:

model.save("model.h5")

恢复:

model = tf.keras.models.load_model("model.h5")

I’m on Version:

tensorflow (1.13.1)
tensorflow-gpu (1.13.1)

Simple way is

Save:

model.save("model.h5")

Restore:

model = tf.keras.models.load_model("model.h5")

回答 22

对于tensorflow-2.0

这很简单。

import tensorflow as tf

model.save("model_name")

恢复

model = tf.keras.models.load_model('model_name')

For tensorflow-2.0

it’s very simple.

import tensorflow as tf

SAVE

model.save("model_name")

RESTORE

model = tf.keras.models.load_model('model_name')

回答 23

遵循@Vishnuvardhan Janapati的回答,这是在TensorFlow 2.0.0下使用自定义图层/度量/损耗来保存和重新加载模型的另一种方法

import tensorflow as tf
from tensorflow.keras.layers import Layer
from tensorflow.keras.utils.generic_utils import get_custom_objects

# custom loss (for example)  
def custom_loss(y_true,y_pred):
  return tf.reduce_mean(y_true - y_pred)
get_custom_objects().update({'custom_loss': custom_loss}) 

# custom loss (for example) 
class CustomLayer(Layer):
  def __init__(self, ...):
      ...
  # define custom layer and all necessary custom operations inside custom layer

get_custom_objects().update({'CustomLayer': CustomLayer})  

这样,一旦执行了此类代码,并使用tf.keras.models.save_modelmodel.saveModelCheckpoint回调保存了模型,就可以重新加载模型,而无需精确的自定义对象,就像

new_model = tf.keras.models.load_model("./model.h5"})

Following @Vishnuvardhan Janapati ‘s answer, here is another way to save and reload model with custom layer/metric/loss under TensorFlow 2.0.0

import tensorflow as tf
from tensorflow.keras.layers import Layer
from tensorflow.keras.utils.generic_utils import get_custom_objects

# custom loss (for example)  
def custom_loss(y_true,y_pred):
  return tf.reduce_mean(y_true - y_pred)
get_custom_objects().update({'custom_loss': custom_loss}) 

# custom loss (for example) 
class CustomLayer(Layer):
  def __init__(self, ...):
      ...
  # define custom layer and all necessary custom operations inside custom layer

get_custom_objects().update({'CustomLayer': CustomLayer})  

In this way, once you have executed such codes, and saved your model with tf.keras.models.save_model or model.save or ModelCheckpoint callback, you can re-load your model without the need of precise custom objects, as simple as

new_model = tf.keras.models.load_model("./model.h5"})

回答 24

在新版本的tensorflow 2.0中,保存/加载模型的过程要容易得多。由于实施了Keras API,因此是TensorFlow的高级API。

保存模型:检查文档以供参考:https : //www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/models/save_model

tf.keras.models.save_model(model_name, filepath, save_format)

加载模型:

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/models/load_model

model = tf.keras.models.load_model(filepath)

In the new version of tensorflow 2.0, the process of saving/loading a model is a lot easier. Because of the Implementation of the Keras API, a high-level API for TensorFlow.

To save a model: Check the documentation for reference: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/models/save_model

tf.keras.models.save_model(model_name, filepath, save_format)

To load a model:

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/models/load_model

model = tf.keras.models.load_model(filepath)

Pyro-基于Python和PyTorch的深度通用概率编程

Pyro是一个构建在PyTorch之上的灵活的、可伸缩的深度概率编程库。值得注意的是,它的设计考虑到了以下原则:
  • 通用的:pyro是一种通用的PPL-它可以表示任何可计算的概率分布
  • 可扩展:pyro可扩展到大型数据集,与手写代码相比开销很小
  • 极小值:pyro是灵活的,可维护的。它是由强大的、可组合的抽象组成的一个小核心来实现的
  • 灵活性:PYRO的目标是在需要的时候实现自动化,在需要的时候进行控制。这是通过高级抽象来表示生成和推理模型来实现的,同时允许专家方便地访问自定义推理

Pyro最初是由Uber AI开发的,现在由社区贡献者积极维护,包括在Broad Institute2019年,PyrobecameLinux基金会的一个项目,这是一个在开源软件、开放标准、开放数据和开放硬件方面进行协作的中立空间

有关Pyro的高级动机的更多信息,请查看我们的launch blog post有关其他博客帖子,请查看experimental designtime-to-event modeling在Pyro中

正在安装

安装稳定的Pyro版本

使用pip安装:

Pyro支持Python 3.6+

pip install pyro-ppl

从源安装:

git clone git@github.com:pyro-ppl/pyro.git
cd pyro
git checkout master  # master is pinned to the latest release
pip install .

使用额外的软件包进行安装:

要安装运行中包含的概率模型所需的依赖项,请执行以下操作examples/tutorials目录,请使用以下命令:

pip install pyro-ppl[extras]

确保这些模型来自同一版本的Pyro source code因为您已经安装了

安装Pyro dev分支

对于最新的功能,您可以从源代码安装Pyro

使用pip安装Pyro:

pip install git+https://github.com/pyro-ppl/pyro.git

或者,使用extras依赖项来运行包含在examples/tutorials目录:

pip install git+https://github.com/pyro-ppl/pyro.git#egg=project[extras]

从源安装Pyro:

git clone https://github.com/pyro-ppl/pyro
cd pyro
pip install .  # pip install .[extras] for running models in examples/tutorials

从Docker容器运行Pyro

请参阅说明here

引文

如果您使用Pyro,请考虑引用:

@article{bingham2019pyro,
  author    = {Eli Bingham and
               Jonathan P. Chen and
               Martin Jankowiak and
               Fritz Obermeyer and
               Neeraj Pradhan and
               Theofanis Karaletsos and
               Rohit Singh and
               Paul A. Szerlip and
               Paul Horsfall and
               Noah D. Goodman},
  title     = {Pyro: Deep Universal Probabilistic Programming},
  journal   = {J. Mach. Learn. Res.},
  volume    = {20},
  pages     = {28:1--28:6},
  year      = {2019},
  url       = {http://jmlr.org/papers/v20/18-403.html}
}

Interview-面试=简历指南+LeetCode+Kaggle


网站地址

组织构建

第三方站长

地址A:XXX(欢迎留言,我们完善补充)

组织介绍

程序员的双手是魔术师的双手,他们把枯燥无味的代码变成了丰富多彩的软件.–“疯狂的程序员”

参与须知

面试求职

目前关于面试注意一下几个点:

  • 简历:没有最好的,只有最适合的
  • 刷题:这个就是熟能生巧的活,花时间就行
  • 岗位:你是否知道这个岗位需要什么样的人?
  • 公司:如何去选公司呢?细细来品
  • 职场:如何职业规划呢?有趣的话题哈!
  • 理财:如何增加自己的被动收入
  • 年龄:年龄对职场的影响有多
  • 学历:学习对求职影响有多大

头条-面试求职系列

B站-面试求职系列

简历直播

简历直播系列

面试经验

非组织链接推荐:

算法刷题

  1. Leetcode
  2. 剑指 Offer
  3. 数据结构

非组织链接推荐:

卡格尔

卡格尔直播系列

比赛直播系列

卡格尔系列

卡格尔入门须知

卡格尔官方系列

资料来源:

项目负责人

第一期(2018年-01-01)

第二期(2018年-06-01)

第三期(2019-07-17)

–负责人要求:(欢迎一起为求职面试中文版本做贡献)

  • 热爱开源,喜欢装逼
  • 擅长:刷题、面试、或职场套路
  • 能够有时间及时优化页面错误和用户问题
  • 试用期:2个个月
  • 欢迎联系:片刻529815144

免责声明-[只供学习参考]

  • ApacheCN纯粹出于学习目的与个人兴趣整理
  • ApacheCN保留对此版本译文的署名权及其它相关权利

协议

  • 以各项目协议为准.
  • ApacheCN账号下没有协议的项目,一律视为CC BY-NC-SA 4.0那就是。

赞助我们


Computervision-recipes-计算机视觉的最佳实践、代码示例和文档

计算机视觉

近年来,我们看到了计算机视觉的非同寻常的增长,应用于人脸识别、图像理解、搜索、无人机、地图绘制、半自动和自动驾驶车辆。其中许多应用的关键部分是视觉识别任务,例如图像分类、目标检测和图像相似度

此存储库提供构建计算机视觉系统的示例和最佳实践指南。该存储库的目标是构建一套全面的工具和示例,以利用计算机视觉算法、神经体系结构和实现此类系统的最新进展。我们不是从头开始创建实现,而是利用现有的最先进的库,围绕加载图像数据、优化和评估模型以及向上扩展到云来构建额外的实用程序。此外,在此领域工作多年后,我们的目标是回答常见问题,指出经常观察到的陷阱,并展示如何使用云进行培训和部署

我们希望这些示例和实用程序可以通过按数量级简化从定义业务问题到开发解决方案的过程来显著缩短“上市时间”。此外,示例笔记本将作为指南,并以多种语言展示工具的最佳实践和用法

这些示例提供为Jupyter notebooks也很常见utility functions所有示例都使用PyTorch作为底层深度学习库

目标受众

我们这个存储库的目标受众包括具有不同计算机视觉知识水平的数据科学家和机器学习工程师,因为我们的内容是纯来源的,目标是自定义的机器学习建模。所提供的实用程序和示例旨在作为解决实际视觉问题的加速器

快速入门

要开始,请导航到Setup Guide,其中列出了有关如何设置计算环境和运行此Repo中的笔记本所需的依赖项的说明。设置环境后,请导航到Scenarios文件夹,开始浏览笔记本。我们建议从图像分类笔记本,因为这引入了其他场景也使用的概念(例如关于ImageNet的预培训)

或者,我们支持活页夹只需点击此链接,即可在网络浏览器中轻松试用我们的笔记本电脑。然而,Binder是免费的,因此只提供有限的CPU计算能力,并且没有GPU支持。预计笔记本的运行速度会非常慢(通过将图像分辨率降低到例如60像素,这在一定程度上有所改善,但代价是精确度较低)

场景

以下是此存储库中涵盖的常用计算机视觉场景的摘要。对于每个主要场景(“基础”),我们都会提供工具来有效地构建您自己的模型。这包括在您自己的数据上微调您自己的模型等简单任务,以及硬性否定挖掘甚至模型部署等更复杂的任务

场景 支持 描述
Classification 基地 图像分类是一种有监督的机器学习技术,用于学习和预测给定图像的类别
Similarity 基地 图像相似度是一种计算给定一对图像的相似度分数的方法。在给定图像的情况下,它允许您识别给定数据集中最相似的图像
Detection 基地 对象检测是一种允许您检测图像中对象的边界框的技术
Keypoints 基地 关键点检测可用于检测对象上的特定点。提供了一种预先训练的模型来检测人体关节,以进行人体姿态估计。
Segmentation 基地 图像分割为图像中的每个像素分配类别
Action recognition 基地 动作识别,用于在视频/网络摄像机镜头中识别执行的动作(例如,“运行”、“打开瓶子”)以及各自的开始/结束时间。我们还实现了可以在(Contrib)[contrib]下找到的动作识别的i3D实现
Tracking 基地 跟踪允许随时间检测和跟踪视频序列中的多个对象
Crowd counting Contrrib 统计低人群密度(如10人以下)和高人群密度(如数千人)场景下的人数

我们将支持的CV方案分为两个位置:(I)基地:“utils_cv”和“Scenario”文件夹中的代码和笔记本遵循严格的编码准则,经过良好的测试和维护;(Ii)Contrrib:“contrib”文件夹中的代码和其他资源,主要介绍使用尖端技术的不太常见的CV场景。“contrib”中的代码没有定期测试或维护

计算机视觉在蔚蓝上的应用

请注意,对于某些计算机视觉问题,您可能不需要构建自己的模型。取而代之的是,Azure上存在预先构建的或可轻松定制的解决方案,不需要任何自定义编码或机器学习专业知识。我们强烈建议您评估这些方法是否足以解决您的问题。如果这些解决方案不适用,或者这些解决方案的准确性不够,则可能需要求助于更复杂、更耗时的自定义方法

以下Microsoft服务提供了解决常见计算机视觉任务的简单解决方案:

  • Vision Services是一组经过预先训练的睡觉API,可以调用它们来进行图像标记、人脸识别、光学字符识别、视频分析等。这些API开箱即用,只需要极少的机器学习专业知识,但定制功能有限。查看各种可用的演示以体验该功能(例如Computer Vision)。该服务可通过API调用或通过SDK(以.NET、Python、Java、Node和Go语言提供)使用
  • Custom Vision是一项SaaS服务,用于在给定用户提供的培训集的情况下将模型训练和部署为睡觉应用编程接口。所有步骤,包括图像上传、注释和模型部署,都可以使用直观的UI或通过SDK(.Net、Python、Java、Node和Go语言)执行。训练图像分类或目标检测模型可以用最少的机器学习专业知识来实现。与使用预先培训的认知服务API相比,Custom Vision提供了更大的灵活性,但需要用户自带数据并对其进行注释

如果您需要培训您自己的模型,以下服务和链接提供了可能有用的附加信息

  • Azure Machine Learning service (AzureML)是一项帮助用户加速训练和部署机器学习模型的服务。虽然AzureML Python SDK不特定于计算机视觉工作负载,但它可以用于可伸缩且可靠的培训,并将机器学习解决方案部署到云中。我们在此存储库中的几个笔记本中利用Azure机器学习(例如deployment to Azure Kubernetes Service)
  • Azure AI Reference architectures提供一组示例(由代码支持),说明如何构建利用多个云组件的常见面向AI的工作负载。虽然不是特定于计算机视觉的,但这些参考体系结构涵盖了几个机器学习工作负载,例如模型部署或批处理评分

生成状态

AzureML测试

构建类型 分支机构 状态 分支机构 状态
Linux GPU 师傅 试运行
Linux CPU 师傅 试运行
笔记本电脑单元GPU 师傅 试运行

贡献

这个项目欢迎大家提供意见和建议。请参阅我们的contribution guidelines

Jina-面向任何类别数据的云原生神经搜索框架

云-本地神经搜索[?]适用于以下方面的框架任何数据类型

Jina 允许您在短短几分钟内构建以深度学习为动力的搜索即服务

🌌所有数据类型-大规模索引和查询任何类型的非结构化数据:视频、图像、长/短文本、音乐、源代码、PDF等

🌩️FAST和本机云-从第一天开始的分布式架构,可扩展且设计为本地云:享受集装箱化、流式处理、并行、分片、异步调度、HTTP/GRPC/WebSocket协议

⏱️节省时间这个神经搜索系统的设计模式,从零到生产准备就绪的系统只需几分钟

🍱拥有您的堆栈-保持解决方案的端到端堆栈所有权,避免使用零散的、多供应商的通用旧式工具带来的集成陷阱

运行快速演示

安装

  • 通过PyPI:pip install -U "jina[standard]"
  • 通过Docker:docker run jinaai/jina:latest
更多安装选项
x86/64、arm64、v6、v7 Linux/MacOS和Python 3.7/3.8/3.9 Docker用户
最低要求
(不支持HTTP、WebSocket、Docker)
pip install jina docker run jinaai/jina:latest
Daemon pip install "jina[daemon]" docker run --network=host jinaai/jina:latest-daemon
使用附加服务 pip install "jina[devel]" docker run jinaai/jina:latest-devel

版本标识符are explained here吉娜可以继续奔跑Windows Subsystem for Linux我们欢迎社会各界帮助我们native Windows support

开始使用

文档、执行者和流是JINA中的三个基本概念

1个️⃣复制-粘贴下面的最小示例并运行它:

💡预赛:character embeddingpoolingEuclidean distance

import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests

class CharEmbed(Executor):  # a simple character embedding with mean-pooling
    offset = 32  # letter `a`
    dim = 127 - offset + 1  # last pos reserved for `UNK`
    char_embd = np.eye(dim) * 1  # one-hot embedding for all chars

    @requests
    def foo(self, docs: DocumentArray, **kwargs):
        for d in docs:
            r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
            d.embedding = self.char_embd[r_emb, :].mean(axis=0)  # average pooling

class Indexer(Executor):
    _docs = DocumentArray()  # for storing all documents in memory

    @requests(on='/index')
    def foo(self, docs: DocumentArray, **kwargs):
        self._docs.extend(docs)  # extend stored `docs`

    @requests(on='/search')
    def bar(self, docs: DocumentArray, **kwargs):
        q = np.stack(docs.get_attributes('embedding'))  # get all embeddings from query docs
        d = np.stack(self._docs.get_attributes('embedding'))  # get all embeddings from stored docs
        euclidean_dist = np.linalg.norm(q[:, None, :] - d[None, :, :], axis=-1)  # pairwise euclidean distance
        for dist, query in zip(euclidean_dist, docs):  # add & sort match
            query.matches = [Document(self._docs[int(idx)], copy=True, scores={'euclid': d}) for idx, d in enumerate(dist)]
            query.matches.sort(key=lambda m: m.scores['euclid'].value)  # sort matches by their values

f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer)  # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
    f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip()))  # index all lines of _this_ file
    f.block()  # block for listening request

2个️⃣打开http://localhost:12345/docs(扩展的Swagger UI)在浏览器中,单击/搜索制表符和输入:

{"data": [{"text": "@requests(on=something)"}]}

也就是说,我们希望从上面的代码片段中找到与以下内容最相似的行@request(on=something)现在单击执行巴顿!

3个️⃣不是图形用户界面的人?那就让我们用Python来做吧!保持上述服务器运行,并启动一个简单的客户端:

from jina import Client, Document
from jina.types.request import Response


def print_matches(resp: Response):  # the callback function invoked when task is done
    for idx, d in enumerate(resp.docs[0].matches[:3]):  # print top-3 matches
        print(f'[{idx}]{d.scores["euclid"].value:2f}: "{d.text}"')


c = Client(protocol='http', port_expose=12345)  # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)

,它打印以下结果:

         Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.192049: "query.matches = [Document(self._docs[int(idx)], copy=True, score=d) for idx, d in enumerate(dist)]"

😔不管用吗?我们的错!Please report it here.

阅读教程

支持

加入我们吧

吉娜的后盾是Jina AIWe are actively hiring全栈开发人员、解决方案工程师在开源领域构建下一个神经搜索生态系统

贡献

我们欢迎来自开源社区、个人和合作伙伴的各种贡献。我们的成功归功于你的积极参与



















Mlcourse.ai-开放机器学习课程

mlcourse.ai是一门开放的机器学习课程,由OpenDataScience (ods.ai),由Yury Kashnitsky (yorko)尤里拥有应用数学博士学位和卡格尔竞赛大师学位,他的目标是设计一门理论与实践完美平衡的ML课程。因此,你可以在课堂上复习数学公式,并与Kaggle Inclass竞赛一起练习。目前,该课程正处于自定步模式检查一下详细的Roadmap引导您完成自定进度的课程。ai

奖金:此外,您还可以购买带有最佳非演示版本的奖励作业包mlcourse.ai任务。选择“Bonus Assignments” tier请参阅主页上的交易详情mlcourse.ai

镜子(🇬🇧-仅限):mlcourse.ai(主站点)、Kaggle Dataset(与Kaggle笔记本相同的笔记本)

自定

这个Roadmap将指导您度过11周的mlCourse.ai课程。每周,从熊猫到梯度助推,都会给出阅读什么文章、看什么讲座、完成什么作业的指示。

内容

这是medium.com上发表的文章列表🇬🇧,habr.com🇷🇺还提到了中文笔记本。🇨🇳并给出了指向Kaggle笔记本(英文)的链接。图标是可点击的

  1. 用PANDA软件进行探索性数据分析🇬🇧🇷🇺🇨🇳Kaggle Notebook
  2. 用Python进行可视化数据分析🇬🇧🇷🇺🇨🇳,Kaggle笔记本电脑:part1part2
  3. 分类、决策树和k近邻🇬🇧🇷🇺🇨🇳Kaggle Notebook
  4. 线性分类与回归🇬🇧🇷🇺🇨🇳,Kaggle笔记本电脑:part1part2part3part4part5
  5. 套袋与随机林🇬🇧🇷🇺🇨🇳,Kaggle笔记本电脑:part1part2part3
  6. 特征工程与特征选择🇬🇧🇷🇺🇨🇳Kaggle Notebook
  7. 无监督学习:主成分分析与聚类🇬🇧🇷🇺🇨🇳Kaggle Notebook
  8. Vowpal Wabbit:用千兆字节的数据学习🇬🇧🇷🇺🇨🇳Kaggle Notebook
  9. 用Python进行时间序列分析,第一部分🇬🇧🇷🇺🇨🇳使用Facebook Prophet预测未来,第2部分🇬🇧🇨🇳卡格尔笔记本:part1part2
  10. 梯度增压🇬🇧🇷🇺🇨🇳Kaggle Notebook

讲座

视频上传到thisYouTube播放列表。引言,videoslides

  1. 用熊猫进行探索性数据分析,video
  2. 可视化,EDA的主要情节,video
  3. 诊断树:theorypractical part
  4. Logistic回归:theoretical foundationspractical part(《爱丽丝》比赛中的基线)
  5. 合奏和随机森林-part 1分类指标-part 2预测客户付款的业务任务示例-part 3
  6. 线性回归和正则化-theory,Lasso&Ridge,LTV预测-practice
  7. 无监督学习-Principal Component AnalysisClustering
  8. 用于分类和回归的随机梯度下降-part 1,第2部分TBA
  9. 用Python(ARIMA,PERPHET)进行时间序列分析-video
  10. 梯度增压:基本思路-part 1、XgBoost、LightGBM和CatBoost+Practice背后的关键理念-part 2

作业

以下是演示作业。此外,在“Bonus Assignments” tier您可以访问非演示作业

  1. 用熊猫进行探索性数据分析,nbviewerKaggle Notebooksolution
  2. 分析心血管疾病数据,nbviewerKaggle Notebooksolution
  3. 带有玩具任务和UCI成人数据集的决策树,nbviewerKaggle Notebooksolution
  4. 讽刺检测,Kaggle Notebooksolution线性回归作为一个最优化问题,nbviewerKaggle Notebook
  5. Logistic回归和随机森林在信用评分问题中的应用nbviewerKaggle Notebooksolution
  6. 在回归任务中探索OLS、LASSO和随机森林nbviewerKaggle Notebooksolution
  7. 无监督学习,nbviewerKaggle Notebooksolution
  8. 实现在线回归,nbviewerKaggle Notebooksolution
  9. 时间序列分析,nbviewerKaggle Notebooksolution
  10. 在比赛中超越底线,Kaggle Notebook

卡格尔竞赛

  1. 如果可以,请抓住我:通过网页会话跟踪检测入侵者。Kaggle Inclass
  2. Dota 2获胜者预测。Kaggle Inclass

引用mlCourse.ai

如果你碰巧引用了mlcourse.ai在您的工作中,您可以使用此BibTeX记录:

@misc{mlcourse_ai,
    author = {Kashnitsky, Yury},
    title = {mlcourse.ai – Open Machine Learning Course},
    year = {2020},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/Yorko/mlcourse.ai}},
}

社区

讨论在#mlCourse_ai世界上最重要的一条航道OpenDataScience (ods.ai)松懈团队

课程是免费的,但你可以通过承诺以下内容来支持组织者Patreon(每月支持)或一次性付款Ko-fi


Pandas-profiling 从Pandas DataFrame对象创建HTML分析报告

Documentation|Slack|Stack Overflow

从熊猫生成配置文件报告DataFrame

熊猫们df.describe()函数很棒,但对于严肃的探索性数据分析来说有点基础pandas_profiling将熊猫DataFrame扩展为df.profile_report()用于快速数据分析

对于每个列,以下统计信息(如果与列类型相关)显示在交互式HTML报告中:

  • 类型推理:检测types数据帧中的列数
  • 要领:类型、唯一值、缺少值
  • 分位数统计如最小值、Q1、中位数、Q3、最大值、范围、四分位数间范围
  • 描述性统计如均值、模态、标准差、和、中位数绝对偏差、变异系数、峰度、偏度
  • 最频繁值
  • 直方图
  • 相关性突出高度相关的变量、Spearman、Pearson和Kendall矩阵
  • 缺少值缺失值的矩阵、计数、热图和树状图
  • 文本分析了解文本数据的类别(大写、空格)、脚本(拉丁文、西里尔文)和块(ASCII
  • 文件和图像分析提取文件大小、创建日期和维度,并扫描截断的图像或包含EXIF信息的图像

公告

发布版本v3.0.0其中对报告配置进行了全面检查,提供了更直观的API并修复了以前全局配置固有的问题

这是第一个坚持SemverConventional Commits规格说明

电光后端正在进行中:我们可以很高兴地宣布,用于生成个人资料报告的电光后端已经接近v1。招聘测试者!电光后端将作为此软件包的预发行版发布

支持pandas-profiling

关于……的发展pandas-profiling完全依赖于捐款。如果您在该包中发现了价值,我们欢迎您通过以下方式直接支持该项目GitHub Sponsors好了!请帮助我继续支持这个方案。特别令人兴奋的是GitHub与您的贡献相匹配第一年

请在此处查找更多信息:

2021年5月9日💘


内容:Examples|Installation|Documentation|Large datasets|Command line usage|Advanced usage|integrations|Support|Types|How to contribute|Editor Integration|Dependencies


示例

下面的示例可以让您对软件包的功能有一个印象:

具体功能:

教程:

安装

使用管道



通过运行以下命令,可以使用pip包管理器进行安装

pip install pandas-profiling[notebook]

或者,您也可以直接从Github安装最新版本:

pip install https://github.com/pandas-profiling/pandas-profiling/archive/master.zip

使用CONDA


通过运行以下命令,可以使用Conda包管理器进行安装

conda install -c conda-forge pandas-profiling

从源开始

通过克隆存储库或按键下载源代码‘Download ZIP’在这一页上

通过导航到正确的目录并运行以下命令来安装:

python setup.py install

文档

的文档pandas_profiling可以找到here以前的文档仍然可用here

快速入门

首先加载您的熊猫DataFrame,例如使用:

import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame(np.random.rand(100, 5), columns=["a", "b", "c", "d", "e"])

要生成报告,请运行以下命令:

profile = ProfileReport(df, title="Pandas Profiling Report")

更深入地探索

您可以按您喜欢的任何方式配置配置文件报告。下面的示例代码将explorative configuration file,它包括文本(长度分布、Unicode信息)、文件(文件大小、创建时间)和图像(尺寸、EXIF信息)的许多功能。如果您对使用的确切设置感兴趣,可以与default configuration file

profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)

了解有关配置的详细信息pandas-profilingAdvanced usage页面

木星笔记本

我们建议使用Jupyter笔记本以交互方式生成报告。有两个界面(参见下面的动画):通过小部件和通过HTML报告

这是通过简单地显示报告来实现的。在Jupyter笔记本中,运行:

profile.to_widgets()

HTML报告可以包含在Jupyter笔记本中:

运行以下代码:

profile.to_notebook_iframe()

保存报告

如果要生成HTML报告文件,请保存ProfileReport添加到对象,并使用to_file()功能:

profile.to_file("your_report.html")

或者,您也可以以JSON的形式获取数据:

# As a string
json_data = profile.to_json()

# As a file
profile.to_file("your_report.json")

大型数据集

版本2.4引入了最小模式

这是禁用代价高昂的计算(如关联和重复行检测)的默认配置

使用以下语法:

profile = ProfileReport(large_dataset, minimal=True)
profile.to_file("output.html")

有基准可用here

命令行用法

对于熊猫可以立即读取的标准格式的CSV文件,您可以使用pandas_profiling可执行文件

有关选项和参数的信息,请运行以下命令

pandas_profiling -h

高级用法

可以使用一组选项来调整生成的报告

  • title(str):报告标题(默认为‘Pandas Profiling Report’)
  • pool_size(int):线程池中的工作进程数。设置为零时,它将设置为可用CPU数(默认情况下为0)
  • progress_bar(bool):如果为True,pandas-profiling将显示进度条
  • infer_dtypes(bool):何时True(默认)dtype的变量是使用visions使用排版逻辑(例如,将整数存储为字符串的列将被视为数字进行分析)

有关更多设置,请参阅default configuration fileminimal configuration file

您可以在高级用法页面上找到配置文档here

示例

profile = df.profile_report(
    title="Pandas Profiling Report", plot={"histogram": {"bins": 8}}
)
profile.to_file("output.html")

集成

寄予厚望

分析数据与数据验证密切相关:通常,验证规则是根据众所周知的统计数据定义的。为此,pandas-profilingGreat Expectations这是一个世界级的开源库,可以帮助您维护数据质量并改善团队之间关于数据的沟通。远大期望允许您创建期望(基本上是数据的单元测试)和数据文档(便于共享的HTML数据报告)pandas-profiling提供了一种基于ProfileReport的结果创建一套预期的方法,您可以存储这些预期,并使用它来验证另一个(或将来的)数据集

您可以找到有关《远大前程》集成的更多详细信息here

支持开源

如果没有我们慷慨的赞助商的支持,维护和开发熊猫侧写的开源代码是不可能的,它有数百万的下载量和数千的用户

Lambda workstations、服务器、笔记本电脑和云服务为财富500强公司和94%的前50所大学的工程师和研究人员提供动力。Lambda Cloud提供4个和8个GPU实例,起步价为1.5美元/小时。预装TensorFlow、PyTorch、Ubuntu、CUDA和cuDNN

我们要感谢我们慷慨的Github赞助商和支持者,是他们让熊猫侧写成为可能:

Martin Sotir, Brian Lee, Stephanie Rivera, abdulAziz, gramster

如果您想出现在此处,请查看更多信息:Github Sponsor page

类型

类型是有效数据分析的强大抽象,它超越了逻辑数据类型(整型、浮点型等)。pandas-profiling目前,可识别以下类型:布尔值、数值、日期、分类、URL、路径、文件图像

我们为Python开发了一个类型系统,为数据分析量身定做:visions选择合适的排版既可以提高整体表现力,又可以降低分析/代码的复杂性。要了解更多信息,请执行以下操作pandas-profiling的类型系统,请签出默认实现here同时,现在完全支持用户自定义摘要和类型定义-如果您有特定的用例,请提出想法或公关!

贡献

请阅读有关参与Contribution Guide

提出问题或开始贡献的一个低门槛的地方是通过接触熊猫-侧写松弛。Join the Slack community

编辑器集成

PyCharm集成

  1. 安装pandas-profiling通过上述说明
  2. 找到您的pandas-profiling可执行文件
    • 在MacOS/Linux/BSD上:
      $ which pandas_profiling
      (example) /usr/local/bin/pandas_profiling
    • 在Windows上:
      $ where pandas_profiling
      (example) C:\ProgramData\Anaconda3\Scripts\pandas_profiling.exe
  3. 在PyCharm中,转到设置(或首选项在MacOS上)>工具>外部工具
  4. 单击+图标以添加新的外部工具
  5. 插入以下值
    • 名称:熊猫侧写
    • 计划:在步骤2中获得的位置
    • 参数:"$FilePath$" "$FileDir$/$FileNameWithoutAllExtensions$_report.html"
    • 工作目录:$ProjectFileDir$

要使用PyCharm集成,请右键单击任意数据集文件:

外部工具>熊猫侧写

其他集成

其他编辑器集成可以通过拉请求进行贡献

依赖项

配置文件报告是用HTML和CSS编写的,这意味着pandas-profiling需要现代浏览器

你需要Python 3来运行此程序包。其他依赖关系可以在需求文件中找到:

文件名 要求
requirements.txt 套餐要求
requirements-dev.txt 发展的要求
requirements-test.txt 测试的规定
setup.py 对微件等的要求

LightGBM-基于决策树算法的快速、分布式、高性能梯度提升框架,用于排序、分类和许多其他机器学习任务

LightGBM是一个使用基于树的学习算法的梯度提升框架。它设计为分布式且高效,具有以下优势:

  • 更快的培训速度和更高的效率
  • 降低内存使用率
  • 更高的精确度
  • 支持并行、分布式和GPU学习
  • 能够处理大规模数据

有关更多详情,请参阅Features

得益于这些优势,LightGBM在许多领域得到了广泛的应用winning solutions机器学习竞赛的

Comparison experiments公开数据集上的数据显示,LightGBM在效率和准确性上都优于现有的Boosting框架,并且内存消耗明显较低。更重要的是,distributed learning experiments展示了LightGBM可以通过在特定设置中使用多台机器进行训练来实现线性加速

入门和文档

我们的主要文档在https://lightgbm.readthedocs.io/并且是从该存储库生成的。如果您是LightGBM的新手,请关注the installation instructions在那个网站上

接下来,您可能想要阅读:

投稿人文档:

新闻

请参阅更改日志,地址为GitHub releases页面

一些旧的更新日志位于Key Events页面

外部(非官方)存储库

FLAML(用于超参数优化的AutoML库):https://github.com/microsoft/FLAML

Optuna(超参数优化框架):https://github.com/optuna/optuna

朱莉娅-套餐:https://github.com/IQVIA-ML/LightGBM.jl

JPMML(Java PMML转换器):https://github.com/jpmml/jpmml-lightgbm

Treite(用于高效部署的模型编译器):https://github.com/dmlc/treelite

lLeaf(基于LLVM的模型编译器,用于高效推理):https://github.com/siboehm/lleaves

Hummingbird(将模型编译器转换为张量计算):https://github.com/microsoft/hummingbird

CuML林推理库(GPU加速推理):https://github.com/rapidsai/cuml

daal4py(英特尔CPU加速推理):https://github.com/IntelPython/daal4py

m2cgen(适用于各种语言的模型应用程序):https://github.com/BayesWitnesses/m2cgen

树叶(GO模型施加器):https://github.com/dmitryikh/leaves

ONNXMLTools(ONNX转换器):https://github.com/onnx/onnxmltools

Shap(模型输出解释器):https://github.com/slundberg/shap

Shapash(模型可视化和解释):https://github.com/MAIF/shapash

dtreeviz(决策树可视化和模型解释):https://github.com/parrt/dtreeviz

MMLSpark(电光上的LightGBM):https://github.com/Azure/mmlspark

Kubeflow光顺(Kubernetes上的LightGBM):https://github.com/kubeflow/fairing

Kubeflow运算符(Kubernetes上的LightGBM):https://github.com/kubeflow/xgboost-operator

ML.NET(.NET/C#-Package):https://github.com/dotnet/machinelearning

LightGBM.NET(.NET/C#-Package):https://github.com/rca22/LightGBM.Net

红宝石:https://github.com/ankane/lightgbm

LightGBM4j(Java高级绑定):https://github.com/metarank/lightgbm4j

lightgbm-rs(铁锈装订):https://github.com/vaaaaanquish/lightgbm-rs

MLflow(实验跟踪、模型监控框架):https://github.com/mlflow/mlflow

{treesnip}(r{parsnip}-兼容接口):https://github.com/curso-r/treesnip

{mlr3learners.lightgbm}(r{mlr3}-兼容接口):https://github.com/mlr3learners/mlr3learners.lightgbm

支持

如何做出贡献

检查CONTRIBUTING页面

Microsoft开放源代码行为准则

本项目采用了Microsoft Open Source Code of Conduct有关更多信息,请参阅Code of Conduct FAQ或联系方式opencode@microsoft.com如有任何其他问题或评论

参考文献

柯国林,齐蒙,托马斯·芬利,王泰峰,魏晨,马卫东,叶启伟,刘铁岩。“LightGBM: A Highly Efficient Gradient Boosting Decision Tree“神经信息处理系统的进展”(NIPS 2017),第3149-3157页。

齐蒙,柯国林,王泰峰,魏晨,叶启伟,马志明,刘铁岩。“A Communication-Efficient Parallel Algorithm for Decision Tree“神经信息处理系统的进展”29(NIPS 2016),第1279-1287页

张欢,四思,谢楚瑞。“GPU Acceleration for Large-scale Tree Boosting“.SysML大会,2018年

注意事项:如果您在GitHub项目中使用LightGBM,请添加lightgbmrequirements.txt

许可证

这个项目是根据麻省理工学院的许可证条款授权的。看见LICENSE有关更多详细信息,请参阅