三千字轻松入门TensorFlow 2

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"​通过使用深度学习实现分类问题的动手演练,如何绘制问题以及如何改善其结果,来了解TensorFlow的最新版本。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"但是等等...什么是Tensorflow?"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Tensorflow是Google的深度学习框架,于2019年发布了第二个版本。它是世界上最著名的深度学习框架之一,被行业专家和研究人员广泛使用。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c0/c04aa9f12a3463cac53c22a62dbfb051.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"Tensorflow v1难以使用和理解,因为它不像Pythonic,但随着Keras发布的v2现在与Tensorflow.keras完全同步,它易于使用,易学且易于理解。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"请记住,这不是有关深度学习的文章,所以我希望您了解深度学习的术语及其背后的基本思想。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们将使用非常著名的数据集IRIS数据集探索深度学习的世界。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"让我们直接进入代码以了解发生了什么。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"导入和理解数据集"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/04/040199f3f1858c2d7d8ab804df7fa99d.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在,这个"},{"type":"text","marks":[{"type":"italic"}],"text":"iris"},{"type":"text","text":"是一本字典。我们可以使用keys()"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/db/db8c231b8b156f981a3a1126c5b6ec8b.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"因此,我们的数据在 "},{"type":"text","marks":[{"type":"italic"}],"text":"数据 "},{"type":"text","text":"键中, "},{"type":"text","marks":[{"type":"italic"}],"text":"标签"},{"type":"text","text":"在 "},{"type":"text","marks":[{"type":"italic"}],"text":"标签"},{"type":"text","text":"键中,依此类推。如果要查看此数据集的详细信息,可以使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"iris ['DESCR']"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在,我们必须导入其他重要的库,这将有助于我们创建神经网络。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/54/54478a837794cd6eff60c90f97e620d0.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们从"},{"type":"text","marks":[{"type":"italic"}],"text":"tensorflow中"},{"type":"text","text":"导入了2个主要内容 ,即 "},{"type":"text","marks":[{"type":"italic"}],"text":"Dense "},{"type":"text","text":"和 "},{"type":"text","marks":[{"type":"italic"}],"text":"Sequential"},{"type":"text","text":"。我们从"},{"type":"text","marks":[{"type":"italic"}],"text":"tensorflow.keras.layers"},{"type":"text","text":"导入的  密集层是紧密连接的一种层。密集连接的层意味着先前层的所有节点都连接到当前层的所有节点。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"Sequential "},{"type":"text","text":"是Keras的API,通常称为Sequential API,我们将使用它来构建神经网络。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"为了更好地理解数据,我们可以将其转换为数据帧。我们开始做吧。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c1/c1a7a5e21ef833e8cb1c3f636e771582.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f1/f101db5de159d74619ac83d6a224436e.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"请注意,这里我们设置了 "},{"type":"text","marks":[{"type":"italic"}],"text":"column = iris.feature_names"},{"type":"text","text":"  ,其中 "},{"type":"text","marks":[{"type":"italic"}],"text":"feature_names"},{"type":"text","text":" 是具有所有4个特征名称的键。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"对于标签,"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/37/37be26b211b2fc6b939fc7fb6ba649ad.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e2/e2f96e2c7eb0efa31ae2c1afaae0450c.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4c/4c4f30d6c4ab783b77aa6a39cca46df1.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/31/31742373c1bcd792cfdd90157020d5a4.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以看到我们有3个类,每个类的标签分别为0、1和2。要查看标签名称,我们可以使用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e1/e142b502c18b44624a9956699a4f179a.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/26/26e44c94ed39390e6447e58890f66711.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这些是我们必须预测的类的名称。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"机器学习的数据预处理"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在,机器学习的第一步是数据预处理。数据预处理的主要步骤是"}]},{"type":"bulletedlist","content":[{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"填充缺失值"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"将数据分为训练和验证集"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"数据标准化"}]}]},{"type":"listitem","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"将分类数据转换为一键向量"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"缺失值"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要检查是否缺少任何值,可以使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"pandas.DataFrame.info()"},{"type":"text","text":" 方法进行检查。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c2/c22d8d1d7af5fe3bbaa19358add75c2c.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d2/d2d7b59588d670b8c5c71cedaeb53cbd.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这里我们可以看到我们没有丢失值(幸运的是),所有特征都在 "},{"type":"text","marks":[{"type":"italic"}],"text":"float64中"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"分为训练集和测试集"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要将数据分为训练集和测试集,我们可以使用  先前导入的"},{"type":"text","marks":[{"type":"italic"}],"text":"sklearn.model_selection中"},{"type":"text","text":" 的 "},{"type":"text","marks":[{"type":"italic"}],"text":"train_test_split"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/bc/bccf63cb4ebffa7792b6c8d2bec2b1b5.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"其中"},{"type":"text","marks":[{"type":"italic"}],"text":"test_size"},{"type":"text","text":" 是告诉我们我们希望测试数据占整个数据的10%的参数。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"数据标准化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"通常,当数据中存在大量方差时,我们将其标准化。要检查方差,我们可以使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"panadas.DataFrame中的var()"},{"type":"text","text":" 函数  检查所有列的var。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/f0/f03c7cd5dd71a97677b508761c2b2bae.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/83/838e137bdcf2c0331f82c09b38d8138b.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以看到 "},{"type":"text","marks":[{"type":"italic"}],"text":"X_train"},{"type":"text","text":" 和 "},{"type":"text","marks":[{"type":"italic"}],"text":"X_test的 "},{"type":"text","text":"方差都非常低,因此无需对数据进行标准化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"分类数据转换为OneHot向量"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们知道我们的输出数据是已经使用"},{"type":"text","marks":[{"type":"italic"}],"text":"iris.target_names"},{"type":"text","text":"检查的3个类 "},{"type":"text","marks":[{"type":"italic"}],"text":"之一"},{"type":"text","text":",好处是当我们加载目标时,它们已经是0、1、2格式,其中0 = "},{"type":"text","marks":[{"type":"italic"}],"text":"1stclass"},{"type":"text","text":",1 = 2nd class , 等等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"这种表示形式的问题在于我们的模型可能会给较高的数字更高的优先级,这可能导致结果出现偏差。因此,为了解决这个问题,我们将使用一站式表示法。您可以"},{"type":"link","attrs":{"href":"https://towardsdatascience.com/tagged/one-hot-encoder","title":null},"content":[{"type":"text","text":"在此处"}]},{"type":"text","text":"了解更多关于一键矢量的 "},{"type":"link","attrs":{"href":"https://towardsdatascience.com/tagged/one-hot-encoder","title":null},"content":[{"type":"text","text":"信息"}]},{"type":"text","text":"。我们可以使用"},{"type":"text","marks":[{"type":"italic"}],"text":"Keras"},{"type":"text","text":"内置的 "},{"type":"text","marks":[{"type":"italic"}],"text":"to_categorical"},{"type":"text","text":" 或使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"sklearn中 "},{"type":"text","text":"的 "},{"type":"text","marks":[{"type":"italic"}],"text":"OneHotEncoder"},{"type":"text","text":"。我们将使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"to_categorical"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/fb/fb13eb9a52eb31ed9c024bde500b2fa0.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们将仅检查前5行,以检查其是否正确转换。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/c8/c845731cb4d80a3f6af4f6ce99750d99.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6c/6c7e4fb9c0ab6190ad3c42fbfd908cb8.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"是的,我们已经将其转换为OheHot表示形式。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"最后一件事"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们可以做的最后一件事是将数据转换回 "},{"type":"text","marks":[{"type":"italic"}],"text":"numpy"},{"type":"text","text":"数组,以便我们可以使用一些额外的特征功能,这些特征将在稍后的模型中为我们提供帮助。为此,我们可以使用"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/cb/cbd6f904e83ed9194343e456c9742561.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"让我们看看第一个训练示例的结果。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/64/6443c5557dfb8bf11bfe51b98ef1d3b8.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9f/9f7bc0bb81dc665903d24754a76588fa.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在第一个训练示例中,我们可以看到4个要素的值,其形状为(4,)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"当我们对它们使用"},{"type":"text","marks":[{"type":"italic"}],"text":"to_categorical时 "},{"type":"text","text":",它们的目标标签已经是数组格式 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"深度学习模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在终于可以开始创建模型并对其进行训练了。我们将从简单的模型开始,然后进入复杂的模型结构,其中将介绍Keras中的不同技巧和技术。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"让我们编写基本模型"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/b8/b806faf78f17965bcabfee758011c7f8.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"首先,我们必须创建一个顺序对象。现在,要创建模型,我们要做的就是根据我们的选择添加不同类型的图层。我们将制作一个10个密集层模型,以便我们可以观察过度拟合,并在以后通过不同的正则化技术将其减少。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/ab/ab7558c4aa0c4b5abd5f1a2971a33b1d.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"注意,在第一层中,我们使用了一个额外的"},{"type":"text","marks":[{"type":"italic"}],"text":"input_shape"},{"type":"text","text":"参数"},{"type":"text","marks":[{"type":"italic"}],"text":"。"},{"type":"text","text":"此参数指定第一层的尺寸。在这种情况下,我们不关心训练示例的数量。相反,我们只关心功能的数量。因此,我们传递了任何训练示例的形状,在我们的例子中,它是 "},{"type":"text","marks":[{"type":"italic"}],"text":"(4,)"},{"type":"text","text":" 在"},{"type":"text","marks":[{"type":"italic"}],"text":"input_shape"},{"type":"text","text":"内部 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"注意,我们在输出层中使用了 "},{"type":"text","marks":[{"type":"italic"}],"text":"softmax "},{"type":"text","text":"激活函数,因为它是一个多类分类问题。如果是二进制分类问题,我们将使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"Sigmoid "},{"type":"text","text":"激活函数。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们可以传入我们想要的任何激活函数,例如 "},{"type":"text","marks":[{"type":"italic"}],"text":"S型 "},{"type":"text","text":", "},{"type":"text","marks":[{"type":"italic"}],"text":"线性 "},{"type":"text","text":"或 "},{"type":"text","marks":[{"type":"italic"}],"text":"tanh"},{"type":"text","text":",但是通过实验证明 "},{"type":"text","marks":[{"type":"italic"}],"text":"relu "},{"type":"text","text":"在这类模型中表现最佳。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在,当我们定义了模型的形状时,下一步就是指定它的 "},{"type":"text","marks":[{"type":"italic"}],"text":"损失"},{"type":"text","text":", "},{"type":"text","marks":[{"type":"italic"}],"text":"优化器"},{"type":"text","text":"和 "},{"type":"text","marks":[{"type":"italic"}],"text":"指标"},{"type":"text","text":"。我们在Keras中使用"},{"type":"text","marks":[{"type":"italic"}],"text":"compile "},{"type":"text","text":"方法指定这些 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/2f/2f3c82d72f481e4482fb1fddcba91ea0.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以使用任何 "},{"type":"text","marks":[{"type":"italic"}],"text":"优化程序, "},{"type":"text","text":"例如随机梯度下降,RMSProp等,但是我们将使用Adam。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们在 这里使用"},{"type":"text","marks":[{"type":"italic"}],"text":" categorical_crossentropy"},{"type":"text","text":"是因为我们有一个多类分类问题,如果我们有一个二元类分类问题,我们会改用 "},{"type":"text","marks":[{"type":"italic"}],"text":"binary_crossentropy"},{"type":"text","text":"  。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"指标对于评估一个人的模型很重要。我们可以基于不同的指标来评估模型。对于分类问题,最重要的指标是准确性,它表明我们的预测有多准确。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们模型的最后一步是将其拟合训练数据和训练标签。让我们编写代码。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/9e/9ee23d134c09c7fba2233682661bbcc8.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"}],"text":"fit "},{"type":"text","text":"返回一个回调,该回调具有我们训练的所有历史记录,我们可以用来执行不同的有用任务,例如绘图等。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"History回调具有一个名为"},{"type":"text","marks":[{"type":"italic"}],"text":"history "},{"type":"text","text":"的属性 ,我们可以将其作为"},{"type":"text","marks":[{"type":"italic"}],"text":"history.histor"},{"type":"text","text":" y进行访问 ,它是具有所有损失和指标历史记录的字典,即,在我们的示例中,它具有"},{"type":"text","marks":[{"type":"italic"}],"text":"loss"},{"type":"text","text":",  "},{"type":"text","marks":[{"type":"italic"}],"text":"acc"},{"type":"text","text":",  "},{"type":"text","marks":[{"type":"italic"}],"text":"val_loss"},{"type":"text","text":"和 "},{"type":"text","marks":[{"type":"italic"}],"text":"val_acc "},{"type":"text","text":"的历史记录 并且我们可以访问"},{"type":"text","marks":[{"type":"italic"}],"text":"history.history.loss"},{"type":"text","text":" 或 "},{"type":"text","marks":[{"type":"italic"}],"text":"history.history ['val_acc']"},{"type":"text","text":" 等中的每一个 。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们指定的epoch数为800,批量大小为40,验证分为0.1,这意味着我们现在有10%的验证数据可用于分析训练。使用800个epoch将过度拟合数据,这意味着它将在训练数据上表现出色,但在测试数据上表现不佳。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在训练模型的同时,我们可以在训练和验证集上看到我们的损失和准确性。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/06/060ca80d5c02a4f9618d38863dc1ee34.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以看到我们的训练精度为100%,验证精度为67%,对于这样的模型而言,这是相当不错的。让我们来绘制它。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/d3/d3f8d31318d4f2e8766bffb6cc3616ed.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/85/85f871967cc2c2ba4b0009d302c0695d.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们可以清楚地看到,训练集的准确性比验证集的准确性高得多。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"同样,我们可以将损失绘制为"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/71/7199dd61ffcb2f629e6fc603bee0acae.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/e4/e46c54e215cbf51d35d7630caad5d735.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以清楚地看到我们的验证损失比我们的训练损失高得多,这是因为我们过度拟合了数据。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"要检查模型性能,可以使用 "},{"type":"text","marks":[{"type":"italic"}],"text":"model.evaluate"},{"type":"text","text":" 检查模型性能。我们需要在评估方法中传递数据和标签。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/91/91f6f33943a391ca88ef6c0a440ceabb.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/6b/6b794c6a435a4c96f1f29396bffefa35.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以看到我们的模型给出了88%的准确度,这对于过度拟合的模型来说相当不错。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"正则化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"让我们通过在模型中添加正则化使其更好。正则化将减少我们模型的过度拟合并改善我们的模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们将在模型中添加L2正则化。"},{"type":"link","attrs":{"href":"https://towardsdatascience.com/concept-of-regularization-28f593cf9f8c#:~:text=The%20idea%20behind%20L1%20regularization,absolute%20value%20of%20the%20coefficients.","title":null},"content":[{"type":"text","text":"在此处"}]},{"type":"text","text":"了解有关L2正则化的更多信息 。要在我们的模型中添加L2正则化,我们必须指定要在其中添加正则化的层,并提供另一个参数 "},{"type":"text","marks":[{"type":"italic"}],"text":"kernel_regularizer"},{"type":"text","text":",并传递 "},{"type":"text","marks":[{"type":"italic"}],"text":"tf.keras.regularizers.l2()"},{"type":"text","text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们还将在模型中实现一些改进,这将有助于我们更好地减少过度拟合,从而获得更好的性能模型。要了解更多有关理论和动机背后辍学,请参阅 "},{"type":"link","attrs":{"href":"https://medium.com/towards-artificial-intelligence/an-introduction-to-dropout-for-regularizing-deep-neural-networks-4e0826c10395","title":null},"content":[{"type":"text","text":"此"}]},{"type":"text","text":" 文章。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"让我们重新制作模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/10/104b80bb91a3a1a98fc072a7f7647384.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"如果您密切注意,我们的所有层和参数都相同,除了我们在每个密集层中添加了2个Dropout和正则化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"我们将使所有其他内容(loss,优化器,epoch等)保持不变。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/4e/4eb2eca33824d9cd3a401615510e9162.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"现在让我们评估模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/53/539e19911bfe897c52b1260fe31299c1.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/db/dba4605443a83a787a9b2cf8729bae2c.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"你猜怎么着?通过添加正则化和Dropout,我们将准确性从88%提高到94%。如果我们向其添加批处理规范化,它将进一步改善。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"让我们来绘制它。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/af/afe2274f507b854ec954bce5b24b52b1.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/50/50dd440bb313f8102dd64ddb04ed188c.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/81/81b19aa0358aebaf873c9e42b833a735.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":true}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"见解"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"在这里,我们可以看到我们已经成功地从过度模型中去除了过度拟合,并将模型提高了近6%,对于如此小的数据集而言,这是一个很好的改进。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":4},"content":[{"type":"text","text":"如果你喜欢本文的话,欢迎点赞转发!谢谢。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"看完别走还有惊喜!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"strong"}],"text":"我精心整理了计算机/Python/机器学习/深度学习相关的2TB视频课与书籍,价值1W元。关注微信公众号“计算机与AI”,点击下方菜单即可获取网盘链接。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":""}]},{"type":"image","attrs":{"src":"https://static001.geekbang.org/infoq/90/903237ffd0a3b3ae06272386f26ecb9e.png","alt":null,"title":null,"style":null,"href":null,"fromPaste":true,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章