Google 机器学习从零到一

20 Jun 2020 Skill, Note, and Machine-Learning

1 - The “Hello World” of machine learning

机器学习 · 输入答案和数据，输出规则。

The process of training the neural network, where it ‘learns’ the relationship between the Xs and Ys is in the model.fit call. This is where it will go through the loop we spoke about before:aking a guess, measuring how good or bad it is (aka the loss), using the optimizer to make another guess etc. It will do it for the number of epochs you specify. When you run this code, you’ll see the loss will be printed out for each epoch.

Use the model.predict method to have it figure out the Y for a previously unknown X.

Neural networks deal with probabilities, so given the data that we fed the NN with, it calculated that there is a very high probability that the relationship between X and Y is Y=3X+1, but with only 6 data points we can’t know for sure. As a result, the result for 10 is very close to 31, but not necessarily 31.

· 完成的项目代码

import tensorflow as tf
import numpy as np
from tensorflow import keras

## Define and compile the neural network
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

# Using `mean squared error` for the loss; `stochastic gradient descent` (sgd) for the optimizer.
model.compile(optimizer='sgd', loss='mean_squared_error')

## Providing the data
xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-2.0, 1.0, 4.0, 7.0, 10.0, 13.0], dtype=float)

## Training the neural network
model.fit(xs, ys, epochs=500)

print(model.predict([10.0])) # Print [[31.002396]]

In this lab you saw how to train a neural network to spot the relationship between two sets of numbers by defining the network. You defined a set of layers (in this case just 1) that contained neurons (also in this case, just 1), which you then compiled with a loss function and an optimizer.

2 - Beyond Hello World, A Computer Vision Example

使用数字而非中文作为标签，好处在于：

易于计算机计算。
避免偏差。使用“短靴”二字时就已经注入的基于英文的偏差，使用数字就可以人为映射成任意语言。

关于结构 model = tf.keras.models.Sequential([tf.keras.layers.Flatten(input_shape=(28, 28)), # 输入28*28像素图片 tf.keras.layers.Dense(128, activation=tf.nn.relu), # 128 相当于128个函数，都带不同参数，每个输入都经由其中算得唯一结果 tf.keras.layers.Dense(10, activation=tf.nn.softmax)]) # 输出10个分类中的一种
激励函数 relu 线性整流函数 - 返回一个大于零的数值，过滤到小于等于0的结果。
softmax - 选出集合中最大的数，最终被设为1。

这里的图像大小都一致（28*28），且物件都在图像正中。需要识别更泛用的普通图像，物件混杂的图片，需要识别特征的方法，使用卷积神经网络。

插曲

keras下载实在太慢，遂直接跟踪下载链接下数据集，放到目录 ~/.keras/datasets/ 下。均到.gz后缀。预训练模型放到 ~/.keras/models/ 下。 Windows 应该是 C:\Users\Username.keras\datasets\fashion-mnist 这样。

plt.imshow(training_images[0])不显示图片.. 记得补一句plt.show()。

You’ll notice that all of the values in the number are between 0 and 255. If we are training a neural network, for various reasons it’s easier if we treat all values as between 0 and 1, a process called ‘normalizing

Both the list and the labels are 0 based, so the ankle boot having label 9 means that it is the 10th of the 10 classes. The list having the 10th element being the highest value means that the Neural Network has predicted that the item it is classifying is most likely an ankle boot

rule of thumb - 经验做法

必须Flatten。 Right now our data is 28x28 images, and 28 layers of 28 neurons would be infeasible, so it makes more sense to ‘flatten’ that 28,28 into a 784x1.
最后一层神经元数量必须得是分类数量。 The number of neurons in the last layer should match the number of classes you are classifying for. In this case it’s the digits 0-9, so there are 10 of them, hence you should have 10 neurons in your final layer.

在512神经元层和输出层之间加一层，对相对简单是数据没有影响。 model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), tf.keras.layers.Dense(512, activation=tf.nn.relu), tf.keras.layers.Dense(256, activation=tf.nn.relu), tf.keras.layers.Dense(10, activation=tf.nn.softmax)]) Consider the effects of additional layers in the network. What will happen if you add another layer between the one with 512 and the final layer with 10.

Ans: There isn’t a significant impact – because this is relatively simple data. For far more complex data (including color images to be classified as flowers that you’ll see in the next lesson), extra layers are often necessary.

训练更多epoch，loss可能会下降，但再多epoch可能反而会让loss上升，发生过拟合。 Try 30 epochs – you might see the loss value stops decreasing, and sometimes increases. This is a side effect of something called ‘overfitting’ which you can learn about [somewhere] and it’s something you need to keep an eye out for when training neural networks. There’s no point in wasting your time training if you aren’t improving your loss, right! :)

归一化的用处

归一化后加快了梯度下降求最优解的速度，防止奇异数据导致无法收敛
归一化有可能提高精度（如KNN）注：没有一种数据标准化的方法，放在每一个问题，放在每一个模型，都能提高算法精度和加速算法的收敛速度。

也使用callback函数在达到目标准确率后停止训练。

· 项目代码

import tensorflow as tf
from tensorflow import keras

import matplotlib.pyplot as plt
print(tf.__version__)

# 写一个callback函数 在达到预期准确率之后停止训练
'''
class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('acc')>0.9):
      print("\nReached 90% accuracy so cancelling training!")
      self.model.stop_training = True

'''
#callbacks = myCallback()
mnist = tf.keras.datasets.fashion_mnist

# Calling load_data on this object will give you two sets of two lists, these will be the training and testing values for the graphics that contain the clothing items and their labels.
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

# 用plt看看测试数据和标签长什么样
plt.imshow(training_images[0])
plt.show()
print(training_images[0])
print(training_labels[0])

# Normalizing
training_images = training_images / 255.0
test_images = test_images / 255.0

#* 可以不带input——shape(只写tf.keras.layers.Flatten() )
model = keras.Sequential([tf.keras.layers.Flatten(input_shape=(28, 28)),
                                   tf.keras.layers.Dense(128, activation=tf.nn.relu),
                                   tf.keras.layers.Dense(10, activation=tf.nn.softmax)]) #input_shape=(28, 28)

'''
Sequential: That defines a SEQUENCE of layers in the neural network

Flatten: Remember earlier where our images were a square, when you printed them out? Flatten just takes that square and turns it into a 1 dimensional set.

Dense: Adds a layer of neurons

Each layer of neurons need an activation function to tell them what to do. There's lots of options, but just use these for now.

Relu effectively means "If X>0 return X, else return 0" -- so what it does it it only passes values 0 or greater to the next layer in the network.

Softmax takes a set of values, and effectively picks the biggest one, 
so, for example, if the output of the last layer looks like [0.1, 0.1, 0.05, 0.1, 9.5, 0.1, 0.05, 0.05, 0.05], it saves you from fishing through it looking for the biggest value, and turns it into [0,0,0,0,1,0,0,0,0] -- The goal is to save a lot of coding!
'''

#* optimizer的参数也可以是 optimizer = 'adam'
model.compile(optimizer=tf.keras.optimizers.Adam(),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=5)
#model.fit(training_images, training_labels, epochs=5, callbacks=[callbacks])
# 训练完得到的准确度为训练集图像与标签的匹配准确度，要判断测试集的准确度需要调用model.evaluate

test_loss, test_acc = model.evaluate(test_images, test_labels)
print(test_loss)
print(test_acc)

# classification
pedictions = model.predict(test_images)
# the probability that this item is each of the 10 classes
print(pedictions[0]) # test_labels[0] == 9, 即标签为9可能性最大, 代表ankle boot

3 - Introduction to Convolutions

A convolution is a filter that passes over an image, processes it, and extracts features that show a commonality in the image.

卷积神经网络需要先过滤，然后才能凸显出图像特征，在此基础上识别不同物品。过滤器就是乘法器（+pic1）

过滤器可以和池化同时使用。池化把图像中的像素分组，集成，向下过滤到它的子集里。如最大化的池化(MAX pooling)（+pic2）：分割为2*2图片，取最大值，图像就会被缩小为1/4，但是保留了原始图像特征。 The goal is to reduce the overall amount of information in an image while maintaining the features that are detected as present.

过滤器是被习得的。

4 - RockPaperScissors

一点小问题

关于 ImageDataGenerator

 # ImageDataGenerator - 生成一个batch的图像数据，支持实时数据提升。训练时该函数会无限生成数据，直到达到规定的epoch次数为止。
 # 用来从下载的图片中生成训练用的图片。
 training_datagen = ImageDataGenerator(
       rescale = 1./255,
       rotation_range=40,
       width_shift_range=0.2,
       height_shift_range=0.2,
       shear_range=0.2,
       zoom_range=0.2,
       horizontal_flip=True,
       fill_mode='nearest')

model.fit() 和 model.fit_generator()

 '''
 The Keras fit function takes arrays of data, numpy arrays, not generators. 
 The function you need is fit_generator. 
 Note that fit_generator takes slightly different parameters, such as steps_per_epoch instead of batch_size.
 '''
 history = model.fit_generator(train_generator, epochs=25, steps_per_epoch=20, 
                               validation_data = validation_generator, 
                               verbose = 1, 
                               validation_steps = 3)

enumerate()

对于一个可迭代的（iterable/可遍历的对象（如列表、字符串），enumerate将其组成一个索引序列，利用它可以同时获得索引和值。多用于在for循环中得到计数。

 # 如果对一个列表，既要遍历索引又要遍历元素时，首先可以这样写：
 list1 = ["这", "是", "一个", "测试"]
 for i in range (len(list1)):
     print i ,list1[i]
	
 # 使用enumerate()则会更加直接和优美：
 list1 = ["这", "是", "一个", "测试"]
 for index, item in enumerate(list1):
     print index, item
	
 '''
 >>>
 0 这
 1 是
 2 一个
 3 测试
 '''
	
 # enumerate还可以接收第二个参数，用于指定索引起始值，如：
 list1 = ["这", "是", "一个", "测试"]
 for index, item in enumerate(list1, 1):
     print index, item
	
 '''
 >>>
 1 这
 2 是
 3 一个
 4 测试
 '''
	
 # 也可以使用enumerate统计文件行数
 count = 0
 for index, line in enumerate(open(filepath,'r'))： 
     count += 1

val_loss, val_acc 是验证集的 acc 和 loss

为什么需要验证集？使用验证集是因为在训练模型的时候，如果只有训练集，拟合过程中loss一定是逐渐下降的，即使过拟合了也不会回头。那么如果你要调参，想让 loss 更低，就会一直加层，一直加神经元，然后发现 loss 越来越低，最后100%识别所有的样本。
为什么在拥有验证集的基础上还需要测试集？根据验证集选择模型，可能使结果带有偶然性，也相当于人为泄露了验证集给模型。因此需要一个测试集，不影响模型的整个训练过程以及调参过程。那么这个集就叫做测试集。使用测试集来评估模型，得到的分数就是合理的，可信的。