Assignment4
Previously in 2_fullyconnected.ipynb and 3_regularization.ipynb, we trained fully connected networks to classify notMNIST
characters.The goal of this assignment is make the neural network convolutional.
Problem 1
The convolutional model above uses convolutions with stride 2 to reduce the dimensionality. Replace the strides by a max pooling
operation (nn.max_pool()) of stride 2 and kernel size 2.
我如果理解不错的话,这个意思是让你吧stride2的convolution变成步长为1的,然后在其后面接max pooling来达到降维度.
惯例,我们先看给出的代码的效果:
Minibatch loss at step 1000: 0.600151 Minibatch accuracy: 87.5% Validation accuracy: 82.5% Test accuracy: 89.3%
CNN效果还不错,比我们之前自己在Assignment3写的不差,而且参数和速度都要优越的多.
接下来我们分析一下代码,看看怎么改.
tf_train_dataset = tf.placeholder(
tf.float32, shape=(batch_size, image_size, image_size, num_channels))
layer1_weights = tf.Variable(tf.truncated_normal(
[patch_size, patch_size, num_channels, depth], stddev=0.1))
conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')
有两点和之前是不同的
首先,这个placeholder的shape. 我之前一直想把数据用[None,28,28]输入进去.看来是少了最后一个参数
tensorflow中这个shape应该也不是自己想多少就多少的. 似乎需要把数据整理成他给的规则的形式才行.
tf.nn.conv2d操作:
data:输入数据
layer1_weights:卷积核参数,也就是CNN中的w
[1,2,2,1] :这个4维度参数是stride,分别对应input的四维.是在数据集上的步长,在image_size上的步长,在image_size上的步长,在num_channels的步长 所以,一般第1,3个参数都是1. 中间两个一般都相同
padding: padding模式有两种,一种是SAME,如果stride为1,输入输出的卷积是一样大的,周边一圈用0填满. 另一种是'VALID', 做完以后输出大小会是图片宽-卷积核宽+1
我们再来看一下max_pool的使用手册
tf.nn.max_pool(value, ksize, strides, padding, name=None)
Performs the max pooling on the input.
Args:
value: A 4-D Tensor with shape [batch, height, width, channels] and type float32, float64, qint8, quint8, qint32.
ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.
strides: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor.
padding: A string, either 'VALID' or 'SAME'. The padding algorithm.
name: Optional name for the operation.
Returns:
A Tensor with the same type as value. The max pooled output tensor.
这里我们确信,确实tensor shape如果想用tensorflow写好的这一套东西,不能自己自由定义.还是按它的规则来~
我们看简单地把conv的stride换成使用maxpool的降维操作后,效果就小有提升
Minibatch loss at step 1000: 0.311003 Minibatch accuracy: 93.8% Validation accuracy: 84.4% Test accuracy: 90.9%
Problem 2
Try to get the best performance you can using a convolutional net. Look for example at the classic LeNet5 architecture, adding Dropout, and/or adding learning rate decay.
这里我们打算实现一下他讲的Icenption Module哈,也就是2014 image-net夺冠的GOOGLENET里面的结构.
这里是我使用tensorboard可视化自己的网络结构.网络设计没有任何讲究,随意弄的,所以最终训练时间和参数都提升了无数倍的前提下,效果比提供的CNN代码也就好了1个点左右的样子. 就是为了感受一下inception module这个概念.当然,github也有很多人实现了挺好的incpetion module.在tensorflow git主页的model zoo就有实现.我的结果:
Minibatch loss at step 10000: 0.035036
Minibatch accuracy: 0.906250
test accuracy: 0.922000