前言

**注意：coursera要求不要在網際網路公佈自己的作業。如果你在學習這個課程，建議你進入課程系統自行完成作業。使用邏輯迴歸作為一個最簡單的類似神經網路來進行影象判別。我覺得程式碼有參考和保留的意義。v
使用一個 2×4×1的網路來對資料進行二分類。**
比較麻煩的是什麼時候用點乘，什麼時候用矩陣乘法。見【4.3】小節的程式碼。雖然程式碼都提交通過稽核，但自己還需要梳理一下。
I.Cost的計算
$\begin{matrix} (13) & J = - \frac{1}{m} \sum_{i = 0}^{m} (y^{(i)} \log (a^{[2] (i)}) + (1 - y^{(i)}) \log (1 - a^{[2] (i)})) \end{matrix}$

i))log⁡(1−a[2](i)))
用點乘：

    logprobs = np.multiply(np.log(A2),Y)+np.multiply(np.log(1-A2),1-Y)
    cost = - np.sum(logprobs)/m

II.gradient的計算

    # Backward propagation: calculate dW1, db1, dW2, db2. 
    ### START CODE HERE ### (≈ 6 lines of code, corresponding to 6 equations on slide above) 

    #按單樣本進行簡化考量，引數的梯度的形應該與引數的形一致，如：dZ2.shape==Z2.shape=（1,1）;dW1.shape==W1=(4,2)
    dZ2 = A2-Y  # A2(1,1)-Y(1,1)=dZ2(1,1)
    dW2 = np.dot(dZ2,A1.T)/m   #dZ2(1,1)*A1.T(1,4)=dW2(1,4),==>使用矩陣乘!
    db2 = np.sum(dZ2,axis=1,keepdims=True)/m   #np.sum(dW2(1,4))/m=db2(1,1)
    dZ1 = np.multiply(np.multiply 
(W2.T,dZ2),1 - np.power(A1, 2))  #輸出dZ1=(4,1)。W2.T(4,1)×dZ2(1,1)  [1 - np.power(A1, 2)],所以內外兩層的乘號都是點乘。
    dW1 = np.dot(dZ1,X.T)/m   #dZ1(4,1)*X.T(1,2)=dW1(4,2)===>使用矩陣乘
    db1 = np.sum(dZ1,axis=1,keepdims=True)/m    
    ### END CODE HERE ###

是否使用點乘還是矩陣乘，有一個判別方法，把輸入的shape和應該輸出的shape提前標出來，那麼比較容易判別用點乘還是矩陣乘。輸入的形與輸出一致，那麼肯定是點乘，其他情況是用矩陣乘。如果標出的形還不好判別，可以進一步畫出是行還是列用來表達單個樣本中的n個特徵。或者只按一條樣本的shape來進行判斷。如上。

Planar data classification with one hidden layer

Welcome to your week 3 programming assignment. It’s time to build your first neural network, which will have a hidden layer. You will see a big difference between this model and the one you implemented using logistic regression.

You will learn how to:
- Implement a 2-class classification neural network with a single hidden layer
- Use units with a non-linear activation function, such as tanh
- Compute the cross entropy loss
- Implement forward and backward propagation

1 - Packages

Let’s first import all the packages that you will need during this assignment.
- numpy is the fundamental package for scientific computing with Python.
- sklearn provides simple and efficient tools for data mining and data analysis.
- matplotlib is a library for plotting graphs in Python.
- testCases provides some test examples to assess the correctness of your functions
- planar_utils provide various useful functions used in this assignment

# Package imports
import numpy as np
import matplotlib.pyplot as plt
from testCases_v2 import *
import sklearn
import sklearn.datasets
import sklearn.linear_model
from planar_utils import plot_decision_boundary, sigmoid, load_planar_dataset, load_extra_datasets

%matplotlib inline

np.random.seed(1) # set a seed so that the results are consistent

/opt/conda/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')
/opt/conda/lib/python3.5/site-packages/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
  warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.')

2 - Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.

X, Y = load_planar_dataset()

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data.

# Visualize the data:
plt.scatter(X[0, :], X[1, :], c=Y, s=40, cmap=plt.cm.Spectral);

這裡寫圖片描述

You have:
- a numpy-array (matrix) X that contains your features (x1, x2)
- a numpy-array (vector) Y that contains your labels (red:0, blue:1).

Lets first get a better sense of what our data is like.

Exercise: How many training examples do you have? In addition, what is the shape of the variables X and Y?

Hint: How do you get the shape of a numpy array? (help)

### START CODE HERE ### (≈ 3 lines of code)
shape_X = X.shape
shape_Y = Y.shape
m = shape_X[1]  # training set size
### END CODE HERE ###

print ('The shape of X is: ' + str(shape_X))
print ('The shape of Y is: ' + str(shape_Y))
print ('I have m = %d training examples!' % (m))

The shape of X is: (2, 400)
The shape of Y is: (1, 400)
I have m = 400 training examples!

Expected Output:

shape of X	(2, 400)
shape of Y	(1, 400)
m	400

3 - Simple Logistic Regression

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use sklearn’s built-in functions to do that. Run the code below to train a logistic regression classifier on the dataset.

# Train the logistic regression classifier
clf = sklearn.linear_model.LogisticRegressionCV();
clf.fit(X.T, Y.T);

/opt/conda/lib/python3.5/site-packages/sklearn/utils/validation.py:515: DataConversionWarning: A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().
  y = column_or_1d(y, warn=True)

You can now plot the decision boundary of these models. Run the code below.

# Plot the decision boundary for logistic regression
plot_decision_boundary(lambda x: clf.predict(x), X, Y)
plt.title("Logistic Regression")

# Print accuracy
LR_predictions = clf.predict(X.T)
print ('Accuracy of logistic regression: %d ' % float((np.dot(Y,LR_predictions) + np.dot(1-Y,1-LR_predictions))/float(Y.size)*100) +
       '% ' + "(percentage of correctly labelled datapoints)")

Accuracy of logistic regression: 47 % (percentage of correctly labelled datapoints)

這裡寫圖片描述

Expected Output:

**Accuracy**

47%

Interpretation: The dataset is not linearly separable, so logistic regression doesn’t perform well. Hopefully a neural network will do better. Let’s try this now!

4 - Neural Network model

Logistic regression did not work well on the “flower dataset”. You are going to train a Neural Network with a single hidden layer.

Here is our model:

這裡寫圖片描述
Mathematically:

For one example $x^{(i)}$ :

\begin{matrix} (1) & z^{[1] (i)} = W^{[1]} x^{(i)} + b^{[1]} \end{matrix}

\begin{matrix} (2) & a^{[1] (i)} = \tanh (z^{[1] (i)}) \end{matrix}

\begin{matrix} (3) & z^{[2] (i)} = W^{[2]} a^{[1] (i)} + b^{[2]} \end{matrix}

\begin{matrix} (4) & {\hat{y}}^{(i)} = a^{[2] (i)} = σ (z^{[2] (i)}) \end{matrix}

\begin{matrix} (5) & y_{p r e d i c t i o n}^{(i)} = {\begin{cases} 1 & if a^{[2] (i)} > 0.5 \\ 0 & otherwise \end{cases} \end{matrix}

吳恩達.深度學習系列-C1神經網路與深度學習-w3-（作業：一個隱藏層進行二維資料分類）

前言

Planar data classification with one hidden layer

1 - Packages

2 - Dataset

3 - Simple Logistic Regression

4 - Neural Network model

吳恩達.深度學習系列-C1神經網路與深度學習-w3-（作業：一個隱藏層進行二維資料分類）

吳恩達.深度學習系列-C1神經網路與深度學習-W3淺層神經網路

吳恩達.深度學習系列-C1神經網路與深度學習-W2-（作業：神經網路思想的邏輯迴歸）

吳恩達.深度學習系列-C1神經網路與深度學習-w4-（作業：建立神經網路）

吳恩達《神經網路與深度學習》課程筆記歸納（二）-- 神經網路基礎之邏輯迴歸

吳恩達《神經網路與深度學習》課程筆記歸納（三）-- 神經網路基礎之Python與向量化

Coursera 吳恩達《神經網路與深度學習》第三週程式設計作業

吳恩達神經網路與深度學習——神經網路基礎習題1

吳恩達神經網路與深度學習——神經網路基礎習題2

吳恩達神經網路與深度學習——深度神經網路習題4：DNN分類應用

Coursera吳恩達《神經網路與深度學習》課程筆記（2）-- 神經網路基礎之邏輯迴歸

吳恩達深度學習神經網路與深度學習神經網路基礎課程作業

機器學習與深度學習系列連載：第二部分深度學習（十二）卷積神經網路 3 經典的模型（LeNet-5，AlexNet ，VGGNet，GoogLeNet，ResNet）

《神經網路與深度學習(美)MichaelNielsen著》中英文版PDF+原始碼+吳岸城版PDF

吳恩達DeepLearning.ai系列課後程式設計題實踐總結week3

深度學習系列——關於神經網路理解的總結

【機器學習吳恩達】CS229課程筆記notes4翻譯-Part VI學習理論

神經網路與深度學習課程筆記（第三、四周）

神經網路與深度學習課程筆記（第一、二週）

深度學習介紹（下）【Coursera deeplearning.ai 神經網路與深度學習】

吳恩達.深度學習系列-C1神經網路與深度學習-w3-（作業：一個隱藏層進行二維資料分類）

前言

Planar data classification with one hidden layer

1 - Packages

2 - Dataset

3 - Simple Logistic Regression

4 - Neural Network model

相關推薦