For the training set given above, what is the value of m? In the box below, please enter your answer (which should be a number between 0 and 10).
θ0=0.5,θ1=0.5
θ0=0,θ1=0.5
θ0=1,θ1=1
θ0=1,θ1=0.5
θ0=0.5,θ1=0
3.
Suppose we set θ0=−1,θ1=2. What is hθ(6)?
No matter how θ0 and θ1 are initialized, so long
as α is sufficiently small, we can safely expect gradient descent to converge
to the same solution.
Setting the learning rate α to be very small is not harmful, and can
only speed up the convergence of gradient descent.
If θ0 and θ1 are initialized at
the global minimum, then one iteration will not change their values.
If the first few iterations of gradient descent cause f(θ0,θ1) to
increase rather than decrease, then the most likely cause is that we have set the
learning rate α to too large a value.
5.
Suppose that for some linear regression problem (say, predicting housing prices as in the lecture), we
have some training set, and for our training set we managed to find some θ0, θ1 such that J(θ0,θ1)=0. Which
of the statements below must then be true? (Check all that apply.)
This is not possible: By the definition of J(θ0,θ1), it is not possible for there to exist
θ0 and θ1 so that J(θ0,θ1)=0
For this to be true, we must have θ0=0 and θ1=0
so that hθ(x)=0
We can perfectly predict the value of y even for new examples that we have not yet seen.
(e.g., we can perfectly predict prices of even new houses that we have not yet seen.)
For these values of θ0 and θ1 that satisfy J(θ0,θ1)=0,
we have that hθ(x(i))=y(i) for every training example (x(i),y(i))