Statistical Machine Learning GR5241

Statistical Machine Learning GR5241
Spring 2021
Homework 4
Homework submission: Please submit your homework electronically through Canvas by 11:59pm on the due
date. You need to submit both the pdf le and your code (either in R or Python).
Problem 1 (Boosting, 50 points)
The objective of this problem is to implement the AdaBoost algorithm. We will test the algorithm on handwritten
digits from the USPS data set.
AdaBoost: Assume we are given a training sample (x(i); yi); i = 1; :::; n, where x(i) are data values in Rd and
yi 2 f?1; +1g are class labels. Along with the training data, we provide the algorithm with a training routine for
some classi er c (the weak learner”). Here is the AdaBoost algorithm for the two-class problem:
1. Initialize weights: wi = 1
n
2. for b = 1; :::;B
(a) Train a weak learner cb on the weighted training data. (See the note below)
Note: step 2(a) can be completed using two di erent methods: (1) use the weight vector directly in
the training of the weak learner, or (2) use the weight vector to sample data points with replacement
from the original data, then train the weak learner on the sampled data. Either way will guarantee full
credit.
(b) Compute error: b :=
Pn
i=1 wiIfyi6=cb(x(i) P )g n
i=1 wi
(c) Compute voting weights: b = log

1?b
b

or b = 1
2 log

1?b
b

(d) Recompute weights: wi = wi exp
?
bIfyi 6= cb(x(i))g

3. Return classi er ^cB(x(i)) = sgn
PB
b=1 bcb(x(i))

Decision stumps: Recall that a stump classi er c is de ned by
c(xjj; ;m) :=
(
+m xj > 
?m otherwise.
(1)
Since the stump ignores all entries of x except xj , it is equivalent to a linear classi er de ned by an ane
hyperplane. The plane is orthogonal to the jth axis, with which it intersects at xj = . The orientation of the
hyperplane is determined by m 2 f?1; +1g. We will employ stumps as weak learners in our boosting algorithm.
To train stumps on weighted data, use the learning rule
(j; ) := arg min
j;
Pn
i=1 wiIfyi 6= c(x(i)jj; ;m)g Pn
i=1 wi
: (2)
In the implementation of your training routine, rst determine an optimal parameter 
j for each dimension
j = 1; :::; d, and then select the j for which the cost term in (2) is minimal.
Note: If the data is sampled using weights wi, then the decision stump can be trained using a loss function other
than weighted 0-1 loss. Using other loss functions to train the weak learner is technically not correct; however,
the results are similar.
Homework problems:
1. Implement the AdaBoost algorithm in R.
2. Run your algorithm on the USPS data (use the training and test data for the 3s and 8s) and evaluate your
results using cross validation.
More precisely: Your AdaBoost algorithm returns a classi er that is a combination of B weak learners.
Since it is an incremental algorithm, we can evaluate the AdaBoost at every iteration b by considering the
sum up to the b-th weak learner. At each iteration, perform 5-fold cross validation to estimate the training
and test error of the current classi er (that is, the errors measured on the cross validation training and test
sets, respectively).
3. Plot the training error and the test error as a function of b.
Submission. Please make sure your solution contains the following:
 Your implementation of AdaBoost.
 Plots of your results (training error and cross-validated test error).
Problem 2 (Basic Theory Related to the Lasso [30 points])
2.i Consider the univariate lasso objective function with no bias:
Q( ) =
1
2n
Xn
i=1
(yi ? xi )2 + j j
Also suppose x is scaled using the formula
xi :=
xi
1
n
Pn
i=1 x2i
; i = 1; 2; : : : ; n:
Derive a closed form expression for the lasso solution, i.e., show Q( ) is minimized at
^ =
8><
>:
1
n
P
i xiyi ?  if 1
n
P
i xiyi > 
0 if ?   1
n
P
i xiyi  
1
n
P
i xiyi +  if 1
n
P
i xiyi < ?
2.ii Consider the multivariate lasso objective function with no bias:
Q( ) = Q( 1; 2; : : : ; p) =
1
2n
Xn
i=1

yi ?
Xp
j=1
jxij
2
+ 
Xp
j=1
j j j
Also suppose that the jth feature xj is scaled using the formula
xij :=
xij
1
n
Pn
i=1 x2
ij
; i = 1; 2; : : : ; n; j = 1; 2; : : : ; p:
Solve the expression @Q
@ j
= 0 for j . Your nal answer should be
j = S

1
n
xTj
r(j)
i

;
where xj is the jth feature, r(j) is the partial residual, and S is the soft thresholding operator. See the
lecture slides for further details.
2
Problem 3 (Regression Trees, ISL 8.4 [20 points])
3.i Sketch the tree corresponding to the partition of the predictor space illustrated in the left-hand panel of
Figure 1. The numbers inside the boxes indicate the mean of Y within each region.
3.ii Create a diagram similar to the left-hand panel of Figure 1, using the tree illustrated in the right-hand panel
of the same gure. You should divide up the predictor space into the correct regions, and indicate the mean
for each region.
Figure 1: Left: A partition of the predictor space corresponding to part a. Right: A tree corresponding to part b.
3

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more
error: Content is protected !!
Open chat
1
You can contact our live agent via WhatsApp! Via + 1 (929) 473-0077

Feel free to ask questions, clarifications, or discounts available when placing an order.

Order your essay today and save 20% with the discount code SCORE