Multi-task Learning

Original Source: https://www.coursera.org/specializations/deep-learning

Say we want to find out cars, pedestrians, road signs, and traffic lights in an image. There are two ways to do this with machine learning.

First, we can train 4 neural networks with binary classification.

Second, we can make one model predict multiple classes. For example, our model can predict that there are car and pedestrian in an image. This is called transfer learning.

When do we use multi-task learning?

When training on a set of tasks that could benefit from having shared lower-level features
When amount of data you have for each task is quite similar
When you can train a big enough neural network to do well on all tasks

Architecture

Our prediction $\hat{Y}$ and label $Y$ is a matrix of shape $m\times k$ where $k$ is number of classes.

If $m=3$ and $k=4$(car, pedestrian, road sign, traffic light) and $Y=\left[ {\begin{array}{} 1 & 1 & 0 & 1\\ 0 & 1 & 0 & 0\\ 0 & 0 & 1 & 1\\ \end{array} } \right]$ then there are car, pedestrian and traffic light in $X^{(1)}$, pedestrian in $X^{(2)}$, road sign and traffic light in $X^{(3)}$.

Cost

\[J = -\frac{1}{m}\sum_{i=1}^m \sum_{j=1}^k (y_j^{(i)}\log(\hat{y}_j^{(i)})+(1-y_j^{(i)})\log(1-\hat{y}_j^{(i)}))\]

The cost can be computed even if some entries are not labled.

Share on

Twitter Facebook Google+ LinkedIn

YoonSoo

Multi-task Learning

When do we use multi-task learning?

Architecture

Cost

Share on

Leave a Comment

You May Also Enjoy

Generalized Linear Models (GLM)

“ALBERT: A Lite BERT for Self-supervised Learning of Language Representations” Summarized

“Generative Pretraining from Pixels” Summarized

“Language Models are Few-Shot Learners” Summarized