Tuesday, 16 February 2016

Maths puzzle: Scatter graph with horizontal line

Question

If you have a 2D scatter plot with n points, you want to draw a horizontal line such that the perpendicular distance between the line and the points is minimised. 

Answer

Intuitively, you will take the mean of the y-coordinates and draw the line through the y-axis at this point.

Can we prove this a little more formally?

You basically want to find the minimum $k$ for $\sum_{i=0}^{n} (y_i - k)^2$

$$ \frac{d}{dk} \sum_{i=0}^{n} (y_i - k)^2 $$

$$ = \frac{d}{dk} \sum_{i=0}^{n} (y_i^2 + k^2 - 2 * y_i*k) $$

$$ = \sum_{i=0}^{n} (2*k - 2 * y_i) $$

To find minimum, set to 0 and solve:

$$ 0 = \sum_{i=0}^{n} (2*k - 2 * y_i) $$

$$ => 0 = n * k + \sum_{i=0}^{n}  -  y_i $$

$$ => k = \frac{ \sum_{i=0}^{n}  y_i}{n}$$

Second derivative is $n$ which is positive so this is minimum.



No comments:

Post a Comment

Scala with Cats: Answers to revision questions

I'm studying the 'Scala with Cats' book. I want the information to stick so I am applying a technique from 'Ultralearning...