Thursday 25 February 2016

Algorithm to rotate a matrix using transpose

This is a horrible question but there is a neat trick.

To rotate 90 degrees just take transpose then reverse the rows. To rotate -90 (270) degrees take the transpose then reverse the columns.

val matrix = Array(Array(1,2), Array(3,4), Array(5,6))
print(matrix)

1 2
3 4
5 6

Transpose

def transpose(matrix: Array[Array[Int]]): Array[Array[Int]] = {
  matrix.head.indices.map(i => matrix.map(_(i))).toArray
}

print(transpose(matrix))

1 3 5
2 4 6

Rotate 90 degrees clockwise 


Transpose and reverse rows:

def rotate90(matrix: Array[Array[Int]]): Array[Array[Int]] = {
  transpose(matrix).map(_.reverse)
}
print(rotate90(matrix))

5 3 1
6 4 2

Voila.

Rotate 90 degrees anticlockwise


Transpose and reverse columns:

def rotateMinus90(matrix: Array[Array[Int]]): Array[Array[Int]] = {
  val t = transpose(matrix)
  t.head.indices.foreach { i =>
    (0 until (t.length/2)).foreach { j =>
      val temp = t(j)(i)
      t(j)(i) = t(t.length - j - 1)(i)
      t(t.length - j - 1)(i) = temp
    }
  }
  t
}
print(rotateMinus90(matrix))

2 4 6
1 3 5

Full code:


Pimp my library pattern

Let's say I want to add a function to the String class to 'calibrate' data for comparison, eg remove trailing whitespace/convert to lowercase/remove punctuation

I want to call it like " StrING   ".calibrate which would give me "string"


import scala.language.implicitConversions
object StringCalibrationImplicits {

  class CalibratedString(s: String) {
    def calibrate = {
      // also might wanna filter out punctuation marks or whatever      s.trim.toLowerCase
    }
  }

  implicit def calibrate(s: String): CalibratedString = new CalibratedString(s)
  
}

Then from another class

import com.ojha.StringCalibrationImplicits._
val s = "  StrING   "s.calibrate



Tagging in ScalaTest

Really useful for separating tests running in different CI builds.

I usually define a tags.scala which contains the tag definitions:

package com.ojha

import org.scalatest.Tag

object UnitTest extends Tag("com.ojha.UnitTest")

object IntegrationTest extends Tag("com.ojha.IntegrationTest")


Then tag tests like this:

it should "be a cat" taggedAs UnitTest in {
  
}


Then from the sbt console to run only the unit tests:

"test-only -- -n com.ojha.UnitTest"

To run everything except integration tests:

"test-only -- -l com.ojha.IntegrationTest"








Tuesday 16 February 2016

Maths puzzle: Scatter graph with horizontal line

Question

If you have a 2D scatter plot with n points, you want to draw a horizontal line such that the perpendicular distance between the line and the points is minimised. 

Answer

Intuitively, you will take the mean of the y-coordinates and draw the line through the y-axis at this point.

Can we prove this a little more formally?

You basically want to find the minimum $k$ for $\sum_{i=0}^{n} (y_i - k)^2$

$$ \frac{d}{dk} \sum_{i=0}^{n} (y_i - k)^2 $$

$$ = \frac{d}{dk} \sum_{i=0}^{n} (y_i^2 + k^2 - 2 * y_i*k) $$

$$ = \sum_{i=0}^{n} (2*k - 2 * y_i) $$

To find minimum, set to 0 and solve:

$$ 0 = \sum_{i=0}^{n} (2*k - 2 * y_i) $$

$$ => 0 = n * k + \sum_{i=0}^{n}  -  y_i $$

$$ => k = \frac{ \sum_{i=0}^{n}  y_i}{n}$$

Second derivative is $n$ which is positive so this is minimum.



Bit twiddling tricks



  • Power of two: if (n & (n-1) == 0) then n is a power of two
  • Odd or even: if (n & 1 == 0) then n is even
  • Set nth bit of number i to 1:  i | (1 << n)
  • Set nth bit of number i to 0: i & ~(1 << n )

Solution to Codility Chapter 10: Peaks


The 12 Days of Christmas

Question. 

On the first day of Christmas my true love gave to me: a partridge in a pear tree

On the second day of Christmas my true love gave to me: two turtle doves and a partridge in a pear tree

Etc.

On the 12th day of Christmas, of which type of bird/person do you have the greatest number? 

Answer:

Exhaustively, on day 12:

1 * 12 partridges
2 * 11 turtle doves
3 * 10 french hens
4 * 9 calling birds
5 * 8 gold rings
6 * 7 geese a-laying
7 * 6 swans a-swimming
8 * 5 maids a-milking
9 * 4 ladies dancing
10 * 3 lords a-leaping
11 * 2 pipers piping
12 * 1 drummers drumming

(yes i had to google what was what on each day)

So the answer is that you have 42 geese and 42 swans.

Intuitively we can say that the biggest value is when we are multiplying the numbers that are closest together, in this case 6*7. So the answer for the max is $floor(\frac{13}{2}) * (13-floor(\frac{13}{2}) )$

How to prove this mathematically:

Firstly, is there a general formula?

Yes, where $i$ is the day you first getting a particular type of animal then clearly on day twelve you have $i*(13 - i) $

To find the max we differentiate:

$$\frac{d}{di} i*(13 - i) $$

$$ = \frac{d}{di} 13i - i^2 $$

$$ = 13 - 2i $$

Set this to zero and solve

$$ 13 - 2i = 0 => i = \frac{13}{2}$$

Is this definitely maximum? Differentiating again = $-2$ which means this is maximal.

Obviously, our data is discrete, so we check the values on either side of 6.5, 6 and 7. In this case they give the same answer: 6 * 7 and 7*6 = 42.

(Fun fact when I was fourteen my friends and I made up our own version of this song. All I can remember is the following:

5 g strings
4 manly hugs
3 french kisses
2 bulging biceps
And a nice piece of ass for me

I've forgotten everything I ever learned in A level chemistry, but it's good to know that the important stuff sticks. )

Quicksort in scala

Basic Algorithm


Step 1: Choose pivot p as some element in the array.
Step 2: Partition the array so that everything on the left of p is < p and everything on the right > p (this step is $O(n)$).
Step 3: Recurse on the left and right.

Best and worse cases


Intuitively the best case is when the pivot happens to be the median each time. $O(n * log(n))$

The worst case would be choosing pivot = first element where the array is already sorted. $O(n^2)$

Implementation


There are two main implementation to consider. The first is the most intuitive:


Analysis of Lomuto quicksort

We can see that the partition method is $O(n) = a*n + c_1$

The check 'if (r < l)' occurs in constant time, $c_2$

The go method is more interesting. We can say that in the best case, when the partition is at the median every time

$$T(n) = 2 T (\frac{n}{2}) + (a * n + c_1) + c_2 = 2 T (\frac{n}{2}) + a * n + c_3  $$

Constant doesn't really make much difference as $n \rightarrow \infty $ so let's drop it:

$$T(n) = 2 T (\frac{n}{2}) + a * n  $$

$$T(\frac{n}{2}) = 2 T (\frac{n}{4}) + \frac{a * n}{2} $$
$$ => T(n) = 2 \{ 2 T (\frac{n}{4}) + \frac{a * n}{2} \} + a * n $$
$$ => T(n) = 4 T (\frac{n}{4}) + 2 * a * n $$
$$ => T(n) = 2^k * T(\frac{n}{2^k}) + k * a * n $$

We want to find this formula in terms of $T(1)$

$$ 1 = \frac{n}{2^k}$$

$$ => 2^k = n $$

$$ => k = log_2(n) $$

Substitute this value of $k$ into the earlier formula:

$$ => T(n) = 2^{log_2(n)} * T(1) + log_2(n) * a * n $$

$$ => T(n) = n * T(1) + a * n * log_2(n)  $$

$T(1)$ will happen in constant time $c$

$$ => T(n) = n * c + a * n * log_2(n)  $$

So we can see that the time complexity should be $n * log_2(n)$

Hoare Partitioning


The other implementation of quicksort is less intuitive. Apparently it does fewer swaps (x3 fewer) than lomuto, but the analysis is too complicated for me to bother with right now.

def hoare(): Unit = {
  def swap(i: Int, j:Int) = { val t = a(i); a(i) = a(j); a(j) = t }

  def aux(l: Int, r: Int): Unit = {
    val p = a(l + r / 2)
    var i = l
    var j = r

    while (i < j) {
      while (a(i) < p) i += 1            while (a(j) > p) j -= 1
      if (i <= j) { swap(i,j); i+= 1; j -= 1 }

      if (l < j) aux(l, j)
      if (r > i) aux(i,r)

    }
  }
  aux(0, a.length-1)
}





Saturday 6 February 2016

Probability Puzzles

Question 1:

Three people are trying to win a game. Each of them is given a hat that is pink or yellow with equal probability ($\frac{1}{2}$). They can each see the colour of the other two hats, but not their own hat colour. They can define a strategy before playing the game but cannot communicate with each other when the game starts. 

Each of them has to guess the colour of their own hat (the guesses are kept secret from each other).  Valid guesses are 'pink',  'yellow' or 'don't know'. 

They win the game if at least one of them guesses correctly and none of them guesses incorrectly. 

Answer 1: 

The strategy should be that if someone sees the other two people wearing the same colour hat, they should guess the opposite colour. If they see two people wearing different coloured hats then should guess 'don't know'. 

The possible outcomes are the following: 

$$ P(3 pink) = \frac{1}{8} $$

$$ P(3 yellow) = \frac{1}{8} $$

$$ P(2 pink, 1 yellow) = \frac{3}{8} $$

$$ P(1 pink, 2 yellow) = \frac{3}{8} $$

Therefore this strategy will work $\frac{3}{4}$ of the time. 

Question 2: 

A plane has 100 passengers who have each been allocated one of 100 seats. However the first passenger to board the plane chooses his seat randomly (perhaps he lost his ticket). The remaining 99 passengers sit in their own seat if it is unoccupied or also choose a seat randomly if someone is sitting in their allocated seat. What is the probability that the last passenger sits in their own seat? 

Answer 2: 

As soon as somebody sits in the first passenger's seat, they break the cycle and everything after that is fine. Therefore, when the last passenger boards the plane the only seats that can be free are either the first passenger's seat or the last passenger's own seat, with equal probability, so the answer is $\frac{1}{2}. $

Question 3: 

A family has two children.

a) One of them is definitely a girl. What is the probably that the other one is also a girl? 
b) The older one is a girl. What is the probability that the younger one is also a girl? 
c) One of the children is a girl with a very unusual name (let's say Rumplestiltskin). What is the probability that the other child is also a girl? (Note that we don't know whether she is older or younger than her sibling).

Answer 3:

a) Three equal options (G B), (B G) or (G G). Answer = $\frac{1}{3}$ 

b) Two equal options (G B) or (G G). Answer = $\frac{1}{2}$

c) 

Let's say that the overall probability that a girl is called Rumplestiltskin = $\frac{1}{10000}$

Possible events: 

P(Girl called RumplestiltskinBoy) =  $ \frac{1}{10000} * \frac{1}{2} * \frac{1}{2} = \frac{1}{40000}$

P(BoyGirl called Rumplestiltskin) = $ \frac{1}{2} * \frac{1}{2} * \frac{1}{10000} = \frac{1}{40000} $

P(Girl with a normal nameBoy) = $ \frac{9999}{10000} * \frac{1}{2} * \frac{1}{2} = \frac{9999}{40000}$

P(BoyGirl with  normal name) = $ \frac{1}{2} * \frac{1}{2} * \frac{9999}{10000} = \frac{9999}{40000} $

P(Girl with normal nameGirl with normal name) = $\frac{9999}{10000} * \frac{1}{2} * \frac{9999}{10000} * \frac{1}{2} = \frac{99980001}{400000000}$

P(Girl with normal nameGirl called Rumplestiltskin) = $ \frac{9999}{10000} * \frac{1}{2} * \frac{1}{10000} * \frac{1}{2} = \frac{9999}{400000000}$

P(Girl called RumplestiltskinGirl with normal name) = $\frac{1}{10000} * \frac{1}{2} * \frac{9999}{10000} * \frac{1}{2} = \frac{9999}{400000000}$

P(Girl called RumplestiltskinGirl called Rumplestiltskin)* = $\frac{1}{10000} * \frac{1}{2} * \frac{1}{10000} * \frac{1}{2} = \frac{1}{400000000}$ 

Conditional probability: 

$$ P (A \mid B) = \frac{P(A \cap B)}{P(B)}$$

Probability of two girls where one girl is called Rumplestiltskin = $\frac{19999}{400000000}$

Probability that one is called Rumplestiltskin = $\frac{39999}{400000000}$

So the P(two girls given one is called Rumplestiltskin) = 
$$ \frac{19999}{40000000} / \frac{39999}{400000000} = 0.499987... \approx \frac{1}{2}$$

(Yep maths is magic) 

*someone call child services

Question 4:

50% of people that complete in a competition win a prize. 95% of people that won the prize felt that they deserved a prize. 75% of people that did not win a prize thought they deserved a prize. 

Whats the probability of getting a prize in the competition if you feel you deserve one? 

Answer 4:

Bayes theorem:  
$$P(A \mid B) = \frac {P(B \mid A) P(A)} {P(B)} $$

$$ => P(prize \mid deserve) = \frac{ P (deserve \mid prize) P (prize) } { P (deserve) } $$

$$=> P(prize \mid deserve) = \frac{0.95 * 0.5}{0.85} = 0.5588... \approx 0.559 $$

Question 5:

What is the expected number of coin tosses to get n heads in a row? 

Answer 5:

Expected number of coin tosses to get one head: 
$$E_1 = 2$$

To get n heads you first need n-1 heads in a row. Then you can either toss another head with probability $\frac{1}{2}$ or a tail, in which case you need $E_n$ more heads. This defines the following recurrence relation:

$$E_n = \frac{1}{2}(E_{n-1} +1) + \frac{1}{2}(E_{n-1} + 1 + E_n)$$

$$ => E_n = 2 E_{n-1} + 2 $$

Hypothesis: 

$$ E_n = 2^{n+1} - 2$$

With a simple proof by induction we see this is indeed the case. 

Scala with Cats: Answers to revision questions

I'm studying the 'Scala with Cats' book. I want the information to stick so I am applying a technique from 'Ultralearning&#...