Alexandra's Tech: scala

Showing posts with label scala. Show all posts

Thursday, 16 August 2018

Cake pattern

(don't read this post if you are eating because it might make you throw up)

Wednesday, 15 August 2018

Thursday, 25 February 2016

Algorithm to rotate a matrix using transpose

This is a horrible question but there is a neat trick.

To rotate 90 degrees just take transpose then reverse the rows. To rotate -90 (270) degrees take the transpose then reverse the columns.

val matrix = Array(Array(1,2), Array(3,4), Array(5,6))
print(matrix)

1 2
3 4
5 6

Transpose

def transpose(matrix: Array[Array[Int]]): Array[Array[Int]] = {
  matrix.head.indices.map(i => matrix.map(_(i))).toArray
}

print(transpose(matrix))

1 3 5
2 4 6

Rotate 90 degrees clockwise

Transpose and reverse rows:

def rotate90(matrix: Array[Array[Int]]): Array[Array[Int]] = {
  transpose(matrix).map(_.reverse)
}

print(rotate90(matrix))

5 3 1
6 4 2

Voila.

Rotate 90 degrees anticlockwise

Transpose and reverse columns:

def rotateMinus90(matrix: Array[Array[Int]]): Array[Array[Int]] = {
  val t = transpose(matrix)
  t.head.indices.foreach { i =>
    (0 until (t.length/2)).foreach { j =>
      val temp = t(j)(i)
      t(j)(i) = t(t.length - j - 1)(i)
      t(t.length - j - 1)(i) = temp
    }
  }
  t
}

print(rotateMinus90(matrix))

2 4 6
1 3 5

Full code:

Let's say I want to add a function to the String class to 'calibrate' data for comparison, eg remove trailing whitespace/convert to lowercase/remove punctuation

I want to call it like " StrING ".calibrate which would give me "string"

import scala.language.implicitConversions
object StringCalibrationImplicits {

  class CalibratedString(s: String) {
    def calibrate = {
      // also might wanna filter out punctuation marks or whatever      s.trim.toLowerCase
    }
  }

  implicit def calibrate(s: String): CalibratedString = new CalibratedString(s)
  
}

Then from another class

import com.ojha.StringCalibrationImplicits._
val s = "  StrING   "s.calibrate

Tagging in ScalaTest

Really useful for separating tests running in different CI builds.

I usually define a tags.scala which contains the tag definitions:

package com.ojha

import org.scalatest.Tag

object UnitTest extends Tag("com.ojha.UnitTest")

object IntegrationTest extends Tag("com.ojha.IntegrationTest")

Then tag tests like this:

it should "be a cat" taggedAs UnitTest in {
  
}

Then from the sbt console to run only the unit tests:

"test-only -- -n com.ojha.UnitTest"

To run everything except integration tests:

"test-only -- -l com.ojha.IntegrationTest"

Tuesday, 16 February 2016

Solution to Codility Chapter 10: Peaks

Quicksort in scala

Basic Algorithm

Step 1: Choose pivot p as some element in the array.
Step 2: Partition the array so that everything on the left of p is < p and everything on the right > p (this step is $O(n)$).
Step 3: Recurse on the left and right.

Best and worse cases

Intuitively the best case is when the pivot happens to be the median each time. $O(n * log(n))$

The worst case would be choosing pivot = first element where the array is already sorted. $O(n^2)$

Implementation

There are two main implementation to consider. The first is the most intuitive:

Analysis of Lomuto quicksort

We can see that the partition method is $O(n) = a*n + c_1$

The check 'if (r < l)' occurs in constant time, $c_2$

The go method is more interesting. We can say that in the best case, when the partition is at the median every time

$$T(n) = 2 T (\frac{n}{2}) + (a * n + c_1) + c_2 = 2 T (\frac{n}{2}) + a * n + c_3 $$

Constant doesn't really make much difference as $n \rightarrow \infty $ so let's drop it:

$$T(n) = 2 T (\frac{n}{2}) + a * n $$

$$T(\frac{n}{2}) = 2 T (\frac{n}{4}) + \frac{a * n}{2} $$
$$ => T(n) = 2 \{ 2 T (\frac{n}{4}) + \frac{a * n}{2} \} + a * n $$
$$ => T(n) = 4 T (\frac{n}{4}) + 2 * a * n $$
$$ => T(n) = 2^k * T(\frac{n}{2^k}) + k * a * n $$

We want to find this formula in terms of $T(1)$

$$ 1 = \frac{n}{2^k}$$

$$ => 2^k = n $$

$$ => k = log_2(n) $$

Substitute this value of $k$ into the earlier formula:

$$ => T(n) = 2^{log_2(n)} * T(1) + log_2(n) * a * n $$

$$ => T(n) = n * T(1) + a * n * log_2(n) $$

$T(1)$ will happen in constant time $c$

$$ => T(n) = n * c + a * n * log_2(n) $$

So we can see that the time complexity should be $n * log_2(n)$

Hoare Partitioning

The other implementation of quicksort is less intuitive. Apparently it does fewer swaps (x3 fewer) than lomuto, but the analysis is too complicated for me to bother with right now.

def hoare(): Unit = {
  def swap(i: Int, j:Int) = { val t = a(i); a(i) = a(j); a(j) = t }

  def aux(l: Int, r: Int): Unit = {
    val p = a(l + r / 2)
    var i = l
    var j = r

    while (i < j) {
      while (a(i) < p) i += 1            while (a(j) > p) j -= 1
      if (i <= j) { swap(i,j); i+= 1; j -= 1 }

      if (l < j) aux(l, j)
      if (r > i) aux(i,r)

    }
  }
  aux(0, a.length-1)
}

Thursday, 28 January 2016

Formal definition of a Monad

A monad it a parametric type with two functions defined, $flatMap$ and $unit$.

trait Monad[T] {
  def flatMap[U](f: T => Monad[U]): Monad[U]
}

def unit[T](t: T): Monad[T]

Monads must satisfy the following three laws:

Associativity law: $$ m.flatMap(f).flatMap(g) = m.flatMap ( x => f(x).flatmap(g)) $$

Left unit law: $$ unit(x).flatMap(f) = f(x) $$

Right unit law: $$ m.flatMap(unit) = m $$

Wednesday, 27 January 2016

Dijkstra in scala

The easiest way to understand dijkstra is by watching a youtube video.

Step 1: Mark all distance to start node as 0 and distance to all other nodes as infinity.
Step 2: Take current node and update distance to all its linked nodes that are unvisited
Step 3: Choose the next node which is the closest unvisited node.
Repeat steps 2 and 3 until target nodes is marked visited.

The following gist shows the code. Full code on github.

Computing edit distance between strings in Scala

Question: Assuming you have a set of n letter dictionary words ($S$), a start word ($s_0 \in S$) and a target word ($s_t \in S$) find the shortest chain of words ($C$) where one letter differs between consecutive words in the chain and the first and last words in the chain are $s_0$ and $s_t$ respectively.

Eg $S = [cat, cab, car, tat, tar, bar], s_0 = cat, s_t = tar $
$ => C = [cat -> car -> bar -> tar]$

How to compute the edges

The interesting part is the computeEdges function. Here is a naive implementation:

private def computeEdgesSlowly(dictionary: Set[String]): List[UnDiEdge[String]] = {
  val (_, edges) 
     = dictionary.foldLeft (dictionary.drop(1), List.empty[UnDiEdge[String]]) {
        case (acc, word) =>
          val (unvisitedWords, edgeList) = acc
          val edgesForThisWord = unvisitedWords.collect { 
            case (w) if differsByOneLetter(w, word) 
               => word ~ w 
          }
          (unvisitedWords.drop(1), edgeList ++ edgesForThisWord)
      }
  edges
}

For every word in the list we check it against every other unvisisted word in the list meaning that this algorithm is O(n^2) where n = the number of words in the dictionary.

Eg

cat is compared to [ cab, car, tat, tar, bar ]
cab is compared to [ car, tat, tar, bar ]
car is compared to [ tat, tar, bar ]
tat is compared to [ tar, bar ]
tar is compared to [ bar ]

Cleverer algorithm for computing edges

It is possible to compute the edges in O(n) which is a huge improvement on the implementation above. :D :D

Example: [ cat, cab, car, tat, tar, bar ]

Step 1:

Firstly we compute 'subs' for each word in the dictionary

'cat' -> [ '?at' , 'c?t' , 'ca?' ]
'cab' -> [ '?ab' , 'c?b' , 'ca?' ]
etc...

val mapOfSubs = dictionary.map { word =>
  val subs = (0 until word.length) map { case i =>
    word.patch(i, Seq('?'), 1)
  }
  word -> subs
}.toMap

Step 2:

We build a hashmap from the subs

  '?at' -> [ 'cat', 'tat' ]
  'c?t' -> [ 'cat' ]
  'ca?' -> [ 'cat', 'cab', 'car' ]
  '?ab' -> [ 'cab' ]
  'c?b' -> [ 'cab' ]
  etc...

val cachedMap = mutable.HashMap.empty[String, Set[String]]
dictionary.foreach { word =>
  val subs = mapOfSubs.get(word)
  subs.foreach(o => o.foreach {
    sub => 
      cachedMap(sub) 
        = Set(word) ++ cachedMap.getOrElse(sub, Set.empty[String])
  })
}

Step 3:

Now we build up the list of edges. For each word, we get 3 subs for each word. Then we look up each of these 3 subs in the above hashmap to get all the nodes for which the current words has an edge with. We keep a list of nodes we have visited so that we don't add each node twice (the edges are undirected).

val visited = mutable.Seq.empty[String]
dictionary.flatMap { word =>
  word :+ visited
  val subs = mapOfSubs(word)
  subs.flatMap { s =>
    val links = cachedMap(s)
    links.collect { 
      case l if !visited.contains(l) 
        => word ~ l // ~ = undirected edge    }
  }
}.toSeq

Full code is here.

Wednesday, 20 January 2016

Scala: Pass by name vs pass by value

Scala call by value vs call by name

In scala when you pass by name (using =>) the expression gets evaluated every time it is used.

Passing by name and passing by value will result in the same output if pure functions are used and both evaluations terminate.

Note that in scala the default is passing by value which is more performant.

Alexandra's Tech