(don't read this post if you are eating because it might make you throw up)
This blog contains stuff that I don't want to forget but might be useful to someone else too. Posts are short and to the point with enough detail for future me to make sense of.
Showing posts with label scala. Show all posts
Showing posts with label scala. Show all posts
Thursday, 16 August 2018
Wednesday, 15 August 2018
Thursday, 25 February 2016
Algorithm to rotate a matrix using transpose
This is a horrible question but there is a neat trick.
To rotate 90 degrees just take transpose then reverse the rows. To rotate -90 (270) degrees take the transpose then reverse the columns.
1 2
3 4
5 6
1 3 5
2 4 6
Transpose and reverse rows:
5 3 1
6 4 2
Voila.
Transpose and reverse columns:
2 4 6
1 3 5
To rotate 90 degrees just take transpose then reverse the rows. To rotate -90 (270) degrees take the transpose then reverse the columns.
val matrix = Array(Array(1,2), Array(3,4), Array(5,6))
print(matrix)
1 2
3 4
5 6
Transpose
def transpose(matrix: Array[Array[Int]]): Array[Array[Int]] = {
matrix.head.indices.map(i => matrix.map(_(i))).toArray
}
print(transpose(matrix))
1 3 52 4 6
Rotate 90 degrees clockwise
Transpose and reverse rows:
def rotate90(matrix: Array[Array[Int]]): Array[Array[Int]] = {
transpose(matrix).map(_.reverse)
}
print(rotate90(matrix))
5 3 1
6 4 2
Voila.
Rotate 90 degrees anticlockwise
Transpose and reverse columns:
def rotateMinus90(matrix: Array[Array[Int]]): Array[Array[Int]] = {
val t = transpose(matrix)
t.head.indices.foreach { i =>
(0 until (t.length/2)).foreach { j =>
val temp = t(j)(i)
t(j)(i) = t(t.length - j - 1)(i)
t(t.length - j - 1)(i) = temp
}
}
t
}
print(rotateMinus90(matrix))
2 4 6
1 3 5
Full code:
Pimp my library pattern
Let's say I want to add a function to the String class to 'calibrate' data for comparison, eg remove trailing whitespace/convert to lowercase/remove punctuation
I want to call it like " StrING ".calibrate which would give me "string"
Then from another class
I want to call it like " StrING ".calibrate which would give me "string"
import scala.language.implicitConversions
object StringCalibrationImplicits {
class CalibratedString(s: String) {
def calibrate = {
// also might wanna filter out punctuation marks or whatever s.trim.toLowerCase
}
}
implicit def calibrate(s: String): CalibratedString = new CalibratedString(s)
}
Then from another class
import com.ojha.StringCalibrationImplicits._
val s = " StrING "s.calibrate
Tagging in ScalaTest
Really useful for separating tests running in different CI builds.
I usually define a tags.scala which contains the tag definitions:
Then tag tests like this:
Then from the sbt console to run only the unit tests:
"test-only -- -n com.ojha.UnitTest"
To run everything except integration tests:
"test-only -- -l com.ojha.IntegrationTest"
I usually define a tags.scala which contains the tag definitions:
package com.ojha
import org.scalatest.Tag
object UnitTest extends Tag("com.ojha.UnitTest")
object IntegrationTest extends Tag("com.ojha.IntegrationTest")
Then tag tests like this:
it should "be a cat" taggedAs UnitTest in {
}
Then from the sbt console to run only the unit tests:
"test-only -- -n com.ojha.UnitTest"
To run everything except integration tests:
"test-only -- -l com.ojha.IntegrationTest"
Tuesday, 16 February 2016
Quicksort in scala
Basic Algorithm
Step 1: Choose pivot p as some element in the array.
Step 2: Partition the array so that everything on the left of p is < p and everything on the right > p (this step is $O(n)$).
Step 3: Recurse on the left and right.
Best and worse cases
Intuitively the best case is when the pivot happens to be the median each time. $O(n * log(n))$
The worst case would be choosing pivot = first element where the array is already sorted. $O(n^2)$
Implementation
There are two main implementation to consider. The first is the most intuitive:
Analysis of Lomuto quicksort
We can see that the partition method is $O(n) = a*n + c_1$
The check 'if (r < l)' occurs in constant time, $c_2$
The go method is more interesting. We can say that in the best case, when the partition is at the median every time
$$T(n) = 2 T (\frac{n}{2}) + (a * n + c_1) + c_2 = 2 T (\frac{n}{2}) + a * n + c_3 $$
Constant doesn't really make much difference as $n \rightarrow \infty $ so let's drop it:
$$T(n) = 2 T (\frac{n}{2}) + a * n $$
$$T(\frac{n}{2}) = 2 T (\frac{n}{4}) + \frac{a * n}{2} $$
$$ => T(n) = 2 \{ 2 T (\frac{n}{4}) + \frac{a * n}{2} \} + a * n $$
$$ => T(n) = 4 T (\frac{n}{4}) + 2 * a * n $$
$$ => T(n) = 2^k * T(\frac{n}{2^k}) + k * a * n $$
We want to find this formula in terms of $T(1)$
$$ 1 = \frac{n}{2^k}$$
$$ => 2^k = n $$
$$ => k = log_2(n) $$
Substitute this value of $k$ into the earlier formula:
$$ => T(n) = 2^{log_2(n)} * T(1) + log_2(n) * a * n $$
$$ => T(n) = n * T(1) + a * n * log_2(n) $$
$T(1)$ will happen in constant time $c$
$$ => T(n) = n * c + a * n * log_2(n) $$
So we can see that the time complexity should be $n * log_2(n)$
Hoare Partitioning
The other implementation of quicksort is less intuitive. Apparently it does fewer swaps (x3 fewer) than lomuto, but the analysis is too complicated for me to bother with right now.
def hoare(): Unit = {
def swap(i: Int, j:Int) = { val t = a(i); a(i) = a(j); a(j) = t }
def aux(l: Int, r: Int): Unit = {
val p = a(l + r / 2)
var i = l
var j = r
while (i < j) {
while (a(i) < p) i += 1 while (a(j) > p) j -= 1
if (i <= j) { swap(i,j); i+= 1; j -= 1 }
if (l < j) aux(l, j)
if (r > i) aux(i,r)
}
}
aux(0, a.length-1)
}
Thursday, 28 January 2016
Formal definition of a Monad
A monad it a parametric type with two functions defined, $flatMap$ and $unit$.
Monads must satisfy the following three laws:
Associativity law: $$ m.flatMap(f).flatMap(g) = m.flatMap ( x => f(x).flatmap(g)) $$
Left unit law: $$ unit(x).flatMap(f) = f(x) $$
Right unit law: $$ m.flatMap(unit) = m $$
trait Monad[T] { def flatMap[U](f: T => Monad[U]): Monad[U] }
def unit[T](t: T): Monad[T]
Monads must satisfy the following three laws:
Associativity law: $$ m.flatMap(f).flatMap(g) = m.flatMap ( x => f(x).flatmap(g)) $$
Left unit law: $$ unit(x).flatMap(f) = f(x) $$
Right unit law: $$ m.flatMap(unit) = m $$
Wednesday, 27 January 2016
Dijkstra in scala
The easiest way to understand dijkstra is by watching a youtube video.
Step 1: Mark all distance to start node as 0 and distance to all other nodes as infinity.
Step 2: Take current node and update distance to all its linked nodes that are unvisited
Step 3: Choose the next node which is the closest unvisited node.
Repeat steps 2 and 3 until target nodes is marked visited.
The following gist shows the code. Full code on github.
Step 1: Mark all distance to start node as 0 and distance to all other nodes as infinity.
Step 2: Take current node and update distance to all its linked nodes that are unvisited
Step 3: Choose the next node which is the closest unvisited node.
Repeat steps 2 and 3 until target nodes is marked visited.
The following gist shows the code. Full code on github.
Computing edit distance between strings in Scala
Question: Assuming you have a set of n letter dictionary words ($S$), a start word ($s_0 \in S$) and a target word ($s_t \in S$) find the shortest chain of words ($C$) where one letter differs between consecutive words in the chain and the first and last words in the chain are $s_0$ and $s_t$ respectively.
Eg $S = [cat, cab, car, tat, tar, bar], s_0 = cat, s_t = tar $
$ => C = [cat -> car -> bar -> tar]$
To solve this programatically, we can see that this is a graph programming problem - we can build a graph representing the data then find the shortest path within the graph from $s_0$ to $s_t$.

Scala has no graph implementation in the standard collections, but there is a neat little project called scala-graph which you can use.
The interesting part is the computeEdges function. Here is a naive implementation:
For every word in the list we check it against every other unvisisted word in the list meaning that this algorithm is O(n^2) where n = the number of words in the dictionary.
Eg
cat is compared to [ cab, car, tat, tar, bar ]
cab is compared to [ car, tat, tar, bar ]
car is compared to [ tat, tar, bar ]
tat is compared to [ tar, bar ]
tar is compared to [ bar ]
Example: [ cat, cab, car, tat, tar, bar ]
Step 1:
Firstly we compute 'subs' for each word in the dictionary
'cat' -> [ '?at' , 'c?t' , 'ca?' ]
'cab' -> [ '?ab' , 'c?b' , 'ca?' ]
etc...
Step 2:
We build a hashmap from the subs
'?at' -> [ 'cat', 'tat' ]
'c?t' -> [ 'cat' ]
'ca?' -> [ 'cat', 'cab', 'car' ]
'?ab' -> [ 'cab' ]
'c?b' -> [ 'cab' ]
etc...
Step 3:
Now we build up the list of edges. For each word, we get 3 subs for each word. Then we look up each of these 3 subs in the above hashmap to get all the nodes for which the current words has an edge with. We keep a list of nodes we have visited so that we don't add each node twice (the edges are undirected).
Full code is here.
Eg $S = [cat, cab, car, tat, tar, bar], s_0 = cat, s_t = tar $
$ => C = [cat -> car -> bar -> tar]$
Top level solution
To solve this programatically, we can see that this is a graph programming problem - we can build a graph representing the data then find the shortest path within the graph from $s_0$ to $s_t$.

Scala has no graph implementation in the standard collections, but there is a neat little project called scala-graph which you can use.
def computeEditDistance(start: String, end: String, dictionary: Set[String]): Option[Int] = {
val graph = Graph.from(dictionary, computeEdges(dictionary))
val path = graph.get(start).shortestPathTo(graph.get(end))
path.map(_.edges.size)
}
How to compute the edges
The interesting part is the computeEdges function. Here is a naive implementation:
private def computeEdgesSlowly(dictionary: Set[String]): List[UnDiEdge[String]] = {
val (_, edges)
= dictionary.foldLeft (dictionary.drop(1), List.empty[UnDiEdge[String]]) {
case (acc, word) =>
val (unvisitedWords, edgeList) = acc
val edgesForThisWord = unvisitedWords.collect {
case (w) if differsByOneLetter(w, word)
=> word ~ w
}
(unvisitedWords.drop(1), edgeList ++ edgesForThisWord)
}
edges
}
For every word in the list we check it against every other unvisisted word in the list meaning that this algorithm is O(n^2) where n = the number of words in the dictionary.
Eg
cat is compared to [ cab, car, tat, tar, bar ]
cab is compared to [ car, tat, tar, bar ]
car is compared to [ tat, tar, bar ]
tat is compared to [ tar, bar ]
tar is compared to [ bar ]
Cleverer algorithm for computing edges
It is possible to compute the edges in O(n) which is a huge improvement on the implementation above. :D :DExample: [ cat, cab, car, tat, tar, bar ]
Step 1:
Firstly we compute 'subs' for each word in the dictionary
'cat' -> [ '?at' , 'c?t' , 'ca?' ]
'cab' -> [ '?ab' , 'c?b' , 'ca?' ]
etc...
val mapOfSubs = dictionary.map { word =>
val subs = (0 until word.length) map { case i =>
word.patch(i, Seq('?'), 1)
}
word -> subs
}.toMap
Step 2:
We build a hashmap from the subs
'?at' -> [ 'cat', 'tat' ]
'c?t' -> [ 'cat' ]
'ca?' -> [ 'cat', 'cab', 'car' ]
'?ab' -> [ 'cab' ]
'c?b' -> [ 'cab' ]
etc...
val cachedMap = mutable.HashMap.empty[String, Set[String]]
dictionary.foreach { word =>
val subs = mapOfSubs.get(word)
subs.foreach(o => o.foreach {
sub =>
cachedMap(sub)
= Set(word) ++ cachedMap.getOrElse(sub, Set.empty[String])
})
}
Step 3:
Now we build up the list of edges. For each word, we get 3 subs for each word. Then we look up each of these 3 subs in the above hashmap to get all the nodes for which the current words has an edge with. We keep a list of nodes we have visited so that we don't add each node twice (the edges are undirected).
val visited = mutable.Seq.empty[String]
dictionary.flatMap { word =>
word :+ visited
val subs = mapOfSubs(word)
subs.flatMap { s =>
val links = cachedMap(s)
links.collect {
case l if !visited.contains(l)
=> word ~ l // ~ = undirected edge }
}
}.toSeq
Full code is here.
Wednesday, 20 January 2016
Scala: Pass by name vs pass by value
Scala call by value vs call by name

In scala when you pass by name (using =>) the expression gets evaluated every time it is used.
Passing by name and passing by value will result in the same output if pure functions are used and both evaluations terminate.
Note that in scala the default is passing by value which is more performant.

In scala when you pass by name (using =>) the expression gets evaluated every time it is used.
Passing by name and passing by value will result in the same output if pure functions are used and both evaluations terminate.
Note that in scala the default is passing by value which is more performant.
Subscribe to:
Posts (Atom)
Scala with Cats: Answers to revision questions
I'm studying the 'Scala with Cats' book. I want the information to stick so I am applying a technique from 'Ultralearning...
-
The Question: You have to climb up a ladder. The ladder has exactly N rungs, numbered from 1 to N. With each step, you can ascend b...
-
I'm studying the 'Scala with Cats' book. I want the information to stick so I am applying a technique from 'Ultralearning...
-
Tree: 3 5 2 1 4 6 Preorder Traversal root -> left subtree -> right subtree 3 -> 5 -> ...