# in Code

## Recent Entries (Page 2)

• ### In Memory of Ertugrul Söylemez (1985 - 2018)

I was heartbroken to find out recently that Ertugrul Söylemez has passed away suddenly. Many have come forward to express their sadness about this passing and how much this has impacted the Haskell community, and how much of a loss it is for functional programming at large.

They aren’t wrong; Ertugrul was one of the faces of the friendly, warm, encouraging, patient Haskell teaching that Haskell has grown to be known for. Ertugrul was also one of the original pioneers in the implementation and theory of Functional Reactive Programming and continued to innovate even through this year. His name is now and forever will be synonymous with the “pull-based” variant of functional reactive programming. And the freenode #haskell channel and the Haskell community at large will have one less friendly face who always enjoys helping new people learn.

However, I wanted to just put some words down about his personal influence in my Haskell, Academic, and FOSS career.

When I was a new Haskeller, a lot of things confused me. But the passion of people like Ertugrul to help me understand concepts that I found interesting late into the night was one of the things that really made it worth it.

One of these lead to the creation of my first ever Haskell library, auto. auto is basically literally a direct translation of one of our conversations (and somewhat of a derivative of his own library netwire), and throughout the entire implementation process he was open to the many questions I had. And some of the features of the library (including implicit serialization) were directly his innovations put into practice.

In a slightly different context — as a new PhD student, I was told to follow wherever my curiosity lead me. One of those lines lead me to “comonadic” image processing, which was directly inspired by this unfinished article of his.

• ### A Purely Functional Typed Approach to Trainable Models (Part 3)

Hi again! Today we’re going to jump straight into tying together the functional framework described in this series and see how it can give us some interesting insight, as well as wrapping it up by talking about the scaffolding needed to turn this all into a working system you can apply today.

The name of the game is a purely functional typed approach to writing trainable models using differentiable programming. Be sure to check out Part 1 and Part 2 if you haven’t, because this is a direct continuation.

My favorite part about this system really is how we have pretty much free reign over how we can combine and manipulate our models, since they are just functions. Combinators — a word I’m going to be using to mean higher-order functions that return functions — tie everything together so well. Some models we might have thought were standalone entities might just be derivable from other models using basic functional combinators. And the best part is that they’re never necessary; just helpful.

Again, if you want to follow along, the source code for the written code in this module is available on github.

• ### A Purely Functional Typed Approach to Trainable Models (Part 2)

Welcome back! We’re going to be jumping right back into describing a vision of a purely functional typed approach to writing trainable models using differentiable programming. If you’re just joining us, be sure to check out Part 1 first!

In the last post, we looked at models as “question and answer” systems. We described them as essentially being functions of type

$f : P \rightarrow (A \rightarrow B)$

Where, for $f_p(x) = y$, you have a “question” $x : A$ and are looking for an “answer” $y : B$. Picking a different $p : P$ will give a different $A \rightarrow B$ function. We claimed that training a model was finding just the right $p$ to use with the model to yield the right $A \rightarrow B$ function that models your situation.

We then noted that if you have a set of (a, b) observations, and your function is differentiable, you can find the gradient of p with respect to the error of your model on each observation, which tells you how to nudge a given p in order to reduce how wrong your model is for that observation. By repeatedly making observations and taking those nudges, you can arrive at a suitable p to model any situation.

This is great if we consider a model as “question and answer”, but sometimes things don’t fit so cleanly. Today, we’re going to be looking at a whole different type of model (“time series” models) and see how they are different, but also how they are really the same.

• ### A Purely Functional Typed Approach to Trainable Models (Part 1)

With the release of backprop, I’ve been exploring the space of parameterized models of all sorts, from linear and logistic regression and other statistical models to artificial neural networks, feed-forward and recurrent (stateful). I wanted to see to what extent we can really apply automatic differentiation and iterative gradient decent-based training to all of these different models. Basically, I wanted to see how far we can take differentiable programming (a la Yann LeCun) as a paradigm for writing trainable models.

Building on other writers, I’m starting to see a picture unifying all of these models, painted in the language of purely typed functional programming. I’m already applying these to models I’m using in real life and in my research, and I thought I’d take some time to put my thoughts to writing in case anyone else finds these illuminating or useful.

As a big picture, I really believe that a purely functional typed approach to differentiable programming is the way to move forward in the future for models like artificial neural networks. In this light, the drawbacks of object-oriented and imperative approaches becomes very apparent.

I’m not the first person to attempt to build a conceptual framework for these types of models in a purely functional typed sense – Christopher Olah’s famous post wrote a great piece in 2015 that this post heavily builds off of, and is definitely worth a read! We’ll be taking some of his ideas and seeing how they work in real code!

This will be a three-part series, and the intended audience is people who have a passing familiarity with statistical modeling or machine learning/deep learning. The code in these posts is written in Haskell, using the backprop and hmatrix (with hmatrix-backprop) libraries, but the main themes and messages won’t be about haskell, but rather about differentiable programming in a purely functional typed setting in general. This isn’t a Haskell post as much as it is an exploration, using Haskell syntax/libraries to implement the points. The backprop library is roughly equivalent to autograd in python, so all of the ideas apply there as well.

The source code for the written code in this module is available on github, if you want to follow along!

• ### The Const Applicative and Monoids

The Applicative typeclass has a somewhat infamous reputation for having opaque laws. There are a lot of great alternative rephrasing of these laws, from many different approaches. For this post, however, I want to talk about Applicative in terms of one of my favorites: Const.

• ### You Could Have Invented Matrices!

You could have invented matrices!

Let’s talk about vectors. A vector (denoted as $\mathbf{x}$, a lower-case bold italicized letter) is an element in a vector space, which means that it can be “scaled”, like $c \mathbf{x}$ (the $c$ is called a “scalar” — creative name, right?) and added, like $\mathbf{x} + \mathbf{y}$.

In order for vector spaces and their operations to be valid, they just have to obey some common-sense rules (like associativity, commutativity, distributivity, etc.) that allow us to make meaningful conclusions.

• ### Introducing the backprop library

backprop: hackage / github

I’m excited to announce the first official release of the backprop library (currently at version 0.1.3.0 on hackage)! backprop is a library that allows you write functions on your heterogeneous values like you would normally and takes them and (with reverse-mode automatic differentiation) automatically generate functions computing their gradients. backprop differs from the related ad by working with functions using and transforming different types, instead of only one monomorphic scalar type.

This has been something I’ve been working on for a while (trying to find a good API for heterogeneous automatic differentiation), and I’m happy to finally find something that I feel good about, with the help of a lens-based API.

As a quick demonstration, this post will walk through the creation of a simple neural network implementation (inspired by the Tensorflow Tutorial for beginners) to learn handwritten digit recognition for the MNIST data set. To help tell the story, we’re going to be implementing it “normally”, using the hmatrix library API, and then re-write the same thing using backprop and hmatrix-backprop (a drop-in replacement for hmatrix).

• ### "Interpreters a la Carte" in Advent of Code 2017 Duet

This post is just a fun one exploring a wide range of techniques that I applied to solve the Day 18 puzzles of this past year’s great Advent of Code. The puzzles involved interpreting an assembly language on an abstract machine. The twist is that Part A gave you a description of one abstract machine, and Part B gave you a different abstract machine to interpret the same language in.

This twist (one language, but different interpreters/abstract machines) is basically one of the textbook applications of the interpreter pattern in Haskell and functional programming, so it was fun to implement my solution in that pattern — the assembly language source was “compiled” to an abstract monad once, and the difference between Part A and Part B was just a different choice of interpreter.

Even more interesting is that the two machines are only “half different” – there’s one aspect of the virtual machines that are the same between the two parts, and aspect that is different. This means that we can apply the “data types a la carte” technique in order to mix and match isolated components of virtual machine interpreters, and re-use code whenever possible in assembling our interpreters for our different machines! This can be considered an extension of the traditional interpreter pattern: the modular interpeter pattern.

This blog post will not necessarily be a focused tutorial on this trick/pattern, but rather an explanation on my solution centered around this pattern, where I will also add in insight on how I approach and solve non-trivial Haskell problems. We’ll be using the operational package to implement our interpreter pattern program and the type-combinators package to implement the modularity aspect, and along the way we’ll also use mtl typeclasses and classy lenses.

The source code is available online and is executable as a stack script. This post is written to be accessible for early-intermediate Haskell programmers.