in Code

Recent Entries (Page 2)

  • A Purely Functional Typed Approach to Trainable Models (Part 2)

    Welcome back! We’re going to be jumping right back into describing a vision of a purely functional typed approach to writing trainable models using differentiable programming. If you’re just joining us, be sure to check out Part 1 first!

    In the last post, we looked at models as “question and answer” systems. We described them as essentially being functions of type


    
f : P \rightarrow (A \rightarrow B)

    Where, for f_p(x) = y, you have a “question” x : A and are looking for an “answer” y : B. Picking a different p : P will give a different A \rightarrow B function. We claimed that training a model was finding just the right p to use with the model to yield the right A \rightarrow B function that models your situation.

    We then noted that if you have a set of (a, b) observations, and your function is differentiable, you can find the gradient of p with respect to the error of your model on each observation, which tells you how to nudge a given p in order to reduce how wrong your model is for that observation. By repeatedly making observations and taking those nudges, you can arrive at a suitable p to model any situation.

    This is great if we consider a model as “question and answer”, but sometimes things don’t fit so cleanly. Today, we’re going to be looking at a whole different type of model (“time series” models) and see how they are different, but also how they are really the same.

    Read more … Comments

  • A Purely Functional Typed Approach to Trainable Models (Part 1)

    With the release of backprop, I’ve been exploring the space of parameterized models of all sorts, from linear and logistic regression and other statistical models to artificial neural networks, feed-forward and recurrent (stateful). I wanted to see to what extent we can really apply automatic differentiation and iterative gradient decent-based training to all of these different models. Basically, I wanted to see how far we can take differentiable programming (a la Yann LeCun) as a paradigm for writing trainable models.

    Building on other writers, I’m starting to see a picture unifying all of these models, painted in the language of purely typed functional programming. I’m already applying these to models I’m using in real life and in my research, and I thought I’d take some time to put my thoughts to writing in case anyone else finds these illuminating or useful.

    As a big picture, I really believe that a purely functional typed approach to differentiable programming is the way to move forward in the future for models like artificial neural networks. In this light, the drawbacks of object-oriented and imperative approaches becomes very apparent.

    I’m not the first person to attempt to build a conceptual framework for these types of models in a purely functional typed sense – Christopher Olah’s famous post wrote a great piece in 2015 that this post heavily builds off of, and is definitely worth a read! We’ll be taking some of his ideas and seeing how they work in real code!

    This will be a three-part series, and the intended audience is people who have a passing familiarity with statistical modeling or machine learning/deep learning. The code in these posts is written in Haskell, using the backprop and hmatrix (with hmatrix-backprop) libraries, but the main themes and messages won’t be about haskell, but rather about differentiable programming in a purely functional typed setting in general. This isn’t a Haskell post as much as it is an exploration, using Haskell syntax/libraries to implement the points. The backprop library is roughly equivalent to autograd in python, so all of the ideas apply there as well.

    The source code for the written code in this module is available on github, if you want to follow along!

    Read more … Comments

  • You Could Have Invented Matrices!

    You could have invented matrices!

    Let’s talk about vectors. A vector (denoted as \mathbf{x}, a lower-case bold italicized letter) is an element in a vector space, which means that it can be “scaled”, like c \mathbf{x} (the c is called a “scalar” — creative name, right?) and added, like \mathbf{x} + \mathbf{y}.

    In order for vector spaces and their operations to be valid, they just have to obey some common-sense rules (like associativity, commutativity, distributivity, etc.) that allow us to make meaningful conclusions.

    Read more … Comments

  • Introducing the backprop library

    backprop: hackage / github

    I’m excited to announce the first official release of the backprop library (currently at version 0.1.3.0 on hackage)! backprop is a library that allows you write functions on your heterogeneous values like you would normally and takes them and (with reverse-mode automatic differentiation) automatically generate functions computing their gradients. backprop differs from the related ad by working with functions using and transforming different types, instead of only one monomorphic scalar type.

    This has been something I’ve been working on for a while (trying to find a good API for heterogeneous automatic differentiation), and I’m happy to finally find something that I feel good about, with the help of a lens-based API.

    As a quick demonstration, this post will walk through the creation of a simple neural network implementation (inspired by the Tensorflow Tutorial for beginners) to learn handwritten digit recognition for the MNIST data set. To help tell the story, we’re going to be implementing it “normally”, using the hmatrix library API, and then re-write the same thing using backprop and hmatrix-backprop (a drop-in replacement for hmatrix).

    Read more … Comments

  • "Interpreters a la Carte" in Advent of Code 2017 Duet

    This post is just a fun one exploring a wide range of techniques that I applied to solve the Day 18 puzzles of this past year’s great Advent of Code. The puzzles involved interpreting an assembly language on an abstract machine. The twist is that Part A gave you a description of one abstract machine, and Part B gave you a different abstract machine to interpret the same language in.

    This twist (one language, but different interpreters/abstract machines) is basically one of the textbook applications of the interpreter pattern in Haskell and functional programming, so it was fun to implement my solution in that pattern — the assembly language source was “compiled” to an abstract monad once, and the difference between Part A and Part B was just a different choice of interpreter.

    Even more interesting is that the two machines are only “half different” – there’s one aspect of the virtual machines that are the same between the two parts, and aspect that is different. This means that we can apply the “data types a la carte” technique in order to mix and match isolated components of virtual machine interpreters, and re-use code whenever possible in assembling our interpreters for our different machines! This can be considered an extension of the traditional interpreter pattern: the modular interpeter pattern.

    This blog post will not necessarily be a focused tutorial on this trick/pattern, but rather an explanation on my solution centered around this pattern, where I will also add in insight on how I approach and solve non-trivial Haskell problems. We’ll be using the operational package to implement our interpreter pattern program and the type-combinators package to implement the modularity aspect, and along the way we’ll also use mtl typeclasses and classy lenses.

    The source code is available online and is executable as a stack script. This post is written to be accessible for early-intermediate Haskell programmers.

    Read more … Comments

  • Introduction to Singletons (Part 2)

    Welcome back to our journey through the singleton design pattern and the great singletons library!

    This post is a direct continuation of Part 1, so be sure to check that out first if you haven’t already! If you hare just jumping in now, I suggest taking some time to to through the exercises if you haven’t already!

    Again, code is built on GHC 8.6.1 with the nightly-2018-09-29 snapshot (so, singletons-2.5). However, unless noted, all of the code should still work with GHC 8.4 and singletons-2.4. All of the code is also available here, and you can drop into a ghci session with all of the bindings in scope by executing the file:

    Read more … Comments

  • Introduction to Singletons (Part 1)

    Real dependent types are coming to Haskell soon! Until then, we have the great singletons library :)

    If you’ve ever run into dependently typed programming in Haskell, you’ve probably encountered mentions of singletons (and the singletons library). This series of articles will be my attempt at giving you the story of the library, the problems it solves, the power that it gives to you, and how you can integrate it into your code today!1 (Also, after my previous April Fools post, people have been asking me for an actual non-joke singletons post)

    This post (Part 1) will go over first using the singleton pattern for reflection, then introducing how the singletons library helps us. Part 2 will discuss using the library for reification, to get types that depend on values at runtime. Part 3 will go into the basics of promoting functions values to become functions on types in a usable, and Part 4 will go deeper into the lifting of functions, using singleton’s defunctionalization scheme to utilize the higher-order functions we love at the type level. Part 3 will go into the basics singleton’s

    I definitely am writing this post with the hope that it will be obsolete in a year or two. When dependent types come to Haskell, singletons will be nothing more than a painful historical note. But for now, singletons might be the best way to get your foot into the door and experience the thrill and benefits of dependently typed programming today!


    1. This series will be based on a talk I gave over the summer, and will expand on it.

    Read more … Comments