Project day with Silverpond

Last Thursday I spent a day with the very talented folks at Silverpond (@SilverpondDev). It was a great opportunity to find out more about the consultancy and to learn more about the projects they work on. Since we have overlapping interests, the idea was to have a 'project day' where we would work together on some sort of demo project or tackle a small part of a much larger project. 

I was definitely thrown in the deep end of the deep-learning pool (pardon the pun) and while we didn't really get stuck into the project like we'd hoped, we had some really good conversations. I definitely gave me a much better sense of what they do and how they operate. These guys really know their stuff, and in terms of putting deep learning theory into practice, I was in way over my head,  but in a good way. Needless to say my 'to learn' list just got longer.

Approaching deep learning when you come from a solid data analysis background is really challenging, especially when you normally approach a problem with a clear scientific question to answer, and where the problem is solved by fitting models derived [sort of... ] from theoretical or empirical observations or correlations that have a real physical basis. Research that is driven mainly by the data rather than the model used to infer what's going on.

With deep learning, you almost have to unlearn the way you approach a problem and just accept the black box nature of things and that you can't necessarily see under the hood. You also need to think conceptually. Deep learning processes are surprisingly abstract and there are a lot of things you need to get your head around, for example, why batch size is important, how to determine appropriate weights, gradient descent optimisation, how to implement loss functions. At times, it feels like there are various "fudge factors" that exist just because they work... or perhaps that's my lack of deep learning knowledge creeping in. 

Noon's idea was to implement CPPN (Compositional Pattern-Producing Networks) using deeplearn.js and Silverpond's team photos, to create transitions between people based on their CIFAR-10 vectors. His idea came from a series of blog posts where a CPPN network trained on CIFAR-10's frog class in order to generate images of ANY resolution. For reasons I won't go into, this is a really useful thing to be able to do. The images from the blog post were also really striking so that was a good enough reason for me. It seems computer generated abstract art is a thing. 

Ground Terrain (2016)

Ground Terrain (2016)

Amphibian (2016)

Amphibian (2016)

In astronomy a similar motivation would be performing high resolution analyses on lower resolution spectra, e.g. if you're trying to extract velocity dispersions at or below the resolution limit of your spectrograph. 

Anyway, it was a really great idea, but I think we totally underestimated the time it would take for me to get up to speed with what had previously been done. To complicate things just a little more (heck, why not!?), the idea was to implement the guts of the process –  the training/optimisation part that was yet to be tackled – using deeplearn.js (despite my lack of knowledge of javascript).  I was keen to see how this might work. deeplearn.js is an open-source library that brings performant machine learning building blocks to the web, allowing you to train neural networks in a browser or run pre-trained models in inference mode. It looks really useful and there a whole swag of reasons why you might want to use it.  Needless to say my enthusiasm for tackling something challenging was thwarted by my ability to tackle said problem, or rather my ability to tackle it in one day. In all fairness, I was also thwarted by my Python3 Anaconda installation, which also meant wasting a bit of time creating a separate python environment and installing various packages – something that although useful, I am loathe to do. At some point over the next few weeks, I'll have another stab at it. Just for fun. 

The rest of the time was spent talking about various things; Planet Labs imaging, Bayesian clustering, other Silverpond projects, and the potential for an AI Accelerator. To me this is one of their most exciting ideas. From what I've seen they are in a really good position to be the go to people for deep learning internships in Melbourne. They have the expertise, they appear to be at the centre of a fast growing community and if they were to establish the right connections and networks, this could really take off.

I also had a good chat with another Silverponder who I recognised, but couldn't place, or put a name to. Turns out he was one of the students from last year's Swinburne Space Apps Hack team, and he had also taken Virginia's undergraduate astronomy course. After graduating and finishing a three month internship with Silverpond, he is now is working on a Unity application that simulates drone imaging and allows you to create labelled data in a #D gaming environment. 

In between meetings and other work, Lyndon also gave me a short demo of Silverbrane.io, a tool Silverpond is developing that allows you to upload data and tag or label objects of interest. Behind Silverbrane is a deep learning algorithm that learns to identify objects as more data is provided. I also had a good chat with Jonathan Chang, their managing director. 

All in all it was a good day, even if the project side of things didn't progress as well as I'd hoped. 

Flirting with TensorFlow

In preparation for my upcoming project day with Silverpond, I spent some time getting familiar with TensorFlow. You can't really read about machine learning without hearing TensorFlow and it seemed like a good time to sit down and check it out.  

What is TensorFlow?

It's an open source library for computing – developed by the Google Brain team, that uses data flow graphs and facilitates CPU and GPU computation in a desktop, server or mobile device using a single API. It's most commonly used for machine learning (neural network) research and analysis and works well with Python. 

Installing TensorFlow on Mac OSX Sierra

A few months ago I made the decision to just try and stick to using my Anaconda Python 3 installation for everything. Partly so I didn't have to worry about clashing global python libraries, multiple installations, or having to set up new Python environments and installing packages every time I used software that clashed with existing installations. I had also [finally...] switched over from tcsh to bash [I know, I know...] which meant that some of my older astro software and general tools weren't quite behaving right because I hadn't yet gone through and changed aliases/paths/shell scripts etc. I needed to declutter my laptop and as part of that I was trying to limit options.

But it turns out TensorFlow and Anaconda aren't the best of friends, although they will play nicely if you know how to handle things. Despite the recommendation, installing TensorFlow centrally with virtualenv didn't work for me. With TensorFlow warnings in mind I went ahead and used conda to install everything anyway, by running the following commands in my shell:

1. bash-3.2$ conda create -n tensorflow
2. bash-3.2$ source activate tensorflow
(tensorflow)$# At this point your prompt should change
3.(tensorflow)$ pip install --ignore-installed --upgrade TF_PYTHON_URL

where  TF_PYTHON_URL is: 
https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.3.0-py3-none-any.whl

Anaconda is not officially supported so I was expecting to run into problems, but didn't (in hindsight I should have probably chosen the URL for Python 2.7 as I would later run into problems running code on project day).

TensorFlow Tutorials

Fortunately there are a number of TensorFlow tutorials thatincrease in difficulty.  I began with the Image Recognition because much of TensorFlow is abstract concepts and I sometimes struggle with that.  It's very much a black box,  and like many researchers I'm not usually comfortable with that. I much prefer seeing under the hood. Normally, anything that magically works first time around is something tobe suspicious of.  The beauty of TensorFlow is the ability to something really complicated in literally just a few lines of code. The downside is that you don't get a good handle on what the code is doing exactly, and more importantly the steps involved and why they matter. 

Perhaps more useful – for those starting out – is playing with TensorFlow's Neural Network Playground.

For a more detailed introduction to neural networks, Michael Nielsen’s Neural Networks and Deep Learning is a good place to start, and for a more technical overview, try Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

On the bright side there is a lot of practical material on the TensorFlow website and it's definitely worth ploughing through it all if you have time.  It's not just a tool, but a completely new way to approach a problem, complete with it's own structure, syntax and quirks. It's also worth having a project in mind to work towards, even if it's just a vague idea. Project based learning is always easier.

My 'to learn' list just got a little longer....

.Astronomy9 Day Zero Planning

 

-- BLOG POST IN PROGRESS --

Late last night (AEST time) Steve Crawford (SAAO), Becky Smethurst (Nottingham) and Arfon Smith (STScI) and I got together over Skype to discuss Day Zero plans for the upcoming .Astronomy conference. I've written about the motivations for Day Zero previously, having organised the very first Day Zero for .Astronomy7.

 

 

The Hidden Markov Model demystified

Over the past few months I've been using LinkedIn more and more, as a way finding great articles and tutorials about all things data science and machine learning. While I'm a big fan of Twitter, LinkedIn appears to be a better platform for discovering tech related articles, written or suggested by fellow astronomers and data scientists. A few days ago Elodie Thilliez's blog post The Hidden Markov Model demystified (Part 1) appeared in my feed. This was a really nice surprise. I haven't seen Elodie since we both left Swinburne. After finishing her PhD last year, she moved straight into data science role at the Deakin Software & Technology Innovation Lab – DSTIL (formerly the Swinburne Software Innovation Lab – SSIL) and I was curious to know what she had been up to. 

The Hidden Markov Model (HMM) is a statistical Markov model in which the system being modelled is assumed to be a Markov process with unobserved (i.e. hidden) states. The Hidden Markov Model (HMM) Wikipedia page has a good example of what HMM is and how it works. 

In The Hidden Markov Model demystified (Part 1 ) Elodie talks about how HMM is used and illustrates the process with a really simple example. Her follow up blog post,  The Hidden Markov Model demystified (Part 2 ) – published today – talks about the mathematics, specifically the probabilities involved in the Forward Algorithm, and how to Implement HMM in R.