I'm a data scientist working at the intersection of technology and design. Reformed astrophysicist & former e-Research/data consultant.
Flirting with TensorFlow

Flirting with TensorFlow

In preparation for my upcoming project day with Silverpond, I spent some time getting familiar with TensorFlow. You can't really read about machine learning without hearing TensorFlow and it seemed like a good time to sit down and check it out.  

What is TensorFlow?

It's an open source library for computing – developed by the Google Brain team, that uses data flow graphs and facilitates CPU and GPU computation in a desktop, server or mobile device using a single API. It's most commonly used for machine learning (neural network) research and analysis and works well with Python. 

Installing TensorFlow on Mac OSX Sierra

A few months ago I made the decision to just try and stick to using my Anaconda Python 3 installation for everything. Partly so I didn't have to worry about clashing global python libraries, multiple installations, or having to set up new Python environments and installing packages every time I used software that clashed with existing installations. I had also [finally...] switched over from tcsh to bash [I know, I know...] which meant that some of my older astro software and general tools weren't quite behaving right because I hadn't yet gone through and changed aliases/paths/shell scripts etc. I needed to declutter my laptop and as part of that I was trying to limit options.

But it turns out TensorFlow and Anaconda aren't the best of friends, although they will play nicely if you know how to handle things. Despite the recommendation, installing TensorFlow centrally with virtualenv didn't work for me. With TensorFlow warnings in mind I went ahead and used conda to install everything anyway, by running the following commands in my shell:

1. bash-3.2$ conda create -n tensorflow
2. bash-3.2$ source activate tensorflow
(tensorflow)$# At this point your prompt should change
3.(tensorflow)$ pip install --ignore-installed --upgrade TF_PYTHON_URL

where  TF_PYTHON_URL is: 

Anaconda is not officially supported so I was expecting to run into problems, but didn't (in hindsight I should have probably chosen the URL for Python 2.7 as I would later run into problems running code on project day).

TensorFlow Tutorials

Fortunately there are a number of TensorFlow tutorials thatincrease in difficulty.  I began with the Image Recognition because much of TensorFlow is abstract concepts and I sometimes struggle with that.  It's very much a black box,  and like many researchers I'm not usually comfortable with that. I much prefer seeing under the hood. Normally, anything that magically works first time around is something tobe suspicious of.  The beauty of TensorFlow is the ability to something really complicated in literally just a few lines of code. The downside is that you don't get a good handle on what the code is doing exactly, and more importantly the steps involved and why they matter. 

Perhaps more useful – for those starting out – is playing with TensorFlow's Neural Network Playground.

For a more detailed introduction to neural networks, Michael Nielsen’s Neural Networks and Deep Learning is a good place to start, and for a more technical overview, try Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

On the bright side there is a lot of practical material on the TensorFlow website and it's definitely worth ploughing through it all if you have time.  It's not just a tool, but a completely new way to approach a problem, complete with it's own structure, syntax and quirks. It's also worth having a project in mind to work towards, even if it's just a vague idea. Project based learning is always easier.

My 'to learn' list just got a little longer....

Project day with Silverpond

Project day with Silverpond

Atlas3D XXIII in a Data Science Context

Atlas3D XXIII in a Data Science Context