Machine Learning

Design your Death: A Portable R&D Demo Evening

 

It’s not often that you get to see what goes on behind the closed doors of a design consultancy. But Portable is no ordinary design consultancy. For a start they are based in Collingwood, which automatically gives them a gold star – my old neighbourhood was a just a hop, skip and a jump across Hoddle St.
Secondly, I’ve heard them say that in their early days they aspired to be a kind of Australian IDEO — and doesn’t everyone love IDEO.

I’ve been to a few of their breakfast and evening talks and they are always (a) fantastic, (b) completely packed – standing room only. I first heard about Portable through fellow RHoK buddy and adopted design mentor Zen Tim. He’s possibly the most chilled out and thoughtful person I know. The first time I visited Portable was back in May 2018. Jason Hendry (Partner & Creative Technologist) gave an excellent talk about how they are using a human-centred design approach to building machine learning tools. More recently I heard Joe Sciglitano (Design Lead) talk about empathy in design and what to do with it. Both talks have been brilliant, and since then I’ve been reading through all their design reports. You can check them out here.

 
We don’t just do design. We start conversations with like-minded and diverse groups of people, whether they have a shared interest in design, technology or inspiring social good. 
— Portable, Collingwood.
  At Death’s door

At Death’s door


Needless to say I knew that their R&D Demo Evening would likely be a pretty special event. I wasn’t disappointed.

I had a great time chatting to Portable and non-Portable folks about all things death and ageing and cancer. I also managed to pick a few Portable brains about tech for social good and the value of working at the intersection of data science and design research – thanks for indulging me Tam Ho.

We were invited to test out and provide feedback on the four prototypes they’ve been developing over the past year. It was a such a privilege to talk to their designers about their though processes, what they define as a success, and to hear more about their plans moving forward. Sarah Kaur (Partner & Chief Operations Officer) then launched their most recent report: The Future of Death & Ageing – 81 pages and 17MB of design goodness.

The highlight of the night was catching up with the lovely Martina Clark, founder of Carers Couch (@carerscouch) – I have no doubt her app will be an amazingly good resource for cancer carers. I also meet Sally Coldham, founder of Airloom (@AirloomSocial) and a She Starts alum, who is also doing amazing work in this space.

A fantastic night talking about all things data and design.


In case you missed it…

Empathy. Everyone's talking about it. But who's actually doing it? And what do you do with it once you've felt it? Our Human Centered Design specialist Joe takes us on a journey to discover the what, how and why of empathy, and how it's transformed his design practice. Hear all about how feeling stuff can help you win arguments, how to innovate by implementing the radical practice of listening to other people, and how an empathetic approach will not only help you understand your customers, but give you and your team the natural drive to solve some of the trickiest problems they face. With plenty of storytelling, animated GIFs and pop culture references along the way, you'll laugh, you'll cry, but that's kinda the point, ya feel me?


Worth watching this one too…

To initiate the re-boot of Portable Talks, we look at how a human centred design approach can be used to build AI and machine learning tools. Our Tech Lead and AI enthusiast Jason Hendry will cover the basic principles of machine learning and show you how anyone with a computer can begin the process of creating a basic machine learning model.

How AI can save our humanity

A wonderful talk by renowned computer scientist Kai-Fu Lee (@kaifulee).

AI is massively transforming our world, but there's one thing it cannot do: love. In a visionary talk, computer scientist Kai-Fu Lee details how the US and China are driving a deep learning revolution -- and shares a blueprint for how humans can thrive in the age of AI by harnessing compassion and creativity.

Responsible AI

The development of machine learning and artificial intelligence is already having a profound impact on people's lives. With great power comes great responsibility; How do we ensure that the products and services created are fair and inclusive? How do we ensure privacy and security? How do we share tools and resources? How do we share knowledge?

There is a growing movement towards ethical tech and responsible AI practises, with many companies and organisations becoming more transparent about how their products and services are built. I'll be writing more about this at a later date, but in the meantime here are just few;

 

AHA Moments in Deep Learning

"Zendesk's Answer Bot uses deep learning to understand customer queries, responding with relevant knowledge base articles that allow customers to self-serve. Research and development behind the ML models underpinning Answer Bot has been rewarding but punctuated with pivotal deviations from our charted course"

I recently had the opportunity to hear Zendesk Data Scientists, Arwen Griffioen and Chris Hausler talk about their journey from product ideation to launch, starting with a traditional customer-base d machine learning approach, and ending with a single global deep learning model that serves tens of thousands of accounts. 

This was a fantastic talk that gave a really good insight into how the Zendesk Machine Learning team works and what they value. Both Arwen and Chris have research backgrounds, which is always great to see. Arwen has a computer science background and finished her PhD on ecological modelling (using MLA)  in 2015.  Chris has a computational neuroscience background and finished his PhD in 2014.  

Apparently the data team at Zendesk is like a team sport a real mix of talent: engineers, software developers and data scientists all working together towards a common goal. I love that any additional trainging (i.e., deep learning) is done as a team and includes everyone, regardless of their specific role. I’ve heard of data science teams where only the most senior are allowed to up skill at work and then pass on the knowledge — the rest have to do it in their own time — which is ludicrous. High performance teams work when people are encouraged grow and learn and develop new skills. 

The talk began with the anatomy of a data product. I loved their iceberg analogy. While things may appear to be advancing smoothly (at least in press releases, conference talks, shareholder letters), the bulk of the time is really spent researching new methods, trying things that ultimately fail — life would be terribly boring if we had all the answers º — designing, testing, and re-engineering. 

Supervised classification:

To build the Answer Bot, the team started out with a fairly simple machine learning model. By simple I mean supervised learning using NLP on a software ticket, and using a logistic classifier to predict the most relevant help document or article. The assumption was that this would be fairly accurate, performant, familiar and explainable. Because the industry and therefore context around tickets varies broadly for all of Zendesk’s clients, labelled data would need to be provided for each client, and because the Answer Bot learns on the job and improves with more data, you can’t really switch it on from the get go.  For this to work well the team needed to spend a lot of time preprocessing data. 

Unsupervised classification:

The team explored unsupervised classification, using both tickets and articles as inputs which worked well, except that it would require ~100,000 different models (for each client) and it takes a really long time to train. Part of the reason is that the same words can has a very different meaning depending on the user, and different industries have different sets of words. For example "ticket" may mean an issued ticket, given so that someone can join a queue, or it could be something that is purchased, for example a movie ticket. Answering a  question such as "what do I do if I lose my ticket" requires a good understanding of context. If you try to build a single model will all the words the dictionary, you're going to run out of parameters pretty quickly. 

Pivoting to Deep Learning:

This happened quite a few months down the track and came partly out of their journal club. They essentially  started from scratch, and this required loads of reading and retraining the whole team. A lot of uncertainty and not really knowing what they were doing, but with some knowledge that NLP problems work well with deep learning and the more data you can throw at it the better. Zendesk has no shortage of data. After the talk I asked Arwen how much she and the team knew about deep learning before coming to Zendesk and her answer was “basically nothing” (I love this company!)

The team split into two groups and tackled various aspects of the problem. I wasn’t surprised to hear they use TensorFlow. I was really pleased to hear Chris say that problem solving is a creative process — the mark of a great researcher, and not something you can learn easily. 

The initial perceptions of deep learning were that you could develop one robust model, that it would work well, and that the more data you threw it at the better it would work. This is one of my big worries about machine learning and deep learning. Weights are determined as if by magic, loss functions are calculated and "accurate" results are taken as gospel. From my experience with astronomy data, I can tell you right now that if you start with ALL THE CRAPPY DATA you can still get a good fit, after all you just need to keep adding parameters — seven dimensional string theory anyone? BUT.... The result will inevitably meaningless. Chris summed this up eloquently; "if you put shit in, you're going to get shit out"... or something to that effect. So this is where things get really exciting. This is where you have to go back and figure out each step of the miracle that is deep learning and exploring everything that’s going on and what could be implemented, whether the data introduces unintended biases — turns out datasets with large numbers tickets were artificially skewing things, and whether there are overfitting problems (Hint: unless you have an underlying physical model there will almost always be overfitting problems).

Of course the hard work paid off and it sounds like they’ve come up with a bloody good solution.  The entire process took six months and it was a good year before the product was considered reliable enough for deployment. They spent quite a lot of time validating the model, developing reliable performance metrics, ensuring consistency, and taking the time to do proper human user testing. I was both surprised and pleased that Zendesk allowed the team spend so much time researching. Since I’ve never worked at a tech company I’m not sure what would be considered normal, but my impression is that many data science teams are expected to data analysis results out on pretty short timescales, regardless of data quality.

Lessons from the team:

  • ML products are really hard work.
  • “Vanilla” ML works really well. Logistic regression and Random Forrest work really well.
  • Always start with the simplest model.
  • Deep learning isn’t magic
  • When it finally works, it’s great.

Working through Python's Natural Language Toolkit

Python's Natural Language Toolkit (NLTK)  is a fantastic resource for dipping your toes into NLTK. The online documentation is really comprehensive and full of in-depth tutorials. For those who want to get up to speed with natural language processing recommended taking a full day to explore the library and see what it has to offer.

The tutorials are really worth going through cover the following aspects:

  • How simple programs can help you manipulate and analyse language data, and how to write these programs
  • How key concepts from NLP and linguistics are used to describe and analyse language
  • How data structures and algorithms are used in NLP
  • How language data is stored in standard formats, and how data can be used to evaluate the performance of NLP techniques

Better still, they offer a great starting point for creating your own.

Project day with Silverpond

Last Thursday I spent a day with the very talented folks at Silverpond (@SilverpondDev). It was a great opportunity to find out more about the consultancy and to learn more about the projects they work on. Since we have overlapping interests, the idea was to have a 'project day' where we would work together on some sort of demo project or tackle a small part of a much larger project. 

I was definitely thrown in the deep end of the deep-learning pool (pardon the pun) and while we didn't really get stuck into the project like we'd hoped, we had some really good conversations. I definitely gave me a much better sense of what they do and how they operate. These guys really know their stuff, and in terms of putting deep learning theory into practice, I was in way over my head,  but in a good way. Needless to say my 'to learn' list just got longer.

Approaching deep learning when you come from a solid data analysis background is really challenging, especially when you normally approach a problem with a clear scientific question to answer, and where the problem is solved by fitting models derived [sort of... ] from theoretical or empirical observations or correlations that have a real physical basis. Research that is driven mainly by the data rather than the model used to infer what's going on.

With deep learning, you almost have to unlearn the way you approach a problem and just accept the black box nature of things and that you can't necessarily see under the hood. You also need to think conceptually. Deep learning processes are surprisingly abstract and there are a lot of things you need to get your head around, for example, why batch size is important, how to determine appropriate weights, gradient descent optimisation, how to implement loss functions. At times, it feels like there are various "fudge factors" that exist just because they work... or perhaps that's my lack of deep learning knowledge creeping in. 

Noon's idea was to implement CPPN (Compositional Pattern-Producing Networks) using deeplearn.js and Silverpond's team photos, to create transitions between people based on their CIFAR-10 vectors. His idea came from a series of blog posts where a CPPN network trained on CIFAR-10's frog class in order to generate images of ANY resolution. For reasons I won't go into, this is a really useful thing to be able to do. The images from the blog post were also really striking so that was a good enough reason for me. It seems computer generated abstract art is a thing. 

   Ground Terrain (2016)

Ground Terrain (2016)

   Amphibian (2016)

Amphibian (2016)

In astronomy a similar motivation would be performing high resolution analyses on lower resolution spectra, e.g. if you're trying to extract velocity dispersions at or below the resolution limit of your spectrograph. 

Anyway, it was a really great idea, but I think we totally underestimated the time it would take for me to get up to speed with what had previously been done. To complicate things just a little more (heck, why not!?), the idea was to implement the guts of the process –  the training/optimisation part that was yet to be tackled – using deeplearn.js (despite my lack of knowledge of javascript).  I was keen to see how this might work. deeplearn.js is an open-source library that brings performant machine learning building blocks to the web, allowing you to train neural networks in a browser or run pre-trained models in inference mode. It looks really useful and there a whole swag of reasons why you might want to use it.  Needless to say my enthusiasm for tackling something challenging was thwarted by my ability to tackle said problem, or rather my ability to tackle it in one day. In all fairness, I was also thwarted by my Python3 Anaconda installation, which also meant wasting a bit of time creating a separate python environment and installing various packages – something that although useful, I am loathe to do. At some point over the next few weeks, I'll have another stab at it. Just for fun. 

The rest of the time was spent talking about various things; Planet Labs imaging, Bayesian clustering, other Silverpond projects, and the potential for an AI Accelerator. To me this is one of their most exciting ideas. From what I've seen they are in a really good position to be the go to people for deep learning internships in Melbourne. They have the expertise, they appear to be at the centre of a fast growing community and if they were to establish the right connections and networks, this could really take off.

I also had a good chat with another Silverponder who I recognised, but couldn't place, or put a name to. Turns out he was one of the students from last year's Swinburne Space Apps Hack team, and he had also taken Virginia's undergraduate astronomy course. After graduating and finishing a three month internship with Silverpond, he is now is working on a Unity application that simulates drone imaging and allows you to create labelled data in a #D gaming environment. 

In between meetings and other work, Lyndon also gave me a short demo of Silverbrane.io, a tool Silverpond is developing that allows you to upload data and tag or label objects of interest. Behind Silverbrane is a deep learning algorithm that learns to identify objects as more data is provided. I also had a good chat with Jonathan Chang, their managing director. 

All in all it was a good day, even if the project side of things didn't progress as well as I'd hoped. 

Flirting with TensorFlow

In preparation for my upcoming project day with Silverpond, I spent some time getting familiar with TensorFlow. You can't really read about machine learning without hearing TensorFlow and it seemed like a good time to sit down and check it out.  

What is TensorFlow?

It's an open source library for computing – developed by the Google Brain team, that uses data flow graphs and facilitates CPU and GPU computation in a desktop, server or mobile device using a single API. It's most commonly used for machine learning (neural network) research and analysis and works well with Python. 

Installing TensorFlow on Mac OSX Sierra

A few months ago I made the decision to just try and stick to using my Anaconda Python 3 installation for everything. Partly so I didn't have to worry about clashing global python libraries, multiple installations, or having to set up new Python environments and installing packages every time I used software that clashed with existing installations. I had also [finally...] switched over from tcsh to bash [I know, I know...] which meant that some of my older astro software and general tools weren't quite behaving right because I hadn't yet gone through and changed aliases/paths/shell scripts etc. I needed to declutter my laptop and as part of that I was trying to limit options.

But it turns out TensorFlow and Anaconda aren't the best of friends, although they will play nicely if you know how to handle things. Despite the recommendation, installing TensorFlow centrally with virtualenv didn't work for me. With TensorFlow warnings in mind I went ahead and used conda to install everything anyway, by running the following commands in my shell:

1. bash-3.2$ conda create -n tensorflow
2. bash-3.2$ source activate tensorflow
(tensorflow)$# At this point your prompt should change
3.(tensorflow)$ pip install --ignore-installed --upgrade TF_PYTHON_URL

where  TF_PYTHON_URL is: 
https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.3.0-py3-none-any.whl

Anaconda is not officially supported so I was expecting to run into problems, but didn't (in hindsight I should have probably chosen the URL for Python 2.7 as I would later run into problems running code on project day).

TensorFlow Tutorials

Fortunately there are a number of TensorFlow tutorials thatincrease in difficulty.  I began with the Image Recognition because much of TensorFlow is abstract concepts and I sometimes struggle with that.  It's very much a black box,  and like many researchers I'm not usually comfortable with that. I much prefer seeing under the hood. Normally, anything that magically works first time around is something tobe suspicious of.  The beauty of TensorFlow is the ability to something really complicated in literally just a few lines of code. The downside is that you don't get a good handle on what the code is doing exactly, and more importantly the steps involved and why they matter. 

Perhaps more useful – for those starting out – is playing with TensorFlow's Neural Network Playground.

For a more detailed introduction to neural networks, Michael Nielsen’s Neural Networks and Deep Learning is a good place to start, and for a more technical overview, try Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.

On the bright side there is a lot of practical material on the TensorFlow website and it's definitely worth ploughing through it all if you have time.  It's not just a tool, but a completely new way to approach a problem, complete with it's own structure, syntax and quirks. It's also worth having a project in mind to work towards, even if it's just a vague idea. Project based learning is always easier.

My 'to learn' list just got a little longer....

The Hidden Markov Model demystified

Over the past few months I've been using LinkedIn more and more, as a way finding great articles and tutorials about all things data science and machine learning. While I'm a big fan of Twitter, LinkedIn appears to be a better platform for discovering tech related articles, written or suggested by fellow astronomers and data scientists. A few days ago Elodie Thilliez's blog post The Hidden Markov Model demystified (Part 1) appeared in my feed. This was a really nice surprise. I haven't seen Elodie since we both left Swinburne. After finishing her PhD last year, she moved straight into data science role at the Deakin Software & Technology Innovation Lab – DSTIL (formerly the Swinburne Software Innovation Lab – SSIL) and I was curious to know what she had been up to. 

The Hidden Markov Model (HMM) is a statistical Markov model in which the system being modelled is assumed to be a Markov process with unobserved (i.e. hidden) states. The Hidden Markov Model (HMM) Wikipedia page has a good example of what HMM is and how it works. 

In The Hidden Markov Model demystified (Part 1 ) Elodie talks about how HMM is used and illustrates the process with a really simple example. Her follow up blog post,  The Hidden Markov Model demystified (Part 2 ) – published today – talks about the mathematics, specifically the probabilities involved in the Forward Algorithm, and how to Implement HMM in R.