open workshop - sports analytics research cluster

This week Stuart Morgan (@in2sport) from the Australian Institute of Sport visited Swinburne to  discuss current  research projects and areas of potential collaboration with Swinburne.  In terms of multi-disciplinary e-Research capability this was really interesting. It turns out sports research is not just about biology and physiology (I'm completely ignorant when it comes to Australian sport and sports research), but includes applications of machine learning, predictive analytics, computer vision, sports engineering, and data visualisation. The purpose of Stuart's visit was to start discussions about capability, with a view towards establishing a sports analytics cluster.

I knew that 3D advanced visualisation and pattern recognition was important and I was aware that astronomers and sport science researcher at Swinburne were working together on time-of-flight data analysis. What I hadn't quite appreciated, was the high demand of data-scientists and analysts in the sports industry. It seems obvious now, but a lot research is going into non-invasive tracking methods, for example optical tracking of  hockey or rugby  players, for training and match reviews as well as real-time decision support during a game. Optical tracking in swimming is another challenge, where lane ropes and swimmer positions are really difficult to see, let alone tracking  individual arm movements.

Stuart also talked briefly about one project with Prof. Chris Fluke from the Centre of Supercomputing and Astrophysics. Using depth sensors and time-of-flight cameras they've been working towards creating 3D reconstructions of boxers dancing in a ring.

A number of potential area of collaboration were identified:

  • Information retrieval
  • Predictive analytics
  • Empirical support for visualisation
  • The ability to map/track body pose
  • Synthetic training environments

by exploiting existing collaborative pathways such as ARC linkage grants, collaborations with the Swinburne Software Innovation Lab (SSIL), sports research grants and AIS funded top up grants for Australian Postgraduate Awards.

The proposed sports analytics cluster would be aligned with current demands in high-performance sport,  would provide a career pathway for quantitative students, have an international research profile and meet industry needs for research in sports analytics. Ideally it would need something larger, like an overarching data science institute to provide critical mass and ensure long term resourcing.

Interactive Map of Optical and Radio Telescopes

 

This week I set aside half of Tuesday and half of Wednesday for my very own hack day. For a while now I've been wanting to create an interactive map of telescopes around the world, partly because I've found that people are often confused* about why the telescopes astronomers use are inevitably on the other side of the planet, and partly because I wanted to see how quickly I could  learn some mapping data visualisations and finish a project-in-a-day. This is what I created. Pan and zoom to your heart's content...

Another reason for doing this was to get a feeling for how easy it would be to teach researchers in other disciplines. During our e-Research Symposium last week we discussed the idea of  having Hacky Hour for Swinburne researchers, similar to that organised by the ResBaz folks at Melbourne.

(*Occasionally when I tell people Swinburne has a radio astronomy group they ask if researcher use telescope on top of the Engineering building. When I worked in Liverpool people were confused when I told them the Liverpool Telescope was in fact in the Canary Islands. When I tell people I used to go to ESO Headquarters in Munich they are surprised to find out that the ESO telescopes are actually situated on top of a mountain in Chile... and then there's the Hubble Space Telescope!!)

Anyway, prior to this had I started looking at using one of the d3js.org Javascript libraries and spent a day working though a couple of  Mike Bostocks (the founder) Let's Make a Map/Bubble Map  tutorials, which I highly recommend despite the steep learning code for the CSS and Javascript illiterate like myself. After a couple of hours I  soon realised that what I wanted to do was actually much simpler and I wasn't too fussed about customisation. Thanks to Steve Bennett (@stevage1), I remembered there was CartoDB.  This is an open source, cloud-based, geospatial mapping platform that allows you make visually stunning maps -  your own datasets - in very little time and with very little effort.

The following image is an example of the sort of visualisations you can make and they are all available to explore on the CartoDB website.

If you are creating visualisations for education and research purposes you can sign up for  a FREE account with the following specifications;

  • Cloud-based application.
  • A maximum of  six datasets can be uploaded.
  • 75 Mb account limit (actually quite a lot, my telescope map is only 0.1 Mb because I link to existing online images)
  • Access to common datasets, for example countries, river systems, NYC subways, population data.
  • Public visualisations that can be embedded in websites.
  • Map and SQL  APIs that let you interact with data remotely.
  • 10,000 map views/month (eek - let's see how we go...)

So after about two solid hours of work (mainly setting up the excel database and getting telescope coordinates from Wiki and Google Maps)  I had a basic terrain map of optical and radio telescopes around the world Fortunately a lot the telescope parameters; location, site name, mirror size, effective aperture, operator, image,  and image credit were available through Wikipedia so that saved a lot of time creating the database. It also helps to know exactly how you want to present the data before you start. Primarily the data is from these two websites; Wikipedia's List of largest optical reflecting telescopes and  the List of large optical telescopes. The map is far from complete, there are many more optical telescopes. Shortly I'll be adding single dish radio telescopes and arrays. But I think this a good start. 

I have spent quite a bit of time tinkering with the various mapping options; base maps, various zoom levels, formatting information windows, switching between clustered telescopes i.e. multiple telescopes per observatory or single points. Getting the correct base map was really important for this exercise since telescopes at major observatories, for example Mauna Kea in Hawaii, Kitt Peak in Arizona, and Roque de los Muchachos in the Canary Islands, can have up to a dozen telescopes tightly packed on a mountain top/crater rim. I also felt that having a terrain map really shows the remoteness of the observatories, not just in terms of location but  the landscapes themselves. Construction is a feat in itself. Incidentally here is a video of a Chilean mountain top being blown off in preparation for the construction of ESO's future E-ELT (Extremely Large Telescope).

Back to CartoDB.  I opted for both 'click' (top box with image) and  'hover' (black info box without image) on a terrain map. Here it is. One advantage over Google maps is the pop up information box for each telescope. Included is the which name of the site, the country or organisation who operates it, the effective aperture, operating wavelength, mirror type, website, when the telescope was first constructed,  image credit (very important and hopefully I got this right - please let me know if I haven't), and data/image source.

There are still a few features I would like to add, and eventually I would like to be able to link this to research and publication data, for example how many published papers can be attributed to each telescope. I'd also really like to add some real science images (raw and fully processed) to each telescope to give visitors a better idea of what the images look like.

In the meantime... back to work. I'll write another post in a few weeks when I add the radio telescopes.

Cheers.

 

research data, citation and software repositories for the savvy astronomer.

Last week I presented a short talk to the Swinburne astronomy group as part of the Director's Lunch talk series. Inspired by the cartoon brainstorming sessions at last year’s e-Research Australasia conference I started thinking about all the things I wish I had known during my early post-doc years and scribbled them down on a scrappy bit of paper (my modus operandi). Of course many of these things, especially the software repositories and online coding courses didn’t exist back then – oh to  imagine a world without MOOCs! – and the line between academic researcher and data science/software developer was pretty much a brick wall.

These ideas ended up in a short 10 minute talk which generated a fairly lively discussion. Feel free  download the slides from Speaker Deck.

To be honest, since I have very little computer science geek in me, I would probably still be in the dark about many of these things had I not taken a side step out of academia and into the world of e-Research and data repositories.

So here is my list of research data, citation and software repositories for the savvy astronomer, plus some other tidbits along the way. 

 The List

  • TED Talks:  I’m a little obsessed with all things TED and I’m really pleased to see a growing number of Astronomy TED Fellows; check out the talks by Lucianne Walkowicz, Robert Simpson, and Tamara Davis (TEDx) if you haven't already. If I had the confidence and some more interesting stories to tell I would love to be a TED Fellow.
  • Research data & metadata repositories:  You've just published a paper and you want to send your data out to the big wide world. How do you do it? This is a tricky one. Many research data repositories only store 'metadata' and although useful this doesn't really help solve the long term data archiving issue. Especially if you're about to move on to the next postdoc. To begin with I would start checking out these, in addition to the usual astronomy data centres; **Research Data Australia (RDA),  Nature's new Scientific Data repository and CSIRO's Data Access Portal. Another great example is MyTardis@Monash.

**Swinburne researchers should check out Swin RedBox - a little metadata store we whipped up last year - or contact me directly about getting data collections into RDA.

  • GitHub & Bitbucket:  All astronomers should start using software repositories, especially if you are keen to get involved in collaborative coding. Generally speaking we're been pretty bad about sharing code. Handing over months of hard work to collaborators or the next generation of PhD students, who may not fully appreciate the months pain of pain and suffering, does not come naturally. Of course those who do tend to go on to have very successful careers. In a world that asserts the Publish or Perish! mantra, your coding skills may give you an edge if your publication record is weak. Regardless, I think you can still become a more savvy astronomer by brushing up on industry skills and standards, and at the very least pretend to be a bona fide software developer.  We all suffer imposter syndrome so why not go the whole hog. GitHub is a great tools for doing this. Andy Green gave a really great talk about collaborative coding at the Astroinformatics 2013 conference, in the context of the AAO’s SAMI Galaxy Survey. If you are a little shy about exposing your ad-hoc coding skills to the world, Bitbucket might be a better option - it's also Australian Made. Bitbucket allows you to have an unlimited number of free private repositories (GitHub provides a limited number before you have to start paying) and it has a more user friendly interface - at least I think it does - with Wiki-like intro pages.  Regardless of which one you choose, understanding how software repositories work is becoming increasingly important, especially if you are thinking about transitioning into a Data Science  career.
  • GitHub now supports DOIs: If you want to assign a DOI – Digital Object Identifier or persistent URL  to your code then choose GitHub. A good example for when you might want to do this is if you're writing a high impact or survey paper  and you want to make code citable for  many years to come.
  •  Speaker Deck and Slideshare:  A bit of shameless self promotion never hurt anyone. Speaker Deck is a great way to archive all your talks and you'll be surprised at how many people will be interested in what you do and what you have to say.
  • .dot Astronomy: I’ve written about these “cool kids” before. A bunch of astronomers and citizen scientists leading some really creative projects. Astronomy hack-days are on the increase and  this shift towards creative coding has been well received by the wider astronomical community. A good example of this was the  recent UKs National Astronomy Meeting (NAM2014) hackday. Rob Simpon wrote about this on his blog, Orbiting Frog.  My advice - go and put the kettle on, make yourself a cuppa and check it out . You may as well check out the #NAM2014 twitter feed while your at it.
  • Research Bazaar:  Not unlike the.Astronomyfolks, the University of Melbourne has been doing some great things over the past year in terms of ITS support for researchers. To to be honest I’m quite envious and I’d really like to see Swinburne embrace this idea. I believethat David Flanders and Steve Manos are responsible for setting up the blog. In addition to workshops, software bootcamps and all-round great ideas, they will be hosting the first Research Bazaar Conference in February 2015. I’m already getting excited about. Many of their workshops are open to non-UniMelb researchers.
  • D3js.org: Data-Driven Documents. Great visuals. One of my work colleagues Samara is implementing some of these into a new research analytics system, but there is no reason why you can’t create your own collaboration and research visualization .Again this is a really useful skill tomake you stand out, particularly if you want to become a Data Scientists. Tech companies may not know (or care) that you can calculate the specific angular momentum of early-type galaxies but they will be impressed by your ability to whip-up data visuals in a few a lines of code.
  • Science to Data Science (S2DS) and Insight Data Science fellowships Alternate career path as a Data Scientist. To be honest if these organisations existed  at the start of 2012 it's quite likely that I would have been  working in London right now…. or maybe San Francisco.

Good luck! and be savvy…

Building bridges: how do you create an e-Research community?

Lately I’ve been thinking a lot about e-Research advocacy and how Swinburne’s small band of e-Science champions could affect they way e-Research projects are funded, resourcedand ultimately embraced by the University. As a little fish in a big pond it’s difficult to know where to start and since e-Research covers a wide range of disciplines there isn’t going to be a one size fits all solution. So I’ve been throwing around a number of ideas; developing an e-Research strategy, defining some sort of governance, establishing targeted working groups to “build bridges" between e-research projects, disciplines and methodologies, and hosting informal workshops and brainstorming sessions motivated by current researcher needs – a grassroots movement if you like. Since the end goal is not always obvious these strategies are difficult to sell. It doesn't help that we lack critical mass.

A part of the problem stems from the broad definition of e-Research, a definition which tries to encapsulate a range of disciplines from the physical, biological and natural sciences to digital humanities and social sciences. In the UK the term e-Science is more widely used, and focussed on STEM activities,  while in the US the term Cyberinfrastructure is favoured.   The term "cyberinfrastructure" was used by the National Science Foundation (NSF) in 2003 in response to the question: how can NSF, as the nation's premier agency funding basic research, remove existing barriers to the rapid evolution of high performance computing, making it truly usable by all the nation's scientists, engineers, scholars, and citizens?  (Cyberinfrastructure Framework for the 21st Century Science and Engineering, was one of the many subsequent vision documents). Many reasearchers, particularly particle and astophysicisist don’t even think of their projects falling under the ‘e-Research banner’. High performance computing, advanced visualisation, and data mining are the norm. Yet e-Research in humanities seems to be fundamentally different. So how do you make sure the social sciences aren't left out?

Some of Swinburne’s innovative e-Research projects include cloud based laboratories for building model universes (Theoretical AstrophysicalObservatory),  grid computing in social sciences,  astronomy and biomedical 3D and interactive visualization, designing intelligent transport systems, high performance computing (we have our own supercomputer!), policy and “grey literature” databases (Australia Policy Online) and combined research and clinical e-Therapy projects (Mental Health Online). As a technology focused university we do appear to strongly embrace e-Research projects, many of which have become high-profile assets, perhaps without realising it. Despite this there a lack of communication - at all levels – and sharing of ideas and expertise. This is a problem faced by many universities. Personally, I’m a fan of the model adopted by University of California; the recently established Berkeley Institute for Data Science, and the University of Melbourne Research Bazaar (more than just a blog) that was set up mid-2013. I’ll tell you more about these another time.

But I am excited to tell you that in few weeks we will be having our own (and I believe the first) e-Research Symposium; a one day showcase of our most interesting and sucessful projects followed by brainstorming sessions to start "building bridges" and to figure out the direction Swinburne should take in this space. We’ve an overwhelmingly posititve response so far with 80 registered participants in the first four days, the majority of which sighned up within the first five hours.

I think this is a pretty good start, don’t you think?

I’ll let you know it goes.