I'm a data scientist working at the intersection of technology and design. Reformed astrophysicist & former e-Research/data consultant.

research data, citation and software repositories for the savvy astronomer.

Last week I presented a short talk to the Swinburne astronomy group as part of the Director's Lunch talk series. Inspired by the cartoon brainstorming sessions at last year’s e-Research Australasia conference I started thinking about all the things I wish I had known during my early post-doc years and scribbled them down on a scrappy bit of paper (my modus operandi). Of course many of these things, especially the software repositories and online coding courses didn’t exist back then – oh to  imagine a world without MOOCs! – and the line between academic researcher and data science/software developer was pretty much a brick wall.

These ideas ended up in a short 10 minute talk which generated a fairly lively discussion. Feel free  download the slides from Speaker Deck.

To be honest, since I have very little computer science geek in me, I would probably still be in the dark about many of these things had I not taken a side step out of academia and into the world of e-Research and data repositories.

So here is my list of research data, citation and software repositories for the savvy astronomer, plus some other tidbits along the way. 

 The List

  • TED Talks:  I’m a little obsessed with all things TED and I’m really pleased to see a growing number of Astronomy TED Fellows; check out the talks by Lucianne Walkowicz, Robert Simpson, and Tamara Davis (TEDx) if you haven't already. If I had the confidence and some more interesting stories to tell I would love to be a TED Fellow.
  • Research data & metadata repositories:  You've just published a paper and you want to send your data out to the big wide world. How do you do it? This is a tricky one. Many research data repositories only store 'metadata' and although useful this doesn't really help solve the long term data archiving issue. Especially if you're about to move on to the next postdoc. To begin with I would start checking out these, in addition to the usual astronomy data centres; **Research Data Australia (RDA),  Nature's new Scientific Data repository and CSIRO's Data Access Portal. Another great example is MyTardis@Monash.

**Swinburne researchers should check out Swin RedBox - a little metadata store we whipped up last year - or contact me directly about getting data collections into RDA.

  • GitHub & Bitbucket:  All astronomers should start using software repositories, especially if you are keen to get involved in collaborative coding. Generally speaking we're been pretty bad about sharing code. Handing over months of hard work to collaborators or the next generation of PhD students, who may not fully appreciate the months pain of pain and suffering, does not come naturally. Of course those who do tend to go on to have very successful careers. In a world that asserts the Publish or Perish! mantra, your coding skills may give you an edge if your publication record is weak. Regardless, I think you can still become a more savvy astronomer by brushing up on industry skills and standards, and at the very least pretend to be a bona fide software developer.  We all suffer imposter syndrome so why not go the whole hog. GitHub is a great tools for doing this. Andy Green gave a really great talk about collaborative coding at the Astroinformatics 2013 conference, in the context of the AAO’s SAMI Galaxy Survey. If you are a little shy about exposing your ad-hoc coding skills to the world, Bitbucket might be a better option - it's also Australian Made. Bitbucket allows you to have an unlimited number of free private repositories (GitHub provides a limited number before you have to start paying) and it has a more user friendly interface - at least I think it does - with Wiki-like intro pages.  Regardless of which one you choose, understanding how software repositories work is becoming increasingly important, especially if you are thinking about transitioning into a Data Science  career.
  • GitHub now supports DOIs: If you want to assign a DOI – Digital Object Identifier or persistent URL  to your code then choose GitHub. A good example for when you might want to do this is if you're writing a high impact or survey paper  and you want to make code citable for  many years to come.
  •  Speaker Deck and Slideshare:  A bit of shameless self promotion never hurt anyone. Speaker Deck is a great way to archive all your talks and you'll be surprised at how many people will be interested in what you do and what you have to say.
  • .dot Astronomy: I’ve written about these “cool kids” before. A bunch of astronomers and citizen scientists leading some really creative projects. Astronomy hack-days are on the increase and  this shift towards creative coding has been well received by the wider astronomical community. A good example of this was the  recent UKs National Astronomy Meeting (NAM2014) hackday. Rob Simpon wrote about this on his blog, Orbiting Frog.  My advice - go and put the kettle on, make yourself a cuppa and check it out . You may as well check out the #NAM2014 twitter feed while your at it.
  • Research Bazaar:  Not unlike the.Astronomyfolks, the University of Melbourne has been doing some great things over the past year in terms of ITS support for researchers. To to be honest I’m quite envious and I’d really like to see Swinburne embrace this idea. I believethat David Flanders and Steve Manos are responsible for setting up the blog. In addition to workshops, software bootcamps and all-round great ideas, they will be hosting the first Research Bazaar Conference in February 2015. I’m already getting excited about. Many of their workshops are open to non-UniMelb researchers.
  • D3js.org: Data-Driven Documents. Great visuals. One of my work colleagues Samara is implementing some of these into a new research analytics system, but there is no reason why you can’t create your own collaboration and research visualization .Again this is a really useful skill tomake you stand out, particularly if you want to become a Data Scientists. Tech companies may not know (or care) that you can calculate the specific angular momentum of early-type galaxies but they will be impressed by your ability to whip-up data visuals in a few a lines of code.
  • Science to Data Science (S2DS) and Insight Data Science fellowships Alternate career path as a Data Scientist. To be honest if these organisations existed  at the start of 2012 it's quite likely that I would have been  working in London right now…. or maybe San Francisco.

Good luck! and be savvy…

Interactive Map of Optical and Radio Telescopes

Interactive Map of Optical and Radio Telescopes

Building bridges: how do you create an e-Research community?