Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

5 important Python Data Science advancements of 2015

Sometimes it can be hard to keep up with all the new stuff happening in the Python data science community. And who has the time to watch the videos from the PyData and the SciPy conferences? Well, since I did just that, I might as well also give you a short summary of the most awesome stuff.

4 min readDec 14, 2015

--

Numba

— Just-In-Time compiling

Coding in Python just became a lot more awesome… and a whole lot faster as well. Previously, gaining a significant speed-up came with the cost of having to rewrite your code as well as having to break it into compilable pieces. Now you don’t have rewrite the entire code, nor to pre-compile it! All you need is a decorator.

I will let the code speak for itself:

Press enter or click to view image in full size
Image

It even warns you about stuff that can be done smarter. If you wanna know a little more, check out this talk from Scipy 2015:

Conda

— replacing pip and virtualenv

Conda is the future. Why? Because it is better than pip for managing packages, better than virtualenv for managing environments. And then it is platform independent!

I recommend Why I promote Conda or My Python Environment Workflow with Conda about their reasons for using Conda.

The Jupyter notebook

— with GitHub... and Binder... and Pinapple… and tmpnb.org!

I didn’t realize this until my iPython notebook suddenly had a new logo. The notebook project has grown big and spread into other languages and has now been separated from iPython notebook kernel. This means that any language can be implemented to work with the new Jupyter Notebook. This separation has spawned an entire ecosystem around the Jupyter Notebook.

Here is a few of my favorites.

GitHub rendering

Files of the type .ipynb are now rendered directly inside GitHub. BuzzFeed News is already open sourcing their data analysis this way!

Press enter or click to view image in full size
Image

Binder

Lets you instantly turn a GitHub repo into a collection of interactive notebooks running in the cloud and being interfaced by your browser. This is going to completely revolutionize science and push data journalism to the next level — making it easy to share results, and as easy to reproduce them.

tmpnb.org

Lets you spin up temporary notebook instantly. Go try it out!

Pineapple

A Desktop app for Mac. It still needs a little work, but I already love it!

And the Jupyter team keeps going. Soon the Jupyter notebook will get support for real-time collaboration!

Matplotlib 2.0

— it’s kind of pretty now

Can you believe it? Matplotlib plots now looks kind of pretty. While the old default matplotlib was making you cry every time you needed to plot something, you can now use a handful of different predefined well-known styles like ggplot, Bayesian Methods for Hackers, or FiveThirtyEight, by simply typing for instance:

plt.style.use(‘ggplot’)

Here is what some of the styles look like:

Press enter or click to view image in full size
Image
The Matplotlib 1.0 default
Press enter or click to view image in full size
Image
‘ggplot’
Press enter or click to view image in full size
Image
‘bmh’ (as in Bayesian Methods for Hackers)
Press enter or click to view image in full size
Image
‘fivethirtyeight’

matplotlib 2.0 is going to break backward compatibility which is hopefully going to make it easier to use. However, I would still recommend interfacing with matplotlib through Seaborn.

Edit: The styles shown above are not new. What’s new is that matplotlib 2.0 will have a pretty default style. It has not yet been decided which one, but the old matplotlib 1.x style can be brought back using:

plt.style.use(‘classic’)

TensorFlow

— A second generation deep learning library from Google

Chances are you did not miss this. If so, you should check it out. The project still have a few childhood diseases, but it quickly got more followers than any other open source deep learning library on GitHub. Primarily because of the insane amount of hype, but let’s see.

The best comparison of deep learning frameworks that I have seen so far is this one.

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Helge Munk Jacobsen
Helge Munk Jacobsen

Written by Helge Munk Jacobsen

Newly hatched data scientist eager to make the world less boring.

Responses (4)