Python is a very amazing programming language and is very popular these days, so you have probably heard about Python already, you have tried it out or you use it even as your main programming language at work. Therefore I want to talk in this blog post why Python is the language (or at least one great language) when it comes to dealing and working with Big data.
My inspiration for this video actually came from a talk from Jake Vanderplas at PYCON 2017. Jake Vanderplas is a astronomer and talks in his talk about why and how astronomers use Python. You can of course watch the full talk, but in case you do not have the time, I have made a summary for you.
In the first few minutes of his talk he explains how astronomers work and what they do. So first he makes clear that astronomers don’t watch through classic telescopes like everybody would expect, they have big telescopes like the Hubble Space telescope or the Kepler telescope. These telescopes collect big data chunks with various features but they don’t make clear pictures of e.g. whole planets. To get information out this information, they write statistical code mostly based on Python. In the last year astronomers really explored the possibilities of Open Source. So e.g. the whole Kepler project code is online on GitHub. He also talks about the James Webb Space telescope, which should replace the Hubble Space telescope. But instead of watching the visible light it will watching infrared light, to get better images from exoplanets and get general more information about exoplanets. And like the code for Kepler, the code for the data analysis of the James Webb Space telescope is also mostly written in Python and is online at GitHub. As a last telescope he mentioned the Large Synoptic Survey Telescope which is a ground based telescope with the largest digital camera ever created with the size of 3 GigaPixel. So the telescope will make two exposures every 30 seconds and creates 15-30 TerraByte per night with a final 10 year catalog of hundreds of PetaBytes of data. This is very impressive! And again, the software for the data analysis is mostly written in Python (and C++) and you can check it out again at GitHub. So according to Jake Vanderplas, there was currently a big revolution in the astronomy in the last ten years. Python got way more important to the scientific field in general, so e.g. Python seems to became the primary language for astronomers in the last 10 years.
So why Python?
In his second part of his talk Jake Vanderplas talks about why Python is so effective in science. So basically Python was developed for learning to program and it was never intended as a primary language for programmers, but as you might recognized, Python got one of ten most important programming language in the last years. As reasons why especially scientist and astronomers use Python Jake Vanderplas list some key points:
- Interoperability with Other Languages Scientist work with a wide variety of systems, databases and interfaces. So instead of wasting the most of her time trying to fit all their components together, they use Python, because Python makes it very easy to put together different system components. “Python is a Glue”, so Jake Vanderplas.
- “Batteries Included” + Third-Party Modules As another reason he says that Python is also so important for scientists, because there are so many tools already included in Pythons Standard Library and the scientific stack in the last years increased heavily.
My view on Python
As the last years showed as, Python became very important for the scientific community and the software development industry. Big companies like Google (YouTube), Facebook, Dropbox or Instagram use Python in different ways. It is getting more and more popular and it is by far not only a programming language for beginners and for students who want to learn their first programming language. There are great libraries out there and everybody who know how to program can pick it up and learn very fast. There are also great machine learning libraries like TensorFlow and sci-kit out there and I think the growth of machine learning in the next years will be another huge factor why Python will get even more important in the future.