Search jobs

How do you solve a problem like big data? With a python of course

Python is fast becoming the programming language of choice for the science community, delivering the tools needed to work with the kind of Big Data that scientific research is now generating. Here, we take a closer look at Python’s rise to fame and why the next generation of scientists will be better prepared for the challenges of dealing with Big Data.

In the early noughties, a little-known programming language called Python began its ascent into the mainstream. The developer named the programme after the 70s BBC comedy show Monty Python’s Flying Circus, and while it contained no dead parrots, Hell’s Grannies or cross-dressing lumberjacks, it was quickly picked up by Google as a robust and commercially orientated programming language which could easily process large amounts of data across multiple computers.

With Google as an advocate, Python became more widely recognised as an alternative to other languages like Java and C+, and it soon established itself within the scientific community.

One of the challenges facing the life sciences sector is how to deal with the data it generates, particularly on projects like Genomics England’s 100,000 Genomes Project. Each one of the 100,000 genomes sequenced creates 200GB of data. To accurately analyse and draw meaningful conclusions from this data, having the right computer programming in place to work the numbers is mission critical.

Before Python, scientists used Matlab or SPSS to analyse their work, but these languages were complex and assumed a level of programming knowledge that many people didn’t have. Python is different; it isn’t a programme made for programmers, it is an open-source platform using simple coding which is easy to learn but still capable of high-level data analysis.

Because it is easy to pick up and easy to read, Python is now commonly used in the research and data analytic industries, and as it is an open-source programme its

community of users are adding to it all the time. Python features a large library of coding tools which scientists can use, adapt and develop to meet their own needs without starting from scratch. Effectively the wider its adoption in the scientific community, the better it becomes.

As such an important tool in the future of scientific research, understanding and discovery, it’s comforting to know that programming skills are now being taught to children of all ages on the national curriculum. In 2014, England became the first country in the world to mandate computer programming in primary and secondary schools.

Children start learning to write code when they enter school at age five and don’t stop until at least age 16 when they finish their GCSEs. Although the curriculum is not prescriptive about the programming languages it teaches, the widespread use of Python makes it a natural choice.

The emphasis on computing in schools ensures that the next generation moving into science-related careers are digitally literate, able to use and express themselves and develop their ideas through information and communication technology. This will go some way to ensuring the UK’s position as a leader in the field of scientific excellence is maintained in the future.

Like every major technological advancement, Python’s ascent could level off in years to come and it may eventually be replaced by a newcomer who does what Python does, only better. Other contenders to watch for would be Julia, Go and Scala. But as long as the science community is able to keep up with the programming languages which will help it the most, Big Data will continue to open doors to new breakthroughs in the medical arena.

If you know your Python from your dead parrot, why not get in touch with Paramount today and ask about our latest data scientist roles?

Share this:

10th May

industry news

Build your career

Upload CV    Search Jobs