Sign up for our newsletter and get the latest big data news and analysis.

Becoming a Data Scientist – What Does it Take?

Editors_deskI’ve been monitoring a curious and lively discussion over on LinkedIn – “Is it necessary to have a Masters Degree to become a data scientist?” The comments I’ve seen have exhibited a number of points of view on the matter that I think are reflective of the questions on many people’s minds – both those wanting to become a data scientist and those wanting to hire a data scientist.

Here are some highlights of the discussion:

  • Some agree that a formal education that may include a masters degree is appropriate for becoming a data scientist. It was also pointed out that some institutions are building undergraduate programs for data science that are akin to computer science degrees.
  • Others say that “data scientist” is just a title, but what’s important is the work being done. A specific degree is not important. So if you possess knowledge of and practice mathematics, statistics, and computer science then you can call yourself a data scientist.
  • The discussion included references to new programs such as the Columbia Data Science Institute that is offering a new Masters in Data Science Program, and the new Data Science Specialization on Coursera sponsored by Johns Hopkins University which offers a certification.
  • One participant indicated he’d like to reinvent himself and is considering a rather expensive option by enrolling in the Master in Information and Data Science MIDS at UC Berkeley which costs ~ $60,000!
  • A well-known data scientist points to a long list of educational programs in data science HERE. He also suggests participating in one or more Kaggle competitions to gain experience, although the competition is stiff.
  • Another suggestion is to get a SAS certification.

And then there was this rather rosy picture:

Big data knowledge is not very difficult to obtain, and anyone with some needed pre-requisites like existing knowledge of statistics, programming and databases concepts can become a big data professional.”

Ah, if it were only that easy – which leads me to a data science meetup group I attended the other night where one of the presenters was a gentleman who had been programming for 31 years and suddenly last year decided to become a data scientist. He tried to get a jump start on the field by taking Andrew Ng’s popular “Introduction to Machine Learning” on the Coursera MOOC platform. He said this course “destroyed” him and he had to drop. Then he took the “Computing for Data Analysis” course also on Coursera which is basically an introduction to R. Then he completed the Data Science class offered by General Assembly, an intensive on-premise course that lasts 11 weeks. And viola, in less than a year he was a “data scientist.” Uh uh, it’s not that simple.

First of all, any true data scientist requires a firm foundation in mathematical statistics, probability theory, computer science, and machine learning. To fully understand the latter, you need to comprehend the math – calculus, linear algebra and PDEs at the minimum. Next, all the educational resources the gentleman took advantage of are fine to get you in the running, but only years of experience applying the methods of data science can lead to a secure place in this field.

The moral of this story is an advanced degree, Masters or Ph.D., is certainly a noble way to go about becoming a data scientist. Is such a degree mandatory? Probably not. A mixture of contemporary education resources (traditional degree in data science or MOOCs) coupled with years of practical experience is a good equation for success in this field.

Daniel – Managing Editor, insideBIGDATA

 

Sign up for the free insideBIGDATA newsletter.

 

 

 

 

 

Comments

  1. terrytimko says:

    Great info but I can’t help believe that demand for data scientists will outstrip supply for a very long time, especially given what it takes to become a true data scientist. What is needed is to take work out with new tools, methods and approaches to enable non-data scientist to get some insight quicker and better leverage of existing data scientists by business. I’d also add 2 critical success factors for data scientists: 1) business acumen and 2) an understanding of the business the data scientist operates in.

    • Hi Terry, couldn’t agree more, business acumen is an absolute necessity, and domain experience is indeed helpful, although I’ve found that I can acquire this knowledge by interviewing various domain experts in the organization in prep for a data science project. Still, many companies seeking their unicorns will wait a long, long time before hiring a data scientist that just happens to have experience in their industry. — Daniel

  2. Hi Daniel, not sure if the “infamous” label is the data science meetup guy’s opinion or if it’s your evaluation of Andrew Ng’s course. Infamous can mean notoriously tough or that the course content is lousy and therefore has a bad reputation in the data science field. Please clarify. I’d like to do the course but I wouldn’t want to waste my time. I’m registered for the Data Science Specialisation (Johns Hopkins) on Coursera. How do you rate this programme, given the prevailing debate re data science education.

    • Sorry for the poor choice of words. Dr Ng’s class is very popular. I audited the class myself and found it to be excellent, one of the most engaging courses I’ve ever taken. You’d be well-advised to try it out.

  3. That’s a relief, thanks. Your opinion on Coursera’s Data Science specialisation? It seems to cover the necessary skills as discussed in An Introduction to Data Science (Stanton) and What is Data Science (Loukides). Thanks again for your time.

Resource Links: