## 15 Most Common Data Science Interview Questions

Reading Time: 5 minutes

Data Science is a comparatively new concept in the tech world, and it could be overwhelming for professionals to seek career and interview advice while applying for jobs in this domain. Also, there is a need to acquire a vast range of skills before setting out to prepare for data science interview. Interviewers seek practical knowledge on the data science basics and its industry-applications along with a good knowledge of tools and processes. Here is a list of 15 most common data science interview questions that might be asked during a job interview. Read along.

### 1. How is Data Science different from Big Data and Data Analytics?

Ans. Data Science utilizes algorithms and tools to draw meaningful and commercially useful insights from raw data. It involves tasks like data modelling, data cleansing, analysis, pre-processing etc.

Big Data is the enormous set of structured, semi-structured, and unstructured data in its raw form generated through various channels.

And finally, Data Analytics provides operational insights into complex business scenarios. It also helps in predicting upcoming opportunities and threats for an organization to exploit.

### 2. What is the use of Statistics in Data Science?

Ans. Statistics provides tools and methods to identify patterns and structures in data to provide a deeper insight into it. Serves a great role in data acquisition, exploration, analysis, and validation. It plays a really powerful role in Data Science.

### 3. What is the importance of Data Cleansing?

Ans. As the name suggests, data cleansing is a process of removing or updating the information that is incorrect, incomplete, duplicated, irrelevant, or formatted improperly. It is very important to improve the quality of data and hence the accuracy and productivity of the processes and organization as a whole.

### 4. What is a Linear Regression?

The linear regression equation is a one-degree equation of the form Y = mX + C and is used when the response variable is continuous in nature for example height, weight, and the number of hours. It can be a simple linear regression if it involves continuous dependent variable with one independent variable and a multiple linear regression if it has multiple independent variables.

### 5. What is logistic regression?

Ans. When it comes to logistic regression, the outcome, also called the dependent variable has a limited number of possible values and is categorical in nature. For example, yes/no or true/false etc. The equation for this method is of the form Y = eX + e – X

### 6. Explain Normal Distribution

Ans. Normal Distribution is also called the Gaussian Distribution. It has the following characteristics:

a. The mean, median, and mode of the distribution coincide

b. The distribution has a bell-shaped curve

c. The total area under the curve is 1

d. Exactly half of the values are to the right of the centre, and the other half to the left of the centre

### 7. Mention some drawbacks of the linear model

Ans. Here a few drawbacks of the linear model:

a. The assumption regarding the linearity of the errors

b. It is not usable for binary outcomes or count outcome

c. It can’t solve certain overfitting problems

### 8. Which one would you choose for text analysis, R or Python?

Ans. Python would be a better choice for text analysis as it has the Pandas library to facilitate easy to use data structures and high-performance data analysis tools. However, depending on the complexity of data one could use either which suits best.

### 9. What steps do you follow while making a decision tree?

Ans. The steps involved in making a decision tree are:

a. Pick up the complete data set as input

b. Identify a split that would maximize the separation of the classes

c. Apply this split to input data

d. Re-apply steps ‘a’ and ‘b’ to the data that has been divided

e. Stop when a stopping criterion is met

f. Clean up the tree by pruning

### 10. What is Cross-Validation?

Ans. It is a model validation technique to asses how the outcomes of a statistical analysis will infer to an independent data set. It is majorly used where prediction is the goal and one needs to estimate the performance accuracy of a predictive model in practice.

The goal here is to define a data-set for testing a model in its training phase and limit overfitting and underfitting issues. The validation and the training set is to be drawn from the same distribution yo avoid making things worse.

### 11. Mention the types of biases that occur during sampling?

Ans. The three types of biases that occur during sampling are:

a. Self-Selection Bias

b. Under coverage bias

c. Survivorship Bias

### 12. Explain the Law of Large Numbers

Ans. The ‘Law of Large Numbers’ states that if an experiment is repeated independently a large number of times, the average of the individual results is close to the expected value. It also states that the sample variance and standard deviation also converge towards the expected value.

### 13. What is the importance of A/B testing

Ans. The goal of A/B testing is to pick the best variant among two hypotheses, the use cases of this kind of testing could be a web page or application responsiveness, landing page redesign, banner testing, marketing campaign performance etc.

The first step is to confirm a conversion goal, and then statistical analysis is used to understand which alternative performs better for the given conversion goal.

### 14. What are over-fitting and under-fitting?

Ans. In the case of over-fitting, a statistical model fails to depict the underlying relationship and describes the random error and or noise instead. It occurs when the model is extremely complex with too many parameters as compared to the number of observations. An overfit model will have poor predictive performance because it overreacts to minor fluctuations in the training data.

In the case of underfitting, the machine learning algorithm or the statistical model fails to capture the underlying trend in the data. It occurs when trying to fit a linear model to non-linear data. It also has poor predictive performance.

### 15. Explain Eigenvectors and Eigenvalues

Ans. Eigenvectors depict the direction in which a linear transformation moves and acts by compressing, flipping, or stretching. They are used to understand linear transformations and are generally calculated for a correlation or covariance matrix.

The eigenvalue is the strength of the transformation in the direction of the eigenvector.

Stay tuned to this page for more such information on interview questions and career assistance. If you are not confident enough yet and want to prepare more to grab your dream job in the field of Data Science, upskill with Great Learning’s PG program in Data Science Engineering, and learn all about Data Science along with great career support.

## Is Design Thinking PepsiCo’s Secret to Market Dominance?

Reading Time: 4 minutes

Pepsico CEO Indra Nooyi took up the reins of the company when it was facing considerable drop in sales. As a way to address this, she revised her business strategy to make it more inclusive for consumers. She famously went after Mauro Porcini and sought his expert advice to redesign PepsiCo’s user experience. Eventually, her team resolved the problems by relying on an iterative process of understanding users and providing instinctive solutions.

Under the leadership of Indra Nooyi, PepsiCo prioritised user-specific solutions, designed products that were more human-centric and earned the company record-breaking revenues apart from accolades. User experience was not a part of PepsiCo’s business strategy until the early 2010s. Whether it was their product packaging, form or function, that human element was missing in the design. Once they focused on customer experience and made design a priority, customers responded by engaging with the brand more. From designing touch-screen fountain machines (Pepsi Spire) to launching a special line of women’s snacks, PepsiCo reconditioned the way consumers interact with products. Mauro Porcini successfully introduced a more consumer-centric PepsiCo to the world with design thinking being the key driver behind all these changes.

The company leveraged design to drive innovation and create relevant brand experience for their customers. Design thinking helped them change their brand’s visual identity and improve the product itself. Following an iterative prototyping process, Pepsico was able to align the company goals around the product, helping transform obscure ideas and overcome plausible blockers in production process.

What really helped PepsiCo’s journey towards success was a deep understanding of consumer needs – the idea that the product had to communicate with the consumer in a way in which was unheard of before. Getting a perception of what consumers wanted from each of the products- vending machines, fountains or consumables and crafting the experience accordingly helped the company reclaim the market.

The Pepsi spire (a series of fountains and vending machines) is the most loved and the first in the design enhanced line of products. Pepsi Spire allows customers to customize their drinks by communicating with a highly responsive touch-screen fountain. Now, if you are wondering if design thinking is just about enhancing product packaging, it’s not quite so. Pepsi Spire is a classic example of how design thinking can impact all phases of product-customer experience. The spire is basically a futuristic machine that speaks to customers and invites them to interact with it. Its intelligent interface reminds customers of the order history and suggests new options based on the customer profile. They can also experience the infusion digitally by watching the whole process of adding their favourite elements in the drink on the screen in real time- right when they select it. This approach extends the enhanced customer experience to the post-product phase and makes it holistic. Pepsi Spire has now become iconic and inspired a series of intelligent vending machines.

Other Companies taking Cue from PepsiCo

Using design thinking to drive business means designing solutions with customers in mind – not only will that lead to more customer satisfaction but also establish businesses as distinguishable brands. What company wouldn’t want that? Global leaders are already using design thinking to align their customer’s goals and step into the future. Let’s take a look at the top companies who have already benefited from this model.

Apple:  Apple is undeniably a classic example of how reconstructing user experience through innovation can lead to revolutionary success. At its core, Apple remains a company that has always championed innovation and delivered unique customer-driven experiences – all thanks to design thinking. Apple products ranging from iPhone, MacBook to ios not just bring you exquisite usability but also optimised functionality. From providing a holistic user experience to predicting customer needs, Apple has successfully shown the rest of the world how it’s done.

Nike: Nike has been a pioneer in merging sports with fashion. A brand which primarily targeted athletes and helped them enhance performance has now become quite a fashion trailblazer. “Move forward” (their pet phrase) not only dictates their designs but also aptly captures their user imagination. All along, design thinking has been instrumental in shaping their advanced products and services.

Google: Needless to say, Google has been acing the game and how! Whether it is Google map or Google Pixel’s image software, Google products are glaring examples of enhanced designs. Google teams are constantly thinking ahead of time and designing products and services that answer futuristic customer needs. Google’s constant endeavour to design products with a focus on user experience has established the brand as a world leader in design thinking.

Design Thinking has been around for longer than we think and its focus towards building enhanced user experiences has made it a much coveted strategy for brand building today. To put it in Porcini’s own words,

“People don’t buy, actually, products anymore, they buy experiences that are meaningful to them, they buy solutions that are realistic, that transcend the product, that go beyond the product, and mostly they buy stories that need to be authentic.”

PepsiCo’s success has since then inspired many other companies to rethink their business strategy and hire design thinking experts. If you are an enthusiast, learn more about it here.

## With Career support, I got to interview with many companies – Sai Ramya Machavarapu, Data Analyst at Mercedes Benz.

Reading Time: 3 minutes

A career transition can be a daunting experience for many. But given the right direction, learning, and support, it is more like a cakewalk. That’s why here at Great Learning, we strive to provide the right learning, practical exposure, and complete career support.

### What has your professional journey been like?

I completed my graduation in Electronics and Communications Engineering from Amrita College, Bangalore. Then I moved to the USA to pursue my Masters in Electrical Engineering from the University of Missouri, Kansas-City in the year 2014. I got placed in Reliable Software Resources as a QA Tester and worked until May 2017. I will be joining Mercedes Benz very soon as a Data Analyst.

### How did you develop an interest in Data Science? Why did you choose GL to pursue it?

Previously, I was working as a manual tester for a Consulting firm. The job profile involved manual testing for a project of Banking. The role was very limited and monotonous, so I decided not to go deeper into testing. I left my job and moved back to India. As I was from a non-programming background, I was very sceptical to get into coding and related fields. I was looking into various technologies and was suggested by a friend to consider Data Science as an option. I attended many seminars and workshops on Data Science organized by various companies. I developed an interest in this field and was looking for a classroom course. On the recommendation of the same friend, I joined Great learning to pursue PGP-DSE.

### Coming from a non-programming background, was it difficult for you to understand the subjects?

Not at all. As most of the students in the batch were from the non-IT background, the course is designed keeping them in mind. The faculties ensured that the basics were covered. I understood that the course is based on Logics, so I slowly developed pace and contrary to my presumptions, I didn’t find it difficult. The faculty put in a lot of time and attention towards us and even repeated the sessions whenever required.

### How was your experience of the academic and career support given by GL?

The team was always available, especially Akhila as she helped us thoroughly in preparing for the interviews and gave regular suggestions and feedback for us to improve at the same. Whenever we had any issues, Akhila and the team resolved them at priority.

With Career support, I got to interview with many companies like CTS, Mercedes, etc. Based on my experience, I realized that the curriculum is self-sufficient to crack any interview. The entire course is designed in a way to help us understand the concepts, crack interviews, and guide during the projects.

### What did you like the most in the program?

We were assigned mini projects on the completion of every topic. This gave a lot of hands-on experience of every topic in terms of understanding and its practical application. This hands-on experience on mini projects gave me a lot of confidence and helped me in exams as it gave a recap of all that we had learned in the course. After the completion of the course, during the capstone project, there were many remedial sessions to clear doubts.

### Share your experience of the interview with Mercedes.

The interview was organized by GL at Mercedes’s Bangalore office. The interview included a total of 3 rounds; 2-Technical and 1-HR. The first round included questions based on whatever I had mentioned in my resume and basic questions over Coding, ML, SQL, etc. 2nd round involved questions related to the Business aspect. The final round was an HR round, where they gave me a confirmation after the interview.

### Any advice to aspirants who wish to take this course?

They should be confident in sharing what they know and admit to what they don’t know. Give your 100% to every interview thinking that this is the last opportunity as there is a huge competition in the market. There focus should be in developing a strong foundation of whatever they are learning. The interviews are based on basics and focus to test you in your understanding of the field. So, have a stronghold of basics and you will be good enough to crack through it.

Upskill with Great Learning’s PG program in Data Science Engineering and unlock your dream career.

## The placement assistance was excellent – Debashis Gogoi, Data Analyst at Indegene.

Reading Time: 2 minutes

There lies a big challenge among engineering students to pick the right field of specialization and build a successful career within the same. Once you understand the core area of interest, upskilling in the same with a relevant course could be a key to unlock your dream career. Here’s how Debashis did it.

### What is your professional background?

I completed my graduation in Civil Engineering from Royal School of Engineering & Technology. After graduation, I worked for 3 months in National Highway project and Gammon India Pvt. Ltd., Guwahati. Then, I moved to Bangalore to pursue the course in Data Science Engineering. Currently, I am working with Indegene as a Data Analyst.

### How did you develop an interest in Data Science and How did you choose GL?

I wanted to pursue graduation in Computer Science Engineering but could not as there was no scope for IT in Assam. Based on the available opportunities, I took a course in Civil Engineering. While working during my internship, I realised that I have a passion for Analytics. So I moved from Assam to Bangalore and took some certification courses from Coursera. Meanwhile, I was looking for Data Science courses and got to know BABI is the No.1 course in India for Analytics. Since the only full-time course was of DSE and it was designed for Freshers like me, I took this course.

### How was the overall experience with Great Learning?

It was a very nice experience. Before joining the course, I checked the curriculum and found it was very extensive. DSE is a 5-month program and I believe GL did justice in delivering the basics and in-depth understanding. The faculty members were industry experts and they spent a good amount of time with almost all topics. The management was very supportive and the placement assistance was excellent. From CV reviews to Mock interviews, everything made the students really comfortable and industry-ready.

### How was your experience at the interview with Indegene?

I got to participate in the placement drives of 6 companies. I got to interview with 3 companies, namely, Kargil Solutions, Evive, and Indegene. With Indegene, there were 2 interview sessions; 1st was a Case Study and 2nd was a Technical Round where they tested me with my basic ML concepts. After the interviews, they offered me the role of Data Analyst.

### Coming from a non-programming background, how easy was it for you to understand the course?

The course is designed with the first week dedicated to Python. Initially, it was a bit tough but then eventually things got easier as we got acquainted with it. The course and the curriculum are very well designed keeping the diversity of the batch in mind.

### Any advice for our future aspirants?

My father always quoted me with “Patience and Perseverance always pay”. Along with it, working hard, being focused, and believing in oneself will help anyone achieve the best out of the program. I will suggest them to practice more and participate in a lot of competitions.

Upskill with Great Learning’s PG program in Data Science Engineering and unlock your dream career.

## GL helped me to kick-start my career – Yeknath Merwade, Associate Analyst at Ugam Solutions

Reading Time: 3 minutes

One needs career support the most when they are a fresh graduate. The right direction and support at the right time help multifold in shaping a successful career. What kind of support did Yeknath get? Read on:

What has your professional background been?

I completed my Graduation in Electrical, Electronics & Communications Engineering in 2018 from Belagavi, Karnataka. I then took a course in Data Science at Great Learning, Bangalore and currently, I am working in Ugam Solutions as an Associate Analyst.

How did you develop an interest in Data Science?

I finished my graduation with 58% aggregate score. With this score, I was not eligible to attend interviews for any good role or company. I understood the need to upskill myself as my father suggested me to read about Data Science which has created a lot of buzz. After researching online, I developed an interest in it and got fascinated with what this field can do.

Why did you choose GL to pursue a course in Data Science Engineering?

After viewing the scope and growth opportunities, I immediately started to search for courses. But to choose the best out of them was a task in itself. All I wanted was to take a classroom program as for a fresher it was better compared to online training. I visited GL’s website for weeks and saw it was regularly updated with relevant data and testimonials. I checked the reviews on Google and LinkedIn as well. Finally, I looked at the faculty profiles on LinkedIn and saw their experience. I understood that GL is the best institute in India to study Data Science, so I took up the course here.

What did you like the most in the program?

There were many things that I loved about the program.

1. The Faculty: Since I looked at the LinkedIn profiles of almost all the teaching professionals, I got to know that they all were Industry experts and had a great experience in their respective fields. When I enrolled myself for the course, I was surprised to see how grounded and friendly they were. Also, they taught us everything from scratch.
2. Course-Curriculum: The course is well designed and well structured. The curriculum is exhaustive and gave me a good understanding of the domain. The course includes what is needed by the industry and everything is accommodated in the syllabus.
3. Career Assistance: I got to sit on campus drive of 7 companies and got shortlisted in all of them. Apart from this, the CV reviews and Mock Interviews helped me develop confidence and crack interviews. Also, they organized Bootcamps for the students and helped us in all aspects. There were ample opportunities and it got us placed.

Overall it was a nice experience as I got good friends and faculty with whom I learned a lot and I am still in touch with them. I feel very grateful to GL, that helped me to kick start my career.

Being from a non-programming background, did you face any issues with the course or the transition?

Initially, it was very hard for me to adjust to the syllabus as I was not at all familiar with Coding or programming. The first week of the course started with Python, which was a new thing for me. Here, I would like to mention that the teaching faculty boosted my confidence by mentioning that “It is not rocket science and is easy to learn”. After the EDA session, I felt self-motivated and realised that irrespective of any branch, one can achieve success in their ventures. Slowly things started to fall in place. I was in regular sync with sessions, and the regular exams and quizzes kept us in constant touch of the topics. In the end, everything was good and great.

Share your experience of interviewing with Ugam?

I had 4 rounds of the interview; An SQL Test of 30 minutes duration, followed by a Case Study of 30 minutes duration again, a Technical round and finally an interview session with Vice President and HR. The technical round involved questions around my Project mentioned in the Resume and general technical questions to check my understanding of algorithms. With the VP, the interview was to check how my understanding can contribute to the Analytics team of Ugam and general questions from the HR. After the interview, I received a job confirmation from them.

Any advice to our future aspirants of this course.

I would like to suggest to prepare well on Stats and SQL. The material is self-sufficient and includes in-depth content and curriculum. The placement assistance is superb and helps everyone in getting placed. So there is no need to panic for anything. Also, focus on your project as all my interview questions revolved around my Capstone Project.

Upskill with Great Learning’s PG program in Data Science Engineering and unlock your dream career.

## I got to interview with 3 companies – Pushpendra Nathawat, Programmer Analyst at Cognizant

Reading Time: 2 minutes

Finance has evolved to position itself as an important business function. Given the nature of this domain, it overlaps with analytics in many areas. Finance professionals and executives are finding new ways to leverage from this overlap and increase the value of this vertical in their organizations.

### What is your professional background?

I had completed my MBA from Tapmi School of Business in the year 2015. I then joined Vodafone and worked as a Relationship Manager for 10 months. I switched to HDFC and worked for over 1.75 yrs as an Assistant Manager. Currently, I am working with Cognizant as a Programmer Analyst.

### Why did you think of upskilling? Why did you choose Great Learning?

I did an MBA with Finance as my specialization and while working with HDFC, I enrolled myself in Financial Risk Course with IIM Kashipur. Though I had good knowledge in Finance Domain, I had no understanding of Coding or Data Science. I felt the need to upskill and checked for the courses. While searching I found high recommendations for GL. So I left my job in Jaipur and moved to Bangalore to pursue a full-time program in Data Science Engineering with Great Learning.

### What did you like most about the program?

The management, staff, and the faculty, everyone was very helpful. The faculty took a great deal of interest in teaching students and gave a good explanation of every topic. The management was very supportive in providing any assistance whenever the batch needed extra sessions or special classes for having a better understanding of the programming subjects.

### How was your overall experience at Great Learning?

Since I was from a non-programming background, initially it was a bit difficult to follow the specific modules. But later with the help of faculty, I could cope up with the subjects and it became easier to understand and manage. The faculty was very helpful in providing material and guidance, especially in my lacuna. They took extra effort in organizing classes over those areas during the weekends. Since I was very new to Data Science, I had to improvise a lot in terms of my CV & Interview performance. The Career assistance provided by GL helped me prepare an impressive CV & mock interviews prepped me to crack interviews.

### Share your experience of Career fair organized by GL?

I got to interview with 3 companies; Kinara Capital, Credi India, and Cognizant. I cleared the interview with CTS which involved 3 rounds; 2 in Technical of 45 minutes duration each and 1 HR round of interview on the same day. The technical interviews involved testing my knowledge of Machine learning. I got the job confirmation on the same day.

Upskill with Great Learning’s PG program in Data Science Engineering and unlock your dream career.

## The only resolution you should be making in 2017

Reading Time: 5 minutes

Every New Year brings with it the hope of a new beginning in our lives and along with it, come the myriad of resolutions we make to ourselves. Research indicates that most of the resolutions made by people are towards fitness and weight loss. As a result, January becomes a windfall month for most gymnasiums and fitness studios while most of us don’t become any leaner or fitter with passing years the one thing that we can definitely achieve is being a better version of ourselves. To achieve that you don’t have to make tall promises to yourself just Make Learning a Habit.

Learning new things is simple, achievable and one of the most profitable investments you can make each year.

1. Learning is like weight-loss

Let me make an uncanny analogy here: Aspiring to becoming leaner is very similar to wanting to learn something new. Ultimately, you have to change something that’s core in your behaviour to have the desired results. Both these goals need focus, determination and lots of discipline. And lastly, just as in weight loss as in education, there are no low-hanging fruits or express results. Both take time to fructify, but once you go the distance, there is no looking back.

2. Why ‘Learning’ in 2017?

The right question here should be ‘Why Not’. There has never been a better time to learn and frankly speaking, with the changing dynamics of businesses and technology disruption impacting us, if we don’t make learning a habit in 2017 and onwards, our professional credentials would be questionable at best and irrelevant at worst. Learning new skills and upgrading one’s professional capabilities is no longer a matter of choice but a necessity to have a fruitful career. In today’s time and age, the half-life of knowledgehalf-life of knowledge is forever decreasing which means that one needs to keep learning always to stay professionally relevant. The new reality is that what you learn at 25, will not take you till 35.

3. What should I learn?

This is like standing by an ocean and trying to find the perfect starting point for your swim. What you can learn is limited only by your intellectual bandwidth and interest. For the sake of brevity, let us focus on what the professional in you needs to learn. Depending upon the industry you are in or aspire to be in, you need to understand the trends that are driving growth. If you are unclear about it, you should talk to your seniors from the industry and pick their brains. Pick an area that is affecting most companies in your space and eventually will impact everyone and build your skill sets in that. Professional competencies such as analytics, big data engineering, product management, information security, intellectual property, digital marketing etc. are high growth areas where most companies are struggling for ‘good’ talent. Finding a sweet spot like this and making yourself competent in it will ensure your career benefits from this talent shortage.

4. Where should I learn?

Learning in 2017 will be easier than ever before. From blogs to YouTube or TED, from companies offering online learning to mobile apps, ‘lack of access’ cannot be your excuse to not learning. But having said that, having a plethora of options makes it overwhelming and confusing.

I come across some candidates who know what the skills they need to acquire but are not sure if they will be able to learn. I usually advise them to first test the waters by accessing some free content online. YouTube is usually a good source for this. See if you like what you are learning and are able to grasp it.

5. Why do we fail to learn online?

If you are the kind that does not suffer from such starting troubles, you will usually find your learning options to be either completely online courses or blended courses (online + occasional weekend classroom sessions). Given this spread, how do you decide which format to go for?

Completely online courses provide convenience since you don’t need to attend any classroom sessions. But, online learning has been plagued by abysmally low rates of completion. The main reason for this is that for most of us, we learn better when we learn in a classroom setting with peers and faculty, who we can talk to in person.

The flexibility of attending class room sessions over few weekends in a month gives you the advantage of mixing the best of two worlds – the flexibility of online learning and the learning effectiveness of classroom learning. In our blended analytics program, we have seen hundreds of candidates do our program after having done one or multiple online courses. When asked, the most common response we get is because they feel that their learning in the online programs was incomplete. Also, when it comes to acquiring hard skills such as analytics, big data or machine learning, it is important to focus on programs that are more exhaustive and immersive and don’t take a superficial approach by promising to teach something in a matter of some hours.

6. What will it take?

Learning is for everyone. Amongst the thousands of candidates who take our programs every year, we see about 30% of them to be with in the 15-30 year experience bracket. While there is no age to imbibe the habit of learning, just like with all good habits, the sooner you do it, the better you are. Having said that, learning is hard work. Depending upon when was the last time you were in a class, you would need discipline, focus and perseverance to go the whole distance. Usually, we have seen that the first two months are the hardest but once you settle into a routine within the first sixty days, you will go one to achieve the results you desire. The advice that we give to all our learners is to start small. Begin by dedicating an hour every day for the first 2 weeks, then about 8-10 hours a week for the next thirty days. Small changes in your habit will ultimately lead to big gains in your learning and professional success.

On that note, in 2017, make a promise to yourself. To learn something new and to challenge your professional status quo. Make Learning a habit and build the career you’ve always wanted. Oh and as for fitness, try playing a sport – 5 days a week. It is fun and just as effective (or ineffective) 🙂

## RIP Degree, Hello Competency?

Reading Time: 4 minutes

The inevitable transition of value from Degrees to Competencies in the knowledge economy

I had a conversation in Bangalore recently with a senior technology professional, one with over 20 years of experience in both large and small technology companies and currently in a VP role at one of the posterchildren of India’s Internet businesses. He said that when he interviews people these days, very rarely does he look at what degree they possessed. He is more interested in what they can do and what the most recent course they had done on Coursera.

Shift in Hiring Manager’s mindsets

Conversations with dozens of senior industry professions over the past couple of years indicate that most of them, in their hiring decisions, look more at what the candidates know (knowledge) and can do (competence) rather than what degree they have.

This is a phenomenon, increasingly common, that acknowledges the fact that our undergraduate and post graduate degrees are no longer good enough. The rapid pace of change being driven by technology has meant that we have to be constantly learning to keep up with the latest and best practices.

When we were growing up, people were introduced to each other an engineer, a doctor, a CA, a lawyer, a commerce graduate, an arts graduate, etc., – essentially tying our identity to our formal education. Job requirements were specified in these terms. The most essential requirement for a job would be the undergraduate or post graduate degree. Today, if you look at the job descriptions on Naukri or Linkedin for knowledge workers/roles, the degree would be the last or the second last thing that is mentioned with most of the other requirements specified in terms of competencies: x years of digital marketing experience, y years of data analysis experience, z years of design experience, etc. Unless it is from a very reputed top institution, the degree seem to hardly contribute to the interview evaluation. Even the value being added by the reputed institution seems to be attributed more to the filtering and motivation/drive of the candidates that it signals than the specific degree that is pursued there.

This transition is already manifesting in the recruitment practices of several of the most reputed companies in the world. A few weeks back, Ernst and Young, one of the most reputed consulting firms and a large recruiter of young talent globally, declared in its UK division that it would scrap the UG degree as a recruitment filter and instead rely on its internal competence assessment. Several technology companies like Google, Uber, Facebook, and closer to home, Flipkart, Snapdeal, etc., have started accepting candidates based on informal credentials like the “nanodegrees” from Udacity or the “specializations” and certificates from Coursera, which are merely signals of verified competencies and not accredited degrees or diplomas.

This phenomenon hit home for me recently from the most unlikely of sources. I was attending a talk at one of the TIE conferences on education in Delhi and the keynote speaker was an ex Pro Vice Chancellor of IGNOU. IGNOU is the world’s largest grantor of degrees in the world, having over 1 million enrolees and distributing hundreds of thousands of degrees and diplomas each year. I am sure these degrees are meaningful to a large number of people who were not fortunate enough to go through a full time college experience and are qualifying them for a large number of government and public sector jobs thus serving a valuable purpose for them. However, it is widely acknowledged that they hold little value in the knowledge economy due to the poor learning outcomes associated with them.

Given this background, I went into this talk expecting to hear a very traditional perspective on education from this septuagenarian gentleman. However, what I heard blew my mind. He had some of the most progressive and creative ideas I had ever heard on the transformation that is happening and will happen in the world of education and in the global talent markets. One of the points he made really stuck. He said, “Today, what KRA stands for has changed. It stands for ‘Kyun Rakhe Aapko (Why should I keep you)’”. I thought that this captured most succinctly the massive change in focus in the talent market from degree to competency.

Dawn of the Portfolio

If competencies are becoming all important, how does one showcase them or communicate about them? This is being done through creating a “Portfolio” or “body of work” that demonstrates the competence. This approach is not new. It has been widely used in other fields that require creativity and innovation, qualities that are increasingly more important in the knowledge economy. A photographer, an architect, an artist, a designer, a writer, a journalist, a film director, a PR executive – all of these professionals are judged by reviewing a portfolio of their prior work. This is now being applied to knowledge professionals as well. Good programmers are being judged by the code libraries they have shared on Github, the hackathon they have participated in and their topcoder rank. Good Data Analytics professionals are being judged the analytics problems they have solved on platforms like Kaggle. Marketing professionals are being judged by the blog posts and social media presence they have created for their brands.

I believe that this trend will accelerate further as it is in keeping with the general shift in decision making becoming more and more data driven. When recruiters can make decisions based on data that is directly relevant to them, like the directly relevant portfolios of candidates they are considering, they have a little reason to depend on a stamp applied by a third party education institution using methodology that may or may not be relevant to them.

So, it’s high time all knowledge professionals, particularly those at the early stages or their career, start creating their personal portfolios. That will be their currency in the competence-driven world of the future.

## Public Data: A Data Scientist’s dream

Reading Time: 3 minutes

We’ve all heard how data science will transform (if it hasn’t already) the business landscape, touching everything from our supermarkets to our hospitals and our airlines to our credit cards. Most companies in the areas of data science use proprietary information from millions of private transactions to gain insight into our behavior that in turn allows these companies to turn a profit. However, if you are an amateur data scientist, a hobbyist, a student or a data-minded citizen, this information is typically off limits. And a simulation just isn’t enough because it doesn’t meaningfully replicate the complexity and multi-dimensionality of this data.

Public Data Sets

How about all the publicly available information though? Now, here’s an underused treasure trove for data scientists. Concerns about the quality of data aside, open data provides unparalleled opportunities. There are typically no usage restrictions for data in the public domain, and stitching together disparate sources of data (census, crime records, traffic, air pollution, public maps, etc.) gives you the opportunity to test interactions between various data sets. Possibly the most complete list of public datasets is available at this GitHub page.

Notice I said ‘concerns about the quality of data’ in the previous paragraph? That can be a massive problem. The biggest impediment to the use of public data is the lack of reliability of data. Often, the data sets are poorly indexed or incomplete. But even more commonly, these public stores of information are stored in formats that are incompatible with data wrangling. Scanned documents and hand-written ledgers don’t lend themselves to easy analysis. So, a large part of public data projects ends up being a transcription effort. Web scraping, dimensionality reduction, imputation, bias removal and normalization are all skills that a data scientist needs to develop when working with public, open data.

Where is all this public data?

Of course, there are some extremely powerful sources of public data with somewhat clean, reliable and ready-to-use data as well. For government and public sector data, the first port of call is India’s Open Government Data Platform, which includes robust data on commodity prices, national statistics, company registers and even government budgets. Macroeconomic data is best sourced from the World Bank or from Google’s Public Data Explorer. The Public Data Explorer stitches together information from a range of sources (IMF, World Bank, OECD, University libraries and even Google’s own data collection efforts), and contains some slick, interactive visualization.  A variety of other interesting sources of data include Reserve Bank data for bank, forex and CPI information and Bhuvan, ISRO’s geo-platform for geographical data.

Recognizing just how time-intensive and complicated data cleaning and collation can be, there are some interesting companies that focus on getting you clean data sets. Not surprisingly, they focus on the most immediately lucrative sector – finance.  Quandl provides some intriguing financial data sets for free, including the renowned Big Mac price index, and all the data is designed to be easily imported and ready for use in minutes. Another company challenging the traditional (paid) data powerhouses is StockTwits. Their API allows you to get real-time data for free all day, every day. If you want historical data (going back about 3-5 years), numerous users have downloaded using StockTwits and created data sets that you can easily repurpose.

Getting competitive

If you’re the sort who likes a competitive challenge rather than tinkering with datasets by yourself, there are some wonderful competition platforms that make public datasets available with a well-defined problem statement. The first port of call is Kaggle, whose competition problems include Flu Forecasting and Automated Essay Scoring. Kaggle also comes with a set of very interesting data sets for the self-driven data scientist. Driven Data is another such platform albeit with a limited selection of competitions.

Once you’re ready to meet and work meaningfully with others interested in data-driven solutions to social problems, you can seek out global movements like DataKind. Their efforts range from weekend marathons to long-term cross-sector engagements. Earlier this year, DataKind’s Bangalore chapter created a tool to help you understand various aspects of the Union Budget for 2016-17. The source code is public and entirely open to being repurposed for use on any other data set. There are also academic paths to learning and collaboration in data science – the most prominent of which is the University of Chicago’s Data Science for Social Good fellowship.

Public datasets offer the best opportunity to learn, experiment and produce valuable analytical insights to benefit society. In a world where data is an increasingly valuable currency, these public data sets are perhaps the last bastion for the precious, complex data necessary to draw meaningful conclusions about the way we live.