Big Data is the Wind Beneath the Wings of Artificial Intelligence

Artificial Intelligence has been around as an academic discipline since the 1950s. And a number of the machine learning techniques that are in vogue at the moment are a product of the 20th century. So why is this just now a big deal? Largely because it wasn’t easy to build a whole lot of real-world applications with the limited ability to process information. We’re talking about an era when access to computing power was limited to a few well-funded corporations. And “large-scale data” was in megabytes, not terabytes and petabytes.

The ability to store, move and compute a large amount of data is what lends a whole lot of predictive capability to the algorithms. Pete Johnson, who leads Big Data and AI Initiatives at the global financial powerhouse Metlife, corroborates this view and cites three significant developments enabling AI in enterprises:

  1. Previously expensive hardware has now become a commodity, particularly when using the cloud
  2. Integration of multiple sources of data – visual, textual, structured and others – enhancing the richness of information
  3. Scaled up algorithms and techniques like Deep Learning have enhanced our analytical ability

To these, I’d add two recent developments: the prevalence of GPUs (Graphics Processing Unit) and “everything as a sensor”. With its massively parallel processing power, GPUs are the rocket fuel that allows big data to be crunched quickly and effectively. And with every machine that we interact with capturing information about us, we’re producing data at an unprecedented rate. Our phones are always with us, and we provide everything from behavioral data, purchase patterns, interests, political affiliations, and demographic information through every click on our computers. It comes as no surprise then, that in a survey of executives from the largest technology companies in the world by New Vantage Partners, 98% of leaders say that they are investing heavily in AI and Big Data, and three-quarters of the respondents indicated that the availability of larger and more frequent sets of data was driving AI and cognitive learning investments.

Hand-in-hand with the giddy anticipation comes some sobering concerns as well. More data means a greater need to secure it all. And the more advanced our algorithms get, the harder it gets to understand what’s happening under the hood. And how do we get the people in our organizations to stay up to date? These are some of the big and urgent challenges facing our companies as we jump on the AI & Big Data bandwagon.

Data Privacy and Security: We’re collecting data everywhere. And most of us are either collecting or releasing data that we don’t even realize. Do we know which sites use cookies, which apps track our location, and how long our financial and health information is stored (and how securely)? Probably not. We’re at the infancy of privacy and security at scale. Enterprises have historically dealt with these issues by building fortresses within which the data is contained, sheltered from prying eyes. But as even enterprises start to embrace the cloud, APIs, interconnected services and SaaS, this is going to get harder and harder.

Interpretability of algorithms: The power of machine learning models lie in their ability to take big data as input, and produce predictions through a series of convoluted steps. The more data we throw at this process, and the more we tweak the models to produce better results, the harder it gets for us to understand what exactly is going on. This might be the reason stringent financial regulators are wary of certain models when making customer or financial decisions. They are wary of a number of real problems with the outputs of these models – including bias and victimization. There is more research now around interpretability of models, but product designers, analytics professionals and business leaders must all become as aware of the dangers of this technology as they are of their power.

Data-Driven organizations: Perhaps the biggest adaptation that companies have to go through is one of educating their teams. With this level of attention, the executives responding to the New Vantage Partners survey are unanimous in their agreement that data-centricity needs to permeate every level of their organizations. AI and Big Data cannot be things that small, isolated teams work on and worry about. Everybody needs to be aware – so product designers, marketers, lawyers, and leaders are all thinking about these issues every day.

“With great power comes great responsibility”. Never has this adage been truer than in the case of mass adoption of Big Data and AI. It’s’ opening hitherto unknown doors to us, and allowing us to mine an incredible amount of value for companies and individuals alike. I hope we also take the time to educate ourselves and learn to be responsible in the use of these powerful technologies.

7 Things You Need to Know About the IT Slump

Harish Subramanian, Director, Growth & New Product Development at Great Learning recently held a webinar on ‘How To Make it Through the IT Slump on Your Own Terms’. A Kellogg alumni with vast experience in international and national corporations, Harish explains how to overcome the IT slump and come up to pace with current requirements.

With the fast evolving IT landscape, it is important to de-mystify the factors behind the transitions in IT jobs and explain how to make it through the slump on your own terms. Having personally transitioned and reinvented himself to adapt to changing times, Harish’s webinar was about not just surviving the job slump but navigating through it to come out on top.

    While there is a definite cut in jobs, NASSCOM predicts that the firing is being offset by hiring of resources with higher skill sets. As we all read about a “doom & gloom” scenario in newspapers, Harish explains what these reports really mean. Meanwhile, here are some important numbers you need to know:
    01 harish - Copy
    Harish opines that this layoff scenario has been in the offing for a long time. Some of the factors which have contributed to this situation are

    • The macro-economic scenarios such as tighter immigration rules, protectionism, and cautious moving economies. He says that since most business comes from abroad, we must wait and watch how the situation develops.
    • Another fundamental shift in the IT trends is the requirement for advanced tech instead of process driven IT functions.
    • The slack in the system is being cut, such as the deep benches kept in anticipation of excessive work will be cut down.
    • The most well-known cause of the layoff, automation, is making several tasks machinery driven.
      02 harish
    Yes, there is a way out of the IT job crunch! Upskilling is the favored answer of industry experts who are urging professionals to look towards the future and then assess their current skill set.

    • Look towards gathering skills in robotics, process and testing automation as automation has been historically occurring in all industries and processes from kitchen tools to transportation and even technology.
    • As mechanization and automation increases in jobs, gain expertise in multiple skills. It is important to be able to do collaborative problem solving, with an understanding of various problems.
    • Every professional now needs to claim their work by developing a body of work. Share your projects and prove your mettle rather than just claiming expertise.
      03 harish
    Emphasize on your individual story. Rather than fitting into checklists that are the norm, highlight the skills that are your strength. Focus on what sets you apart and how it can bring value to the company. Your narrative needs to be about how you are unique and irreplaceable and should generate interest in you as a prospect.
    04 harish
    With an eye on the future, learn skills that will be relevant 5 years from now and can keep you two steps ahead of the competition. These include:

    • Internet of Things.
    • Cyber Security
    • Robotics
    • Big Data Analytics
    • UI/UX
    • Think of progressing seamlessly into careers that are extensions of your current skills.
      05a harish
    The current transition towards advanced skills is not the first or last shift in IT and it is important to learn skills every few years. Everything that we’re learning now will be automated in a few years, hence it is important to keep upscaling your knowledge base.
    05 harish
    Develop a reputation in your field by contributing and engaging with relevant communities. An excellent way to do this is to become a part of open source projects. You can check out websites like to get started. This can also help you to develop a reputation and network with like-minded people.
    07 harish

Watch the webinar here.

How Can You Fix Your Tech Skills Gap?

The recent IT layoffs are a testament to the sharp shift that the technology sector is experiencing in terms of jobs and skills. Read on an article by Harish Subramaniam, Director – Business Development at Great Learning, on “How to fix your tech skills gap?”. In the article published by Financial Express, Harish talks about the urgent need for employees to reskill themselves with new age skill set such as analytics, big data, cloud computing, & machine learning. Harish also mentions the experiences of a few Great Learning students.


Most of us are inextricably tied to our identities as a ‘mainframes guy’ or a ‘Java Developer’. It’s what we know, we’re proud of it, and we’re very, very comfortable. Until we are not.

The tech skills gap seems to have snuck up on the IT industry as a whole, but it shouldn’t have. The writing’s been on the wall for a long time. Automation is the most commonly attributed cause, but it’s likely only the last straw. No doubt IT firms’ business model is changing fundamentally – with a larger proportion of business likely to come from advanced technology solutions rather than the bread-and-butter process outsourcing. At the same time, the wage arbitrage opportunity has been shrinking over the last decade, and a vast majority of our ‘client countries’, including the US and the UK, have been kicking up an anti-globalisation, protectionist storm. In light of such macroeconomic uncertainty, companies have unsurprisingly taken a ‘wait and see’ approach.

All this while, companies have maintained rather deep benches with hiring and training contingent on the pipeline of future projects and project extensions. And ‘bench strength’ is the first area of reduction, especially when this contingent talent isn’t trained areas that are increasingly most pertinent – analytics, big data, cloud computing, machine learning and the like. This seems patently unfair. You learned what you were told to learn by your firm, and you got really good at your job. Now that’s not enough?

Fear not, though. Competing for a shrinking number of jobs in your area of expertise isn’t your destiny. Nasscom believes that about 40% of the tech and BPO workforce in India need to reskill themselves over the next five years. In a sector that employs an estimated four million people, that’s 1.6 million who need to learn these skills. So, what exactly should you be learning?

So apart from getting skilled in the ‘coming wave’ of technologies, what else can one do? Reinventing yourself takes a lot more than just taking a course of two. For starters, build a ‘body of work’. Employers now need to be shown, not told, what your skills are. So, set up that GitHub account, post your projects there, develop an opinion, and create a portfolio rather than a resume. Even more importantly, learn to learn – so you’re not in this situation again. The cycle time of technology advancement is only getting shorter, and companies face less of a hurdle in the adoption of new technology solutions. So, you need to stay on top of every ‘coming wave’.

Ok, the learning part makes sense. But what do you do with all this new knowledge? How do you reinvent yourself when you’ve built a decade-long career in one area? Let’s take a look at some of our Great Lakes PGP-BABI alumni who have blazed a trail for us.

  • Move within your organization
    Kiran Jangeti (PGP-BABI ’14) spent over 15 years at Value Labs, when he felt the need to upskill himself. But with all this experience, the logical place to transition into a career as an analytics leader was at his existing organization – where he had a working history, a track record, and trust. He didn’t need to prove his ability to deliver. Instead, he only had to augment his repertoire with a set of analytical tools and techniques. Not only is Kiran now the AVP of Global Delivery, he also met some impressive students in his class who he ended up hiring.
  • Build on what you are good at
    With 18 years experience as an IT manager in the financial services domain, Vilas Wakale (PGP-BABI ’15) wanted a change. Having learned analytics, visualization, modeling and a host of new skills, Vikas seized the opportunity to set up his own consulting practice. In this capacity, he’s been able to combine his newly acquired skills with nearly 2 decades of experience in financial services and IT projects. This complementary combination has helped him work with the Ministry of Home Affairs in a multi-disciplinary data management role with an emphasis on banking.
  • Start up
    Having worked for a decade as an engineer and manager, Gayatri Sukumar (PGP-BABI ’15) looked to start a venture of her own. While it takes some convincing for a company to trust your newfound skills, you don’t need to convince yourself! Her tenacity in applying these new skills to an area of passion – education – led to the birth of her company, Latitude Analytics.

Reinvention isn’t an overnight process. Learning new skills needs to be reinforced by the humility to accept lateral or even lower-grade assignments. A beginner’s mindset certainly helps. And above all, you need to back yourself to succeed.

Further Readings:

How IT Professionals Can Prepare Themselves to Deal with Layoffs

IT Layoffs: The Hire and Fire Scenario

Take These 4 Steps to Survive the IT Layoffs

Will 2017 be the year of virtual reality?

Will 2017 be the year VR lives up to the hype and becomes ubiquitous? Probably not. It’s a very young technology, and the ecosystem isn’t robust enough to have hit the tipping point. So, does it have any future? To answer this, we need to first understand where VR stands in the grand scheme of technological advancement.

Nearly every tech follows some approximation of the hype cycle. First, it’s an emerging technology with little immediate applicability but immense potential. Next, every conceivable use is listed out and the total potential added up to arrive at an absurdly large dollar figure. This leads to companies everywhere forcing it upon their products and customers with limited success. This failure to match the hype causes a ‘bursting of the bubble’. This is when things get really interesting.

In this post-hype valley, truly persistent companies and experts stay to build solid tech experiences. Prices are driven down by incremental advances. An ecosystem of developers and consumers starts to build products that materially make lives better for specific groups of people. Niches are found, and robust applications are built for a set of ‘beachhead markets’.

In the VR market, Oculus, HTC, Sony, and Google have launched headsets with a limited set of apps. While the experience is still clunky – VR sickness, discomfort, and visual glitches being quite common, it is getting closer to ‘consumer-grade’. Simultaneously, Nvidia and AMD’s GPU chips are advancing rapidly, and VR could get much better, much sooner than expected. Without the unrealistic expectations of ‘VR everywhere’, the right companies (Magic Leap perhaps?) will start to make significant advancements somewhat under the radar.

Apart from the quality of tech, its versatility could determine when it moves from a ‘nice-to-have’ to a ‘must-have’. Much like the HD/4K TV – which worked just fine with SD content and gave you the option to make the most of better content as and when it became available – VR hardware is more likely to be a household device if it can do other things while waiting for the apps to catch up.

shutterstock_413186407 - Copy

Breakthrough VR applications are most likely to come from video gaming and immersive entertainment, with healthcare, defense and possibly education following closely behind. As a learning tool, VR will find most value where the cost of ‘learning by doing’ is highest. Disaster recovery, aircraft maintenance, rare & complex surgery – are obvious candidates. The next wave could bring students immersive experiences that are expensive, like international trips to learn languages and museum visits. By contrast, do we really want to see a professor stand in front of a classroom in fully immersive 3D? Not likely. Whichever way you look at it, the incremental value needs to justify the cost (in dollar terms and in terms of the adoption curve). Some really interesting companies working on educational VR include Alchemy and Discovery, who have a rich content library, and they could also tap into the potential to transform learning of topics that are just so much better when viscerally driven home through VR.

2017 may not be the year VR goes mainstream. It may take a couple of years more before that could happen. But following the disillusionment of 2016 and the subsequent shift of focus to other areas as the ‘next big thing’, VR might finally have the opportunity to fly somewhat under the radar and make serious inroads towards becoming an affordable, reliable, quality technology that just works! And then, we may just see it live up to its potential.

Hiring without the Hyperbole


Amidst all the job listings looking for ‘rockstars’, ‘ninjas’ and ‘superstars’, I’m left wondering where the rest of us fit in. If everyone needs a rockstar, where does the diligent, yet perfectly ordinary person fit in? In all the talk of A players hiring other As, and Bs hiring Cs, we’ve devalued what it means to be exceptional. No surprise then that 90% of job seekers are above average.

You risk missing out on half the population

Men and women view their own abilities differently. A number of studies have shown what many of us always suspected – that women suffer disproportionately from the ‘imposter syndrome’, the belief that their success is undeserved and that they’re on the verge of being found out for their true (lack of) abilities. Women are less likely to describe themselves as exceptional performers and tend to judge themselves more harshly when evaluating their fit for a role. Men, on the other hand, rate themselves much higher in their ability to do (and succeed at) the job.

You risk turning off true rockstars

“The more you know, the more you know you don’t know”. This aphorism rightly captures the essence of the Dunning-Kruger effect. Across four studies, Kruger and Dunning found that the “not only do the unskilled reach erroneous conclusions and make unfortunate choices, but their incompetence robs them of the metacognitive ability to realize it”. Most of us want to attract the truly thoughtful, diligent and self-reflective of employees; and not the shallow self-publicist. Why turn off our target audience at the very outset?

You probably don’t need a rockstar for that

For a number of companies, especially the well-funded behemoths like Google, Facebook and Apple, money is no object. Talent begets success begets more talent. They’re caught in a virtuous loop. For the vast majority of companies, this causes a real problem. Salaries are inflated, as are expectations of perks, stock and growth. But here’s where you get to define your parameters.

Do you really need that 22-year-old genius/olympian who can program in 5 languages, speak in 5 more and effortlessly deliver a TED talk in one take? Chances are, you don’t. Why then, do we fill our job descriptions with unnecessary but ‘nice-to-have’ traits with no regard to practicality? Why don’t we spend an extra few minutes thinking about what we REALLY want them to know/do and who we’d REALLY like them to be?

Hiring is hard enough without the pressure to find the unicorn. Mistakes when hiring people are among the most expensive mistakes a company can make. It not only takes an inordinate amount of time and money to fix a mis-hire, but it also creates a lot of stress within the teams into which people are hired. All the while, competence, cohesion, openness and a willingness to learn often count for a lot more than people give them credit for. For the sake of our sanity and of creating a more realistic, representative marketplace for skills, let’s end the hyperbole.


Education: Excellence for few or access for many?


Singapore ranks #1 in the global PISA rankings.

Just four US universities (Harvard, Columbia, Chicago, MIT) have produced over 400 Nobel laureates.

About 82% of Danish citizens are enrolled in post-secondary education, while over 2% of the US (over 6 million people) can call themselves doctors – either medical or through a PhD.

So what’s the right measure of a country’s educational prowess? Which of these countries offers the ‘best’ education?

In his great new podcast, Revisionist History, Malcolm Gladwell offers a strong vs. weak link framework to understand advancement and higher education. To use a sports analogy, a strong link sport is one where one superstar or high-performing individual can often influence outcomes, as in the case of basketball. A weak link sport is one where an above average team with no tangible superstars will often beat a mediocre team with one superstar. Leicester City’s league win in last year’s English Premier League is one such example. Gladwell uses this framework to explain why the Industrial Revolution gained momentum in England – where a large number of common folk were proficient tinkerers – rather than in France or Germany where the elite few had attained remarkable heights. He proceeds to compare Stanford – a strong link university – with Rowan University, a small and deliberately weak link university in New Jersey.

What constitutes a good education? And as an educator, should you build strong link systems or weak link institutions? In other words, do you focus on access or global excellence? Do you build institutions that enable access to the largest slice of the population or focus on the few most likely to succeed and build a truly cutting edge system?

The tempting answer is ‘both’. Where possible, we should combine breadth with depth. But let me throw another complication to the mix – what in Mathematics is often called initial conditions.

See, the initial conditions – the starting point – aren’t the same for all countries and communities. We don’t have the same populations, resources, values and levels of homogeneity. Here in India, we’ve made a concerted effort to build strong link institutions – IISc, IITs, IIMs and AIIMS. We are justifiably proud of graduates from these institutions, and these graduates occupy a majority of the senior leadership positions in academia, business, medicine and even startups. But we don’t seem to be doing as much about wide-reaching access to good quality education. I’m not talking about nominal access – on paper. I’m talking about access to the kind of education that can help you learn and even master topics, get a good job, perform well in that job, and build a strong career. Millions need this kind of education, the kind of education that launches a million careers in a thousand companies.  We aim to be just this kind of an institution, but I’m sure we are not alone in this endeavor.

If you know of transformative weak-link educational institutions that improve access to high-quality education in India, please write to We would love to work with like-minded people to improve learning outcomes.



Public Data: A Data Scientist’s dream

We’ve all heard how data science will transform (if it hasn’t already) the business landscape, touching everything from our supermarkets to our hospitals and our airlines to our credit cards. Most companies in the areas of data science use proprietary information from millions of private transactions to gain insight into our behavior that in turn allows these companies to turn a profit. However, if you are an amateur data scientist, a hobbyist, a student or a data-minded citizen, this information is typically off limits. And a simulation just isn’t enough because it doesn’t meaningfully replicate the complexity and multi-dimensionality of this data.

Public Data Sets

How about all the publicly available information though? Now, here’s an underused treasure trove for data scientists. Concerns about the quality of data aside, open data provides unparalleled opportunities. There are typically no usage restrictions for data in the public domain, and stitching together disparate sources of data (census, crime records, traffic, air pollution, public maps, etc.) gives you the opportunity to test interactions between various data sets. Possibly the most complete list of public datasets is available at this GitHub page.

Notice I said ‘concerns about the quality of data’ in the previous paragraph? That can be a massive problem. The biggest impediment to the use of public data is the lack of reliability of data. Often, the data sets are poorly indexed or incomplete. But even more commonly, these public stores of information are stored in formats that are incompatible with data wrangling. Scanned documents and hand-written ledgers don’t lend themselves to easy analysis. So, a large part of public data projects ends up being a transcription effort. Web scraping, dimensionality reduction, imputation, bias removal and normalization are all skills that a data scientist needs to develop when working with public, open data.

Where is all this public data?

Of course, there are some extremely powerful sources of public data with somewhat clean, reliable and ready-to-use data as well. For government and public sector data, the first port of call is India’s Open Government Data Platform, which includes robust data on commodity prices, national statistics, company registers and even government budgets. Macroeconomic data is best sourced from the World Bank or from Google’s Public Data Explorer. The Public Data Explorer stitches together information from a range of sources (IMF, World Bank, OECD, University libraries and even Google’s own data collection efforts), and contains some slick, interactive visualization.  A variety of other interesting sources of data include Reserve Bank data for bank, forex and CPI information and Bhuvan, ISRO’s geo-platform for geographical data.

Recognizing just how time-intensive and complicated data cleaning and collation can be, there are some interesting companies that focus on getting you clean data sets. Not surprisingly, they focus on the most immediately lucrative sector – finance.  Quandl provides some intriguing financial data sets for free, including the renowned Big Mac price index, and all the data is designed to be easily imported and ready for use in minutes. Another company challenging the traditional (paid) data powerhouses is StockTwits. Their API allows you to get real-time data for free all day, every day. If you want historical data (going back about 3-5 years), numerous users have downloaded using StockTwits and created data sets that you can easily repurpose.

Getting competitive

If you’re the sort who likes a competitive challenge rather than tinkering with datasets by yourself, there are some wonderful competition platforms that make public datasets available with a well-defined problem statement. The first port of call is Kaggle, whose competition problems include Flu Forecasting and Automated Essay Scoring. Kaggle also comes with a set of very interesting data sets for the self-driven data scientist. Driven Data is another such platform albeit with a limited selection of competitions.

Once you’re ready to meet and work meaningfully with others interested in data-driven solutions to social problems, you can seek out global movements like DataKind. Their efforts range from weekend marathons to long-term cross-sector engagements. Earlier this year, DataKind’s Bangalore chapter created a tool to help you understand various aspects of the Union Budget for 2016-17. The source code is public and entirely open to being repurposed for use on any other data set. There are also academic paths to learning and collaboration in data science – the most prominent of which is the University of Chicago’s Data Science for Social Good fellowship.

Public datasets offer the best opportunity to learn, experiment and produce valuable analytical insights to benefit society. In a world where data is an increasingly valuable currency, these public data sets are perhaps the last bastion for the precious, complex data necessary to draw meaningful conclusions about the way we live.