Building Innovation Strategy
This Project was done for our Big Data class. My partners were David Farthing, Lyndsay Richter and Kaitlyn Schroeder. Our goal was to formulate an Innovation strategy for Novartis, a global pharmaceutical giant. By applying big data techniques to the analysis of Novartis’ patent data, we uncovered key insights to drive the company’s innovation strategy and enhance Novartis’ position in the global marketplace.
The first step was to get the data from Hadoop cluster. The patent data of many multinational organizations for last 20 years was available in the form of XML files. We used shell scripting and Hadoop PIG to churn out the data for Novartis and some other competitors from the big data chunk.
Then we started analysis the Patent Titles and Patent Abstracts.
The first step was to get the data from Hadoop cluster. The patent data of many multinational organizations for last 20 years was available in the form of XML files. We used shell scripting and Hadoop PIG to churn out the data for Novartis and some other competitors from the big data chunk.
Then we started analysis the Patent Titles and Patent Abstracts.
We found different metrics like number of patents produced by year, number of patents to be expired by year, number of classes, number of patents for each class, etc. We did the same for the competitors. Internet research helped a lot.
We used regression to predict the lag time (difference between patent filing and patent granting time) for different class of patents.
We used topic modelling on patent title data to create 10 comprehensive topics among which the patents can be divided.
We used regression to predict the lag time (difference between patent filing and patent granting time) for different class of patents.
We used topic modelling on patent title data to create 10 comprehensive topics among which the patents can be divided.
We gathered drugs data, stock price data, market share data for Novartis and other competitors to aid our analysis.
Find our reports here:
Find our reports here:
![]()
|
![]()
|