I 'm Parag Chandiwal
I’m a data scientist who is ready to join the business world from academia
JourneyWhy Data Science?
Quite a number of people have asked me about my switch from software development to data analytics. How did I do it? When did I do it? Why did I do it? I felt today (May 12th, 2018) was a befitting day to answer these questions(I am graduating!) I hope sharing my story would give some insights into what I did to become data analyst and encourage budding "anythings" everywhere to pursue their passion fiercely.
My first exposure to data was from a project that had nothing to do with data.
While working on a project for Hinsdale Psychiatry, I noticed how "Patient Healthcare Questionnaire" was used by practitioners to diagnose likelihood of common mental disorders. To put it lightly, I was mind blown and had to find out more. This discovery came at the tipping point for me because at the time I was in my Bachelor's final year. This realization also made me open new challenges and pivoting career-wise. Data Science seemed to fit right into that.
I created my first learning curve from an answer on Quora.
I found a very helpful answer which I recommend to this day for anyone to start out in data. How can I become Data Scientist? This answer shaped me my first learning path in January 2016. I enrolled in a Master's Program at Illinois Tech, Chicago in ITM Data Management. My graduate school study centered around the areas of Object-Oriented Programming, Advance data analytics & warehousing, automation and backend engineering.
How I started my blog — where the real learning started
By end of 2017, I had slowed down on online courses because 90% of the courses had the same content and assumed you’re a beginner so it became a bit repetitive. By this time, I felt I was ready to start doing personal projects using a blog. Being a painter, I knew creativity is not some talent that you either have or don’t. Creativity is born of experience and confidence in your skills because the possibilities of what can be done expand with the more you know.
ProjectsWhat I have been upto?
Oscar Science!Oscar Prediction Project
Predicting 2018's Win
Rather than relying on gut instinct, I've used the power of data science to help select the film that is most likely going home with the famous gold statue on March 4th.
To understand my methodology, one must first understand a concept called “supervised learning.” Supervised learning is a machine learning concept that allows us to understand the relationship between one output and a lot of inputs. In this case, it helps us understand past outcomes (who won Best Picture previously and why) so we can better predict future outcomes (who will win this year and why). From critic ratings to performance at precursors, I used publicly available data set from Thinkful. This data would help inform our algorithm which was built using SciKit Learn, one of the most popular learning toolkits in the world.
Through evaluating multiple models, I determined that random forest classification provided the most accurate prediction of previous Oscar winners. When applied to Oscar winners and losers over the past 38 years, this approach made correct predictions in all but 1 year, 2017.
And for 2018 model predicted
And the Oscar Goes to..
The shape of water
See complete code on my
Working in STEM!Predicting likelihood of working in STEM w.r.t. Gender and Race
Predicting likelihood of working in STEM w.r.t. Gender and Race
The American Community Survey (ACS) is the largest continuous household survey in the United States, providing a wealth of information about the economic, social, and demographic characteristics of persons, as well as housing characteristics. Primary Objective of this project is to examine as many data variables as possible to examine factors affecting individual’s income and to provide granular snapshot into the lives of many Americans.
The goal of this project is to build a model to predict likelihood of working in a STEM (Science, Technology, Engineering, and Math) career based on basic demographics: Age, sex, race, state of origin.
I've created two logistic regression models. The first one models the likelihood of an individual having a degree in science based on their demographics. The second one models the likelihood of an individual with a science degree getting a job in a STEM field, based on their demographics.
With these two models, I provide a high level overview of which demographic features are most likely to influence disparities at the level of education, and which are most likely to influence disparities at the level of hiring.
1. Underrepresentation of certain races exists at both the level of education and the level of career placement. However, the effect of underrepresentation in education seems to be much greater. In the first chart, Asian Americans more than twice as likely as Whites to have a degree in STEM, and Black, Puerto Rican, and Mexican Americans less than half as likely as Whites to have a degree in STEM. For an underrepresented minority who wants to work in STEM, the biggest part of the hurdle is getting a degree in STEM.
2. The gender gap in having STEM degrees exists much more in older Americans, while younger women seem to have closed the gap. However, in STEM careers, the gender gap appeared to be increasing. Although women are now earning STEM degrees at a rate equal to that of men, women with STEM degrees are still far less likely than their male counterparts to find careers in STEM.
See complete code on my
Chi-Town CrimeTableau report of Chicago's Crime
Visualization of crimes in chicago using Tableau
Chicago has often been in the national headline for fluctuation in violent crimes. My interactive tableau report will help to visualize and confirm the uptick in crime rate and where/when a crime is happening in Chi-Town.
The DataI used the Chicago crime dataset from Kaggle. The data included information such as date/time when and where crime has happened, type of crime and location coordinates.The dashboard consists of three main components -
1) Heatmap (map of Chicago that visualizes the density of crime in different areas by charge)
2) Time(Analysis over time)
3) Type of Offence/Crime
ExperienceMy education and experience
PartnerInterested in collaborating with me?
SkillsQuestions, Questions, Questions
My graduate school study centered around the areas of OOP, Advance data analytics & warehousing, automation and backend engineering. It was the two years I spent at Illinois Tech that instilled in me an appreciation and excitement for the process of scientific research and the underlying mechanics; hypothesis formulation, study design, statistical programming, data collection, data analysis, and result presentation.