- Ability to deliver valuable insights via analytics, machine learning, and other advanced data - driven methods.
- Expert in Data Modeling, Relational Database Design and data analytics. Experience in manipulating, analyzing large datasets and finding patterns, insights within structured and unstructured data.
- Proficient in using Unix based Command Line Interface. Experience in developing Spark Applications using Scala and Python along with Spark RDD, Spark-SQL.
- Hands on knowledge in ETL processing, building data pipelines and data warehousing techniques. Designed AMI images of EC2 Instances by employing AWS CLI and GUI.
- Experience managing multiple projects simultaneously. Experience mentoring and training developers. Deep understanding of processes in various domains such engineering, financial and healthcare.
- Consulted with business partners and made recommendations to improve effectiveness of reporting/Big data systems. Integrated new tools and developed technology frame works/prototypes to accelerate the data integration process and empower the deployment of predictive analytics. Experience in HealthCare.
- Designing, reviewing, implementing and optimizing data transformation processes. Able to consolidate, validate and cleanse data from a vast range of sources from applications and databases to files. Experience designing and implementing fast efficient data acquisition using Big Data Processing techniques and tools.
- Deliver exceptional rather than expected results through strategic thinking, innovative problem-solving and managing teams/change for performance excellence. Self-directed, disciplined, hardworking, detail oriented, flexible, and confident. Self-motivated Team Player with excellent Interpersonal and Communication Skills.
Programming and database: SQL, Python, R, Core Java, MySQL, Oracle, SQLite, PostgreSQL
Statistics: Exploratory data analysis, Hypothesis test, T-test, F-Test, ANOVA, Chi-squared test, Regression test, Significance and confidence interval
Tools: Power BI, Tableau, AWS, Advanced Excel, Git, SAS, Visio, TensorFlow, SSIS, ETL
Data Mining/Extraction: NumPy, Pandas, SciPy, sklearn, Keras, Feature selection, Feature extraction
Hadoop Ecosystem: Spark, Hive, HDFS, Kafka, Sqoop, oozie, Pig, HBase, Zookeeper, Yarn
Cloud Services: AWS, S3, AWS CLI, RDS, CloudFormation, Code Pipeline, SQS, SNS, Sage Maker
OS and Scripting: xml, JSON, Unix shell scripting, Windows, Linux
ML Algorithms: Regression, SVM, Decision Tree, Random Forest, Bagging & Boosting, KNN, clustering, Neural Network.
Confidential - Minneapolis, MN
- Analyzed trends from Tableau dashboard, made updates to data quality, and reported the results.
- Creating calculations like string manipulation, arithmetic calculations, custom aggregations and ratios, logic statements and quick table calculations.
- Used statistical techniques to describe data using groups, bins, hierarchies, sorts and filters.
- Extract useful data and manipulate data from databases using SQL. Worked on cleaning and preparing data for analysis and visualization.
- Optimized slow running SQL queries by performance tuning and use of proper indexing strategies.
- Design data visualization/interactive dashboard, trendline analysis using Quick Sight.
- Worked on analyzing IoT sensors data and suggested best suited ML Algorithms to build models.
- Generously practiced on data cleaning and ensured data quality, consistency, integrity using Pandas, NumPy.
- Performed univariate and multivariate analysis on the data to identify any underlying pattern in the data and associations between the variables. Performed data imputation using Scikit-learn package in Python.
- Participated in features engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn preprocessing.
- Worked with Machine learning algorithms like Regressions (linear, logistic), SVMs and Decision trees.
- Used the AWS Sage Maker to quickly build, train and deploy the machine learning models.
- Gathered and translated business requirements into technical requirements, delivered precise functional specifications to include functional hierarchy, workflow, definitions, and outstanding issues while considering all impacted components from an end-to-end perspective.
- Extracted data to analyze revenue, sales, profit from different data sources like MSSQL and applied business-related transformation and loaded it back to serving/destination source like Postgres.
- Worked on data cleaning to support developing forecasting models .
- Created SSIS Packages to extract data from Excel Files, MS Access files using Execute SQL Task, Data Flow Task, Execute Package Task etc to generate underlying data for the operational and sales performance reports
- Exported cleaned data from different database sources to CSV files to business and reporting teams.
- Integrated custom visual reports based on requirements using Power BI.
- Developing an ETL framework using Python on Spark framework (PySpark) which could handle millions of healthcare customer’s records.
- Design and development of a Customer Data Analytics using Spark which can help the business make better decisions on enrollments.
- Creating Hive tables and working on them using Hive QL.
- Automating of data supply to many applications which is used during OEP (Open Enrollment Period) using Stored Procedures in SQL.
- Design and development of cross domain multiple metrics data loading and transformations using Spark to deliver data to POSTGRES.
- Improved the performance and optimization of existing jobs using Spark-Context, Spark-SQL and Spark YARN.
- Performed as a developer in an agile environment. Created databases, tables via migrations, CRUD operations in Django framework of Python.
- Gathered requirements, created Sprints, and conducted Scrum meetings planning and coordinating with team. Provided continued maintenance and development of bug fixes and patch sets for existing application.
- Performed a lead role in planning of the database structure and end user options for users in dashboard. Confidential .com which was used by admins to add, edit and delete posts.
- Dealt with Rest APIs, routing and designed dynamic pages in angular and Aurelia framework.
- Interacted with Digital Marketing Team to create SEO friendly pages with metatags and maintained the required SEO Standards. Performed unit testing of the modules before pushing to the staging server.
- Designed and created Analytics Dashboard for the admin to view. Leveraged knowledge of GitLab version control and AWS cloud storage.
- Worked on full stack development of the website Confidential .com which is a social media platform for schools.