- 7+ years of experience in IT, with 4+ years of experience as a Data Analyst, working to transform raw data into actionable strategic knowledge to gain insight into business processes, and thereby guide strategic and tactical decision - making.
- Worked on cloud technologies and experience in Amazon EC2 & S3 and supporting both the development and production environment.
- Experience working with RDBMS including Oracle/ DB2, SQL Server, PostgreSQL 9.x, MS Access and Teradata for faster access to data on HDFS.
- Extensive experience in Data Mining solutions to various business problems and generating data visualizations using Tableau, PowerBI, Birst, Alteryx.
- Developed solutions using Alteryx tool to provide the data for the dashboard in formats that include JSON, csv, excel, etc. Experience in working with business intelligence and data warehouse software, including SSAS/SSRS/SSIS, Business Objects, Amazon Redshift, Azure Data Warehouse and Teradata.
- Working experience in data analysis techniques using Python libraries like NumPy, Pandas, SciPy and visualization libraries of Python like Seaborn, Matplotlib.
- Worked on big data technologies like Hadoop/ HDFS, Spark, MapReduce, Pig, Hive, Scoop to extract and load data of various heterogeneous sources like Oracle, flat files, XML, other streaming data sources into EDW and transform for analysis (ETL/ELT).
- Experience in AGILE (Scrum) Methodology, participating in daily scrum meetings, and being actively involved in sprint planning and product backlog creation.
Programming Languages: Python, UNIX shell scripting, PERL, Visual Basic, T-SQL, PL/SQL, C#, Hive, Spark, HiveQL, Lua
Tools: SQL Workbench, Teradata SQL Assistant, Putty, SSIS, SSAS, SAP Crystal Reports, TOAD, SQL Developer
Big Data Technologies: HDFS, Scoop, Flume, Oozie, PySpark, Data Lake, HBase, Redshift, Kafka, YARN, Spark Streaming, ML Lib, ZooKeeper
Data Analysis libraries: Pandas, NumPy, SciPy, Scikit-learn, Statsmodels, NLTK, Plotly, Matplotlib
Data Modeling Tools: Toad Data Modeler, SQL Server Management Studio, MS Visio, SAP Power designer, Erwin 9.x
Databases: Teradata MVS, MySQLServer, PostgreSQL, Oracle12c/11g/10g/9i, MS Access 2016/2010, Hive, SQL Server 2014/2016, Amazon Redshift, Azure SQL Database
Reporting Tools: Crystal reports XI/2013, SSRS, Business Objects 5.x/ 6.x, Tableau, Informatica Power Center
Cloud Technologies: Amazon Web Services (AWS), Microsoft Azure (familiar), Amazon EC2
Analytics: Alteryx, Tableau, Power BI, MS Excel
Project Execution Methodologies: Agile, Scrum, Lean Six Sigma, Ralph Kimball and Bill Inmon data warehousing methodology
BI Tools: Alteryx, Tableau, Birst, Power BI, SAP Business Objects, SAP Business Intelligence
Operating Systems: Windows Server 2012 R2/2016, UNIX, CentOS
Confidential, Melbourne, FL
- Interact with the Technology Risk team and participate in the development of new solutions to further advance the maturity and risk reporting capabilities.
- Work with data analytics team to ingest, prepare and transform data to produce metrics that inform the technology risk posture.
- Provide insights about the weekly and monthly updates to the leadership team and gather functional and non-functional requirements and enhance the process.
- Worked on migrating data from Teradata to AWS using Python and BI tools like Alteryx.
- Automate the data flow process in the Alteryx from data sources (flat files, Postgres database) to S3 bucket using Python, SQL and Alteryx tool inbuild capabilities. Also, provide data files for the tableau reporting purpose.
- Work on the automation factory building and Alteryx server setup to improve the reporting process and enhance the customer experience.
- Write scripts to automate the data processing and access of data on AWS (Amazon Web Services) cloud process.
- Check the data and tables structure in the PostgreSQL & Redshift databases and run the queries to generate reports.
Environment: Teradata, Redshift, PostgreSQL, Tableau, Birst, SQL, UNIX, Lua, SQL Workbench, Python, BI, DWH, AWS, S3, Alteryx, Tableau
- Collaborate with business leaders for data initiatives, with focus on the use of data to optimize business KPIs such as revenue and circulation, along with the team of data professionals with specific focus on: Analytics & Insight, Data Engineering and Data Science.
- Work with data analytics team to create user groups to sell targeted advertisements.
- Used the A/B testing, multivariate testing and conversion optimization techniques across digital platforms.
- Worked on migrating data from Teradata to AWS using Python, SQL.
- Created programs in Python for automating the processes for creating Excel sheet reading data from Redshift databases.
- Created instances in AWS as well as migrated data to AWS from data Center using AWS migration services including Kinesis firehose, AWS Snowball, S3 Transfer acceleration, etc.
- Working on Multiple AWS instances, set the security groups, Elastic Load Balancer and AMIs.
- Automated cleaning and processing of 150+ data sources using Python and Informatica.
- Created Macros, to generate reports daily, monthly basis and moving files from Test to Production.
- Perform Data Analysis on the Analytic data present in Teradata, Hadoop/HIVE/Oozie/Sqoop and AWS using SQL, Teradata SQL Assistant, Python, Apache Spark, SQL Workbench.
- To extract log files data and send the data into HDFS using Flume.
- As per Ad-hoc request created History tables, views on the top of the Data mart/ production databases by using Teradata, Hadoop/HIVE/Oozie/Sqoop, BTEQ, and UNIX.
- Developed SQL scripts for data loading and table creation in Teradata and Hadoop/HIVE/Oozie/Sqoop.
- Experience in end to end BI and DWH implementation for the raw data acquisition till dashboard and KPI measurements using tools like Tableau.
Environment: Teradata, Oracle, SQL, BTEQ, UNIX, Kinesis firehose, AWS Snowball, S3 Transfer acceleration, Teradata SQL Assistant, Python, Apache Spark, SQL Workbench, Hadoop, HIVE, Oozie, Sqoop, BI, DWH, Birst, Elastic Load Balancer, A/B testing, Informatica, Birst, Tableau, HDFS, Flume.
Confidential, Madison, WI
- Worked with research teams of health services and business analysts to understand the clear overview of large claims database, e-medical records and other healthcare registry data related.
- Develop models for personalized medicine, or more effective treatment based on individual health data paired with predictive analytics for better disease assessment.
- Developed and test hypotheses in support of research and product offerings, and communicate findings in a clear, precise, and actionable manner to our clients
- Worked with Enterprise Data Management, Data Integration teams to identify, understand, resolve data issues and improve the efficiency, productivity and scalability of products and processes.
- Understanding of data management technologies that include Hadoop, Python, Hive, Spark, Flume, Oozie and cloud technologies like AWS Redshift, S3. Created EC2 instances in AWS as well as migrated data to AWS from data Center using snowball and AWS migration service.
- Extensively used python libraries like NumPy, Pandas, SciPy for data wrangling and analysis, while visualization libraries of Python like Seaborn, Matplotlib for graphs plotting. Present the dashboards using BI analytics tool like power BI.
- Worked on Multiple AWS instances, set the security groups, Elastic Load Balancer and AMIs.
- Performed Data Analysis on the Analytic data present in Teradata, Hadoop/HIVE/Oozie/Sqoop and AWS using SQL, Teradata SQL Assistant, Python, R, Apache Spark, SQL Workbench.
Environment: Teradata, Oracle, SQL, BTEQ, UNIX, Python, Apache Spark, SQL Workbench, Hadoop, HIVE, Oozie, Sqoop, Elastic Load Balancer, AMI, A/B testing, Seaborn, Matplotlib, AWS EC2, S3, Redshift, Power BI.
Confidential, Illinois, IL
- Worked with teams to identify/ construct relations among various parameters to analyzing customer response data.
- Developed and improve bidding algorithms for daily optimization using Python and continuously analyze and test new data sources. Also, perform research analysis on bidding strategies.
- Developed automated data pipelines from various external data sources (web pages, API etc.) to internal data warehouse (SQL Server, AWS), then export to reporting tools like Tableau.
- Carried out various mathematical operations for calculation purpose using Python libraries NumPy, SciPy, Pandas.
- Configured various big data workflows to run on top of Hadoop and these workflows comprise of heterogeneous jobs like Pig, Hive, Spark, etc.
- To ensure data was matching as per the business requirements and designed and deployed with Drill Down and Drop-down menu option and Parameterized and Linked reports using Tableau.
- Defined and create Cloud Data strategies, including designing multi-phased implementation using AWS, S3.
- Created views in Tableau Desktop that were published to internal team for review and further data analysis and customization using filters and actions, used KPI’s for business performance.
Environment: Python, API, NumPy, SciPy, Pandas, scikit-learn, statsmodels, Hadoop, Pig, Hive, Tableau, AWS, S3, Redshift, JSON, Hive, HiveQL, KPI, ETL.
- Involved in complete software development life cycle (SDLC) - Requirement Analysis, Conceptual Design, and Detail design, Development, System and User Acceptance Testing (UAT).
- Worked with internal DuPont Pioneer team to analyze, track and visualize the data genetically modified products.
- Identified and analyzed patterns.
- Designed experiments, test hypotheses, and build models.
- Conducted advanced data analysis and complex algorithm to help stakeholders to discover and quantify leading information using data analytics software and tools including Python, R, Hadoop, Spark, Teradata, Redshift.
- Worked with statistical analysis methods like time series, regression models, principal component analysis, multi-variance analysis.
- Designed Test system with Oracle SQL, Oracle, AIX/Unix and Unix Shell Scripts.
- Performed Data Analysis on the Analytic data present in Teradata, Hadoop/HIVE/Oozie/Sqoop and AWS using SQL, Teradata SQL Assistant, Python, Apache Spark, SQL Workbench.
- Created Daily, Weekly, monthly reports related by using Teradata, Hadoop/HIVE/Oozie/Sqoop, MS Excel, and UNIX.
Environment: Python, Hadoop, Spark, Oracle, Teradata, Redshift, HIVE, Oozie, Sqoop, SQL Server, AWS, SQL Workbench, SDLC, UAT, AWS.
- Worked with domain experts to identify data relevant for analysis with large data sets from multiple sources to understand linkages and to develop use cases.
- Assisted in the development of risk management predictive / analytical models to help management identify, measure and manage risk.
- Experience working with Technology, Operations, Legal and Business partners to successfully deliver requirements and translate business logic into functional requirements.
- Created big data clusters with big data technologies like Hadoop, Hive, Flume, Sqoop, Oozie, Python and data analytics libraries like Pandas, NumPy, Matplotlib, Plotly, to efficiently ingest, store and analyze data.
- Expertise in creating databases, users, tables, triggers, macros, views, stored procedures, functions, Packages, joins and hash indexes in Teradata database.
- Supported applications using Ticket Management System (TMS) called Jira.
- Knowledge of statistics and experience using statistical packages for analyzing datasets like NumPy, Pandas, matplotlib, etc., and automate processes using shell scripts in Unix environment.
Environment: Python, SQL Developer, Hadoop, Hive, Flume, Sqoop, Oozie, SAS, SPSS, Unix, Oracle/ DB2, Teradata, Jira, BTEQ, TOAD, PL/SQL, Teradata SQL Assistant, Hadoop, Hive, Flume, Sqoop, Oozie, Excel, SPSS, SAS, Agile, Fast Export, Fast Load, Multi Load.