Data Scientist Resume
Hiladelphia, PA
SUMMARY:
- Self - motivated, results-oriented data enthusiast with strong problem solving and analytical skills, diversified experience in Data Science / Analytics / Big Data.
- Experience in Extraction, Transformation, and Loading (ETL) of data from multiple sources like Flat files, XML files, and Databases
- Good understanding in various Machine Learning algorithms
- Experience in Data Warehouse life cycle, methodologies, and its tools for reporting and data analysis
- Good knowledge on Hadoop Cluster architecture and monitoring the cluster
- Created action filters, parameters and calculated set for preparing dashboard and worksheet in Tableau, Yellowfin
- Experience in building Hive, pig and Map Reduce scripts
- Good understanding of cloud configuration in Amazon web services (AWS)
- Strong experience working in Data Structure and Algorithms
- Strong experience in designing and developing software applications using Spring, Hibernate, J2EE, and other Java technologies
- Good Experience in complete Software Development Life Cycle (SDLC) including planning, design, development, testing and documentation
- Develop ANT scripts to build and deploy application and in Maven to build and manage Java projects
- Good knowledge with Amazon EC2, Amazon S3 and Amazon RDS
- Experience working on Web and Application servers like Tomcat, WebLogic, WebSphere
- Good experience in developing RESTful Web Services
- Experience in writing complex SQL queries across multiple relational databases
- Experience in troubleshooting and resolving complex issues in timely and efficient way
- Possess strong analytical, technical and problem-solving skills
- Strong experience working on Jenkins as a Continuous Integration Tool
- Strong experience in using Postman for API development
- Experience with unit testing using Junit, TestNG
- Strong industry experience working in an Agile environment
- Experience in developing applications using editors like Eclipse, RubyMine
- Strong working experience with SCM tools like SVN, GIT, Confidential Rational Team Concert
- Possess strong analytical, technical and problem-solving skills
- Experience in designing UML using Rational Rose, Microsoft Visio design tools
- Good knowledge in writing complex SQL queries, Stored Procedures, Views, PL/SQL and RDBMS concepts
- Experience in creating use case model, use case, class, sequence diagrams using Microsoft Visio and Rational Rose.
- Experience in design and development of object-oriented analysis design (OOAD) based system using Rational Rose.
- Excellent writing, presentation, and communicative skills, able to facilitate interdisciplinary team endeavors
TECHNICAL SKILLS:
Languages: Java, Python, R, Ruby, C++, SQL
Big Data Techniques: Map-Reduce, Hadoop, HDFS, Hive, Spark
Data Visualization: Tableau, Yellowfin
Business Intelligence Tools: SAS Enterprise Miner 9.3, Business Objects
Documenting & Reporting: Jupyter Notebook
Relational Database: MS Access, Oracle 11g, MySQL, Confidential DB2
NoSQL Database: CouchDB, MongoDB
Analysis Tools: Google Analytics, Google Ad Words
Mobile/Web Automation: Calabash, Appium, Selenium WebDriver, Cucumber
Unit testing: Junit, TestNG
Web development: HTML, CSS, JSP, JavaScript, Node.js, jQuery, AngularJS, XML, JSON
Framework/Others: Spring, Hibernate, Web Services (SOAP, REST)
Cloud Computing: Amazon Web Services (AWS)
IDE: Eclipse, Net Beans, Confidential RAD
Source control software: SVN, Git, Confidential Rational Team Concert
Continuous Integration: Jenkins
Design tools: Rational Rose, Microsoft Visio
Servers: Tomcat, WebLogic, WebSphere Application Server
Modelling Language: UML
Other Software’s: Charles, Postman
OS: Windows, Linux, Mac OS
PROFESSIONAL EXPERIENCE:
Data Scientist
Confidential, Philadelphia, PA
Responsibilities:
- Built classification and regression predictive models using advanced ML tools and platforms
- Explored and analyzed unstructured, semi-structured data for hidden trends & patterns
- Develop metrics and data streams to help product teams to better utilize signals from customer feedback to make intelligent, data-based decisions to improve products
- Conducted data preparation and built the models using Python, R
- Performed ETL using Pig, Hive and MapReduce to transform transactional data to de-normalized form
- Measure and provide suggestions on improving effectiveness of in house feedback channels.
- Analyzed customer feedback and report insights on customer satisfaction/customer experience from in house sources and social channels.
- Did performance optimization of the Pig scripts using the specialized joins, thus brought down the run time drastically.
- Performed ETL using Pig, Hive and MapReduce to transform transactional data to de-normalized form
- Identify advocates to help marketing campaigns using customer feedback
- Applied logistic regression model to predict products growth based upon usage and customer feedbacks
- Created and presented executive dashboards to show the trends in the data
- Explained analysis to top management and advised them
- Interacted with clients to understand and troubleshoot their issues
- Conducted unit testing for the development team within the sandbox environment.
- Collected data from public facing API’s, validated & ingested (Built an automated data pipeline)
Environment: Python, R, SQL, RStudio, Hadoop, MapReduce, Hive, Java, SQL, Pig, Spyder, Jupyter Notebook, Anaconda
Sr Data Engineer
Confidential, Herndon, VA
Responsibilities:
- Applied regression models on the salary data and helped HR team to predict the salary range for an employee
- Built models using Python, R and helped management in taking important business decisions
- Used MapReduce to Index the large amount of data to easily access specific records
- Performed ETL using Pig, Hive and MapReduce to transform transactional data to de-normalized form
- Involved in Statistical Analysis, Data Modeling, Design and Development for various projects
- Applying regression methods on sales data and provided analysis to the management team
- Created dashboards and reports using structured and unstructured data
- Assisted the team responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, managing and reviewing data backups and Hadoop log files
- Created HBase tables to store various data formats coming from different portfolios
- Worked with teams in various locations nationwide and internationally to understand and accumulate data from different sources
- Worked with the testing teams to fix bugs and ensure smooth and error-free code
- Created, Optimized and modified the triggers, complex store functions, procedures.
- Requirement Analysis, Impact Analysis, Coding & Unit Testing of Production Requests and Change Requests of database
Environment: Python, R, SQL, RStudio, Hadoop, MapReduce, Hive, Java, SQL, Pig, Spyder, Jupyter Notebook
Data Engineer
Confidential Framingham, MA
Responsibilities:
- Developed Python scripts to extract, transform and load data from the Amazon S3 to MySQL database
- Set up an Elastic Load Balancer (ELB) to automatically distribute incoming application traffic across multiple Amazon EC2 instances in the cloud
- Installed and configured Apache Hadoop to test the maintenance of log files in Hadoop cluster.
- Installed and configured Hive, Pig on the Hadoop cluster.
- Parsed and analyzed Splunk data using Node.js
- Developed simple and complex Map/Reduce Jobs using Hive and Pig.
- Developed MapReduce Programs for data analysis and data cleaning.
- Setup Yellowfin (BI tool) on Amazon EC2 using Yellowfin AMI and created reports using Yellowfin
- Wrote complex SQL queries for retrieving and updating data in MySQL tables
- Setup a development environment in the cloud for the team
- Created views and stored procedure in MySQL
- Built various graphs for business decision making using Python matplotlib library.
- Worked on importing and writing data to HBase and reading the same using Hive.
- Continuous monitoring and managing of Hadoop cluster using Cloudera Manager.
- Migration of ETL processes from Oracle to Hive to test the data manipulation.
Environment: Python, Node.js, Amazon EC2, Hadoop, MapReduce, Hive, Java, Amazon S3, Amazon RDS, MySQL, Yellowfin, Visual Studio Code
Software Engineer
Confidential, Littleton, Massachusetts
Responsibilities:
- Designed and implemented application components in an Agile environment utilizing a test-driven development approach
- Implemented end-to-end UI (JavaScript) and server-side code (Java) for compacting the size of Jena Triplestore
- Developed REST endpoints for cloud ready configuration and UI consumption
- Debugged and fixed bugs that were existing in the product
- Participated in code review meetings and provided suggestions to the team members
- Implemented Junit tests to improve the existing automated testing framework
- Identified defects by writing smoke tests and resolved functional tests
- Developed validations using JavaScript for an important feature in the product
- Participated in future release planning
- Created technical documents for the completed stories
- Setup multi node environment where team members could test their features during product release
Environment: Java 1.7, JavaScript, HTML, CSS, JSP, Servlets, JSON, XML, RESTful Web Services, Apache Tomcat 7.0, Junit, Eclipse, Confidential RTC, CouchDB
Assistant Systems Engineer
Confidential
Responsibilities:
- Determined optimal distribution of merchandise to stores using a Decision Support software application for JCPenney
- Resolved technical issues that were reported by users of Decision Support software application
- Identified issues in the software application and reported to the development team
- Performed smoke testing to check the functionality of important features in the application
- Improved the application performance by tuning oracle queries and introducing views when necessary
- Identified defects in the application by testing it manually
- Experience in complete project life cycle including planning, design, testing and documentation
- Coordinated between onsite, offshore and took up responsibility of sending daily status updates
- Improved the application performance by tuning oracle queries and introducing views when necessary
- Developed validations using JavaScript for important forms in software application
- Developed a module in web application using HTML, CSS, JSP, JavaScript for granting and tracking the loans for customers
- Debugged and resolved bugs in software application that were reported by the testing team
- Involved in writing SQL Queries for retrieving and updating data in tables
- Created UML diagrams based on the business requirements and shared it with the team
- Helped the team in resolving a critical production issue by framing complex SQL queries and updating them in the database which saved revenue loss to the client
Environment: Java 1.6, HTML, CSS JavaScript, JSP, Servlets, JSON, XML, Spring, Apache Tomcat, Junit, Eclipse, SVN