Big Data Analyst/ Engineer Resume
Bentonville, ArkansaS
PROFESSIONAL SUMMARY:
- Over 2.5 years of Professional IT experience in software development, big data management implementation and testing of enterprise class systems spanning big data frameworks, advanced analytics and Java/J2EE technologies.
- Good understanding in Date warehousing, Analytics, Data Visualization, Data Flows, Data Validation and Data Modeling
- Strong Knowledge of business and statistical analysis toolsets like MS Excel VBA, SQL, R and Tableau
- Good Knowledge in working with Cloudera, Hortonworks and Amazon web services
- Working knowledge of SQL server database as well as SQL commands, including read, update, create, and delete as well as complex joins
- Experience in writing SQL queries for reporting needs
- Skilled in Tableau for data visualization, Reporting and Analysis
- Experience in Data Quality Assurance processes
- Skilled in Teradata, SSIS, SSRS and SSAS using SQL Server 2008 R2
- Good Knowledge of AWS components like VPC, EC2, EBS, Redshift
- Experience working with Sqoop to export the data to relational databases for Visualization and to generate reports
- Experience working with the Hadoop ecosystem and NoSQL database
- Skilled in data transformation and analysis using Spark, Hive
- Experience working with a team that built big data analytics solution using cloudera Hadoop distribution
- Experience working with version control tools like Git, JIRA for project tracking and Jenkins for continuous integration
- Experienced working with data in different file formats
- Knowledge in unit testing such as Junit, MRUnit
- Strong understanding of NoSQL databases like HBase and Cassandra and Relational databases like SQL, IBM DB2, MySQL 5.0, Oracle 11, PostgreSQL
- Extensively involved through the Software Development Life Cycle from initial planning to implementation of the projects by using Agile and Waterfall methodology’s
- Results oriented, self - starter looking for challenges, ability to rapidly learn and apply new technologies and adapt to its environment quickly
TECHNICAL SKILLS:
Languages: T-SQL, PL/SQL, R, Visual Basic, HTML, C, C++, SAS
Database: MS SQL Server 2012, Oracle, MS Access
Hadoop Ecosystem: Hive, Scala, Spark, Cassandra, Mahout, Oozie
ETL Tools: SQL Server Integration Services, Teradata
Reporting tools: Tableau, Advance MS Excel (Macros, Pivot, VLOOKUP etc.)
Tools: /Methodologies: Power Pivot, MS Visio, MS Project
PROFESSIONAL EXPERIENCE:
Confidential, Bentonville, Arkansas
Big Data Engineer
Responsibilities:
- Involved in design and development of various Enterprise applications using Type safe technologies like Scala, Play Framework, Akka.
- Developed the file to message conversion by using tokenization, detokenization, encryption and decryption and convert them into JSON web tokens
- Experienced in handling the PCI data and pushing them to Azure cloud
- Responsible to manage data coming from different sources and validate them
- Responsible for loading data from UNIX file systems to Azure cloud(PCI to Non PCI)
- Implemented Streaming application to monitor sales control of items in all the stores across US using Kafka, Spark and Apache Flink
- Implemented Streaming application to maintain catalogue of Items using Kafka, Flink and Cassandra.
- Created Hive external tables for each source table in the Data Lake and Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
- Created REST clients to consume data from other partner applications.
- Actively participated in interaction with business to fully understand the requirement of the system
- Did testing in Development environment to ensure the required changes are acquired without disturbing the existing functionality
- Providing support to System Application Testing, Business Application Testing and Non - Functional Testing
- Exception Handling and Stored Procedures for batched code set lookups
- Designed, built and coordinate an automated build & release CI/CD process using Git, Jenkins, Nexus, and looper on hybrid IT infrastructure
- Follow Test Driven Development (TDD) methodology for developing test cases and unit testing using JUnit and Scala Mock.
- Worked with Agile software lifecycle methodologies. Create design documents when and as required. Perform coding, debugging and testing.
Environment: Play framework 2.6.16, Akka 2.0, Apache camel 2.19, Scala 2.10, Java1.8, Apache Flink 1.5.2, Apache Spark1.6, Hive, Cassandra, MongoDB, REST Webservices, Scala Mock, Looper, Jenkins, Agile Methodology, Git.
Confidential
Big Data Analyst/ Engineer- Confidential for Networking lab for Fall 2017.
- Assisted the faculty members with classroom instruction, exams, tutoring & mentoring students.
- Also worked on establishing Lab Sessions on Java.
- Provided group and individual instruction on database programming (SQL)
Confidential
Big Data Analyst/ Engineer- Confidential for Spring 2018
- Provided support to Library Managers and oversee daily activities including registering patrons, updating records, answering to reader inquiries, placing returned items on shelves, cataloguing new resources etc.
- Offered technical support for patrons and employees with computer software questions
Confidential
Big Data Analyst/ Engineer
Responsibilities:
- Worked on Big Data Integration and Analytics based on Hadoop, Hive and NOSQL Database
- Worked with highly unstructured and semi structured large data sets to aggregate and report on it
- Migrating the needed data from Oracle, MySQL in to HDFS using Sqoop and importing various formats of flat files in to HDFS
- Automated all the jobs to pull the data and load into Hive tables, using Oozie workflows
- Responsible for performing various transformations like sort, join, aggregations, filter in-order to retrieve various datasets using apache spark
- Worked on storing the data frame into hive as table using Python(PySpark).
- Experience in developing various spark application using Spark-shell(Scala).
- Involved in creating Hive Tables, loading with data and writing Hive queries which will invoke and run MapReduce jobs in the backend
- Involved in creating Hive tables, and developed Spark APIs that helps to do inserts and updates in Hive
- Worked on creating and optimizing Hive scripts for data analysis based on the requirements
- Developed Oozie workflows to generate monthly report files automatically
- Working with an agile methodology to ensure delivery of high quality work with monthly iteration
Environment: Hadoop, HDFS, Sqoop, Spark, Hive, Oozie, Agile Methodology, HDP 2.3.