Bigdata Engineer Resume
St, LouiS
SUMMARY:
- 6 years of total IT experience, with over 2 years of experience in all phases of Hadoop and 3+ years of experience in development and support of database applications using SQL/PLSQL.
- Functional Experience working in domains of Insurance Services and Financial Data Services
- Expertise in data management and implementation of Big Data applications using Hadoop framework.
- Involved in designing, developing and implementing connectivity products that allow efficient exchange of data between our core database engine and the Hadoop ecosystem.
- Excellent understanding / knowledge of Hadoop architecture and various components such as Hadoop 2.0+, HDFS, MapReduce, YARN, hive 0.12+, HBase 0.98+, Sqoop 1.4.2+, Spark 1.4.0+, Kafka 0.8.1+, Zookeeper 3.4+ and Oozie 3.3+
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like HBase, Oozie, Hive, Sqoop, Pig, and Flume.
- Hands on experience with the Spark Core 1.6 , Spark SQL 1.6 and Spark Streaming 1.6 for complex data transformations using SCALA.
- Experience in analyzing data using HiveQL, Pig Latin, HBase and custom MapReduce programs in Java.
- Extending Hive and Pig core functionalities by writing custom UDFs.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Excellent understanding/knowledge in Core Java, Data Structures, Algorithms, Object Oriented Design (OOD) and Design Pattern
- Experienced in Database development, ETL, OLAP and OLTP
- Hands on experience in automation testing using selenium
- Worked on Build Management tools like SBT , Maven and Version control tools like Git .
- Experience in working with Continuous Integration (CI) tools like Jenkins .
- Worked in a Test-driven development (TDD) environment using Junit, ScalaTest, Hive Runner.
- Experience in Agile Engineering practices.
- Techno-functional responsibilities include interfacing with users, identifying functional and technical gaps, estimates, designing custom solutions, development, leading developers, producing documentation, and production support.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
TECHNICAL SKILLS:
Big Data Ecosystems Tools: Apache Hadoop 2.6.0, MapReduce v2, HDFS, Jenkins (Continuous Integration), SVN, Git Hive 0.14.0, Pig, HBase, Zookeeper, Sqoop \ (Version Control), JIRA1.4.6, Kafka, Oozie, Impala
Cloudera Distributions Spark Components: CDH 5.8, Spark Core 1.6.0, Spark SQL, Spark Streaming
Languages Databases: Scala 2.11, Java 8, C++, Perl, Shell Scripting, Oracle, NoSQL (MongoDB, Cassandra, HBase)
Cloud Platform Web Development: Amazon Web Services (EC2, S3), HTML, CSS, XML, JavaScript
Methodologies: Agile Scrum
PROFESSIONAL EXPERIENCE:
Confidential, St Louis
Bigdata Engineer
Responsibilities:
- Developed real-time data pipelines with Kafka 0.8 to receive data from various financial services.
- Configured Spark Streaming 1.6 with Kafka for building the common learner data model which gets the data from Kafka in near real time and persists into Hbase.
- A developed Scala 2.11 script, UDF’s using both Dataframes/SQL/Datasets and RDD/MapReduce in Spark for Data Aggregation and queries.
- Integrated HBase with Hive 0.13 and wrote HiveQL for data transformations.
- Transferred data from Hive to Tableau and created visualization reports for Business Intelligence requirements.
- Developed workflow in Oozie 3.3 to automate the tasks of loading data and running scheduled batch jobs.
- Performed unit testing for Spark and Spark Streaming with ScalaTest and JUnit
- Used SVN for version control, JIRA for project tracking and Jenkins for continuous integration.
- Automated shell scripts for collecting logs related to cluster.
Environment: CDH5.3, Flume, MapReduce, Hive, Pig, NoSQL, Flume, Kafka, REST API, Scala and Java EE, Spring, Hbase, Ibatis, Oracle 11g, JIRA, HiveRunner, MRUnit, pigUnit, SVN, Junit, ScalaTest
Confidential, St Louis
Bigdata Engineer
Responsibilities:
- Developed data pipelines with Flume 1.5 to ingest data from various upstream into HDFS in the form of logs.
- Implemented Kafka 0.8 with Flume to Import real time streaming of data into Hbase
- Designed and developed Databases with Hive 0.13, Hbase integrated Hive, customized UDF’s and Hive QL for faster data processing and analytics.
- Used Sqoop to import data between Hive and Relational database.
- Configured spark streaming 1.6 with Hive for real time data analytics as per the business requirements.
- Involved in loading data from Linux file system to HDFS using Kettle.
- Worked with Avro Data Serialization system to work with JSON data formats.
- Developed workflow in Oozie 3.2 to automate the tasks of loading data and running scheduled batch jobs.
- Performed unit testing for Hive with HiveRunner, MRUnit.
- Used Git for version control, JIRA for project tracking and Jenkins for continuous integration.
Environment: CDH 5.X, Spark, Scala, Java 8, Spark Streaming, Junit, Scalacheck, Kafka, Flume, Hbase, Hive, Tableau, Oozie, Git, JIRA, Jenkins
Confidential
Programmer Analyst
Responsibilities:
- Coded Java Maven module to perform various Amazon S3 Operations.
- Written custom MapReduce program to cleanse and enrich data in EMR.
- Load transformed data back to S3.
- Load data to Marklogic server in AWS.
- Coded Java script for search operations to build search app with Marklogic.
- Querying for keyword search with Facets number of years, URI’s against skillset.
- Coded Python module for URI based analysis.
- Build a Python Dictionary for predefines keyword from URI crawling data.
- Build a Faceted Search on top of data stored in Marklogic.
Environment: Amazon EMR, S3, Java, Java Script, Python, Marklogic-NoSQL
Confidential
Programmer Analyst
Responsibilities:
- Developed Complex database objects like Stored Procedures, Functions, Packages and Triggers using Oracle DB, MY SQL and PL/SQL.
- Implemented & maintained the branching and build/release strategies utilizing SVN.
- Lead configuration management and workflow development efforts for the development team.
- Automated the process of testing using selenium .
- Developed POC for the load balance testing process for the application.
- Tested the application by creating the macros for a large set of information.
- Strong understanding of JAVA project structures.
- Experience in using HTML, CSS in the development of Prototypes for the application.
- Good understanding of building the GUI using the struts 2.3 .
- Building pre-install scripts using Shell scripting and load balance testing.
- Monitoring the application and supporting the production environment.
- Eliminated the bugs using the concepts of lean six Sigma- Five why’s .
- Build Binaries using C++, Perl and GNU Make .
- Responsibilities include developing complex build, test, provision, secure and deployment systems and providing support to a large community of developers and testers.
Environment: Java/J2EE, XML, Web logic, SQL, PL/SQL, Perl Scripts, Shell scripts, Tomcat Application Server
Confidential
Programmer Analyst
Responsibilities:
- Developed Complex database objects like Stored Procedures, Functions, Packages and Triggers using Oracle DB, MY SQL and PL/SQL.
- Solved the production tickets related to the database as well as oracle Forms.
- Handled multiple interface teams and assist them in accessing the dependency functionalities.
- Lead configuration management and workflow development efforts for the development team.
- Automated the process of testing using selenium .
- Developed POC for the load balance testing process for the application.
- Tested the application by creating the macros for a large set of information.
- Strong understanding of JAVA project structures.
- Experience in using HTML, CSS in the development of Prototypes for the application.
- Good understanding of building the GUI using the struts 2.3 .
- Building pre-install scripts using Shell scripting and load balance testing.
- Monitoring the application and supporting the production environment.
Environment: Java/J2EE, XML, Web logic, SQL, PL/SQL, Perl Scripts, Shell scripts, Tomcat Application Server