Sr. Big Data Engineer Resume Livonia MI - Hire IT People

SUMMARY

7+ years of experience in IT including design and development of object oriented, web based enterprise applications, and big data processing applications.
Experienced in developing big data applications for processing tera - bytes of data using Hadoop ecosystem (HDFS, MapReduce, Hbase, sqoop, Apache Kafka, Hive, Pig, Oozie) and In-depth knowledge of MR1 (classic) and MR2 (YARN) frameworks.
In depth understanding/knowledge of Hadoop Architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, Data Node and Map Reduce concepts.
Expertise in installation, configuration, supporting and managing of Big Data and underlying infrastructure of HadoopCluster.
Experienced on major components in HadoopEcosystem like HadoopMap Reduce, HDFS, HIVE, PIG, Sqoop.
Experienced on fast streaming big data components like Flume, Kafka, Storm and Spark.
Excellent understanding and hands on experience using NOSQL databases like Cassandra, Mongo DB and Hbase.
Experienced in Object Oriented Analysis Design (OOAD) and development of software using UML Methodology, good knowledge of J2EE design patterns and Core Java Design Patterns.
Extensive knowledge in creating PL/SQL Stored Procedures, Packages, Functions, Cursors against Oracle (9i, 10g, 11g, 12c) and MySQL server.
Experienced in preparing and executing Unit Test Plan and Unit Test Cases using JUnit, MRUnit.
Experienced with build tools like Maven, Ant and CI tools like Jenkins.
Excellent experience with version controls like CVS, SVN and Git.
Experienced in Scrum, Agile and Waterfall models.
Extensive knowledge in NoSQL databases like HBase, Cassandra.
Experienced with performing CRUD operations using HBase Java Client API and Rest API.
Experienced with Oozie Workflow Engine to automate and parallelizeHadoopMap/Reduce, Hive and PIG jobs.
Experienced with processing different file formats like Avro, XML, JSON and Sequence file formats using MapReduce programs.
Excellent Java development skills using J2EE Frameworks like spring, Hibernate, EJBs and Web Services.
Experienced with implementing SOAP and Rest based Web Services.
Excellent experience in analyzing data using HiveQL, PIG Latin, and custom MapReduce programs in Java.
Strong experience in collecting and storing stream data like log data in HDFS using Apache Flume.
Experienced with working MapReduce Design Patterns to solve complex MapReduce programs.
Experienced on extending Hive and Pig core functionality by writing custom UDFs.
Experienced inHadoopadministration activities such as installation and configuration of clusters using Apache and Cloudera.
Excellent Knowledge in Amazon AWS concepts like EMR and EC2 web services which provides fast and efficient processing of Big Data.
Experienced the integration of various data sources like Java, RDBMS, Shell Scripting, Spreadsheets, and Text files.
Ability to blend technical expertise with strong Conceptual, Business and Analytical skills to provide quality solutions and result-oriented problem solving technique and leadership skills.

TECHNICAL SKILLS

Languages: Java, C, C++, SQL and XML, PL/SQL, HTML, Javascript

Hadoop/BigData Technologies: HDFS, Map Reduce, Sqoop, Flume, Pig, Hive, Oozie, impala, Zookeeperand Cloudera Manager, MongoDB, NO SQL Database HBase

Version Control Tools: Github, Bitbucket, CVS, SVN, Clear Case, Visual Source Safe

Database: Oracle 8i/9i/10g/11g/12c MS SQL Server 2005, MySQL,Teradata

Build & Deployment Tools: Maven, ANT, Hudson, Jenkins

Monitoring and Reporting: Tableau, Custom shell scripts,HadoopDistribution Horton Works, Cloudera, MapR

PROFESSIONAL EXPERIENCE

Confidential, Livonia MI

Sr. Big Data Engineer

Responsibilities:

Worked on Hadoop technologies like Pig Latin, Hive, Sqoop and Big Data testing.
Worked on tools Flume, Kafka, Storm and Spark.
Developed automated scripts for ingesting the data from Teradata around 200TB bi-weekly refreshment of data.
Developed Hive scripts for end user / analyst requirements for adhoc analysis.
Used of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive for optimized performance
Solved performance issues in Hive and Pig scripts with understanding of Joins, Group and aggregation and how does it translate to MapReduce jobs.
Extensively used Apache Sqoop for efficiently transferring bulk data between ApacheHadoopand relational databases (Oracle) for product level forecast.
Worked in tuning Hive and Pig scripts to improve performance.
Developed UDFs using JAVA as and when necessary to use in PIG and HIVE queries.
Extracted the data from Teradata into HDFS using Sqoop.
Created Sqoop job with incremental load to populate Hive External tables.
Developed TWS workflow for scheduling and orchestrating the ETL process.
Used Impala to read, write and query theHadoopdata in HDFS or HBase or Cassandra.
Functional, non-functional and performance testing of key systems prior to cutover to AWS
Developed programs in Spark based on the application for faster data processing than standard MapReduce programs.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
Extracted feeds form social media sites such as Twitter.
Worked with application teams to install operating system,Hadoopupdates, patches, version upgrades as required.
Configured Hadoop system files to accommodate new sources of data and updated the existing configuration Hadoop cluster
Involved in gathering business requirements and prepared detailed specifications that follow project guidelines required to develop written programs.
Worked on importing and exporting data from different databases like Oracle, Teradata into HDFS andHive using Sqoop.
Actively participating in the code reviews, meetings and solving any technical issues.

Environment: Java 7, Eclipse, Oracle 12c, Hadoop, MapReduce, Kafka, Hive, HBase, TWS, ITG Linux, MapReduce, HDFS, Hive, AWS, MapR, SQL, Talend 5.5.2.

Confidential, NYC NY

Sr. Big Data Engineer

Responsibilities:

Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Developed simple to complex Map Reduce job using Hive.
Analyzed the data by performing Hive queries and running Pig scripts to know user behavior.
Optimized Map/Reduce jobs to use HDFS efficiently by using various compression mechanisms.
Created partitioned tables in Hive.
Extensively used Pig for data cleansing.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Developed the Pig UDF's to pre-process the data for analysis.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Worked on streaming the data into HDFS from web servers using Flume.
Designed and implemented Hive and Pig UDF's for evaluation, filtering, loading and storing of data.
The Hive tables created as per requirement were Internal or External tables defined with appropriate Static and Dynamic partitions, intended for efficiency.
Wrote Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS.
Implemented Lateral View in conjunction with UDTFs in Hive.
Performed complex Joins on the tables in Hive.
Load and transform large sets of structured, semi structured using Hive and Impala.
Connected Hive and Impala to Tableau reporting tool and generated graphical reports.
Worked on implementation and maintenance of ClouderaHadoopcluster.

Environment: Hadoop, HDFS, Pig 0.10, Hive, AWS, MapReduce, Sqoop, Java Eclipse, SQL Server, Shell Scripting.

Confidential, SFO, CA

Hadoop Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
Importing and exporting data into HDFS and Hive using Sqoop.
Designed and developed Big Data analytics platform for processing customer viewing preferences and social media comments using Java, Hadoop, Hive and Pig.
Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive structured and unstructured data.
Experienced in defining job flows.
Developed and executed custom MapReduce programs, PigLatin scripts and HQL queries.
UsedHadoopFS scripts for HDFS (HadoopFile System) data loading and manipulation.
Performed Hive test queries on local sample files and HDFS files.
Developed and optimized Pig and Hive UDFs (User-Defined Functions) to implement the functionality of external languages as and when required.
Extensively used Pig for data cleaning and optimization.
Developed Hive queries to analyze data and generate results.
Exported data from HDFS to RDBMS via Sqoop for Business Intelligence, visualization and user report generation.
Analyzed business requirements and cross-verified them with functionality and features of NOSQL databases like HBase, Cassandra to determine the optimal DB.
Load and transform large sets of structured, semi structured and unstructured data.
Installed and configured Apache Hadoop, Hive and Pig environment on the prototype server.
Configured SQL database to store Hive metadata.
Loaded unstructured data into Hadoop File System (HDFS).
Created ETL jobs to load Twitter JSON data and server data into MongoDB and transported MongoDB into the Data Warehouse.
Responsible to manage data coming from different sources.
Responsible for implementing MongoDB to store and analyze unstructured data.
Supported Map Reduce Programs those are running on the cluster.
Involved in loading data from UNIX file system to HDFS.
Installed and configured Hive and also written Hive UDFs.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
Implemented CDH3 Hadoop cluster on CentOS.

Environment: Hadoop, MapReduce, HDFS, Hive, Spark, Pig, Java, SQL, Cloudera Manager, Sqoop, Strom, Solr, Flume, Cassandra, Oozie, Java (jdk 1.6), Eclipse

Confidential

Java/J2EE developer

Responsibilities:

Developed Servlets and Java Server Pages (JSP).
Writing Pseudo-code for Stored Procedures.
Developed PL SQL queries to generate reports based on client requirements.
Enhancement of the System according to the customer requirements.
Designed and Developed UI pages in CBMS application using CBMS custom framework, business objects, JDBC, JSP and java script.
Involved in business requirement gatherings, development of technical design documents and design of real time eligibility project.
Developed Real Time Eligibility web service using CBMS custom framework, AJAX 2.0, WSDL and SOAP UI.
Used JAXB Marshaller and Unmarshaller to marshall and unmarshall WSDL request.
Developed all WSDL components, XSD, producing and consuming WSDL web services using AJAX 1.5 and AJAX 2.0.
Development of java services using java code, SQL queries, JDBC, Spring and hibernate entities.
Used to Eclipse for development, debugging and deployment of the code.Created test case scenarios for Functional Testing.
Used Java Script validation in JSP pages.
Helped design the database tables for optimal storage of data.
Coded JDBC calls in the servlets to access the Oracle database tables.
Responsible for Integration, unit testing, system testing and stress testing for all the phases of project.
Prepared final guideline document that would serve as a tutorial for the users of this application.

Environment: Java 1.5, Servlets, J2EE 1.4, JDBC, Oracle 10g, PL SQL, HTML, JSP, Eclipse, UNIX.

We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

Livonia, MI

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship