Sr. Data / Hadoop Engineer Resume San Antonio, TX - Hire IT People

SUMMARY

Around 9 years of work experience in Information Technology with skills in analysis, design, development, testing and deploying various software applications. Including 3+ years of experience in implementing Big Data applications.
Strong skills in developing applications involving Big Data warehouse systems using Hadoop ecosystem tools such as HDFS, MapReduce, Yarn, Hive, Pig, Sqoop, Zookeeper, Oozie and Falcon.
Knowledge on Storm, Scala and Spark.
Experience in managing Big data Ingestion, Data Transformation using Pig, Hive.
Expertise in writing UDF’s to extend functionality of Pig & Hive.
Excellent understanding of Hadoop architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce programming paradigm.
Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa
Very good understanding of Partitions, Bucketing concepts in Hive and designed both Managed and External tables in Hive to optimize performance.
Solved performance issues in Hive and Pig scripts with understanding of Joins and Groups.
Good knowledge in integration of various data sources like RDBMS, Spreadsheets, Text files, JSON and XML files.
Experience in using Text, Sequence, RC, ORC files formats and different compression techniques like ZLIB and Snappy.
Experience in Apache Falcon for creating workflows.
Knowledge of HBase for storing data that need faster access.
Knowledge of developing Apache Spark programs using Scala for large-scale data processing and using the in-memory computing capabilities for faster data processing with Spark core and Spark SQL.
Experience in Agile and Scrum methodologies.
Experienced in SQL, PL/SQL, Procedures/Functions and Triggers.
Strong hands on experience in Bash scripting in Linux.
Exposure to basic Python programming.
Familiar with data warehousing and ETL tools like Informatica PowerCenter.
Hands on experience in designing and coding web applications using Core Java and J2EE.
Experience in Web Services using XML, HTML and SOAP.
Involved in developing distributed Enterprise and Web applications using UML, Java/J2EE, Web technologies.
Exceptional analysis skills with an ability to transform Business requirements into functional and technical specifications.
Extensive experience in working with Business Analysts, Users, Architects, Infrastructure and support groups for system design, documentation and implementation.
Experience in Code reviews, fixing defects and enhancing application performance.
Strong Managing skills with offshore and onshore model.
A team player with strong communication, analytical, relationship management and problem solving skills.

TECHNICAL SKILLS

Big Data Ecosystems: Hadoop HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Falcon, Hbase, Storm, Spark

Java & J2EE Technologies: Core Java, Servlets, JSP, JSF, JMS, JDBC

Frameworks: Struts, Hibernate, Spring

Programming languages: C, C++, SQL, PL/SQL, Shell Scripting, Scala, Java, basic Python

Databases: MySQL, MS-SQL Server

Web Technologies: HTML, XML, JavaScript, AJAX, SOAP

ETL Tools: Informatica, Teradata, SQL Sever

Methodology: Agile and Waterfall

Tools: Microsoft Word, Excel, PowerPoint, Visio, Box, SharePoint, RallyDev, Jira, WinSCP, Git, SVN, Putty, Log4j

Business Domain: Healthcare, Banking, Telecom and Energy

PROFESSIONAL EXPERIENCE

Confidential, San Antonio, TX

Sr. Data / Hadoop Engineer

Responsibilities:

Actively participated with the development team to meet the specific customer requirements and proposed effective Hadoop solutions.
Processed HDFS data and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project.
Developed Linux Scripts for data cleansing & preprocessing on huge volumes of data.
Wrote custom framework by using Java for creating hive ddl’s by reading excel.
Developed Pig Scripts and Hive Scripts to load data files.
Experienced in optimizing hive queries to handle different data sets.
Identified and created ORC formatted hive tables for high usage.
Used hive schema to create relations in pig using HCatalog.
Created Indexes and buckets for Hive tables to improve the performance of hive queries.
Imported data into clusters from various sources, such as MySQL using Sqoop and also exported data from Hive external tables to relational databases for generating reports.
Wrote Pig UDF’s as per requirement.
Developed Hive queries for data-mining delimited text file or Excel file to verify data transfer success, and support internal and external customer needs.
Leveraged Falcon workflows.
Designed and developed Oozie workflows for sequence flow of job execution.
Integrating with Hadoop ecosystem to retrive data using Spark core components and making use of in-memory processing.
Writing SQL queries to process the data using Spark SQL.
Managed Agile Software Practice using Rally by creating Product Backlog, Iterations and Sprints in collaboration with the Product Team.
Utilized Agile Scrum Methodology to manage and organize the team with regular code review sessions.

Environment: HDP 2.2 & 2.3, Hdfs, Hive, Pig, Spark, Scala, HCatalog, MapReduce, YARN, Falcon, Linux, Java/JDK1.7, GIT, RallyDev, MySql.

Confidential, Houston, TX

Data Engineer

Responsibilities:

Worked on writing various Linux scripts to ingest data from landing zone onto the data lake.
Developed several shell scripts, which acts as wrapper to start these Hadoop jobs and set the configuration parameters.
Actively involved in working with Hadoop Administration team to debugging various slow running MR Jobs and doing the necessary optimizations.
Load and transform large sets of structured data.
Involved in creating different Hive tables to load data and write hive queries, which will run internally in map, reduce way. Involved in designing a production process for extracting the final data and forwarding it to end users on an as-needed basis.
Extracted the data from Teradata into HDFS using Sqoop.
Experience in managing and reviewing Hadoop log files.
Developed a Utility framework to generate Excel Reports by exporting data from HDFS.
Exporting data out of HDFS to Teradata client environment.
Developed MapReduce jobs using Hive, and Pig to extract and analyze data.
Extensively used Tivoli Workload Scheduler (TWS) to schedule periodical run of various scripts for initial and delta loads for various datasets.
Was actively involved in building the Hadoop generic framework to enable various teams to reuse some of the best practices.

Environment: Apache Hadoop, MapReduce, Hive, Pig, Sqoop, TWS, Java (jdk 1.6), XML, Teradata client, Linux.

Confidential, Houston, TX

Big Data Hadoop Consultant

Responsibilities:

Gathering data requirements and identifying sources for acquisition.
Involved in loading data from edge node to HDFS using shell scripting.
Wrote Pig Scripts to generate MapReduce jobs and performed ETL procedures on the data in HDFS.
Created external, partitioned hive tables and corresponding HDFS locations to load data.
Developed scripts to ingest data into hive tables that can be reused across the project.
Implemented optimized joins to gather data from different data sources using hive joins.
Developed pig scripts for transforming data into standardized data structures for consumers.
Written Hive queries for data analysis to meet the business requirements.
Developed scripts for monitoring HDFS data usage and cleanup.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Documented coding best practices.
Worked on different file formats like Text files, Sequence Files, Record columnar files (RC).
Experienced in monitoring the application process and suggesting improvements.
Conduct/Participate in project team meetings to gather status, discuss issues & action items.
Created deployment plan, runbook and implementation checklist.

Environment: Hadoop, MapReduce, HBase, Hive, Pig, Sqoop, Java (jdk 1.6), UINX.

Confidential, Houston, TX

Java Developer

Responsibilities:

Involved in Detailed Design documentation and implementation and Coding.
Established schedule and resource requirements by planning, analyzing and documenting development effort to include time lines, risks, test requirements and performance targets
Developed UI and client side validations using HTML, CSS, Java Script and JSP.
Developed User Interface module using Struts Framework, JSP and Servlets.
Designed and implemented the database interaction using JDBC.
Used JDBC to establish connection between the database and the application.
Wrote SQL queries and stored procedures using PL/SQL.
Design and develop enterprise web applications, for internal production support group, using Java (J2EE), Design Patterns and Struts framework
Developed server side utilities using J2EE technologies Servlets, JSP.
Created Functional Design Specification for the technical team.

Environment: Core Java, J2EE, Servlets, JSP, Struts, Hibernate, XML, SQL, PL/SQL, Eclipse IDE, JUnit, Java Script, HTML and CSS.

Confidential

Java Developer

Responsibilities:

Involved in requirements gathering and creating functional specifications by interacting with business users.
Responsible for analysis, design, development and unit testing.
Unit testing before check in the code for the QA builds.
Created web pages using XML, HTML and JavaScript.
Used AJAX for client-to-server communication
Involved in Production Support.
Used JavaScript for client side validations.
Used Log4j and commons-logging frameworks for logging the application flow.
Used SVN for Version controlling.
Developed JavaScript functions for the front-end validations.
Resolve system defects and perform bug fixes during testing phase
Committing the updated files to repository using SVN.

Environment: Apache Tomcat, Eclipse IDE, PL/SQL, HTML, AJAX, JavaScript, UML, Windows XP, SVN.

Confidential

Java Developer

Responsibilities:

Involved in the complete software development life cycle(SDLC) of the application from requirement analysis to testing.
Developed The UI using JavaScript, HTML, and CSS for interactive cross browser functionality and complex user interface.
Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
Prepared the Functional, Design and Test case specifications.
Involved in writing Stored Procedures to do some database side validations.
Performed unit testing, system testing and integration testing
Developed Unit Test Cases. Used JUNIT for unit testing of the application.
Provided Technical support for production environments resolving the issues, analyzing the defects, providing and implementing the solution defects. Resolved more priority defects as per the schedule.

Environment: Java/J2EE, HTML, Java Script, CSS, PL/SQL, HTML, MySQL, JDBC 3.0, Junit, log4j, Eclipse.

We provide IT Staff Augmentation Services!

Sr. Data / Hadoop Engineer Resume

San Antonio, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship