Sr. Big Data Engineer Resume Washington DC - Hire IT People

SUMMARY:

Highly skilled And IT Professional with 9+ years of experience in Software Engineering with emphasis on Big Data Application development and Java server - side programming.
Strong expertise in Big Data ecosystem like Spark, Hive, Sqoop, Hdfs, Map Reduce, Kafka, Oozie, Yarn, Pig, HBase, Flume.
Strong expertise in building scalable applications using various programming languages (Java, Scala and python).
In depth Knowledge on architecture of distributed systems and parallel computing.
Experience implementing end-to-end data pipelines for serving reporting and data science capabilities.
Experienced in working with Cloudera, Hortonworks and Amazon EMR clusters.
Experience in fine tuning applications written in Spark and Hive and to improve the overall performance of the pipelines.
Developed production ready spark applications using Spark RDD apis, Data frames, Datasets, Spark SQL and Spark Streaming.
Hands on experience on fetching the live stream data and inject data into HBase table using Spark Streaming and Apache Kafka.
Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
In depth knowledge on import/export of data from Databases using Sqoop.
Well versed in writing complex hive queries using analytical functions.
Knowledge in writing custom UDF’s in Hive to support custom business requirements.
Solid experience in using the various file formats like CSV, TSV, Parquet, ORC, JSON and AVRO.
Experience in using the compression techniques like G-zip, Snappy with in Hadoop.
Strong knowledge of NoSQL databases and worked with HBase, Cassandra and Mongo DB.
Experience in using the cloud services like Amazon EMR, S3, EC2, Red shift and Athena.
Extensively used various IDE’s like IntelliJ, NetBeans and Eclipse
Proficient in using RDBMS concepts with Oracle, MySQL, DB2, Teradata and experienced in writing SQL queries.
Knowledge in writing shell scripts and scheduling using cron jobs.
Experience working with GIT(Repository), Jenkins and Maven build tools.
Developed cross-platform applications using JAVA, JSP, Servlets, Hibernate, RESTful, JDBC, JavaScript, XML, and HTML.
Used Log4J for enabling runtime logging and performed system integration test to ensure quality of the system.
Experience in using SOAP UI tool to validate the web service.
Expertise in writing unit test cases using JUnit API.
Experience in database design, entity relationships, database analysis, programming SQL, stored procedures PL/ SQL, packages and triggers in Oracle.
Highly self-motivated, good technical, communications and interpersonal skills. Able to work reliably under pressure. Committed team player with strong analytical and problem-solving skills, ability to quickly adapt to new environments & technologies.

TECHNICAL SKILLS:

Big Data Ecosystem: MapReduce, HDFS, HIVE, HBase, Pig, Sqoop, Flume, Oozie, Zookeeper, Spark, Kafka

Cloud Platform: Amazon AWS EMR, EC2, Redshift, Athena

Programming Languages:: Java, Scala, Python, SQL, UNIX Shell Scripting.

Databases: Oracle 12c/11g, MySQL, MS-SQL Server2016/2014

Version Control: GIT, GitLab, SVN

NoSQL Databases: HBase and MongoDB

Methodologies:: Agile Model

Build Management Tools:: Maven, Ant.

IDE & Command line tools: Eclipse, IntelliJ

PROFESSIONAL EXPERIENCE:

Confidential, Washington DC

Sr. Big Data Engineer

Responsibilities:

Created Sqoop Scripts to import and export customer profile data from RDBMS to S3 buckets.
Built custom Input adapters to migrate click stream data from FTP servers to S3.
Developed various enrichment applications in spark using Scala for performing cleansing and enrichment of click stream data with customer profile lookups.
Troubleshooting Spark applications for improved error tolerance and reliability.
Used Spark Data frame and Spark SQL API to implement batch processing of Jobs.
Used Apache Kafka and Spark Streaming to get the data from adobe live stream rest Api connections.
Automated creation and termination of AWS EMR clusters.
Worked on fine tuning and performance enhancements of various spark applications and hive scripts.
Used various concepts in spark like broadcast variables, caching, dynamic allocation etc. to design more scalable spark applications.

Environment: AWS EMR, S3, Spark, Hive, Sqoop, Scala, Java, MySQL, Oracle DB, Athena, Redshift.

Confidential, Addison, NJ

Big Data/Hadoop Engineer

Responsibilities:

Extensively worked in Sqoop to migrate data from RDBMS to HDFS .
Ingested data from various source systems like Teradata, MySQL, Oracle databases.
Developed Spark application to perform Extract Transform and load using Spark RDD and Data frames.
Created Hive external tables on top of data from HDFS and wrote ad-hoc hive queries to analyze the data based on business requirements.
Utilized Partitioning and Bucketing in Hive to improve hive query processing times.
Performed incremental data ingestion using Sqoop as existing application is generating data on daily basis.
Performed Data ingestion using SQOOP, Apache Kafka, Spark Streaming and FLUME.
Migrated/reimplemented Map Reduce jobs to Spark applications for better performance.
Handled data in different file formats like Avro and Parquet.
Extensively used Cloudera Hadoop distributions with in the project.
Used GIT for maintaining/versioning the code.
Created Oozie workflows to automate the data pipelines

Environment: Cloudera (CDH 5.x), Spark, Scala, Sqoop, Oozie, Hive, HDFS, MySQL, Oracle DB, Tera Data

Confidential, Atlanta, GA

Sr. Big Data/Hadoop Engineer

Responsibilities:

Wrote complex Map Reduce jobs to perform various data cleansing and ETL like processing on the data.
Worked on different file formats like Text, Avro, Parquet using Map Reduce Programs.
Developed Hive Scripts to create partitioned tables and create various analytical datasets.
Worked with cross functional consulting teams within the data science and analytics team to design, develop and execute solutions to derive business insights and solve client operational and strategic problems.
Objective of this project is to build a data lake as a cloud-based solution in AWS using Apache Spark.
Extensively used Hive queries to query data in Hive Tables and loaded data into HBase tables.
Developed Spark scripts by using Scala Shell commands as per the requirement.
Developed shell script to pull the data from third party system's into Hadoop file system.
Exported the processed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Developed Spark scripts by using Scala Shell commands as per the requirement.
Used Hive Partitioning and Bucketing concepts to increase the performance of Hive Query processing.
Designing Oozie workflows for job scheduling and batch processing.
Helped analytics team by writing Pig and Hive scripts to perform further detailed analysis of the data processed.

Environment: Java, HDFS, MapReduce, Hive, Pig, MySQL, CDH, IntelliJ, YARN, Sqoop, HBase, Unix Shell Scripting.

Confidential, Atlanta, GA

Bigdata/Hadoop Engineer

Responsibilities:

Experience in using Avro, Parquet and JSON file formats and developed UDFs using Hive and Pig.
Developing and maintaining Workflow Scheduling Jobs in Oozie.
Experience in loading and transforming huge sets of structured, semi structured and unstructured data.
Continuously monitored and managed Hadoop cluster using Cloudera Manager.
Created Hive tables, loaded them with data and wrote hive queries.
Involved in collecting, aggregating and moving data from RDBMS to HDFS using Sqoop.
Experience in managing and reviewing Hadoop log files.
Analysis of Web logs using Hadoop tools for operational and security related activities.
Developed efficient Map Reduce programs in java for filtering out the unstructured data.
Managed and reviewed Hadoop log files to identify issues when job fails.
Ingest the application logs into HDFS and processes the logs using map reduce jobs.
Create and maintain Hive warehouse for Hive analysis.
Worked on different file formats like XML files, Sequence files, CSV and Map files.
Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing
Responsible for design and creation of Hive tables, partitioning, bucketing, loading data and writing hive queries.
Worked with Oozie workflow engine to run multiple Map-R, Hive and Pig jobs.

Environment: HDFS, Hive, Scala, Map Reduce, Storm, Java, HBase, Pig, Sqoop, Oozie, MySQL, Tableau.

Confidential, Houston TX .

Bigdata/Hadoop Engineer

Responsibilities:

Involved in creating Hive tables, loading with data and writing hive queries.
Involved in data ingestion into HDFS using Sqoop and Flume from variety of sources.
Developed MapReduce programs to parse the raw data, populate tables and store the refined data in partitioned tables.
Installed and configured Hadoop and Hadoop stack on a 4-node cluster.
Experienced in managing and reviewing application log files.
Ingest the application logs into HDFS and processes the logs using map reduce jobs.
Create and maintain Hive warehouse for Hive analysis.
Generate test cases for the new MR jobs.
Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing
Responsible for design and creation of Hive tables, partitioning, bucketing, loading data and writing hive queries.
Created HBase tables to store various data formats of personally identifiable information data coming from different portfolios.
Involved in managing and reviewing Hadoop log files.
Worked with Oozie workflow engine to run multiple Hive and Pig jobs.

Environment: HDFS, Hive, Map Reduce, Storm, Java, HBase, Pig, Sqoop, Shell Scripts, Oozie, MySQL, Eclipse, Webservices, MYSQL, JDBC and WebSphere Applications.

Confidential, Philadelphia, PA

Sr. Java/J2EE Developer

Responsibilities:

Involved in a full life cycle Object Oriented application development - Object Modeling, Database Mapping, GUI Design.
Developed the J2EE application based on the Service Oriented Architecture.
Used Design Patterns like Singleton, Factory, Session Facade and DAO.
Developed Use Case diagrams, Class diagrams and Sequence diagrams to express the detail design.
Worked with EJB (Session and Entity) to implement the business logic to handle various interactions with the database.
Created and injected spring services, spring controllers and DAOs to achieve dependency injection and to wire objects of business classes.
Used Spring Inheritance to develop beans from already developed parent beans.
Used DAO pattern to fetch data from database using Hibernate to carry out various database.
Used SOAP Lite module to communicate with different web-services based on given WSDL.
Used Hibernate Transaction Management, Hibernate Batch Transactions, and cache concepts.
Created complex SQL Queries, PL/SQL Stored procedures, Functions for back end.
Developed various generic JavaScript functions used for validations.
Developed screens using HTML5, CSS, jQuery, JSP, JavaScript, AJAX and Ext.JS.
Used Aptana Studio and Sublime to develop and debug application code.
Used Rational Application Developer (RAD) which is based on Eclipse, to develop and debug application code.
Created user-friendly GUI interface and Web pages using HTML, AngularJS, JQuery and JavaScript.
Used Log4j utility to generate run-time logs.
Deployed business components into WebSphere Application Server.
Developed Functional Requirement Document based on users' requirement.

Environment: Core Java, J2EE, JDK 1.6, spring 3.0, Hibernate 3.2, Tiles, AJAX, JSP 2.1, Eclipse 3.6, IBM WebSphere7.0, XML, XSLT, SAX, DOM Parser, HTML, UML, Oracle10g, PL/ SQL, JUnit.

We provide IT Staff Augmentation Services!

Sr. Big Data Engineer Resume

Washington, DC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship