Big Data Engineer Resume Dallas, TX - Hire IT People

SUMMARY:

6 years of experience in Information technology (IT) industry including 2+ years of hands on experience in Big Data ecosystem technologies such as Hadoop, MapReduce, Spark, Hive, HBase, Sqoop, Kafka, Oozie, Cassandra and Flume
Technically skilled at developing new applications on Hadoop according to business needs and converting existing applications to Hadoop environment.
Used NIFI for the transformation of data from different components of Big data ecosystem.
Used Spark Streaming and Kafka to process real time data.
Worked on writing custom UDF’s in Java to extend Hive core functionality
Worked on loading and transforming of large sets of structured, semi structured and unstructured data.
Worked with RDBMS including MySQL and Oracle SQL
Worked with NoSQL databases including HBase, MongoDB and Cassandra
Developed simple to complex Map reduce and Streaming jobs using Scala and Java for data cleansing, filtering and data aggregation.
Extensive hands on experience in most of the programming languages, Java, Python, Scala
Proficient in writing HiveQL and SQL queries to achieve data manipulation
Conducted data transformation with data formats like Sequence File, Flat files, XML, JSON, Avro, Parquet and relational tables
Strong in core Java including Object - Oriented Design (OOD) and Java components like Collections Framework, Exception handling, I/O system
Adept at using Sqoop to migrate data between RDBMS, NoSQL databases and HDFS
Developed real-time read/write access to very large datasets via HBase.
Consolidated MapReduce jobs by implementing Spark.
Experience with Apache Spark with Scala, Python and Java
Good knowledge of scheduling batch job workflow using Oozie
Familiar with developing environments like JIRA, Agile/Scrum and Waterfall
Experience in collecting, aggregating and moving large amounts of streaming data using Flume, Kafka, Spark Streaming.
Demonstrated ability to communicate and gather requirements, partner with enterprise architects, business users, analysts and development teams to deliver rapid iterations of complex solutions.
Proficient in Data Visualization by creating multiple dashboards using Tableau

TECHNICAL SKILLS:

Hadoop Ecosystem\ Databases: Apache Hadoop 2.5, Hive, Pig, HBase, Sqoop, \ Oracle, MySQL, SQL, MongoDB, Cassandra Spark 1.6, Kafka, Oozie, Zookeeper

Languages\ Visualization: Python, Java, Scala, SQL, R\ Tableau, R

Web Technologies: HTML, CSS

PROFESSIONAL EXPERIENCE:

Confidential, Dallas, TX

Big Data Engineer

Responsibilities:

Developed data pipeline using Kafka, Sqoop, Hive and Java MapReduce to ingest data into HDFS for analysis.
Developed design documents considering all possible approaches and identifying best of them.
Aggregated and stored the data result into HDFS and HBase
Responsible to manage data coming from different sources
Developed business logic using Scala
Responsible for collecting incoming data in real-time and processing them with Spark-Streaming and SparkSQL.
Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi structured data coming from various sources
Developed scripts and automated data management from end to end and sync up between all the clusters
Experienced with Spark Context, Spark SQL, Data Frame, Pair RDD's
Developed functional programs in Scala for connecting the streaming data application and gathering web data.
Implemented the workflows using Apache Oozie framework to automate tasks
Worked in an Agile environment. Effectively communicated with different levels of the management.

Environment: Hadoop 2.5, Hive 1.2, Pig 0.16.0, SPARK 1.6, Scala 2.11.8, MapReduce, HBase 1.1.2, Sqoop 1.4.6, Kafka 0.10.0.1

Confidential, Dallas TX

Big Data Engineer

Responsibilities:

To create a production data-lake that can handle transactional processing operations using Hadoop Eco-System.
Building data Ingestion layer using Spark and Sqoop in distributed cluster.
Data migration from various relational data platforms to Hadoop and building data warehouse on Hadoop ecosystems such as Hive, Oozie and Sqoop.
Prepared an ETL pipeline with the help of Sqoop and Hive to be able to frequently bring in data from the source and make it available for consumptions.
Configured periodic incremental imports of data from Oracle into HDFS using Sqoop.
Extensive experience in working with structured data using Hive QL , join operations, writing custom UDF's and experienced in optimizing Hive Queries.
Expertise in implementing Spark Scala application using higher order functions for both batch and interactive analysis requirement.
Developed Spark jobs and Hive jobs to summarize and transform data.
Experienced in loading and transforming of large sets of structured data using Spark .
Involved in gathering requirements from client and estimating a timeline for developing complex queries using Hive for logistics applications
Created Hive tables, loaded data and Hive queries to analyze user request patterns and implement various performance optimization measures including partitions and bucketing in Hive.
Setup Oozie workflow for HIVE/Sqoop actions.
Involved in designing of HDFS storage to have efficient number of block replicas of data.

Environment: Hive 1.2, Sqoop 1.4.6, Hadoop 2.5, Oozie 4.2.0, Spark 1.6, Oracle, Scala 2.11.8

Confidential

Python Developer

Responsibilities:

Implemented scalable applications for information identification, extraction, analysis, retrieval.
Directed software design and development while remaining focused on client needs.
Collaborated closely with other team members to plan, design and develop robust
Interfaced with business analysts, developers and technical support to determine optimal specifications.
Evaluated interface between hardware and software.
Advised customers regarding maintenance of diverse software systems.

Environment: Ubuntu Linux, Python, OpenCV, Twilio, Raspberry Pie

Confidential

Python Developer

Responsibilities:

Designed a dynamic and an interactive website that ensured positive customer experience, resulting in 40% increase in revenue.
Developed, tested and debugged software tools.
Implemented website functionality using class-based views and models to store data in SQLite database.
Developed website using Python and Django Web Framework with the help of HTML template tagging, JS and Bootstrap in front end.
Implemented test programs and evaluated existing engineering processes.
Designed and configured database and back end application programs.
Performed research to explore and identify new technological platforms.
Collaborated with internal teams to convert end user feedback into meaningful and improved solutions.
Resolved ongoing problems and accurately documented progress of project.

Environment: Ubuntu Linux, Python, Django web framework, HTML, CSS, Bootstrap, JavaScript

Confidential

Junior Java Developer

Responsibilities:

Participated in requirements analysis and design of documents.
Involved in development of core modules like ticket reservation, payment, user registration and hotel reservation.
Developed the application as per the functional requirements from the analysts.
Integrated SOAP web services and mapped the responses to display to the user interface.
Involved in designing the entire database for the application.
Involved in developing persistence layer using JDBC, SQL and stored procedures.
Developed presentation tier using JSP, Servlets, HTML, CSS, JavaScript and jQuery.
Used JBoss server to deploy the application to the server.
Used Subversion (SVN) as version controlling for the source code check in and check outs.
Participated in scrum meetings as a part of Agile Methodology.

Environment: JSP, Servlet, Java 1.7, JDBC, HTML, CSS, JavaScript, Eclipse, SOAP, JBOSS

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Dallas, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship