We provide IT Staff Augmentation Services!

Big Data Engineer / Hadoop Developer Resume

Plano, TX

SUMMARY:

  • Detail oriented Data engineer with 10 years IT experience, including 3 years as a Hadoop developer/big Data Engineer and 4 years hands - on experience in Oracle Database Administration.
  • Experience in implementation of big data applications using Hadoop framework, importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa, LINUX Shell scripting, Hortonworks, Cloudera, experience with major components in Hadoop Ecosystem like MapReduce, HDFS, HIVE, IMPLA, HBASE, Cassandra, Sqoop, Oozie, Kafka, Zookeeper, YARN and SPARK.
  • 2 years of experience with Kafka-Skilled in ingesting streaming data to SPARK clusters using Kafka and processing DStream in SPARK, working on in-memory based Apache Spark application for ETL/transformations.
  • Experience in implementing continuous deployment systems with Jenkins and AWS OpsWorks.
  • Working with large data sets, Performance optimization of Sqoop jobs, scheduling job flow using Oozie, developing ETL Scripts for Data cleansing and Transformation, Data Mart Development & Maintenance of Data Warehousing.
  • Creating Parameterized, Drill through, Drill down, Matrix Reports, Sub Reports, Tabular Reports, handling Slowly Changing Dimensions while loading data into data warehouse.
  • Skilled in Java, Python, Scala, and SQL programming language.
  • Strong analytical skills, customer focused, takes responsibility for projects and drive results through execution. Efficient communicator exhibited by working closely with users to identify and resolve problems.

PROFESSIONAL EXPERIENCE:

Confidential, Plano TX

Big Data Engineer / Hadoop Developer

Responsibilities:

  • Analyzed multiple sources of Data sources from customer using Python.
  • Run, Test & deploy on Confidential &T server.
  • Performed data transformations, filtering, sorting and aggregation using Hive.
  • Imported and exported data using SQOOP from relational databases to HDFS, HIVE tables and vice-versa.
  • Created LINUX shell scripts and updated shell scripts as per the business requirement.
  • Supported Big Data tools & frameworks like MapReduce, HDFS, Hive, Hbase, and Sqoop.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive semi structured and unstructured data. Loaded unstructured data into Hadoop distributed File System (HDFS).
  • Assisted in Operations and Administration of Hadoop cluster in production.

Confidential, Irving, TX

Big Data Engineer / Hadoop Developer

Responsibilities:

  • Assisted in Operations and Administration of Hadoop cluster in production.
  • Supported Big Data tools & frameworks like MapReduce, HDFS, Hive, Hbase, Sqoop, Flume, Pig, and Oozie & Splunk.
  • Evaluated architectural decisions around support of data streaming technologies like Storm, Kafka and Spark.
  • Architected the design and build for receipt and loading of data feeds into the Hadoop Cluster.
  • Handled deploying of applications on AWS environment utilizing various AWS components and best practices.
  • Worked on setting up Tableau server environment for publishing management level dashboards.
  • Loaded the aggregate data into a relational database for reporting and analyses which revealed ways to lower operating costs, boost throughput and improve product quality.
  • Created Spark RDDs from data files and then performed transformations and actions to other RDDs.
  • Created HIVE Tables with dynamic and static partitioning including buckets for efficiency. Also created external tables in HIVE for staging purposes.
  • Loaded HIVE tables with data, wrote hive queries which run on MapReduce and Created customized BI tool for manager teams that perform query analytics using HiveQL.
  • Loading data from LINUX file system to HDFS with LINUX shell commands.
  • Created LINUX shell scripts and updated shell scripts as per the business requirement.
  • Imported and exported data using SQOOP from relational databases to HDFS, HIVE tables and vice-versa.
  • Aggregated RDDs based on the business requirements and converted RDDs into Dataframes saved as temporary hive tables for intermediate processing and stored in HBase/Cassandra and RDBMs.
  • Used SPARK SQL to do analytics on huge data sets.
  • Developed Spark scripts with Scala shell commands as per the business requirement.
  • Converted Cassandra/Hive/MySQL/HBase queries into Spark RDD's using Spark transformations and Scala.
  • Conducted POC's on migrating to Spark and Spark-Streaming using KAFKA to process live data streams and Compared Spark performance with Hive and SQL.
  • Ingested data from RDBMS and performed data transformations in Spark and exported the transformed data to HBase/Cassandra as per the business requirement.
  • Setup Oozie work flow jobs for Hive/SQOOP/HDFS/SPARK actions.
  • Created Fact and dimension tables from Hive data for reporting purposes.
  • Integrated Hadoop into traditional ETL, accelerating the extraction, transformation, and loading of massive semi structured and unstructured data. Loaded unstructured data into Hadoop distributed File System (HDFS).
  • Created Data Lake by extracting data from various sources into HDFS. Data sources included RDBMS, and different file formats.

Confidential, Overland Park, KS

AWS Solutions Architect-Associate

Responsibilities:

  • Developed and provided input of hosting applications on the AWS Cloud platform taking into account infrastructure, sizing, disaster recovery, data protection and application requirements.
  • Designed and deployed highly available, dynamically scalable, cost effective, fault tolerant and reliable applications on AWS platform.
  • Contributed to proposals, statements of work, lessons learned and process improvement.
  • Implemented customized AWS solutions for on premise and cloud environment client.
  • Evaluated customer’s requirements and made recommendations for implementing, deploying, and provisioning application on the AWS platform.
  • Implemented continuous deployment systems with Jenkins and AWS OpsWorks.
  • Provided and documented best practices, backup procedures, troubleshooting guides and making sure infrastructure and architecture drawings are current with changes.
  • Created training modules designed to assist and train onboarding new employees.

Confidential, Overland Park, KS

Oracle Database Administrator/Sr Business Analyst

Responsibilities:

  • Worked extensively on Performance tuning of Queries by maintaining Aggregates, Compression, partition, indexing and use of Hints, Stored outlines, Statistics.
  • Executed and implemented user helpdesk tickets of various issues via REMEDY database from creating user accounts, read-only accounts on a database & table down to assisting them with logging into the database. Experienced in using 10g features like 10g RMAN, Data pump Flash-Back Recovery and ASM.
  • Installed Oracle 11g with ASM and OCFS2 File systems.
  • Extensive Experience with RMAN Backups, Hot Backups and Logical Backups.
  • Excellent experience in managing large-scale OLTP databases 5+TB size in HA and DR setup.
  • Solid experience in oracle database migration, upgrades, applying patch sets, code reviews, database release and deployments and data center consolidation activities. Proficient with Database Cloning, Migration, and Patching of Oracle Databases.
  • Experience in doing upgrades including maintenance & monitoring implementation.

Confidential, Independence, MO

Junior Oracle Database Administrator

Responsibilities:

  • Worked extensively on Performance tuning of Queries by maintaining Aggregates, Compression, partition, indexing and use of Hints, Stored outlines, Statistics.
  • Executed and implemented user helpdesk tickets of various issues via REMEDY database from creating user accounts, read-only accounts on a database & table down to assisting them with logging into the database. Experienced in using 10g features like 10g RMAN, Data pump Flash-Back Recovery and ASM.

Hire Now