Data Engineer Resume
Irving, TX
SUMMARY
I have 7 Years of overall experience as an Oracle/Application Developer. Solid understanding of Big Data components - Hive, HBase, MongoDB, Oozie, Sqoop, Spark, Kafka, MySQL, Oracle. Developed expertise with Python, PySpark, PyMongo. Good understanding of design & development of SDLC, Data Warehousing, Relational Database Systems and ETL tools. Worked on Pandas and Matplotlib. 3 years of experience in C, C++, Visual Basic 6, Visual C++, VBA. Experienced in LINUX Shell Scripts.
TECHNICAL SKILLS
Hadoop Ecosystem: Hadoop, Spark, Kafka, Hive, Oozie, Sqoop, MapReduce, HDFS and Zookeeper
Programming Languages: Python, Pandas, Matplotlib, C, C++, Visual Basic
Databases: Oracle 9i/10g, MySQL, NoSQL - Mongo, HBase
Platforms: Linux, Windows
Methodologies: Agile and Waterfall Model
PROFESSIONAL EXPERIENCE
Data Engineer
Confidential
Responsibilities:
- Expertise in Hadoop ecosystem components like HDFS, Map Reduce, Yarn, HBase, Sqoop, Oozie, SQL, PLSQL, Spark, Zookeeper, Mongo DB and Hive for scalability, distributed computing, and high-performance computing.
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture.
- Having good knowledge of Single node and Multi-node Cluster Configurations.
- Experience in tuning the performances by using Partitioning, Bucketing and Indexing in Hive.
- Created Hive tables to store data into HDFS and processed data using HiveQL.
- Transferred MySQL tables to Hive with Sqoop.
- Can also transfer .csv files to HBase.
- Experienced in Sqoop, Oozie workflows and scheduling.
- Intermediate to advanced knowledge in Python programming.
- Extensively involved in various activities of the project like information gathering, analyzing and documenting the functional or business requirements.
- Interacted with external vendors in finalizing the business requirements.
- Built and maintained SQL scripts, indexes, and complex queries for data analysis and extraction.
- Developed stored procedures and complex packages extensively using PL/SQL and shell programs.
Technology used: Oracle 9i SQL, PL/SQL, Linux, Windows