Data Engineer Resume

SUMMARY

Over 11 + years IT of experience.
4+ years of experience in Big Data Technologies including Hadoop 2.6.0 - Cloudera 5.10 and Hortonworks well versed in SPARK, Python, HIVE, Impala, HBASE and Sqoop.
Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice-versa.
Having good knowledge on Python to execute Spark.
Having good knowledge in AWS Cloud systems.
In depth understanding of Hadoop Architecture and various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node and MapReduce.
Expert in working with Hive data warehouse creating tables, data distribution by implementing Partitioning and Bucketing.
Work with technical and end users to understand business requirements and identify data solutions.
Responsible for gathering the requirements doing the analysis and formulating the requirements specifications with the consistent inputs/requirements.
In-depth knowledge of agile iterative Software Development Life Cycle process.
Develop database migration strategies, including schemas and migrating different kind of DB source system to Hadoop Data lake, Automate migration processes.

TECHNICAL SKILLS

Operating Systems: Windows, Linux, UnixBig Data Technologies: Hadoop Cloudera 2.6.0-cdh5.10, Hortonworks, Hue, Ambari, HIVE, Impala, HBase, Sqoop, SPARK 2.3.4, Py-Spark, Java UDF.

Databases & Tools: ORACLE, MS SQL Server, MySQL

Programming Languages: Python 3.7, Java 8, VB, VB.net

Scripting Languages: UNIX Shell Scripting

Development IDE / Tools: Zeppelin Notebook, Eclipse, VSCode

Source Code Control: Git, PVCS, SVN

PROFESSIONAL EXPERIENCE

Data Engineer

Confidential

Responsibilities:

Developed Sqoop scripts for initial loading of data
Wrote Shell scripts for data extracting from Adobe and injecting into HDFS.
Developed Spark transformations of Data
Developed Python scripts using Spark and Machine Learning libraries.
Actively participated in requirement gathering and analyzing business requirement .

Technologies: Hortonworks on RHEL 7.4, Hive, Spark 2.3.4, Python, Zeppelin Note Book, Unix Shell Scripting, Teradata, Oracle, Adobe Click Stream, and Data modeling.

Data Engineer

Confidential

Responsibilities:

Developed Sqoop with Unix scripts to load the data from SQL Server into HDFS
Developed Hive queries (Hql) for Data mapping and validation.
Developed customized application using Unix bash scripts for the TSV files ingestion
Wrote Shell scripts for data loading and injecting into HDFS.
Understand the bottlenecks and dependencies of existing systems
Responsible for direct interaction with client to gather the requirements.
Performed requirement analysis and design in building the data lake.
Work with business and Architect to understand the impact and streamline of the requirement.
Worked as an interface between the business and technical team to get solutions to identify and solve the issues.
Actively participated in requirement gathering and analyzing business requirement .

Technologies: Cloudera 5.8.3, Hive, Impala, Java UDF, Unix Shell scripts, Py-Spark, SQL Server.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship