Big Data Engineer Lead Resume

SUMMARY

Total IT experience of 14 years in Analysis, Design, Development, Implementation, Support and Testing of software applications which includes 4 years of experience in Big Data/ Hadoop using HDFS, Hive, Hue, Spark, Kafka, Flume, Pig, Sqoop, HBase, YARN, Oozie, Python, Hortonworks, Cloudera, Unix Shell scripting, Map Reduce, AWS, EC2, Google Cloud, Core Java and Agile .
Working as Senior Big Data Engineer using CoreJava, HiveQL, Sqoop, MapReduce, Scala, SparkSQL, PySpark
Hands on experience on ETL to get data ingestions from RDBMS to Hive and HDFS using SQOOP from SQL Server , MYSQL, DB2.
Working in PySpark and especially in Spark Session, Pandas and Generators in Python.
Having good exposure in ORC, AVRO and PARQUET file formats in Hive QL.
Working on Hive QL, Spark QL and extensive knowledge on SQL Server Queries for complex datasets.
Hands on experience in XML Parsing, JSON Parsing and CSV files parsing using Pythons scripts.
Having good exposure with GitHub repository and CI, CD tool of JENKINS
Involved in creating tables, tuning, partitioning, bucketing of table and creating QL in Hive.
Working on Job Automation tool TIDAL, JIIRA tickets, Service Now, Secure CRT, Secure FX, WinSCP
Handling 1000 of jobs per day and doing data ingestion and track its status thru Automation Tool TIDAL
Working on JSON scripts generation and writing UNIX shell scripting to call the SQOOP Import/Export
Working with product owners to review the requirement, analyze the business logic, and translate that to the details tasks of the particular user stories in Rally
Working closely with team members to implement the requirement captured in the Rally user stories
Working in relational SQL and NoSQL databases, including Oracle, Hive, Sqoop and HBase
Monitor jobs, queues, and HDFS capacity using Zookeeper and vendor - specific front-end cluster management tools
Worked on Capacity Scheduler Yarn, commission & decommission nodes on cluster
Hands on experience in Security setup for Ranger, Kerberos, LDAP and link with Active Directory
Onboarding users to use Hadoop - configuration, access control, disk quota, permissions etc.
Good Exposure in Hadoop Framework and Big Data concepts and Cluster configuration
Configured and working with 100 nodes Hadoop cluster. Installation and Configuration of Hadoop, HBase, Hive,Pig,Sqoop and Flume using Ambari UI.
Experience using Service Now Tool and BMC Remedy to handle incidents, tickets and Change creation.
Excellent understanding of Hadoop architecture and different components of Hadoop clusters which include Job Tracker, Task Tracker, Name Node and Data Node.
Strong experience in collecting and storing stream data like log data in HDFS using Apache Flume.
Hands on experience in Dot Net, Java, JSP, SQL Server, MYSQL, Access, VBA Macros, Oracle PLSQL, Crystal Reports.

TECHNICAL SKILLS

Operating Systems: Windows, Linux,Ubuntu, Putty, SecureCRT

Languages/Technology: Big Data, Hadoop, HDFS, PySpark, Python, Spark, Map Reduce Hive, Hue, PigSqoop, Flume, Kafka, Storm and Spark. Core Java, Python, DotNet, VBA, JSPRestful API, Javascript, ASP.NET, VB.NET, C#, VB6, Classic ASP, VBA Macros

Java IDEs: Eclipse

Databases: HBase, MySQL 5.0, Oracle 10g, SQL Server

Development and Build Tools: Eclipse

Scripting Tools: Shell Scripting, JavaScript, VBScript

Cloud Technologies: AWS,S3,EC2, Google Cloud

PROFESSIONAL EXPERIENCE

Confidential

Big Data Engineer Lead

Responsibilities:

Working on Data Ingestion requests from many sources DB2, Oracle, SQL Server and MYSQL servers
Handling 1000 of jobs per day and doing data ingestion and track its status thru Automation Tool TIDAL
Working on JSON scripts generation and writing UNIX shell scripting to call the SQOOP Import/Export
Working on Python Script for XML Parsing, JSON Parsing of data based on requirements.
Ingesting tables from many RDBS data to Hive and HDFS
Handling queries in SparkSQL, HiveQL using Spark Session.
Using AGILE methodology to get the work and deliver the output Stories and Sprint basis.
Working on Python script and PySpark script based on user requirements
Provided support to users for diagnosing, reproducing and fixing Hadoop related issues.
Using JIIRA, Service Now for ticketing, incidents and Change Request creation
Created Unix script for Smoke Test Script to Cluster health check.

Confidential

BigData Developer

Responsibilities:

Involved in writing Map Reduce, Hive queries and Pig scripts and resolve Hive and HBase
Involved in performance tuning of Hive, Writing Hive QL and Pig Scripts based on requirements.
Involved in resolution of access issues, performance issues and Patch/upgrade related issues.
Involved in Cluster configuration and monitoring its activities through Ambari UI and Unix CLI.
Perform and support change tasks related to the Big Data / Hadoop environments covering changes to entire Hadoop Ecosystem (e.g. hardware, OS, cluster, Hive or Hbase)
Validation and/or Turn down services before/after any infrastructure (Network, Hardware, OS) changes.
Involved in Performance implementation in Hive Query, Table maintenance, HBase activities
Provided support to users for diagnosing, reproducing and fixing Hadoop related issues.
Using BMC Remedy for ticketing, incidents and Change Request creation
Created Smoke Test Script to Cluster health check.

Confidential

Hadoop Developer

Responsibilities:

Working with product owners to review the requirement, analyze the business logic, and translate that to the details tasks of the particular user stories in Rally
Participate daily scrum meetings to collaborate with other scrum team members
Working closely with team members to implement the requirement captured in the Rally user stories
Working in relational SQL and NoSQL databases, including Oracle, Hive, Sqoop and HBase.
Involved in performance tuning of Hive, Writing Hive UDF's and Pig UDF's based on requirements.
Involved in Importing and exporting data from HDFS using Sqoop, resolution of access issues, performance issues and Patch/upgrade related issues.
Involved in Performance implementation in Hive Query, Table maintenance, HBase activities.
Preparing and reviewing of run book preparation for change request and use case document.

We provide IT Staff Augmentation Services!

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship