Senior Hadoop Developer Resume
NY
SUMMARY:
- 6+ years of IT experience in a variety of industries, which includes 4+ years of hands on experience in Big Data technologies.
- In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
- Solid understanding of ETL architectures, data movement technologies and database concepts like Partitioning and Performance optimization.
- Design and development experience in ETL of data from various source systems usingSqoop, Hive, Pig and Data Lake for analytics.
- Design and development experience in transformation of data usingHive, Impala, HDFS, Pigand Spark Core/Spark SQL.
- Experience in analysing data using HiveQL, Pig Latin, HBase.
- Experience in building transformation logic using Oracle SQL & PL/SQL.
- Experience in performance tuning of large transformations onHadoop, Spark and PL/SQL.
- Strong understanding of data warehouse and data lake technology.
- Experience in developing conceptual, logical and physical design for various data types and large volumes.
- Developed data - lakes for managing data coming from different source system and loaded into HDFS.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Have good understanding of MapReduce architecture.
- Familiar designing data model with different no SQL databases like Hbase, MongoDB and Cassandra.
- Good knowledge in Python programming, Unix/shell scripting.
- Strong background in mathematics and have very good analytical and problem-solving skills.
TECHNICAL SKILLS:
Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Flume, Oozie, Apache Spark, Impala.
Hadoop Distribution: Cloudera CDHs, Hortonworks HDPs, MAPR.
Programming Languages: Python, C, SQL, PL/SQL, XML, HTML.
Database Systems: DB2,Oracle, MySQL, Teradata,HBase, MongoDB, Cassandra
Monitoring Tools: Cloudera Manager
Operating Systems: Windows, Linux, UNIX
PROFESSIONAL EXPERIENCE:
Confidential, NY
Senior Hadoop Developer
Responsibilities:
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
- Responsible for building scalable distributed data solutions using Hadoop.
- Implemented nine nodes CDH5.8.X Hadoop cluster on Red hat LINUX.
- Involved in loading data from LINUX file system to HDFS.
- Created HBase tables to store variable data formats of PII data coming from different portfolios.
- Implemented a script to transmit information from Oracle to HBase using Sqoop.
- Implemented best income logic using Pig scripts and UDFs.
- Implemented test scripts to support test driven development and continuous integration.
- Worked on tuning the performance Pig queries.
- Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
- Responsible to manage data coming from various sources and involved in loading data from UNIX file system to HDFS.
- Installed Oozie workflow engine to run multiple Hive and pig jobs.
- Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
- Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.
Confidential, NY
Hadoop Developer
Responsibilities:
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
- Created partitions, bucketing across state in Hive to handle structured data.
- Implemented Dash boards that handle HiveQL queries internally like Aggregation functions, basic hive operations, and different kind of join operations.
- Used Pig in three distinct workloads like pipelines, iterative processing and research.
- Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
- Implemented MapReduce jobs to write data into Avro format.
- Created Hive tables to store the processed results in a tabular format.
- Exported the analysed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Implemented various MapReduce Jobs in custom environments and updating them to HBase tables by generating hive queries.
- Created tables, secondary indexes, join indexes in Teradata development Environment for testing.
- Extracted files from other databases through Sqoop and placed in HDFS and processed.
- Captured the data logs from web server into HDFS using Flume.
- Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.
Environment: Z/OS, Windows 7, Hadoop HDFS cluster with Mapr 5.1
Confidential, CT
Hadoop Developer
Responsibilities:
- Installed and configured Hadoop MapReduce, HDFS and developed multiple MapReduce jobs for data cleansing and pre-processing.
- Worked on installing cluster, commissioning & decommissioning of data nodes, name node recovery, capacity planning, and slots configuration.
- Developed advanced Hive queries to generate data reports. Successfully delivered 40 data sample reports on time in 2-month delivery cycles.
- Used Multithreading, synchronization, caching and memory management.
- Used Sqoop to import data into HDFS and Hive from other data systems.
- Created and maintained stored procedures by adding changing stored procedures as needed.
- Understand High level/ e2e design document in Confidential Area.
- Understand and translate business requirements into data models.
- Responsible for Business Deliveries As part of daily job.
- Performing impact analysis for already present objects in database to understand dependencies and performance issues.
Environment: Z/OS, Windows 7, Hadoop HDFS cluster with Cloudera manager.
Confidential, MA
Hadoop Developer
Responsibilities:
- Build new tenant platforms on project request. (Ingesting data from Data lake, Data modeling, Pig script, Hive HQL, HBase and MongoDB).
- Created reports for the BI team using Sqoop to export data into HDFS and Hive.
- Involved in Requirement Analysis, Design, and Development.
- Export and Import data into HDFS, HBase and Hive using Sqoop.
- Work closely with the business and analytics team in gathering the system requirements.
- Load and transform large sets of structured and semi structured data.
- Loading data into Hive partitioned tables using various partition techniques.
- Creating job flows and scheduling the jobs using Oozie.
Confidential, GA
SQL Developer
Responsibilities:
- Worked with internal and external clients for import and normalization of third-party data.
- Documented and maintained database system specifications, diagrams, and connectivity charts.
- Monitored and provided front-line support of daily processes.
- Build and maintain SQL scripts, indexes, and complex queries for data analysis and extraction.
- Perform quality assurance and testing of SQL server environment and made sure that all the data assets were available.
- Created PL/SQL module which was used to integrate the existing data from third parties and on to the database.
- Came up with several programs so as to ensure that there was correct data transfer.
- Prepared and updated documentations. (Data analysis forms, Data mapping file, Business Requirement Description, Data validation forms etc.)
Confidential
BI Developer/Data Analyst
Responsibilities:
- Prepared BRS (Business Requirement Specifications) document that gives the detailed information about the requirements.
- Proficient in writing complex SQL Queries and Sub Queries.
- Designed/developed/Modified stored Procedures, Packages and Functions for derived column in the table transformation.
- Interact with client and different stakeholders.
- Worked on User Defined Data Types, Nested Tables and Collections.
- Experience in Regular Expression, Hierarchical SQL function & SQL Modelling.