Hadoop Engineer Resume
Orlando, FloridA
SUMMARY
- 7+ years of experience in software design, development, maintenance, testing, and troubleshooting of enterprise applications.
- Over 4 plus years of experience in design, development, and maintenance and support of Big Data Analytics using Hadoop Ecosystem components like HDFS, Hive, Pig, Sqoop, Zookeeper, Map Reduce, and Oozie.
- Strong working experience with ingestion, storage, processing and analysis of big data.
- Good Experience in writing Map Reduce programs using Java.
- Expertise in writing Hadoop Jobs for analyzing data using Hive and Pig.
- Good Experience in designing the Nifi flows for data routing, transformations and mediation logic.
- Successfully loaded files to HDFS from Oracle, SQL Server and Teradata using SQOOP.
- Experience with SQL, PL/SQL and database concepts
- Good Experience with job workflow scheduling like Oozie
- Good understanding of NoSQL databases.
- Experience on creating databases, tables and views in HIVE, IMPALA
- Experience with performance tuning on map reduce and hive jobs
- Load and transform large sets of structured, semi - structured and unstructured data using Hadoop ecosystem components.
- Implemented several optimization mechanisms like Combiners, Distributed Cache, Data Compression, and Custom Partitioner’s to speed up the jobs.
- Worked with Sqoop in importing and exporting data from different databases like MySql, Oracle into HDFS and Hive.
- Experience with working of cloud configuration in Amazon web services AWS
- Experience in working with different data sources like Flat files, XML files and Databases.
- Experience in database design, entity relationships, database analysis, programming SQL, stored procedure’s PL/ SQL, packages and triggers in Oracle
- Experience in various phases of Software Development Life Cycle (Analysis, Requirements gathering, Designing) with expertise in documenting various requirement specifications, functional specifications, Test Plans, Source to Target mappings, SQL Joins.
- Hands on experience with Spark-SQL & Spark Streaming.
- Used Spark-SQL & Scala API’s for querying & transformation of data residing in Hive.
- Knowledge on RDD’s, Data Frames and datasets.
- Proficient in PL/SQL programming - Stored Procedures, Functions, Packages, SQL tuning, and creation of Oracle Objects - Tables, Views, Materialized Views, Triggers, Sequences, Database Links, and User Defined Data Types.
- Proficient in Oracle 11g/10g/9i/8i, PL/SQL back end applications development using Toad, SQL Plus, and PL/SQL Developer.
- Proficiency in utilizing ETL tool Informatica Power Center for developing the Data warehouse loads with work experience focused inData Integrationas per client requirement.
TECHNICAL SKILLS
Hadoop Ecosystem: MapReduce, Hive, Pig, Flume, Sqoop, Oozie.
Monitoring and Automation: Nagios, Ganglia, Cloudera Manager.
Databases: Oracle 9i/10g/11g, SQL Server 2005/2008.
Languages: Java, C.
No SQL databases: Cassandra.
Reporting Tools: Cognos 8.X, Cognos 10.x Report Studio, Query Studio, Metric Studio, Analysis Studio, Event Studio, Framework manager
Other Tools: SQL Management Studio, Eclipse, Quality Center, Serena Version Control Tool, MATLAB.
PROFESSIONAL EXPERIENCE
Confidential, Orlando, Florida
Hadoop Engineer
Responsibilities:
- Used Nifi to ingest the data from various sources into the datalake.
- Worked with Pyspark for various transformations.
- Created Hive managed and external tables.
- Used Kafka for streaming application.
- Created topics in kafka broker which gets the data from sources with the help of Nifi and Spark job consumes it and pushes it into IBM Cloudant Database.
- Worked with Partitioning, bucketing and other optimizations in hive.
- Worked with ORC, JSON file formats and used various compression techniques to leverage the storage in HDFS.
- Developed and implemented core API services using Spark with Scala.
- Used Rally to keep the track of the user stories and tasks for completing in each sprint.
- Worked on ingesting the data from hive to spark and create dataframes in spark then updating it into IBM Cloudant Database.
- Used Pydev to perform business logic environment to call the REST API’s to update/create the documents in the IBM Cloudant Database
- Also prepared the data with the help of Paxata (a data preparation tool) for our Business users.
- Worked on various production issues during the month end support and provide the resolutions without missing any SLA.
- Used Gihub to set the overall direction of the project and track the progress of the project.
- Used Paxata for delivering the data to the BI users for creating the dashboards for the Daily Sales ticket of the theme park.
Confidential, New York City, New York
Hadoop Engineer
Responsibilities:
- Worked on processing large sets of structured, semi-structured and unstructured data and also supported systems application architecture.
- Participated in multiple big data POC to evaluate different architectures, tools and vendor products.
- Designed and implemented spark streaming application .
- Worked on pulling the data from Amazon S3 bucket into the HDFS.
- Worked on various file formats like avro and ORC.
- Involved in sqooping the history data from Teradata and loading into HDFS.
- Built a DI/DQ utility to validate the data at the Teradata and Hadoop Distributed File System.
- Responsible for optimizing resource allocation in distributed systems.
- Responsible to monitor application with high performance.
- Used Bedrock (automation tool ) to perform the hive incremental updates.
- Worked with various RDD's, pair RDD's and dataframes.
- Optimized the hive queries using Spark which bought down the cluster usage( from 80% resources to 20%) using coalesce and repartition.
- Worked on the POC for evaluating the performance of the Paxata(a data wrangling tool).
Confidential, Delaware,NJ
Big Data/Hadoop Consultant
Responsibilities:
- Developed Big Data Solutions that enabled the business and technology teams to make data-driven decisions on the best ways to acquire customers and provide them business solutions.
- Developed various Map reduce programs to cleanse the data and make them consumable by hadoop.
- Migrated the existing data to Hadoop from RDBMS (Oracle) using Sqoop for processing the data.
- Worked with sqoop export to export the data back to RDBMS.
- Used various compression codecs to effectively compress the data in HDFS.
- Written Pig Latin scripts for running advanced analytics on the data collected.
- Created hive internal and external tables with appropriate static and dynamic partitions for efficiency.
- Used Avro SerDe’s for serialization and de-serialization and also implemented hive custom UDF’s involving date functions.
- Worked on a POC to benchmark the efficiency of Avro vs Parquet.
- Implemented the end to end workflow for extraction, processing and analysis of data using Oozie.
- Used various optimization techniques to optimize hive, pig and sqoop.
Environment: CDH, Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Unix.
Confidential, Kansas City, Missouri
Big Data/Hadoop Developer
Responsibilities:
- Involved in creating Hive tables, and loading and analyzing data using hive queries.
- Developed and executed custom MapReduce programs, Pig Latin scripts and HQL queries.
- Worked on importing the data from different databases into Hive Partitions directly using Sqoop.
- Performed data analytics in Hive and then exported the metrics to RDBMS using Sqoop.
- Involved in running Hadoop jobs for processing millions of records of text data.
- Extensively used Pig for data cleaning and optimization.
- Implemented complex map reduce programs to perform joins on the Map side using distributed cache.
- Developed multiple MapReduce jobs in java for data cleaning and preprocessing.
- Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
- Extracted Tables using Sqoop and placed in HDFS and processed the records.
Environment: CDH, Hadoop, HDFS, Map Reduce, Hive, Pig, Sqoop, Unix.
Confidential
Java Developer
Responsibilities:
- Involved in Analysis, Design, Coding and Development of custom Interfaces.
- Gathered requirements from the client for designing the Web Pages.
- Gathered specifications for the Library site from different departments and users of the services.
- Assisted in proposing suitable UML class diagrams for the project.
- Wrote SQL scripts to create and maintain the database, roles, users, tables, views, procedures and triggers in Oracle
- Designed and implemented the UI using HTML and Java.
- Worked on database interaction layer for insertions, updating and retrieval operations on data.
- Implemented Multi-threading functionality using Java Threading API
- Analyzed, Designed and Developed few new transactions from the root.
- Coordinated & Communicated with onsite resources regarding issues rose in production environment and used to fix day to day issues.
- Looked after Release Management & code reviews.
- Extensively used JSP and Struts for application development.
- Partly used Hibernate EJB and Web Services.
- Involvement in all Payday Transactions Issue Fixes and Enhancements.
- Developed web components using JSP, Servlets and JDBC.
- Supported with UAT, Pre-Prod and Production Build management.
- Development of new module like Title Pledge.
- Involved in the analysis of Safe/Drawer Transactions, Loan deposit modules and development of Collection Letters.
- Coordination with team for Fixes and Releases.
- Maintained and debugged applications.
- Unit tested and documented website applications and code.
Environment: Java, Java Servlets, JSP, XML, HTML, JavaScript, Oracle 11g, Eclipse 3.3.1