We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Chicago, IL

SUMMARY:

  • 8 Years of extensive experience including 3 years of Big Data and Big Data analytics on Ecommerce, Education Financials and Healthcare domains and 5 years on Development and Implementation of database applications using Oracle 10g/9i, SQL and PL/SQL.
  • Having hands on experience in using Hadoop Technologies such as HDFS, HIVE, SQOOP, and Impala.
  • Having hands on experience in writing Map Reduce jobs in Hive, Pig.
  • Having experience on importing and exporting data from different systems to Hadoop file system using SQOOP.
  • Using Hadoop ecosystem components for storage and processing data, exported data into Tableau using Live connection.
  • Having experience on creating databases, tables and views in HIVEQL, IMPALA and PIG LATIN.
  • Strong knowledge on Map Reduce concepts, Around 1year experience on Spark and Scala.
  • Hands on Experience in working with ecosystems like Hive, Pig, Map Reduce.
  • Strong Knowledge of Hadoop, Hive and Hive analytical functions. Efficient in building map reduce programs using Hive and Pig.
  • Involved in data migration to implement on Hadoop stack from different databases (SQL Server2008 R2, Oracle, and MYSQL).
  • Successfully loaded files to Hive and HDFS from MYSQL.
  • Loaded the dataset into Hive for ETL Operations. Good knowledge on Hadoop Cluster architecture and monitoring the cluster.
  • Good understanding of cloud configuration in Amazon web services (AWS). In - depth understanding of Data Structure and Algorithms.
  • Experience in deploying applications in heterogeneous Application Servers TOMCAT, Web Logic, IBM Web Sphere and Oracle Application Server.
  • Strong Communication skills of written, oral, interpersonal and presentation. Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities.
  • Extensive work experience with different SDLC approaches such as Waterfall and Agile development methodologies.
  • Good communication and presentation skills. Ability to identify and resolve problems both independently and quickly.
  • Moving data from HDFS to RDBMS and vice-versa using SQOOP. Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis.
  • Installed and configured Hadoop cluster in Test and Production environments.
  • Performed both major and minor upgrades to the existing CDH cluster. Implemented Commissioning and Decommissioning of new nodes to existing cluster.
  • Analyzing/Transforming data with Hive and Pig.

SKILL:

APACHE HADOOP HDFS (2 years), Hadoop (2 years), Hadoop Distributed File System (2 years), Oracle (6 years), SQL (8 years)

TECHNICAL SKILLS:

Big Data Hadoop Stack: HDFS, MRV2(YARN), SQOOP, Flume, PIG and Hive SPARK Spark Core, Spark Streaming, Spark SQL.

Machine Learning: Prediction, Classification, Clustering and Time series algorithms

NoSQL: HBase andMongoDB

Programming Language: Java, Python, R and Scala.

Analytics Tools: RStudio, Weka, Excel

RDBMS: MySQL, DB2 and Oracle

Reporting: Tableau, QlikView, D3JS and Excel

WORK EXPERIENCE:

Hadoop Developer

Confidential, Chicago, IL

Responsibilities:

  • Managed and reviewed Hadoop log files. Tested raw data and executed performance scripts.
  • Shared responsibility for administration of Hadoop, Hive and Pig.
  • Developed Map Reduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the EDW.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with EDW reference tables and historical metrics.
  • Enabled speedy reviews and first mover advantages by using Oozie to automate data loading into the Hadoop Distributed File System and PIG to pre-process the data.
  • Provided design recommendations and thought leadership to sponsors/stakeholders that improved review processes and resolved technical problems.
  • Managed and reviewed Hadoop log files.
  • Tested raw data and executed performance scripts.
  • Understand clearly the business requirements of the client with respect to the risk rating modules and report modules.
  • Working in the Cluster Setup 2-node and 5-node clusters with CDH3 distribution.
  • Involved in the data prediction analysis using K-Mean algorithm.
  • Coordinate discussions with customer and functional team as may be required to get various inputs.
  • Work closely with the technology counterparts in communicating the business requirements. Application design and database design.
  • Technical design document preparation.

Environment: Java, Machine learning, Cloud Era, Apache Hadoop, HDFS, Hive, Pig, Apache Spark, Spark Streaming, Spark SQL, SCALA, Git.

Hadoop Developer

Confidential, Chicago, IL

Responsibilities:

  • To lead the Big Data Analytics solution project to load the data from Source all through into Client's Modern Analytics Platform.
  • Analyze and Ingest Policy, Claims, Billing and Agency Data in Client's Solution which is done through multiple stages.
  • Written multiple Map Reduce programs to extract data for extraction, transformation and aggregation from different sources having multiple file formats including XML, JSON, CSV &other compressed file formats.
  • Assisted with data capacity planning and node forecasting.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using SQOOP and automated the SQOOP jobs by scheduling in Oozie.
  • Create Hive scripts to load data from one stage into another and implemented incremental load with the changed data architecture.
  • The Hive tables are created as per requirement were Internal or External tables defined with appropriate static, dynamic partitions and bucketing, intended for efficiency.
  • Performed data analysis, queries on hive, pig on AMBARI(Hortonworks)
  • Enhanced Hive performance by implementing Optimizing and Compressing Techniques.
  • Implemented Hive partitioning and bucketing to improve query performance in the Staging layer which is de-normalized form of the Analytics Model.
  • Implemented techniques for efficient execution of Hive queries like Map Joins, compress map/reduce output, parallel execution of queries.
  • Issued SQL queries via Impala to process the data stored in HDFS and HBASE.
  • Plan and review the deliverables. Assist the team in their development & deployment activities.
  • Involved in cluster setup meetings with the administration team.

Environment: Apache Hadoop 2.2.0, Hortonworks, MapReduce, Hive, Hbase, HDFS, PIG, Sqoop, Flume, Impala,Spark, Oozie, Kafka, MongoDB, UNIX, Shell Scripting, XML, JSON.

Hadoop Developer

Confidential, Oak Brook, IL

Responsibilities:

  • Analyze large datasets to provide strategic direction to the company.
  • Involved in analyzing the system and business.
  • Developed SQL statements to improve back-end communications.
  • Loaded unstructured data into Hadoop File System (HDFS).
  • Created ETL jobs to load Twitter JSON data and server data into MongoDB and transported MongoDB into the Data Warehouse.
  • Created reports and dashboards using structured and unstructured data.
  • Involved in importing data from MySQL to HDFS using SQOOP.
  • Involved in writing Hive queries to load and process data in Hadoop File System.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Involved in working with Impala for data retrieval process.
  • Exported data from Impala to Tableau reporting tool, created dashboards on live connection.
  • Sentiment Analysis on reviews of the products on the client's website.
  • Exported the resulted sentiment analysis data to Tableau for creating dashboards.

Environment: Cloudera, CDH4.3, Hadoop, Map Reduce, HDFS, Hive, MangoDB, SQOOP, MYSQL, SQL, Impala, Tableau.

Database Developer

Confidential, Northbrook, IL

Responsibilities:

  • Write T-SQL statements and use local and global Temp Tables, Views, and CTEs to support data extraction
  • Create Stored Procedures and Views to support reporting and interface data requirements
  • Create SSIS Packages for interfaces from different end users and different departments, loading data into various output files
  • Design daily/weekly/monthly SSRS reports utilizing multiple types of reports including sub reports, Drill Through/Drill Down, and Cascading Parameterized Reports
  • Work with deployment environment at final stage and establish database connections to tables and datasets
  • Provide migration and integration between SQL Server 2008 and 2012 under SQL Server Management Studio
  • Involve in multiple Software Development Life Cycle (SDLC) phases, including requirement gathering from end users, providing requirement analysis, and designing data mapping documents
  • Provide data cleansing including removal of duplicate data, conform data and time data type formatting, string editing, and data conversions
  • Update queries and Stored Procedures for existing reports according to updated requirements, and provide maintenance for database environments
  • Developed views which required complex joins and at the same time quick Response.
  • Involved in writing complex queries for the project as required.
  • Involved in loading flat files into database using SQL*Loader.
  • Modified, Tested and Debug the PL/SQL Packages, Functions, Procedures and Triggers according to the requirements.
  • Developed/modified stored procedures, functions, triggers, views and synonyms to implement the business logic.

Environment: Oracle 9i, Windows XP, SQL*Loader SQL Server 2008/2012, SQL Server Integration Services (SSIS), SQL Server Reporting Services (SSRS), MS Office 2007/2013, Windows 8/7, Qlikview, Team Foundation Server.

Oracle PL/SQL Developer

Confidential, Chicago Heights, IL

Responsibilities:

  • Analyzed Business requirements based on the Business Requirement Specification document.
  • Loaded Data into Oracle Tables using SQL Loader.
  • Create PL/SQL stored procedures, functions & packages for moving the data from staging area to database.
  • Involved in generating numbers for PRIMARY KEY VALUES using Oracle SEQUENCE objects
  • Performed extensive query analysis and tuning, indexes and hints and written numerous complex queries involving sub-queries, correlated queries, union/all, minus, inline SQL's, analytical function SQL's.
  • Developed program specifications for PL/SQL Procedures and Functions to do the data migration and conversion.
  • Created wide range of data types, tables, and index types and scoped variables.
  • Designed the front end interface for the users, using Oracle Forms.
  • Involved in database development by creating Oracle PL/SQL Functions, Procedures, Triggers, Packages, Records and Collections.
  • Involved in development of ETL process using SQL* Loader and PL/SQL Package.
  • Developed and customized Forms/Reports Using Oracle D2K.
  • Designed Data layouts and Developer Reports using Oracle D2K.
  • Implemented batch jobs (shell scripts) for loading database tables from Flat Files using SQL*Loader.
  • Participated in Performance Tuning using Explain Plan.
  • Created numerous of database Triggers using PL/SQL.
  • Created UNIX shell and Perl scripts for data file handling and manipulations.

Environment: Oracle 9i/10g, SQL, PL/SQL, SQL*Plus, Oracle D2K, SQL*Loader.

SQL Developer

Confidential, Carbondale, IL

Responsibilities:

  • Generated database SQL Scripts and deployed databases including installation and configuration
  • Plan, design, and implement application database code objects, such as stored procedures and views.
  • Build and maintain SQL scripts, indexes, and complex queries for data analysis and extraction.
  • Provide database coding to support business applications using Sybase T-SQL.
  • Perform quality assurance and testing of SQL server environment.
  • Develop new processes to facilitate import and normalization, including data file for counterparties.
  • Work with business stakeholders, application developers, and production teams and across functional units to identify business needs and discuss solution options.
  • Ensure best practices are applied and integrity of data is maintained through security, documentation, and change management.
  • Developed SQL Scripts to Insert/Update and Delete data in MS SQL database tables
  • Experience in writing PL/SQL and in developing and implementing Stored Procedures
  • Developed complex SQL queries to perform efficient data retrieval operations including stored procedures, triggers etc.
  • Build data connection to the database using MS SQL Server
  • Used different joins, sub queries and nested querying SQL query
  • Worked with different sources such as Oracle, SQL and Flat files
  • Worked on project to extract data from xml file to SQL table and generate data file reporting using SQL Server 2008.

Environment: My SQL, SQL Server 2008(SSRS, SSIS), Visual studio 2000/2005, MS Excel.

We'd love your feedback!