We provide IT Staff Augmentation Services!

Senior Hadoop Developer Resume

Wilmington, DE

SUMMARY:

  • Senior Hadoop Developer with 7+ years of IT experience in software Development and support with experience in developing strategic methods for deploying big data technologies to efficiently solve Big Data processing requirement.
  • Successful history of effectively implementing systems and directing key initiatives. Deep - rooted interest in designing and crafting efficient modern software.
  • Skilled in troubleshooting with the proven ability to provide creative and effective solutions through the application of highly developed problem-solving skills.
  • Hadoop Developer: Experience in installing, configuring, maintaining and monitoring of Hadoop Clusters Apache, Cloudera and Sandbox.
  • Hadoop Distributions: Horton works, Cloudera CDH4, CDH5 and Apache Hadoop.
  • Hadoop Ecosystem: Hands-on experience on Hadoop Ecosystem including HDFS, Sqoop, MapReduce, Yarn, Pig, Hive, Impala, Zookeeper, and Oozie.
  • Worked on Enterprise level Cloudera Hadoop clusters which has 500TB, 2 Petabytes size.
  • Data Ingestion: Using Flume, designed the flow and configured the individual components. Efficiently transferred bulk data from and to traditional databases with Sqoop.
  • Data Storage: Experience in maintaining distributed Storage HDFS.
  • Data Processing: Processed data using Map Reduce and Spark.
  • Data Analysis: Expertise in analyzing data using Pig scripting, Hive Queries, Sparks (python) and Impala.
  • Migrated the stored procedures into hadoop transformations.
  • Management and Monitoring: Maintained and coordinated service Zookeeper apart from designing and monitoring Oozie workflows. Used Azkaban batch job scheduler for controlling workflow of jobs.
  • Messaging System: Used Kafka as proof of concept to achieve faster message transfer across systems.
  • Scripting: Expertise in Hive, PIG, Impala, Shell Scripting, Perl Scripting, and Python.
  • Cloud Platforms: Configured Hadoop clusters in OpenStack and Amazon Web Services (AWS).
  • Visualization Integration: Integrated tableau, Alteryx using Impala and Hive ODBC connector.
  • Java/J2EE: Expertise in spring web MVC and Hibernate. Proficient in HQL (Hibernate Query language).
  • Project Management: Experience in Agile, Jira and Scrum project management.
  • Web Interface Design: Html, CSS, JavaScript, and bootstrap.
  • A quick learner with a proclivity for new technology and tools.

TECHNICAL SKILLS:

Java/J2EE, Hue, C++, C. LINUX, Windows, UNIX, MapReduce, Hive, Impala, Pig, Sqoop, ZooKeeper, Oozie, Spark, Toad, Flume, HBase. Netezza, DB2, Teradata, MySQL, Oracle, Maven, Eclipse, IntelliJ, JDBC, JSON. Python, Scala, Perl, Java Script, Autosys, JUnit, Net Beans, Putty, WinScp, FileZilla, Splunk, Log4j.

CHRONOLOGICAL SUMMARY OF EXPERIENCE:

Senior Hadoop Developer

Confidential, Wilmington, DE

Responsibilities:

  • Involved in Design and development of common frameworks, utilities across work streams.
  • Coordinated with business users to understand business requirements as part of development activities.
  • Implemented common jdbc utility for data sourcing in spark.
  • Improved the performance of spark jobs by configuring job settings.
  • Optimized and tuned spark applications by using storage level mechanism persist, cache.
  • Used HBase tables for storing the Kafka offset values.
  • Data Enrichment process handled using Spark for all dimension and fact tables.
  • Final tables are exported to Essbase two dimensional database for business validations.
  • Handled spark return codes by adding custom method, when jobs running in cluster mode.
  • Used broadcast variables for input control files as part of enrichment process.
  • Used Splunk for log analysis in uat and prod environment, to overcome operate support.
  • Handled Spark precision loss issue, by using Scala Big Decimal datatype.
  • Imported data from different RDBMS systems such as Oracle, Teradata.
  • Implemented custom jar utility for Excel to CSV file conversion.
  • Used compression techniques such as Snappy, Gzip for data loads and archival.
  • Handled incremental stats and compute stats as daily and weekly jobs to overcome memory issues and long running queries in impala.
  • Involved in writing Jil scripts for Scheduling jobs using an automation tool Autosys.
  • Daily, Monthly, Quarterly and adhoc based data loads automated in Autosys and will run as per calendar dates scheduled.
  • Involved in Production Support, BAU Activities and Release management.
  • Expertise in writing custom UDFs in Hive.

Environment: Cloudera Hadoop, Spark, Scala, Hive, HBase, Kafka, Essbase, Shell, Sqoop, Xml Workflows, Splunk, Teradata, Oracle, Hue, Impala, SVN, Bitbucket.

Senior Hadoop Developer

Confidential, Charlotte, NC

Responsibilities:

  • Coordinated with business users to gather business requirements and interacted with technical leads for the Application design level.
  • Implemented all custom file upload process in pyspark.
  • Implemented common jdbc utility for data sourcing in spark.
  • Worked on optimizing and tuning spark applications by using persist, cache, broadcast.
  • Improved the performance of spark jobs by configuring job settings.
  • Involved in edgenode migration for enterprise level cluster and re-built the application as per the new standards in architecture level.
  • Created workflows in Oozie along with managing/coordinating the jobs and combining multiple jobs sequentially into one unit of work.
  • Imported and exported data from different RDBMS systems such as Oracle, Teradata, SqlServer, Netezza and Linux systems such as Sas Grid.
  • Handled semi-structured data such as excel, csv and imported from sas grid to hdfs by using sftp process.
  • Ingested data into hive tables, using sqoop and sftp process.
  • Used compression techniques such as Snappy, Gzip for data loads and archival.
  • Created data pipelines and implemented all kinds of data transformations using hadoop and spark.
  • Data level transformations have been done in intermediate tables before forming final tables.
  • Data Integrity checks have been handled using hive queries, Hadoop and Spark.
  • All reporting tables are exposed in Tableau, by using Impala Server for better performance.
  • Installed and implemented Kerberos security authentication for applications Keytabs.
  • Involved in writing jil scripts for Scheduling jobs using an automation tool Autosys.
  • Daily, Monthly, Quarterly and adhoc based data loads automated in Austosys and will run as per calendar dates scheduled.
  • Involved in Production Support, BAU Actvities and Release management.
  • Expertise in writing custom UDFs in Hive.

Environment: Cloudera Hadoop, Pyspark, Hive, Pig, Shell, Sqoop, Oozie Workflows, Teradata, Netezza, Sql Server, Oracle, Hue, Impala, Jenkins, Kerberos.

Hadoop Developer

Confidential, Denver CO

Responsibilities:

  • Planned, installed and configured the distributed Hadoop Clusters.
  • Ingested data using Sqoop to load data from MySQL to HDFS on regular basis from various sources.
  • Configured Hadoop tools like Hive, Pig, Zookeeper, Flume, Impala and Sqoop.
  • Ingested data into Hive tables from MySQL, Pig and Hive using Sqoop.
  • Designed the flow and configured the individual components using Flume.
  • Transferred bulk data from and to traditional databases with Sqoop.
  • Migrated the SQL stored procedures into Hadoop transformations
  • Wrote Batch operation across multiple rows for DDL (Data Definition Language) and DML (Data
  • Manipulation Language) for improvised performance using the client API calls
  • Grouped and filtered data using hive queries, HQL and Pig Latin Scripts.
  • Queried both Managed and External tables created by Hive using Impala.
  • Implemented partitioning and bucketing in Hive for more efficient querying of data.
  • Created workflows in Oozie along with managing/coordinating the jobs and combining multiple jobs sequentially into one unit of work.
  • Maintained distributed Storage HDFS and Columnar Storage HBase.
  • Analyzed data using Pig scripting, Hive Queries, and Impala.
  • Maintained and coordinated service Zookeeper apart from designing and monitoring Oozie workflows
  • Designed and created both Managed/ External tables depending on the requirement for Hive.
  • Wrote custom UDFs in Hive.

Environment: Cloudera distribution CDH4, Hadoop, Map Reduce, MySQL, Linux, Hive, Pig, Impala, Sqoop, Zookeeper.

Hadoop Developer

Confidential, MA

Responsibilities:

  • Worked on analyzing Hadoop cluster and different Big Data analytic tools including Pig, Hive database and SQOOP.
  • Installed Hadoop, Map Reduce, HDFS, and developed multiple mapreduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Coordinated with business customers to gather business requirements. And also interact with other technical peers to derive Technical requirements and delivered the BRD and TDD documents.
  • Extensively involved in Design phase and delivered Design documents.
  • Involved in Testing and coordination with business in User testing.
  • Importing and exporting data into HDFS and Hive using SQOOP.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in map reduce way.
  • Experienced in defining job flows.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Experienced in managing and reviewing the Hadoop log files.
  • Used Pig as ETL tool to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Load and Transform large sets of structured and semi structured data.
  • Responsible to manage data coming from different sources.
  • Involved in creating Hive Tables, loading data and writing Hive queries.
  • Utilized Apache Hadoop environment by Cloudera.
  • Created Data model for Hive tables.
  • Involved in Unit testing and delivered Unit test plans and results documents.
  • Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
  • Worked on Oozie workflow engine for job scheduling.

Environment: HDFS, HIVE, Map Reduce, Shell, PIG, SQOOP, Oozie.

Java Developer

Confidential

Responsibilities:

  • Designing the initial Web-WAP pages for a better UI as per the requirement.
  • Involved in developing functional flow of the mZone application.
  • Integrated social media APIS to the application.
  • Used Ajax and JavaScript to handle asynchronous request to server, CSS to handle look and feel of the application.
  • Involved in design of basic Class Diagrams, Sequence Diagrams and Event Diagrams as a part of Documentation.
  • Involved and gained good exposure in creating the Hibernate POJO’s and developed Hibernate mapping Files.
  • Worked on tuning of back-end Oracle stored procedures using TOAD.
  • Used Hibernate, object/relational-mapping (ORM) solution, technique of mapping data representation from MVC model to Oracle Relational data model with an SQL-based schema.
  • Developed SQL queries and Stored Procedures using PL/SQL to retrieve and insert into multiple database schemas.
  • Performed Unit Testing Using JUnit and Load testing using LoadRunner.
  • Implemented Log4J to trace logs and to track information.

Environment: JSP, Struts, Jquery, Tomcat, CSS, JUnit, Log4j, SQL/PLSQL, Oracle 9i, Hibernate, Web services.

Java Developer

Confidential

Responsibilities:

  • Involved in Requirements gathering, Requirements analysis, Design, Development, Integration and Deployment.
  • Used JavaScript to perform checking and validations at Client's side.
  • Extensively used Spring MVC framework to develop the web layer for the application. Configured DispatcherServlet in web.xml.
  • Designed and developed DAO layer using spring and Hibernate apart from using Criteria API.
  • Created/generated Hibernate classes and configured XML apart from managing CRUD operations (insert, update, and delete).
  • Involved in writing HQL and SQL Queries for Oracle 10g database.
  • Used log4j for logging messages.
  • Developed the classes for Unit Testing by using JUnit.
  • Developed Business components using Spring Framework and database connections using JDBC.

Environment: Spring Framework, Spring MVC, Hibernate, HQL, Eclipse, JavaScript, AJAX, XML, Log4j, Oracle 9i, Web Logic, TOAD.

Hire Now