We provide IT Staff Augmentation Services!

Hadoop Developer Resume

0/5 (Submit Your Rating)

NY

SUMMARY

  • Around 7 years of overall IT experience in a variety of industries including Finance, Retail and Insurance.
  • Hands on experience of 2+ years in Big Data Analytics and development and 4+ years in Java as well as ETL Data Warehousing Development.
  • Excellent understanding of Hadoop Architecture and various components such as HDFS Framework, Job Tracker, Task Tracker, Name Node, Data Node and MRv1 and MRv2 and underlying Hadoop framework including Storage Management.
  • Expertise with tools in Hadoop Ecosystem including Sqoop 1.4.6, Spark 1.6, Hive 1.2.1, MapReduce, Kafka 0.10.x, Zeppelin, Yarn 2.6.x, Pig 0.12 and Zookeeper 3.4.9.
  • Extensive experience in importing and exporting data using Sqoop 1.4.6 from HDFS/Hive to Relational Database Systems (RDBMS) and vice versa.
  • Experience in analyzing data using HiveQL 1.2.1 and MapReduce jobs in Java for data cleansing, transformations, pre - processing and analysis.
  • Expert in writing Pig Latin scripts and HiveQL queries to process and analyze data.
  • Transformed various formats of data like sequence File, RC, ORC, Parquet, JSON, AVRO, and experience in dealing with Compression techniques such as Gzip, Snappy.
  • Excellent understanding of Kafka’s streaming platform, Producer, Consumer, Topics, Brokers and Zookeeper.
  • Experience data processing like collecting, aggregating, moving from various sources using Kafka.
  • Expertise in Spark, and in-depth knowledge on Spark-SQL, Lazy transformation and actions.
  • Worked on back-end using Scala 2.10 and Spark 1.6 to perform several aggregations logics.
  • Managed data coming from different sources and involved in HDFS maintenance and loading of structured and unstructured data.
  • Experience in working with NoSQL databases like Hive 1.2.1, HBase 1.2.0 and Cassandra 3.0.
  • Well versed with the working of Data Visualization tools such as Tableau 9.3, D3.js.
  • Expertise in Web application development using J2EE 7, HTML5, CSS3, Bootstrap, JavaScript, JSON, JQuery, AJAX, Web Services (REST, SOAP), JDBC, Spring MVC 4.x and Apache CXF 3.1.10 and Object-Oriented Programming (OOPS).
  • Hands on experience in Unit Testing using ScalaTest 3.0.1 Framework in Test driven Development (TDD) environment.
  • Involved in all stages of Software Development Life Cycle (SDLC) as well as working in Agile based development environment, participating in sprint/iterations and scrum meetings.
  • Expertise in using Git for version control system, JIRA as a bug and issue tracking tool and Jenkins for continuous integration/continuous deployment.
  • Thorough Knowledge in Infrastructure operations involving AWS cloud platforms, EC2, EMR, Kinesis, Kinesis Firehose, EBS, S3, AWS RDS, AWS Lambda, AWS API Gateway, AWS Code commit, AWS VPC, AWS IAM Roles and Security groups.

TECHNICAL SKILLS

Hadoopecosystem: Hadoop 2.7, MapReduce v2.6.x Spark 1.6, \Struts 1.2/2.3, spring 2.x/3.x/4.x, Hibernate, \ Hive 1.2.1, PIG 0.12.0, Putty 0.70, Flume 1.6.0, \Core Java - JDK 1.5/1.6/1.7/1.8, JSP 2.0/3.0

Framework: Scoop 1.4.6/ 1.4.5, YARN 2.6.x, HDFS 2.6.0, \4.3.6, Servlets, Java Beans, Multithreading, \ Zookeeper 3.4.9, Oozie 4.3.0, WinSCP \JDBC, RMI, Maven, Log4J

Java Technologies: Kafka 0.10.x

SOA\Programming Language: RESTful (JAX-RS), SOAP (JAX-WS), WSDL\Core Java, Scala, J2EE, SQL, C/C++

Amazon Web Services: EC2, EMR, EBS, S3, AWS RDS, R53 \MySQL 5.0, Oracle 10g/11g, SQL Server, \ ELB, AWS Config, Cloudtrail, Cloudwatch, \Tadata, Snowflake, Cassandra 3.0, HBASE 1.2.0

Databases: AWS lambda, AWS API gateway, Amazon \ Code Commit, Kinesis, AWS Security groups and AWS VPC, Kinesis Firehose, IAM roles.

Collaboration: Git 2.12.0, Waterfall, Agile, JIRA 6.4, Jenkins\ 2.32.2

PROFESSIONAL EXPERIENCE

Confidential, Plano TX

Data Engineer

Responsibilities:

  • Imported Structured data from Teradata and SQL Server into Amazon s3 and stored into SnowFlake.
  • Developed Python script to load Structure data into Parquet format for Performance and Space utilization.
  • Created Metric table and Staging table to load required raw data into variant format from the Parquet file.
  • Created production tables and loaded data from dev environment and provided L3 support.
  • Monitor the Daily, weekly and Monthly jobs for any issues and failures in Dev and Production.
  • Scheduled daily Production jobs to export the data in CSV/Parquet format to S3 based on the customer requirement.
  • Registered Business and technical datasets of Snowflake and S3 into Nebula for metadata management.

Environment: Hadoop, Spark, Python, SnowFlake, EMR, S3, Teradata, SQL Server, GitHub, HP Service Manager, Service now, Metriplex, Nebula

Confidential, NY

Hadoop Developer

Responsibilities:

  • Designed and Implemented Sqoop incremental imports, delta imports on tables without primary keys and dates from Oracle and appends directly into Hive Warehouse.
  • Performed transformations, cleaning and filtering on large datasets such as Structured data from Oracle databases
  • Unstructured and Semi Structured data from click streams and consumer feedback using Spark Jobs.
  • Import the data from different sources like HDFS into Spark RDD and perform computations using Scala to generate the output response.
  • Responsible for analyzing and cleansing raw data by performing Spark jobs on data to study customer behavior.
  • Experience on Kafka and Spark integration for real-time data processing by using Kafka Producer and Kafka Consumer components for real-time data processing.
  • Used Spark Streaming APIs to perform transformations and actions on the fly data model which gets the data from Kafka in near real-time.
  • Implemented Ad-hoc query using Hive to perform analytics on structured data.
  • Developed Workflows with the help of Oozie to manage the flow of jobs and wrote Custom Expression Language (EL) functions for complex workflows.

Environment: CDH 5.7.5, Hadoop 2.7, Spark 1.6.3, Scala 2.10, Hive 1.2.1, Sqoop 1.4.5, Kafka 0.10, Oozie 4.3.0, MapReduce, Zookeeper 3.4.9, Oracle 11g, Putty, Agile, JIRA

Confidential, NY

Big Data Developer

Responsibilities:

  • Performed advanced procedures like text analytics and processing, using the in-memory computing capabilities of Spark.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDD and Scala.
  • Experience with Spark Context, Spark-SQL, Data Frame and Pair RDD's.
  • Executed Spark programs on log data to transform into structured way to find user product and services and their using habits.
  • Responsible for analyzing large data sets and analyze spending patterns by developing new Spark programs.
  • Created Cassandra tables and ingested data using Cassandra Java APIs.
  • Loaded datasets from MySQL to HDFS and Hive respectively on daily basis.
  • Experience in Importing and exporting data into HDFS and Hive using Sqoop.
  • Implemented extensive Hive queries and creating views for Ad-hoc and business processing.
  • Analyzed data by performing Hive queries (HiveQL) and running Pig Latin scripts to study customer and categorize them in terms of satisfaction.
  • Optimized Hive scripts to use HDFS efficiently by using various compression mechanisms.
  • Created Hive schemas using performance techniques like partitioning and bucketing.
  • Proactively monitored systems and services, architecture design and implementation of Hadoop deployment, configuration management, backup, and disaster recovery systems and procedures.

Environment: CDH 5.7.0, Hadoop 2.6, HDFS, hive 1.1.0, Pig 0.12.0, Sqoop 1.4.6, Spark 1.6.0, Scala 2.10, Cassandra 3.0, Zookeeper 3.4.5, Map Reduce, Putty, MySQL

Confidential

Java/Hadoop Developer

Responsibilities:

  • Involved in the complete Software Development Life Cycle (SDLC) phases of the project.
  • Assisted in Installation and Configuration of Apache Hadoop clusters and Hadoop tools for application development includes HDFS, YARN, Sqoop, Flume, Hive, Pig and HBase.
  • Developed a series of Map Reduce algorithms to increase the performance.
  • Worked on creating combiners, partitions, and distributed cache to improve the performance of MapReduce jobs as well as monitor the data pipeline.
  • Migrating the needed data from MySQL in to HDFS using Sqoop and importing various formats of flat files into HDFS.
  • Developed Map Reduce programs in Java to help the analysis team to read, write, delete, update and analyze the big data.
  • Implemented Spring MVC framework 4.0 in the application. Developed code for obtaining bean references in spring framework using Dependency Injection (DI).
  • Used Spring RESTful API to create RESTful Web Services, set JSON data type between front-end and the middle-tier controller, and handled the Cross-Domain Requests.
  • Designed Spring Controller to handle requests from users and return results, Spring Validators to verified requests, to check customers status, and generate alert messages.
  • Designed interactive web pages with front end screens using HTML5, CSS3 and JavaScript 1.8.4.
  • Mapped (one-to-many, one-to-one, many-to-one relations) DAOs to Oracle Database tables and Java data types to SQL data types by creating Hibernate 4 mapping XML files.

Environment: JDK 1.5/6, AJAX, hibernate 3.x, JSP 2.1, Spring, Servlets, Eclipse 3.x, Oracle 10g, MS-SQL, PL/SQL, XML, HTML, JavaScript, Web sphere, JUnit, Log4j, Shell Scripting, UNIX, HDFS, Hadoop 2.3, MapReduce

Confidential

Java Developer

Responsibilities:

  • Developed multithreaded programs using Core Java to measure system performance
  • Implemented Spring MVC in the application.
  • Involved in XML configuration for obtaining bean references in spring framework using Dependency Injection (DI) or Inversion of Control (IOC).
  • Used Object/Relational mapping Hibernate framework as the persistence layer for interacting with Oracle.
  • Implemented RESTFul Web Services for non-sensitive information consume.
  • Created Secure Web services using SOAP security extensions and certificates for payment info consume.
  • Developed custom tag in JSP, involved in implementing UI using JSP, HTML5, CSS3 and validated with JavaScript for providing the user interface and communication between the client and server.
  • Wrote stored procedures in Oracle 10g using PL/SQL for data entry and retrieval in Reports module.

Environment: Java1.6, JSP, Struts, Servlet, Spring, Hibernate, MyEclipse,JavaScript, JSTL, Unix, Shell script, AJAX, XML, WebSphere Application Server, SQL, PL SQL, Maven, ORM, WebLogic 10, Webservice (SOAP, RESTFUL)

Confidential

Programmer Analyst

Responsibilities:

  • Developed components usingJavamultithreading concept.
  • Developed various Servlets for handling business logic and data manipulations from database
  • Involved in design of JSP's and Servlets for navigation among the modules.
  • Developed UI using HTML, JavaScript, and JSP, AJAX and developed Business Logic and Interfacing components using Business Objects, XML, and JDBC.
  • Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedureson MySQL database.

Environment: Core Java, Multithreading, J2EE, Servlets, JSP, EJB, JMS, UML, Rational Rose, Oracle 8i, Web logic 8.1, HTML, Java script, JUnit, ANT, XML, AJAX

We'd love your feedback!