We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

New Jersey, NJ

SUMMARY

  • Around 9 years of programming experience involved in all phases of Software Development Life Cycle (SDLC)
  • Over 5+ Years of Big Data experience in building highly scalable data analytics applications.
  • Strong experience working wif Hadoop ecosystem components like HDFS, Map Reduce, Spark, HBase, Oozie, Hive, Sqoop, Pig, Flume and Kafka
  • Good handson experiencing working wif various Hadoop distributions mainly Cloudera (CDH), Hortonworks (HDP) and Amazon EMR.
  • Good understanding of Distributed Systems architecture and design principles behind Parallel Computing.
  • Expertise in developing production ready Spark applications utilizing Spark - Core, Dataframes, Spark-SQL, Spark-ML and Spark-Streaming API's, SciKitLearn, SparkML(MLlib) and Tensorflow.
  • Strong experience troubleshooting failures in spark applications and fine-tuning spark applications and hive queries for better performance.
  • Worked extensively on Hive for building complex data analytical applications.
  • Strong experience writing complex map-reduce jobs including development of custom Input Formats and custom Record Readers.
  • Sound Knowledge in map side join, reduce side join, shuffle & sort, distributed cache, compression techniques, multiple hadoop Input & output formats.
  • Worked wif Apache NiFi to automate the data flow between the systems and managed flow of information between system.
  • Good experience working wif AWS Cloud services like S3, EMR, Redshift, Athena, Dynamo DB etc.,
  • Deep understanding of performance tuning, partitioning for optimizing spark applications.
  • Worked on building real time data workflows using Kafka, Spark streaming and HBase.
  • Extensive noledge on NoSQL databases like HBase, Cassandra and Mongo DB.
  • Solid experience in working wif csv, text, sequential, avro, parquet, orc, json formats of data.
  • Extensive experience in performing ETL on structured, semi-structured data using Pig Latin Scripts.
  • Designed and implemented Hive and Pig UDF's using Java for evaluation, filtering, loading and storing of data.
  • Experience in using Hadoop ecosystem and processing data using Tableau.
  • Experience wif Apache Phoenix to access the data stored in HBase.
  • Good noledge in the core concepts of programming such as algorithms, data structures, collections.
  • Developed core modules in large cross-platform applications using JAVA, JSP, Servlets, Hibernate, RESTful, JDBC, JavaScript, XML, and HTML.
  • Extensive experience in developing and deploying applications using Web Logic, Apache Tomcat and JBOSS.
  • Development experience wif RDBMS, including writing SQL queries, views, stored procedure, triggers, etc.
  • Strong understanding of Software Development Lifecycle (SDLC) and various methodologies (Waterfall, Agile).

TECHNICAL SKILLS

Programming Skills: Java/J2EE, JSP, Servlets, AJAX, EJB, Struts, Spring, JDBC, JavaScript, PHP and Python.

Databases: MYSQL, SQL, DB2 and Teradata

Web services: REST, AWS, SOAP, WSDL, Servers Apache Tomcat, WebSphere, JBoss

Operating Systems: Unix, Linux, Windows, Solaris

IDE tools: My Eclipse, Eclipse, NetBeans

QA Tools: Crashlytics or Fabrics

Web UI: HTML, JavaScript, XML, SOAP, WSDL

PROFESSIONAL EXPERIENCE

Confidential, New Jersey, NJ

Sr. Hadoop Developer

Responsibilities:

  • Developed Spark applications using PySpark utilizing Data frames and Spark SQL API for faster processing of data.
  • Developed highly optimized Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement
  • Data pipeline consists Spark, Hive and Sqoop and custom build Input Adapters to ingest, transform and analyze operational data.
  • Developed Spark jobs and Hive Jobs to summarize and transform data.
  • Used Spark for interactive queries, processing of streaming data and integration wif popular NoSQL database for huge volume of data.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark DataFrames and Scala.
  • Used different tools for data integration wif different databases and Hadoop.
  • Analyzed the SQL scripts and designed the solution to implement using Pyspark
  • Involved in installation of Tez and improved the query performance..
  • Used Spark for interactive queries, processing of streaming data and integration wif popular NoSQL database for huge volume of data.
  • Built real time data pipelines by developing kafka producers and spark streaming applications for consuming.
  • Ingested syslog messages, parses them and streams the data to Kafka.
  • Handled importing data from different data sources into HDFS using Sqoop and performing transformations using Hive, Map Reduce and then loading data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop, to further visualize and generate reports for the BI team.
  • Collecting and aggregating large amounts of log data using Flume and staging data in HDFS for further analysis
  • Analyzed the data by performing Hive queries (Hive QL) to study customer behavior.
  • Helped Dev ops Engineers for deploying code and debug issues.
  • Used Hive to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Developed Hive scripts in Hive QL to de-normalize and aggregate the data.
  • Scheduled and executed workflows in Oozie to run various jobs.
  • Experience in using Hadoop ecosystem and processing data using Amazon AWS.

Environment: Hadoop, HDFS, HBase, Spark, Scala, Hive, MapReduce, Sqoop, ETL, Java, PL/SQL, Oracle 11g, Unix/Linux.

Confidential, Daytona Beach, FL.

Sr. Hadoop Developer

Responsibilities:

  • Build a framework Spark wif Scala and migrated existing PySpark applications to improve the runtime and performance.
  • Developed highly optimized Spark applications to perform various data cleansing, validation, transformation and summarization activities according to the requirement
  • Performed Transformations like De-normalizing, Cleansing of data sets, Date Transformations, parsing some complex columns.
  • Worked wif different compression codecs like GZIP, SNAPPY and BZIP2 in MapReduce, Pig and Hive for better performance.
  • Worked wif Apache NiFi to automate the data flow between the systems and managed flow of information between systems
  • Have used Ansible for automation of frameworks.
  • Handled Avro, JSON and Apache Log data in Hive using custom Hive SerDes.
  • Worked on batch processing and scheduled workflows using Oozie.
  • Implemented installation and configuration of multi-node cluster on the cloud using Amazon Web Services (AWS) on EC2.
  • Worked in agile sprint methodology environment.
  • Have used the Knox gateway for having Hadoop security between the users and operators.
  • Used cloud computing on the multi-node cluster and deployed Hadoop application on cloud S3 and used Elastic Map Reduce (EMR) to run Map-reduce.
  • Used Hive-QL to create partitioned RC, ORC tables, used compression techniques to optimize data process and faster retrieval.
  • Implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.

Environment: Apache Hadoop, HDFS, Cloudera Manager, Java, MapReduce, Eclipse Indigo, Hive, HBASE, PIG, Sqoop, Oozie, SQL, Spring.

Confidential, Reston, VA

Hadoop Developer

Responsibilities:

  • Involved in requirement analysis, design, coding and implementation phases of the project.
  • Used Sqoop to load structured data from relational databases into HDFS.
  • Loaded transactional data from Teradata using Sqoop and created Hive Tables.
  • Worked on automation of delta feeds from Teradata using Sqoop and from FTP Servers to Hive.
  • Performed Transformations like De-normalizing, Cleansing of data sets, Date Transformations, parsing some complex columns.
  • Worked wif different compression codecs like GZIP, SNAPPY and BZIP2 in MapReduce, Pig and Hive for better performance.
  • Worked wif Apache NiFi to automate the data flow between the systems and managed flow of information between systems
  • Have used Ansible for automation of frameworks.
  • Handled Avro, JSON and Apache Log data in Hive using custom Hive SerDes.
  • Worked on batch processing and scheduled workflows using Oozie.
  • Implemented installation and configuration of multi-node cluster on the cloud using Amazon Web Services (AWS) on EC2.
  • Worked in agile sprint methodology environment.
  • Have used the Knox gateway for having Hadoop security between the users and operators.
  • Used cloud computing on the multi-node cluster and deployed Hadoop application on cloud S3 and used Elastic Map Reduce (EMR) to run Map-reduce.
  • Used Hive-QL to create partitioned RC, ORC tables, used compression techniques to optimize data process and faster retrieval.
  • Implemented Partitioning, Dynamic Partitioning and Buckets in Hive for efficient data access.

Environment: Apache Hadoop, HDFS, Cloudera Manager, Java, MapReduce, Eclipse Indigo, Hive, HBASE, PIG, Sqoop, Oozie, SQL, Spring.

Confidential, Fremont, CA

Java Developer

Responsibilities:

  • Involved in all the phases of the project development - requirements gathering, analysis, design, development, coding, testing and debugging
  • Implemented MVC architecture by using Struts to send and receive the data from front-end to business layer. Integrated the Struts and Hibernate to achieve Object relational mapping. Used apache struts to develop the web-based components and implemented DAO.
  • Implemented Struts framework in the presentation tier for all the essential control flow, business level validations and for communicating wif the business layer.
  • Integrated the Struts application wif Hibernate for querying/inserting & data management for SQL server database.
  • Responsible for design and development of Web Application in J2EE using Struts MVC Framework.
  • Involved in creating & consuming SOAP based & Restful web services.
  • Used Web Services for communication between the different internal applications.
  • Used SOAP for the communication between the different internal applications.
  • Used GitHub for version control management and consistently produced high quality code through disciplined and rigorous unit testing. Used Maven script for building and deploying the application.
  • Developed the XML schema and Web Services for the data maintenance and structures.
  • Involved in designing test plans, test cases and overall Unit testing of the system.
  • Object Oriented Analysis and Design using UML include development of class diagrams, Sequence diagrams and state diagrams and implemented these diagrams in Microsoft Visio.
  • Worked in agile sprint methodology environment.
  • Implemented MVC, DAOJ2EE design patterns as a part of application development.
  • Used Spring IOC and MVC for enhanced modules.
  • Developed the Persistence Layer using Hibernate.
  • Used DB2 as the database and wrote SQL & PL-SQL.
  • Designed and developed message driven beans that consumed the messages from the Java message queue.
  • Design and development of Web pages using HTML, CSS including Ajax controls and XML.
  • Written controllers based on Spring MVC and made calls to JSP pages.

Environment: Struts, Spring, HTML, CSS, Java, J2ee, JSP, XML, Eclipse, WebLogic, JavaScript. Java Mail API, Hibernate, SQL Server, JBoss, GitHub, Maven, Agile, Junit.

Confidential

Java Developer

Responsibilities:

  • Implemented the presentation layer wif HTML, CSS and JavaScript
  • Developed web components using JSP, Servlets and JDBC
  • Implemented secured cookies using Servlets.
  • Wrote complex SQL queries and stored procedures.
  • Implemented Persistent layer using Hibernate API
  • Implemented Search queries using Hibernate Criteria interface.
  • Provided support for loans reports for CB&T
  • Designed and developed Loans reports for Evans bank using Jasper and iReport.
  • Involved in fixing bugs and unit testing wif test cases using Junit.
  • Object Oriented Analysis and Design using UML include development of class diagrams, Sequence diagrams and state diagrams and implemented these diagrams in Microsoft Visio.
  • Maintained Jasper server on client server and resolved issues
  • Actively involved in system testing.
  • Fine tuning SQL queries for maximum efficiency to improve the performance
  • Designed Tables and indexes by following normalizations.
  • Involved in Unit testing, Integration testing and User Acceptance testing
  • Utilizes Java and SQL day to day to debug and fix issues wif client processes.

Environment: Java, Servlets, HTML, Java Script, JSP, Hibernate, Junit Testing, Oracle DB, SQL, Jasper Reports, iReport, Maven, Jenkins.

Hire Now