We provide IT Staff Augmentation Services!

Bigdata Developer Resume

5.00/5 (Submit Your Rating)

Warren, NJ

PROFESSIONAL SUMMARY:

  • Around 8 years of experience in IT industry with 3+ years of experience in Big Data implementing complete Hadoop solutions along with Java.
  • Good working experience in using Apache Hadoop eco system components like MapReduce, HDFS, Hive, Sqoop, Pig, Oozie, Flume, HBase and Zoo Keeper.
  • Writing UDFs and integrating with Hive and Pig.
  • Experience with Sequence files, AVRO and ORC file formats and compression.
  • Experience in Hadoop Distributions: Cloudera and Hortonworks,
  • Performed importing and exporting data into HDFS and Hive using Sqoop.
  • Experience in job workflow scheduling and monitoring tools like Oozie and Zookeeper.
  • Extensive knowledge in using SQL Queries for backend database analysis.
  • Strong knowledge in NOSQL column oriented databases like HBase, Cassandra and its integration with Hadoop cluster.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and vice - versa.
  • Led many Data Analysis & Integration efforts involving HADOOP along with ETL.
  • Hands on experience on Enterprise data lake to provide support for various uses cases including Analytics, processing, storing and Reporting of voluminous, rapidly changing, structured and unstructured data.
  • Extensive experience with SQL, PL/SQL and database concepts.
  • Transferred bulk data from RDBMS systems like Teradata into HDFS using Sqoop.
  • Experience in analyzing data using Hive QL, Pig Latin, and custom MapReduce programs in Java.
  • Well-versed in Agile, other SDLC methodologies and can coordinate with owners and SMEs.
  • Worked on different operating systems like UNIX, Linux, and Windows
  • Diverse experience utilizing Java tools in business, Web, and client-server environments including Java Platform, Enterprise Edition (Java EE), Enterprise Java Bean (EJB), JavaServer Pages (JSP), Java Servlets (including JNDI), Struts, and Java database Connectivity (JDBC) technologies.
  • Fluid understanding of multiple programming languages, including C#, C, C++, JavaScript, HTML, and XML.
  • Experience in web application design using open source MVC , Spring and Struts Frameworks.

TECHNICAL SKILLS:

Hadoop Core Services: HDFS, Map Reduce, Spark, YARN

Hadoop Distribution: Horton works, Cloudera, Apache

NO SQL Databases: HBase, Cassandra

Hadoop Data Services: Hive, Pig, Sqoop, Flume

Hadoop Operational Services: Zookeeper, Oozie

Monitoring Tools: Cloudera Manager

Cloud Computing Tools: Amazon AWS

Languages: C, Java/J2EE, Python, SQL, PL/SQL, Pig Latin, HiveQL, Unix Shell Scripting

Java & J2EE Technologies: Core Java, Servlets, Hibernate, Spring, Struts

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat

Databases: Oracle, MySQL, Postgress, Teradata

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Development Tools: Microsoft SQL Studio, Toad, Eclipse, NetBeans

Development methodologies: Agile/Scrum

Visualization and analytics tool: Tableau Software, Qlik View

PROFESSIONAL EXPERIENCE:

Confidential, Warren, NJ

Bigdata Developer

Responsibilities:

  • Involved in complete Big Data flow of the application data ingestion from upstream to HDFS, processing the data in HDFS and analyzing the data using several tools.
  • Imported the data from various formats like JSON, Sequential, Text, CSV, AVRO and Parquet to HDFS cluster with compressed for optimization.
  • Experienced on ingesting data from RDBMS sources like - Oracle, SQL Server and Teradata into HDFS using Sqoop.
  • Configured Hive and written Hive UDF's and UDAF's Also, created partitions such as Static and Dynamic with bucketing.
  • Managing and scheduling Jobs on a Hadoop cluster.
  • Created Hive External tables and loaded the data in to tables and query data using HQL.
  • Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
  • Developed PIG scripts for the analysis of semi structured data and Latin scripts to extract the data from the web server output files to load into HDFS.
  • Developed PIG UDF'S for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
  • Developed Oozie workflow for scheduling and orchestrating the ETL process and worked on Oozie workflow engine for job scheduling.
  • Managed and reviewed the Hadoop log files using Shell scripts.
  • Migrated ETL jobs to Pig scripts to do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
  • Using Hive join queries to join multiple tables of a source system and load them to Elastic search tables.
  • Experience in managing and reviewing huge Hadoop log files.
  • Collected the logs data from web servers and integrated in to HDFS using Flume.
  • Expertise in designing and creating various analytical reports and Automated Dashboards to help users to identify critical KPIs and facilitate strategic planning in the organization.
  • Involved in Cluster maintenance, Cluster Monitoring and Troubleshooting
  • Created Data Pipelines as per the business requirements and scheduled it using Oozie Coordinators.
  • Maintaining technical documentation for each and every step of development environment and launching Hadoop clusters.
  • Worked on different file formats like Sequence files, XML files and Map files using MapReduce Programs.
  • Worked with Avro Data Serialization system to work with JSON data formats.
  • Used Amazon Web Services S3 to store large amount of data in identical/similar repository.
  • Worked with the Data Science team to gather requirements for various data mining projects.
  • Wrote shell scripts for rolling day-to-day processes and it is automated.
  • Involved in build applications using Maven and integrated with Continuous Integration servers like Jenkins to build jobs.
  • Used Enterprise Data Warehouse database to store the information and to make it access all over organization.
  • Worked on BI tools as Tableau to create dashboards like weekly, monthly, daily reports using tableau desktop and publish them to HDFS cluster.

Environment: Hadoop, HDFS, Hive, Oozie, Pig, Sqoop, Shell Scripting, HBase, Jenkins, Tableau, Oracle, MySQL, Teradata and AWS.

Confidential, Florham Park, NJ

Big Data Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop
  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis from various sources.
  • Written multiple MapReduce programs to power data for extraction, transformation and aggregation from multiple file formats including XML, JSON, CSV & other compressed file formats.
  • Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
  • Involved in loading data from LINUX file system to HDFS.
  • Loaded and transformed large sets of structured, semi structured and unstructured data in various formats like text, zip, XML and JSON.
  • Defined job flows and developed simple to complex Map Reduce jobs as per the requirement.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Developed PIG UDFs for manipulating the data according to Business Requirements and also worked on developing custom PIG Loaders.
  • Hands on experience in setting up HBase Column based storage repository for archiving and retro data.
  • Responsible for creating Hive tables based on business requirements.
  • Used Enterprise data lake to provide support for various uses cases including Analytics, processing, storing and Reporting of voluminous, rapidly changing, structured and unstructured data.
  • Along with the Infrastructure team, involved in design and developed Kafka and Storm based data pipeline.
  • Implemented Partitioning, Dynamic Partitions and Buckets in HIVE for efficient data access.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Involved in data modeling and sharding and replication strategies in Cassandra.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Knowledge on handling Hive queries using Spark SQL that integrate Spark environment.
  • Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.

Environment: Apache Hadoop 2x, Cloudera, HDFS, MapReduce, Hortonworks, Hive, Pig, HBase, Spark, Scala, Sqoop, Kafka, FLUME, Cassandra, Oracle 11g/10g, Linux, XML,MYSQL.

Confidential

Hadoop Developer

Responsibilities:

  • Understanding business needs, analyzing functional specifications and map those to develop and designing MapReduce programs and algorithms.
  • Optimizing Hadoop MapReduce code, Hive and Pig scripts for better scalability, reliability and performance.
  • Developed the OOZIE workflows for the Application execution.
  • Performing data migration from Legacy Databases RDBMS to HDFS using Sqoop.
  • Writing Pig scripts for data processing.
  • Created HBase tables to load large sets of structured, semi-structured and unstructured data coming from UNIX, NoSQL and a variety of portfolios.
  • Implemented Hive tables and HQL Queries for the reports.
  • Imported data from Cassandra into HDFS using Mongo export utility.
  • Involved in developing shell scripts and automated data management from end to end integration work
  • Experience in performing data validation using HIVE dynamic partitioning and bucketing.
  • Written and used complex data type in storing and retrieved data using HQL in Hive.
  • Developed Hive queries to analyze reducer output data.
  • Implemented ETL code to load data from multiple sources into HDFS using pig scripts.
  • Highly involved in designing the next generation data architecture for the Unstructured data.
  • Developed PIG Latin scripts to extract data from source system.
  • Created and maintained technical documentation for executing Hive queries and Pig scripts.
  • Involved in Extracting, loading Data from Hive to Load an RDBMS using Sqoop

Environment: HDFS, Map Reduce, MySQL, Cassandra, Hive, HBase, Oozie, PIG, ETL, Hortonworks(HDP 2.0), Shell Scripting, Linux, Sqoop, Flume and Oracle 11g.

Confidential

Hadoop Developer

Responsibilities:

  • Involved in importing data from Microsoft SQLServer, MySQL, Teradata. into HDFS using Sqoop.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
  • Used Hive to analyze the partitioned and bucked data to compute various metrices of reporting.
  • Involved in creating Hive tables loading data, and writing queries that will run internally in MapReduce
  • Involved in creating Hive External tables for HDFS data.
  • Solved performance issues in Hive and Pig Scripts with understanding of Joins, Group and Aggregation and perform the MapReduce jobs.
  • Used Spark for transformations, event joins and some aggregations before storing the data into HDFS.
  • Troubleshoot and resolve data quality issues and maintain elevated level of data accuracy in the data being reported.
  • Analyze the large amount of data sets to determine optimal way to aggregate.
  • Worked on the Oozie workflow to run multiple Hive and Pig jobs.
  • Involved in creating Hive UDF's.
  • Developed Automated shell script to execute Hive Queries.
  • Involved in processing ingested raw data using Apache Pig.
  • Monitored continuously and managed the Hadoop cluster using cloudera manager.
  • Worked on different file formats like JSON, AVRO, ORC, Parquet and Compression like Snappy, zlib, ls4 etc.
  • Executed HiveQL in Spark using SparkSQL.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, scala.
  • Gained Knowledge in creating Tableau dashboard for reporting analyzed data.
  • Expertise with NoSQL databases like HBase.
  • Experienced in managing and reviewing the Hadoop log files.
  • Used GitHub as repository for committing code and retrieving it and Jenkins for continuous integration.

Environment: HDFS, MapReduce, Sqoop, Hive, Pig, Oozie, MySQL, Eclipse, Git, GitHub, Jenkins.

Confidential

Java Developer

Responsibilities:

  • Involved in various stages of Enhancements in the Application by doing the required analysis, development, and testing.
  • Prepared the High and Low level design document and Generating Digital Signature
  • For analysis and design of application created Use Cases, Class and Sequence Diagrams.
  • For the registration and validation of the enrolling customer developed logic and code.
  • Developed web-based user interfaces using struts frame work.
  • Handled Client side Validations used JavaScript
  • Wrote SQL queries, stored procedures and enhanced performance by running explain plans.
  • Involved in integration of various Struts actions in the framework.
  • Used Validation Framework for Server side Validations
  • Created test cases for the Unit and Integration testing.
  • Front-end was integrated with Oracle database using JDBC API through JDBC-ODBC Bridge driver at server side.
  • Designed project related documents using MS Visio which includes Use case, Class and Sequence diagrams.
  • Writing end-to-end flow i.e. controllers classes, service classes, DAOs classes as per the Spring MVC design and writing business logics using core java API and data structures
  • Used Spring JMS related MDB to receive the messages from other team with IBM MQ for queuing
  • Developed presentation layer code, using JSP, HTML, AJAX and JQuery
  • Developed the Business layer using spring (IOC, AOP), DTO, and JTA
  • Developed application service components and configured beans using Spring IOC. Implemented persistence layer and Configured EH Cache to load the static tables into secondary storage area.
  • Involved in the development of the User Interfaces using HTML, JSP, JS, Dojo Tool Kit, CSS and AJAX
  • Created tables, triggers, stored procedures, SQL queries, joins, integrity constraints and views for multiple databases, Oracle 11g using Toad tool.
  • Developed the project using industry standard design patterns like Singleton, Business Delegate Factory Pattern for better maintenance of code and re-usability
  • Developed unit test cases using Junit framework for testing accuracy of code and logging with SLF4j + Log4j
  • Worked with defect tracking system Clear Quest
  • Worked with IDE as Spring STS and deployed into spring tomcat server, WebSphere& used Maven as build tool
  • Responsible for code sanity in integration stream used Clear Case as version control tool

Environment: Java, J2EE, Spring, Spring Batch, Spring JMS, MyBatis HTML, CSS, AJAX, JQuery, JavaScript, JSP, XML, UML, JUNIT, IBM WebSphere, Maven, Clear Case, SoapUI, Oracle 11g,, IBM MQ.

We'd love your feedback!