We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

2.00/5 (Submit Your Rating)

Alpharetta, GA

PROFESSIONAL SUMMARY:

  • 8+ years of professional experience in Requirements Analysis, Design, Development and Implementation of Java and client - server technologies.
  • 4+ years of experience in Big Data technologies and Hadoop ecosystem like MapReduce, YARN, HDFS, Apache Cassandra, NoSQL, HBase, Oozie, Hive, Tableau, Sqoop, Pig, Storm, Kafka, Zoo Keeper and Flume.
  • Currently working on Spark and SparkStreaming frameworks extensively using Scala as the main programming language.
  • Used Spark Data frames, Spark-SQL and RDD API of Spark for performing various data transformations and dataset building.
  • Extensively worked on Spark Streaming and Apache Kafka to fetch live streaming data.
  • Good Knowledge on Cloudera distributions and in Amazon simple storage service (Amazon S3), AWS and Amazon EC2, Amazon EMR.
  • Expertise in writing Hadoop Jobs to analyze data using MapReduce, Apache Crunch, Hive and Solr.
  • Extensive experience in working with structured data using Hive QL, join operations, writing custom UDF’s and experienced in optimizing Hive Queries.
  • Experience using various Hadoop Distributions (Cloudera, Hortonworks, MapR) to fully implement and leverage new Hadoop features.
  • Experience in Apache Flume for collecting, aggregating and moving huge chunks of data from various sources such as web server, telnet sources etc.
  • Extensive experience in writing Pig scripts to transform raw data from several data sources into forming baseline data.
  • Extensive experience in importing/exporting data from/to RDBMS the Hadoop Ecosystem using Apache Sqoop.
  • Experienced in using Zookeeper to coordinate the servers in clusters and to maintain the data consistency.
  • Developed Map Reduce jobs to automate transfer of data from HBase.
  • Strong experience in working with UNIX/LINUX environments, writing shell scripts.
  • Extensive experiences in working with semi/unstructured data by implementing complex MapReduce programs using design patterns.
  • Extensive experience in developing applications using Java, JSP, Servlets, JavaBeans, JSTL, JSP Custom Tag Libraries, JDBC, JNDI, SQL, AJAX, JavaScript and XML.
  • Experienced in using Agile methodologies including extreme programming, SCRUM and Test Driven Development (TDD).
  • Proficient in integrating and configuring the Object-Relation Mapping tool, Hibernate in J2EE applications and other open source frameworks like Struts and Spring.
  • Configured and developed web applications in Spring, employed spring MVC architecture and Inversion of Control.
  • Experience in building and deploying web applications in multiple applications servers and middleware platforms including Web logic, Web sphere, Apache Tomcat, JBoss.
  • Experience in building, deploying and integrating applications with ANT, Maven.
  • Experience in writing SQL Queries, Stored Procedures, Views, Functions, and Triggers in Oracle 9i/10g/11g and MySQL4.x and 5.x.
  • Good knowledge in Web Services, SOAP programming, WSDL, and XML parsers like SAX, DOM, AngularJS, Responsive design/Bootstrap.
  • Demonstrated technical expertise, organization and client service skills in various projects undertaken.
  • Strong commitment to organizational work ethics, value based decision-making and managerial skills.

TECHNICAL SKILLS:

Big Data Ecosystem: Hadoop, MapReduce, YARN, HDFS, HBase, Zookeeper, Hive, Hue, Pig, Sqoop, Cassandra, Spark, Oozie, Storm, Flume, Talend, Cloudera Manager, MapR, Hortonworks clusters.

Languages: C, Java, PL/SQL, Pig Latin, Python, HiveQL. Scala, SQL

Java/J2EE Web Technologies: J2EE, EJB, JSF, Servlets, JSP, JSTL, CSS, HTML, XHTML, CSS, XML, Angular JS, AJAX.

Frame works: Struts, Spring 3.x, ORM (Hibernate), JPA, JDBC

Web Services: SOAP, RESTful, JAX-WS

Web Servers: Web Logic, Web Sphere, Apache Tomcat.

Scripting Languages: Shell Scripting, Java script.

Database: Oracle 9i/10g, Microsoft SQL Server, MySQL, DB2, Teradata SQL, RDBS, MongoDB, Cassandra, HBase

Design: UML, Rational Rose, E-R Modelling, Microsoft Visio

IDE & Build Tools: Eclipse, NetBeans, ANT and Maven.

Version Control System: CVS, SVN, GITHUB.

PROFESSIONAL EXPERIENCE

Confidential, Alpharetta, GA

Sr. Big Data Developer

Responsibilities:

  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Sqoop, Kafka, Spark, Impala with Cloudera distribution.
  • Developed Spark code using Python and Spark-SQL for large data sets thru batch processing.
  • Involved in creating Hive tables, loading structured data and writing Hive queries.
  • Developed OOZIE workflows for automating Sqoop, Spark and Hive scripts.
  • Involved in running Hadoop streaming jobs to process terabytes of text data. Worked with different file formats such as Text, Sequence files, Avro, ORC and Parquet.
  • Configured, supported and maintained all network, firewall, storage, load balancers, operating systems, and software in AWS EC2.
  • Implemented the use of Amazon EMR for Big Data processing among a Hadoop Cluster of virtual servers on Amazon related EC2 and S3.
  • Developed process to read files and performed ETL through Spark RDD and data frame. Wrote UDF on Spark data frames.
  • Exported the analyzed data to the databases such as Teradata, MySQL and Oracle using Sqoop for visualization and to generate reports for the BI team
  • Used Sqoop to perform data transfers across applications involving HDFS and RDBMS.
  • Developed PIG and Hive UDF's in java for extended use of PIG and Hive and wrote PigScripts for sorting, joining, filtering and grouping the data.
  • Involved in loading and transforming large sets of Structured, Semi-Structured data and analyzed them by running Hive queries.
  • Created Hive tables (external, internal) with static and dynamic partitions and performed bucketing on the tables to provide efficiency.
  • Used Hive QL to analyze the partitioned and bucketed data and compute various metrics for reporting.
  • Involved in designing and developing HBase tables and storing aggregated data from Hive table.
  • Implemented Spark RDD transformations to Map business analysis and apply actions on top of transformations.
  • Involved in converting Hive/HQL queries into Spark transformations using Spark RDD, Scala and Python.
  • Expertise in processing large sets of structured, semi-structured data in Spark & Hadoop, and store them in HDFS.
  • Develop generic SQOOP import utility to load data from various RDBMS sources
  • Load and transform large data sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Performed Data Ingestion from multiple internal clients using Apache Kafka.
  • Implemented real time system with Kafka, Storm and Zookeeper.
  • Worked on integrating Apache Kafka with Spark Streaming process to consume data from external REST APIs and run custom functions.
  • Developed SQL queries into Spark Transformations using Spark RDDs, DataFrames and Scala, and performed broadcast join on RDD's/DF.
  • Developed Spark code using Scala,Spark-SQL and Python for faster testing and processing of data.
  • Analyzed the SQL scripts and designed the solution to implement using PySpark.
  • Used Scala programming as well to perform transformations and applying business logic.
  • Developed Hive queries in Spark-SQL for analysis and processing the data.
  • Worked with Spark Context, Spark-SQL, DataFrames, Pair RDD's, Spark Streaming.
  • Involved in loading data from LINUX filesystem to HDFS.

Environment: Big Data, Hadoop, MapReduce, Sqoop, HDFS, HBase, Hive, Pig, Sqoop, Oozie, Zookeeper, Cassandra, Teradata, MySQL, Oracle, Scala, Spark, JAVA, UNIX Shell Scripting, AWS EC2. S3, EMR,Kafka.

Confidential Columbus Ohio

Big Data Developer

Responsibilities:

  • Worked in Multi Clustered Hadoop Echo-System environment.
  • Created MapReduce programs using Java API that filter un-necessary records and find out unique records based on different criteria.
  • Performed optimizing Map Reduce Programs using combiners, partitioners and custom counters for delivering the best results.
  • Worked with Linux systems and RDBMS database on a regular basis so that data can be ingested using Sqoop.
  • Designed and implemented HIVE queries and functions for evaluation, filtering, loading and storing of data.
  • Creating Hive tables and working on them using HiveQL.
  • Developed data pipeline using Flume and Spark to store data into HDFS.
  • Big data processing using Spark, AWS, and Redshift.
  • Involved in the process of data acquisition, data pre-processing and data exploration of telecommunication project in Spark.
  • Involved in performing the Linear Regression using Spark MLlib in Scala.
  • Continuous monitoring and managing the Hadoop cluster through HDP (Hortonworks Data Platform).
  • Implemented Frameworks using Java and Python to automate the ingestion flow.
  • Loaded the CDRs from relational DB using Sqoop and other sources to Hadoop cluster by using Flume.
  • Implemented collections & Aggregate Frameworks in MongoDB.
  • Experience in processing large volume of data and skills in parallel execution of process using Talend functionality.
  • Involved in loading data from UNIX file system and FTP to HDFS.
  • Design and Implementation of Batch jobs using MR2, PIG, Hive, Tez.
  • Used Apache Tez for highly optimized data processing.
  • Developed Hive queries to analyze the output data.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS.
  • Developed Pig Custom UDF's for custom input formats for performing various levels of optimization.
  • Used Pig to import semi-structured data coming from Avro files to make serialization faster.
  • Loading Data into HBase using Bulk Load and Non-bulk load.
  • Used Spark for fast processing of data in Hive and HDFS.
  • Performed batch processing of data sources using Apache Spark, Elastic search.
  • Used Zookeeper to provide coordination services to the cluster.
  • Created Hive queries that helped market analysts spot emerging trends by comparing fresh data with reference tables and historical metrics.
  • Worked on Reporting tools like Tableau to connect with Hive for generating daily reports.
  • Utilized Agile Scrum methodology.

Environment: Hadoop, HDFS, Pig, Hive, Map Reduce, Scala, Flume, Sqoop, Hortonworks, AWS, Redshift, Oozie, Zookeeper, Avro, Python, Shell Scripting, SQL Talend, Spark, HBase, MongoDB, Linux, Kafka.

Confidential - Chicago, IL

Hadoop Developer

Responsibilities:

  • Worked on analyzing, writing Hadoop Map reduce jobs using Java API, Pig and Hive.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Involved in loading data from edge node to HDFS using shell scripting.
  • Worked on installing cluster, commissioning & decommissioning of data node, planning, and slots configuration.
  • Created HBase tables to store variable data formats of PII data coming from different portfolios.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat

Confidential

Java Application Developer

Responsibilities:

  • Involving in design, development, testing, and implementation of the process systems, working on iterative life cycles business requirements and creating Detail Design Document.
  • Developed various helper classes needed using Multithreading.
  • Using Agile methodologies to plan work for every iteration and used continuous integration tool to make the build passes before deploying the code to other environments.
  • Worked on Java Script libraries like JQuery and JSON
  • Designed and developed web-based software using Spring MVC Framework and Spring Core.
  • Worked on the JAVA Collections API for handling the data objects between the business layers and the front end.
  • Performed UNIX administration (L1) activities and worked closely with application support teams in deploying jobs in production server.
  • Implemented Controller Classes and Server side validations for account activity, payment history, and transactions.
  • Implemented session beans to handle business logic for fund transfers.
  • Designed and developed Web Services to provide services to the various clients using Restful.
  • Designed the user interface of the application using EXT JS, HTML5, CSS3, JSF 2.1, JavaScript and AJAX.
  • Extensive experience on modern frontend template in frameworks for JavaScript including AngularJS, JQuery.
  • Implemented Hibernate Framework to connect to database and mapping of java objects to database tables.

Environment: HTML5, CSS3, JSF 2.1, JavaScript, AJAX, JQuery, JSON, UNIX, spring.

Confidential

Java Application Developer

Responsibilities:

  • Assisted in designing and programming for the system, which includes development of Process Flow Diagram, Entity Relationship Diagram, Data Flow Diagram and Database Design.
  • Involved in Transactions, login and Reporting modules, and customized report generation using Controllers, Testing and debugging the whole project for proper functionality and documenting modules developed.
  • Designed front end components using JSF.
  • Involved in developing Java APIs, which communicates with the Java Beans.
  • Implemented MVC architecture using Java, Custom and JSTL tag libraries.
  • Involved in development of POJO classes and writing Hibernate query language (HQL) queries.
  • Implemented MVC architecture and DAO design pattern for maximum abstraction of the application and code reusability.
  • Created Stored Procedures using SQL/PL-SQL for data modification.
  • Used XML, XSL for Data presentation, Report generation and customer feedback documents.
  • Used Java Beans to automate the generation of Dynamic Reports and for customer transactions.
  • Developed JUnit test cases for regression testing and integrated with ANT build.
  • Implemented Logging framework using Log4J.
  • Involved in code review and documentation review of technical artifacts.

Environment: J2EE/Java, JSP, Servlets, JSF, Hibernate, spring, JavaBeans, XML, XSL, HTML, DHTML, JavaScript, CVS, JDBC, Log4J, Oracle 9i, IBM WebSphere Application Server.

We'd love your feedback!