We provide IT Staff Augmentation Services!

Sr. Big Data Developer Resume

2.00/5 (Submit Your Rating)

Richmond, VR

SUMMARY

  • Having 8+ years of overall IT experience in a variety of industries, which includes hands - on experience of 5+ years in Big Data Analytics and development.
  • Expertise with the tools in Hadoop Ecosystem including Hive, HDFS, MapReduce, Sqoop, Storm, Spark, Kafka, Yarn, Oozie, and Zookeeper.
  • Excellent knowledge of Hadoop ecosystems such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, and Map Reduce programming paradigm.
  • Experience in manipulating/analysing large datasets and finding patterns and insights within structured and unstructured data.
  • Strong experience with Hadoop distributions like Cloudera, MapR, and Hortonworks.
  • Good understanding of NoSQL databases and hands-on work experience in writing applications on NoSQL databases like HBase, Cassandra, and MongoDB.
  • Worked with various HDFS file formats like Avro, Sequence File, and various compression formats like Snappy.
  • Worked on Google Cloud Platform (GDP) Services like Vision API, and Instances.
  • Strong experience in Azure Cloud platforms
  • Implemented OLAP multi-dimensional cube functionality using AzureSQL Data Warehouse.
  • Strong understanding of the principles of Data Warehousing concepts like Fact tables, Dimension tables, and Star/Snowflake Schema modeling.
  • Developed Simple to complex MapReduce streaming jobs using Python language that is implemented using Hive.
  • Skilled in developing applications in Python language for multiple platforms.
  • Hands-on experience in application development using Java, and Linux Shell Scripting.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Experience in migrating the data using Sqoop from HDFS to Relational Database System and vice-versa according to the client's requirement.
  • Extensive Experience in importing and exporting data using stream processing platforms like Flume and Kafka.
  • Strong Knowledge of Apache Spark with Scala Environment.
  • Developed Sparkscripts by using Scala shell commands as per the requirement.
  • Good hands-on experience in creating the RDDs, Data frames for the required input data, and performing the data transformations using Spark Scala.
  • Good knowledge of real-time data streaming solutions using Apache SparkStreaming, Kafka, and Flume.
  • Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
  • Highly experienced in writing complex ANSI SQL queries
  • Very good experience in complete project life cycle (design, development, testing and implementation) of Client Server and Web applications.
  • Excellent Java development skills using J2EE, J2SE, Servlets, JSP, EJB, JDBC, SOAP and RESTful web services.
  • Strong Experience of Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP and Autosys.
  • Experienced in working with Amazon Web Services (AWS) using EC2 for computing and S3 as storage mechanism.
  • Strong experience in Object-Oriented Design, Analysis, Development, Testing and Maintenance.
  • Excellent implementation knowledge of Enterprise/Web/Client Server using Java, J2EE.
  • Experienced in using agile approaches, including Extreme Programming, Test-Driven Development and Agile Scrum.
  • Worked in large and small teams for systems requirement, design & development.
  • Key participant in all phases of software development life cycle with Analysis, Design, Development, Integration, Implementation, Debugging, and Testing of Software Applications in client server environment, Object Oriented Technology and Web based applications.
  • Experience in using various IDEs Eclipse, IntelliJ and repositories SVN and Git.
  • Experience of using build tools Ant, Maven.
  • Preparation of Standard Code guidelines, analysis and testing documentations.
  • Good interpersonal skills, committed, result oriented, hard working with a quest and deal to learn new technologies.

TECHNICAL SKILLS

Big Data Technologies: HDFS, Map Reduce, Spark, Kafka, Nifi, YARN, Zookeeper, Hive, Pig, Sqoop, Flume, Storm, Impala, Oozie, Kafka.

NoSQL Databases: HBase, Cassandra, MongoDB, Couch base.

Distributions: Cloudera, Hortonworks, Amazon Web Services, Azure.

Languages: C, Java, Scala, Python, SQL, PL/SQL, Pig Latin, HiveQL, Java Script, Shell Scripting

Java & J2EE Technologies: Core Java, Servlets, Hibernate, Spring, Struts, JMS, EJB, RESTful

Web Technologies: HTML5, CSS3, JavaScript, Json

Application Servers: Web Logic, Web Sphere, JBoss, Tomcat.

Databases: Microsoft SQL Server, ANSI SQL MySQL, Oracle, DB2

Operating Systems: UNIX, Windows, LINUX

Build Tools: Jenkins, Maven, ANT

Business Intelligence Tools: Tableau, Splunk, Qlik View

Development Tools: Microsoft SQL Studio, Eclipse, NetBeans, IntelliJ

Development Methodologies: Agile/Scrum, Waterfall

Version Control Tools: Git, SVN

PROFESSIONAL EXPERIENCE

Confidential, Richmond, VR

Sr. Big Data Developer

Responsibilities:

  • Working in an Agile team to deliver and support required business objectives by using Java, Python and shell scripting, and other related technologies to acquire, ingest, transform and publish data both to and from Hadoop Ecosystem.
  • Performed Data Cleansing using Python and loaded it into the target tables.
  • Logical implementation and interaction with HBASE.
  • Used Scala to store streaming data to HDFS and to implement Spark for faster processing of data.
  • Performed Sqoop Incremental imports by using Oozie based on every day.
  • Worked on designing and developing ETL workflows using java for processing data in HDFS/HBase using Oozie.
  • Installed and configured Hadoop MapReduce, HDFS, developed MapReduce jobs in Java for data cleaning and pre-processing.
  • Involved in using HCATALOG to access Hive table metadata from MapReduce or Pig code.
  • Created Pig scripts to transform the HDFS data and loaded the data into Hive external table.
  • Worked on large-scale Hadoop YARN cluster for distributed data processing and analysis using Connectors, Spark core, Spark SQL, Sqoop, Pig, Hive and NoSQL databases.
  • Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Performed advanced procedure like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
  • Implemented Spark RDD transformations, actions to implement business analysis.
  • Connected to HDFS using Pentaho Kettle to read data from hive tables and perform analysis.
  • Worked on Spark Streaming and Spark SQL to run sophisticated applications on Hadoop.
  • Used Oozie and Oozie coordinators to deploy end to end processing pipelines and scheduling the work flows.
  • Worked on concept of quorum with Kafka and Zookeeper.
  • Created an E-mail notification Service upon Completion of job for the particular Team which requested for the data.
  • Fixing production issues and providing error free solution.
  • Co-ordination with onsite/offshore team members on daily basis.

Environment: Hadoop, CDH 4, CDH 5, Scala, MapReduce, HDFS, Hive, Pig, Sqoop, HBASE, Flume, Spark SQL, Spark-Streaming, MapR, Python, UNIX Shell Scripting and Cassandra.

Confidential, Texas

Data Engineer

Responsibilities:

  • Working in an Agile team to deliver and support required business objectives by using Java, Python and shell scripting and other related technologies to acquire, ingest, transform and publish data both to and from Hadoop Ecosystem.
  • Implemented map-reduce programs to handle semi/unstructured data like XML, JSON, Avro data files and sequence files for log files.
  • Developed MapReduce jobs in java for data cleaning and pre-processing.
  • Developed simple and complex MapReduce programs in Java for data analysis on different data formats.
  • Performed Data Cleansing using Python and loaded into the target tables.
  • Used Scala to store streaming data to HDFS and to implement Spark for faster processing of data.
  • Worked on large-scale Hadoop YARN cluster for distributed data processing and analysis using Connectors, Spark core, Spark SQL, Sqoop, Pig, Hive and NoSQL databases.
  • Participated in the development, Improvement and maintenance of snowflake database applications.
  • Implemented Spark Scripts using Scala, Spark SQL to access hive tables into spark for faster processing of data.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Python and Scala.
  • Performed advanced procedure like text analytics and processing, using the in-memory computing capabilities of Spark using Scala.
  • Implemented Spark RDD transformations, actions to implement business analysis.
  • Connected to HDFS using Pentaho Kettle to read data from hive tables and perform analysis.
  • Worked on Spark Streaming and Spark SQL to run sophisticated applications on Hadoop.
  • Fixing production issues and providing error free solution.
  • Co-ordination with onsite/offshore team members on daily basis.

Environment: Hadoop, Spark, Scala, MapReduce, HDFS, Hive, Pig, Sqoop, HBASE, Flume, Spark SQL, Spark-Streaming, MapR, Python, UNIX Shell Scripting and Cassandra.

Confidential, Austin, TX

Hadoop Developer

Responsibilities:

  • Configured Spark Streaming to receive real time data from the Kafka and store the stream data to Cassandra utilizing Scala.
  • Developed Spark code to read data from Hdfs and write to Cassandra.
  • Configured Spark Streaming to receive real time data from the Kafka and store the stream data to Cassandra utilizing Scala.
  • Performed real-time analysis of the incoming data using Kafka consumer API, Kafka topics, Spark Streaming utilizing Scala.
  • Integrated Apache Storm with Kafka to perform web analytics. Uploaded click stream data from Kafka to HDFS, HBase and Hive by integrating with Storm.
  • Performed real-time analysis of the incoming data using Kafka consumer API, Kafka topics, Spark Streaming utilizing Scala.
  • Developed Kafka producer and consumers, Cassandra clients and Spark along with components on HDFS.
  • Used the Spark - Cassandra Connector to load data to and from Cassandra.
  • Experienced in designing and deployment of Hadoop cluster and various Big Data components including HDFS, Map Reduce, and Zookeeper in Cloudera distribution.
  • Implemented advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Implemented Kafka Custompartitionsto send data to different categorized topics.
  • Implemented messaging system for different data sources using apache Kafka and configuring High level consumers for online and offline processing.
  • Loaded and transformed large sets of structured unstructured data.
  • Was responsible to handle a Team of 4 members Confidential Off-shore.
  • Involved in daily SCRUM meetings to discuss the development/process.
  • Worked with data delivery teams to setup new Hadoop users. This job included setting up Linux users, setting up Kerberos principals and testing HDFS, Hive.
  • Worked with data Ingestion Team, have Good understanding of Apache Nifi and its transformations.
  • Done Scaling Cassandra cluster based on lead patterns.
  • Good understanding of Cassandra Data Modelling based on applications.
  • Involved in code deployment to Production and providing support to Production support team.
  • Good understanding of Hadoop admin work - maintained Hadoop cluster using Ambari.

Environment: Hadoop, Spark, Scala, Java, Map Reduce, HDFS, Cassandra, Ambari, Hive, Pig,Sqoop, Flume, Linux, Python, Kafka Storm,Shell Scripting, XML, Eclipse, Cloudera, DB2, SQL Server, MySQL, AWS,HBase

Confidential, Kenilworth, NJ

Hadoop Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS, developed multiple MapReduce jobs in Java for data cleaning and pre-processing.
  • Importing and exporting data into HDFS from Oracle 10.2 database and vice versa using SQOOP.
  • Experienced in defining and coordination of job flows.
  • Gained experience in reviewing and managing Hadoop log files.
  • Extracted files from NoSQL database (MongoDB), HBase through Sqoop and placed in HDFS for processing.
  • Involved in Writing Data Refinement Pig Scripts and Hive Queries.
  • Developed fully customized framework using python, shell script, Sqoop & hive.
  • Good knowledge in running Hadoop streaming jobs to process terabytes of xml format data.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Coordinated cluster services using ZooKeeper.
  • Object relational mapping and Persistence mechanism is executed using Hibernate ORM.
  • Developed custom validator in Struts and implemented server side validations using annotations.
  • Used Oracle for the database and Web Logic as the application server.
  • Used Flume to transport logs to HDFS.
  • Experienced in moving data from Hive tables into Cassandra for real time analytics on hive tables.
  • Configured connection between HDFS and Tableau using Impala for Tableau developer team.
  • Responsible to manage data coming from different sources.
  • Got good experience with various NoSQL databases.
  • Experienced with handling administration activations using Cloudera manager.
  • Supported MapReduce programs those are running on the cluster.
  • Involved in loading data from UNIX file system to HDFS.
  • Installed and configured Hive and also written Hive UDFs.
  • Involved in creating Hive tables, loading with data and writing Hive queries which will run internally in MapReduce way.
  • Worked on Talend ETL tool, developed and scheduled jobs in Talend integration suite.
  • Modified reports and Talend ETL jobs based on the feedback from QA testers and Users in development and staging environments.

Environment: Apache Hadoop, Java, JDK1.6, J2EE, JDBC, Servlets, JSP, Linux, XML, Web Logic, SOAP, WSDL, HBase, Hive, Pig, Sqoop, ZooKeeper, NoSQL, HBase, R, MAHOUT Map-Reduce, Cloudera, HDFS, Flume, Impala, Tableau, Talend, MySQL, HTML5, CSS, MongoDB

Confidential

Java developer

Responsibilities:

  • Developed JSP for UI and Java classes for business logic.
  • Used XSLT for UI to display XML Data.
  • Utilized JavaScript for client-side validation.
  • Utilized Oracle PL/SQL for database operations using JDBC API.
  • Implemented DAO for Oracle 8i for DML Operations like Insert, Update, Delete the records.
  • VSS is used for Software Configuration Management.
  • Involved in the design, development and deployment of the Application using Java/J2EE Technologies.
  • Used IDE tool WSAD for development of the application.
  • Developed Application in Jakarta Struts Framework using MVC architecture.
  • Customizing all the JSP pages with same look and feel using Tiles, CSS (Cascading Style Sheets).
  • Involved in coding for the presentation layer using Apache Struts, XML and JavaScript.
  • Created Action Forms and Action classes for the modules.
  • Developed JSP's to validate the information automatically using Ajax.
  • Implemented J2EE design patterns viz. Façade pattern, Singleton Pattern.
  • Created struts-config.xml and tiles-def.xml files.
  • Developed Ant script to create war/ear file and deploy the application-to-application server.
  • Extensively involved in database activities like designing tables, SQL queries in the application and maintained complex reusable methods which implements stored procedures to fetch data from the database.
  • Used CVS for version control.
  • Also involved in testing and deployment of the application on WebLogic Application Server during integration.

Environment: Java/J2EE, JSP, Servlets, Struts 1.1, Spring, JUnit, Eclipse, Apache Ant, JSP, JavaBeans, JavaScript, Tomcat 4.1, Oracle 9i, XML, XSLT, HTML/DHTML/XHTML, CSS, Tiles, Ajax, DB2 UDB, PL/SQL, XML SPY.

Confidential

Java Developer

Responsibilities:

  • Involved in projects utilizing Java, Java EE web applications to create fully integrated client management systems
  • Developed UI using HTML, Java Script, JSP and developed business Logic and interfacing components using Business Objects, JDBC and XML.
  • Participated in user requirement sessions to analysis and gather Business requirements.
  • Development of user visible site usingPerl, back end admin sites using Python and big data using core java.
  • Involved in development of the application using Spring Web MVC and other components of the
  • Elaborated Use Cases based on business requirements and was responsible for creation of class Diagrams, Sequence Diagrams.
  • Implemented Object-relation mapping in the persistence layer using Hibernate (ORM) framework.
  • Implemented REST Web Services with Jersey API to deal with customer requests
  • Experienced in developing Restful web services: consumed and also produced.
  • Used Hibernate for the Database connection and Hibernate Query Language (HQL) to add and retrieve the information from the Database.
  • Implemented Spring JDBC for connecting Oracle database.
  • Designed the application using MVC framework for easy maintainability
  • Provided bug fixing and testing for existing web applications.
  • Involved in full system life cycle and responsible for Developing, Testing, Implementing.
  • Involved in Unit Testing, Integration Testing and System Testing.
  • Implemented Form Beans and their Validations.
  • Written Hibernate components.
  • Developed client side validations withJavascript.

Environment: Spring, JSP, Servlets, REST, Oracle, AJAX, Java Script, JQuery, Hibernate, Web Logic, Log4j, HTML, XML, CVS, Eclipse, SOAP Web Services, XSLT, XSD, UNIX, Maven, Mockito Junits, Jenkins, shell scripting, MVS, ISPF.

We'd love your feedback!