We provide IT Staff Augmentation Services!

Big Data Developer Resume

5.00/5 (Submit Your Rating)

Dallas, TX

SUMMARY

  • Over 8+ years of experience in overall IT, which includes hands on experience in Big Data technologies.
  • Experience in installing, configuring, and using Hadoop ecosystem components like Scala, Map Reduce, HDFS, Hive, Sqoop, Storm, Kafka, YARN, HBase, Pig, Zookeeper, Oozie and Flume.
  • Well experienced on development of Java Map Reduce Programs and also on Various Hadoop Ecosystem Such as Sparkand SCALA programming, Oozie, Flume etc.
  • Experience in working with Cassandra.
  • Importing and exporting data from different databases like MySQL, Oracle into HDFS and Hive using Sqoop.
  • Extensive experience on several Apache Hadoop projects, map reduce programs using Hadoop Java API and also using Hives and pig.
  • Follow the software development life cycle specifications and ensure all the deliverables meet the Hadoop specifications.
  • Experience in developing customized UDF's in java to extend Hive and Pig Latin functionality.
  • In - depth knowledge of Statistics, Machine Learning, Data mining.
  • Worked on NoSQL databases including HBase and MongoDB.
  • Hands on experience on Real Time data tools like Kafka and Storm.
  • Experienced with big data machine learning in Mahout and Spark MLlib.
  • Extensive knowledge on current development and source code management tools (GIT, SVN).
  • Working Experience on Installation/Configuration/Maintenance ofHorton worksHadoop clusters for application development.
  • Extensive working experience in Agile Software Development Model which includes handles scrum sessions as Scrum Master.
  • Familiar with data architecture including data ingestion pipeline design, Hadoop architecture, data modeling and data mining, machine learning and advanced data processing. Experience optimizing ETL workflows.
  • Good communication and interpersonal skills with self-learning attitude.
  • Ability to perform at a high level, meet deadlines, adaptable to ever changing priorities.
  • Experienced with data cleansing in writing Map Reduce jobs and Spark jobs.
  • Hands on experience developing enterprise Hadoop application in a Cloudera environment.

TECHNICAL SKILLS

Programming Languages: Java, C, C++, C# and SCALA.

Hadoop Ecosystem: HDFS, Map Reduce, Pig, Hive,Sqoop, Flume, Storm, Scala, YARN, Zookeeper,HBase, Kafka, Horton worksData Platform (HDP).

IDE Tools: Eclipse, Net beans, STS, IntelliJ.

Operating Systems: MS-DOS, Windows, Linux, Unix

Web Technologies: HTML, CSS, Javascript and AJAX.

Databases: Oracle, My SQL and SQL Server.

Application /Web Server: Apache Tomcat, Web Logic, TFS

Functional Testing Tools: Quick Test Pro, Selenium, Load Runner, Quality Center, HPALM, JIRA

PROFESSIONAL EXPERIENCE

Confidential, Dallas, TX

Big Data Developer

Responsibilities:

  • Installed and configured Hadoop MapReduce, HDFS and developed MapReduce jobs in Java for data cleaning and preprocessing.
  • Importing and exporting data into HDFS and Hive using Sqoop.
  • Defined job flows, managing and reviewing Hadoop log files.
  • Participated in development/implementation of Cloudera environment.
  • Responsible for running Hadoop streaming jobs to capture and process terabytes of xml format data coming from different sources.
  • Developed code to load data from Linux file system to HDFS.
  • Worked on implementing and integrating in NoSQL databases like HBase.
  • Supported Map Reduce Programs running on the cluster.
  • Installed and configured Hive and also written Hive UDFs in Java and Python.
  • Loaded and transform large datasets such as Structured, Un-structured and semi Structured Data.
  • Expertise in different data modelling and Data Warehouse design and development.
  • Used Spark API over Horton works Hadoop YARN to perform analytics on data in Hive.
  • Explored Spark, Kafka, Storm along with other open source projects to create a real-time analytics framework.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Worked on migrating MapReduce programs into Spark transformations using Spark and Scala.
  • Developed Pig Latin scripts to process the data and also written UDF in Java and Python.
  • Involved in creating Hive tables, loading with data and writing hive queries which will run internally in map reduce way.
  • Wrote Map Reduce programs in Java to achieve the required Output.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Developed Hive queries for data analysis to meet the Business requirements.
  • Used Oozie workflow engine to create the workflows and automate the Map Reduce, Hive, Pig jobs.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop. Cluster co-ordination through Zookeeper.

Environment: Java, Hadoop, Spark, Scala, NoSQL, Kafka and Python, HDFS, Hive, Pig, Map Reduce, YARN, Data Meer, Flume, Oozie, Linux, Teradata, HCatalog, Eclipse IDE, GIT.

Confidential, Arlington Heights IL

Hadoop Developer

Responsibilities:

  • Developed PIG scripts to transform the raw data into intelligent data as specified by business users.
  • Worked in AWS environment for development and deployment of Custom Hadoop Applications.
  • Worked closely with the data modelers to model the new incoming data sets.
  • Involved in start to end process of Hadoop jobs that used various technologies such as Sqoop, PIG, Hive, MapReduce, Spark and Shell scripts (for scheduling of few jobs) Extracted and loaded data into Data Lake environment (AmazonS3) by using Sqoop which was accessed by business users and data scientists.
  • Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, ZooKeeper, SQOOP, flume, Spark, Impala, Cassandra with Horton work Distribution.
  • Installed Hadoop, Map Reduce, HDFS, AWS and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Assisted in upgrading, configuration and maintenance of various Hadoop infrastructures like Pig, Hive, and Hbase.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Improved the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • POC on Single Member Debug on Hive/Hbase and Spark.
  • Configured deployed and maintained multi-node Dev and Test Kafka Clusters.
  • Performed transformations, cleaning and filtering on imported data using Hive, Map Reduce, and loaded final data into HDFS.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Loading Data into Hbase using Bulk Load and Non-bulk load.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Expertise in different data modelling and DataWarehouse design and development.
  • Used Spark API over Hortonworks Hadoop YARN to perform analytics on data in Hive.
  • Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
  • Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
  • Import the data from different sources like HDFS/Hbase into Spark RDD.
  • Developed a data pipeline using Kafka and Storm to store data into HDFS.
  • Performed real time analysis on the incoming data.
  • Used Apache Kafka and Apache Storm to gather log data and fed into HDFS.
  • Automated the process for extraction of data from warehouses and weblogs by developing work-flows and coordinator jobs in OOZIE.
  • Performed transformations like event joins, filter bot traffic and some pre-aggregations using Pig.
  • Developed MapReduce jobs to convert data files into Parquet file format.
  • Executed Hive queries on Parquet tables stored in Hive to perform data analysis to meet the business requirements.
  • Configured Oozie workflow to run multiple Hive and Pig jobs which run independently with time and data availability.
  • Optimized MapReduce code, pig scripts and performance tuning and analysis.

Environment: MapReduce, HDFS, Hive, Pig, Spark, Spark-Streaming, Spark SQL, Apache Kafka, Sqoop, Java, Scala, CDH4, CDH5, AWS, Eclipse, Oracle, Git, Shell Scripting and Cassandra.

Confidential, Pittsburgh, PA

Technical Specialist - BigData

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop.
  • Responsible for Cluster maintenance, adding and removing cluster nodes, Cluster Monitoring and Troubleshooting, Manage and review data backups and log files.
  • Optimized Map/Reduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Analyzed data using Hadoop components Hive and Pig.
  • Developed entire frontend and backend modules using Python on Django Web Framework.
  • Wrote Python scripts to parse XML documents and load the data in database.
  • Worked hands on with ETL process.
  • Worked on NoSQL databases including HBase and Elastic Search.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
  • Used Tableau as reporting tool as data visualization tool
  • Responsible for running Hadoop streaming jobs to process terabytes of xml's data.
  • Load and transform large sets of structured, semi structured and unstructured data using Hadoop/Big Data concepts.
  • Involved in loading data from UNIX file system to HDFS.
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Responsible for managing and reviewing Hadoop log files. Designed and developed data management system using MySQL.
  • Responsible for creating Hive tables, loading data and writing hive queries.
  • Handled importing data from various data sources, performed transformations using Hive, Map Reduce, and loaded data into HDFS.
  • Extracted the data from Teradata into HDFS using the Sqoop and vice versa.

Environment: Hadoop, Unix, ETL, HDFS, Kafka, Hive, PIG, UNIX, SQL, Java Map Reduce, SPARK Hadoop Cluster, HBase, Sqoop, Oozie, Linux, Data Pipeline, Cloudera Hadoop Distribution, Python, MySQL, Git. MapR-DB.

Confidential

Java/J2EE Developer

Responsibilities:

  • Used CVS for maintaining the Source Code Designed, developed and deployed on Apache Tomcat Server.
  • Involved in Analysis, design and coding on J2EE Environment.
  • Developed Hibernate object/relational mapping according to database schema.
  • Designed the presentation layer and programmed using HTML, XML, XSL, JSP, JSTL and Ajax.
  • Designed, developed and implemented the business logic required for Security presentation controller.
  • Created XML files to implement most of the wiring need for Hibernate annotations and Struts configurations.
  • Responsible for developing the forms, which contains the details of the employees, and generating the reports and bills.
  • Involved in designing of class and dataflow diagrams using UML Rational Rose.
  • Created and modified Stored Procedures, Functions, Triggers and Complex SQL Commands using PL/SQL.
  • Developed Shell scripts in UNIX and procedures using SQL and PL/SQL to process the data from the input file and load into the database.
  • Used MVC-Struts framework in the front-end to develop the User Interface.
  • Involved in the implementation of business logic in struts Framework and Hibernate in the back-end.
  • Developed various DOA's in the applications using SpringJDBC support and fetch, insert, update and deleted data into the database table.
  • Involved in code reviews and ensured code quality across the project.
  • Participated in all aspects of application design, development, testing, and implementation, using OOA/OOD or model driven design, and SOA. Hands-on development and testing with web services and SOA.
  • Hibernate Frameworks is used in persistence layer for mapping an object-oriented domain model to a relational database.
  • Developed Web Services and exposed (provider) them to other team for consumption.
  • Actively involved in software development life cycle starting from requirements gathering and performing Object Oriented Analysis.
  • Used Core java concepts in application such as multithreaded programming, synchronization of threads used thread wait, notify, join methods etc.
  • Created cross-browser compatible and standards-compliant CSS-based page layouts.
  • Involved in maintaining the records of the patients visited along with the prescriptions they were issued in the Database.
  • Performed Unit Testing on the applications that are developed.

Environment: Unix (Shell Scripts), J2EE, JSP1.0, Servlets, Hibernate, JavaScript, JDBC, Oracle 10g, UML, Rational Rose 2000, SQL, PL/SQL, CSS, HTML & XML, Apache Tomcat, Eclipse, MVC, JSP, EJB, Struts, WebLogic 9.0, AJAX, Java Script, JMS, XSLT, UML, JUnit, log4j, My Eclipse 6.0

We'd love your feedback!