Hadoopdeveloper Resume
Milwaukee, WI
SUMMARY
- Around 8+ years of experience in Software Development Life Cycle including Requirements Gathering, Documenting, Analysis, Development, Testing and Support. Over 4 years of extensive experience asHadoopDeveloper and Big Data Analyst. Primary technical skills in HDFS, MapReduce, YARN, Pig, Hive, Sqoop, HBase, Flume, Oozie, Zookeeper.
- Working experience with Big Data andHadoop DistributedFile System (HDFS). In depth understanding/knowledge ofHadoopArchitecture and various components such as HDFS, Job Tracker, Task Tracker, Name node, Data node and Map Reduce concepts.
- Experience in working withMapReduce programs using ApacheHadoopfor working with Big Data to analysis.
- Hands on experience in working with Ecosystems like Hive, Pig, Sqoop, Map reduce, Flume, Oozie. Strong knowledge of Pig and Hive analytical functions, extending Hive and Pig core functionality by writing custom UDFs. Creating Internal and External tables, partitioning tables, bucketing tables in Hive.
- Developed Pig Latin scripts to extract the data from the output files and applied data transformation logic to load data into HDFS.
- Developed the Sqoop scripts in order to import data from RDBMS to HIVE, RDBMS to HDFS and Export Data from HDFS to RDBMS. Developed custom functionalities in PIG and Hive UDFs.
- Knowledge of job workflow scheduling and monitoring tools like Oozie and Ganglia, NoSQL databases such as HBase, Big Table, administrative tasks such as installingHadoop, Commissioning and decommissioning, and its ecosystem components such as Flume, Oozie, Hive and Pig.
- Experience in design, development and testing of Distributed, Internet/Intranet/E - Commerce, Client/Server and Database applications mainly using technologies Java, JDBC, JavaScript on Web Logic, Apache Tomcat Web/Application Servers and with Oracle and SQLServer Databases on Unix, windows NT platforms.
- Extensive experience with Databases such as Oracle, Mysql, MS-Sql Server Adept at Writing Sql, PL Sql Script.
- Good experience in writing optimized Map reduces jobs using Java. Experience designing and implementing fast and efficient data acquisition using Big Data processing techniques and tools. Programming Knowledge on SPARK, SCALA.
- Expertise in preparing the test cases, documenting and performing unit testing and Integration.
- Developed PIG Latin scripts and SPARK SQL scripts for handling data formation. Hands on experience in IDE tools like Eclipse. Expertise in Object-Oriented Analysis and Design of Core java, OOPS Concepts, Jscript.
- In-depth understanding of Data Structures and Algorithms and Optimization. Strong knowledge of Software Development Life Cycle and expertise in detailed design documentation.
- Developed SQOOP Scripts for importing large dataset from RDBMS to HDFS. Creating the UDFs in Java and Register them in PIG and HIVE.
- Fast learner with good interpersonal skills, having strong analytical and communication skills and interested in problem solving and troubleshooting. Self-motivated, excellent team player statement to well-documented design. Excellent business knowledge, ability to work under pressure and good interpersonal skills.
TECHNICAL SKILLS
HadoopComponents: HDFS, MapReduce, Pig, Sqoop, Hive, HBase, FLUME, Oozie, Impala
Apache Spark: Spark, Spark SQL, Spark Streaming, SCALA.
Database: Oracle 9i/10g, SQL/PL SQL.
Scripting Language: JavaScript, Pig Script, Shell scripting, Dos/Bat scripting.
Operating Systems: Windows, Linux/Unix.
Versioning Systems: Git, SVN.
NoSQL: Cassandra, HBase.
Languages & Technologies: C, C++, Java, SQL, PL SQL.
IDE: Eclipse
PROFESSIONAL EXPERIENCE
Confidential, Milwaukee, WI
HadoopDeveloper
Responsibilities:
- Involved in HBASE setup and storing data into HBASE, which will be used for analysis.
- Developed analytical components using Scala, Spark, Apache Mesos and Spark Stream.
- Involved in loading data from LINUXfile system to HDFS.
- Developed Spark scripts by using Scala Shell commands as per the requirement.
- Involved in converting Map reduces programs into Spark transformations using Spark RDD in Scala.
- Worked hands on with ETL process, responsible for running Hadoop streaming jobs to process terabytes of xml data
- Wrote and Implemented Apache PIG scripts to load data from and to store data into Hive.
- Implementing Spark Streaming Applications in SCALA.
- Combined visualizations into Interactive Tableau Dashboards and published them to the web portal.
- Imported Mysql Data into HDFS Using Sqoop.
- Experienced in developing custom input formats and data types to parse and process unstructured and semi structured input data and mapped them into key value pairs to implement business logic in Mapreduce.
- Evaluate information gathered from multiple sources, reconcile conflicts, classify the information in logical categories.
- Design and tested the data ingestion to handle data from multiple sources into the Enterprise Data Lake.
- Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and Python.
- Analyzed the SQL scripts and designed the solution to implement using PySpark.
- Worked on reading multiple data formats on HDFS using Scala.
- Implemented real time system with Kafka, Storm and Zookeeper.
- Spark streaming collects the data from Kafka in near real time and performs necessary transformations and aggregations on the fly to build the common learner data model and persists the data in HBase.
- Developed Hive Scripts equivalent to Teradata and performance tuning using various Techniques.
- Used Spark streaming to receive real time data from the Kafka and store the stream data to HDFS using Scala and databases such as HBase.
- Client Communication and Participated in the requirements gathering with Business users.
- Coordinate with offshore and onsite team to understand the requirements and prepare High level and Low-level design documents from the requirements specification.
Environment: Hortonworks Data Platform(HDP), Hive, Pig, HBase, MapReduce, Flume, Spark, Spark SQL, Spark Streaming, Scala, Python, Kafka, storm, Oozie, Zookeeper, Shell/Bash Scripting, YARN, JIRA, JDBC
Confidential, Palo Alto, CA
HadoopDeveloper
Responsibilities:
- Developed Apache PIG scripts to process the HDFS data. Exported the analyzed data into relational databases using Sqoop for visualization and to generate reports for the BI team.
- Created Hive tables to store the processed results in tabular format. Developed the sqoop scripts in order to make the interaction between Pig and MySQL Database.
- Written Mapreduce code that will take input as log files and parse the logs and structure them in tabular format to facilitate effective querying on the log data
- Created External Hive Table on top of parsed data. Designed and developed Big Data analytics platform for processing customer viewing preferences and social media comments using Java,Hadoop, Hive and Pig.
- Experience in retrieving data from databases like MYSQL and Oracle into HDFS using Sqoop and ingesting them into HBase.
- Worked on implementing Flume to import streaming data logs and aggregating the data to HDFS through Flume.
- Able to integrate state-of-the-art Big Data technologies into the overall architecture and lead a team of developers through the construction, testing and implementation phase. Involved in gathering the requirements, designing, development and testing.
Environment: ClouderaDistribution Of Hadoop (CDH), Pig, Hive, SQOOP, Mapreduce, Java, UNIX, Flume, Mysql, Oozie, Zookeeper, Shell/Bash Scripting.
Confidential, Bloomington, IL
HadoopDeveloper
Responsibilities:
- Gathered the business requirements from the Business Partners and Subject Matter Experts.
- Involved in writing MapReduce programs.
- Managing and scheduling Jobs on a Hadoop cluster using Oozie.
- Responsible for designing and managing the Sqoop jobs that uploaded the data from Oracle to HDFS and Hive.
- Involved in moving all log files generated from various sources to HDFS for further processing through Flume.
- Worked on Hue interface for querying the data.
- Developed Pig scripts for data analysis and extended its functionality by developing custom UDF's.
- Created Hive tables to store the processed results in a tabular format.
- Created HBase tables to store variable data formats.
- Utilized cluster coordination services through Zookeeper.
- Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
- Analyzed large amounts of data sets to determine the optimal way to aggregate and report on it.
- Participate in requirement gathering and analysis phase of the project in documenting the business requirements by conducting workshops/meetings with various business users.
Environment: Java, Hadoop, Pig, Hive, Hue, Oozie, Sqoop, Flume, HBase, Zookeeper, Oracle 10g, Eclipse.
Confidential
Java Developer
Responsibilities:
- Designed and developed a Rich GUI (RIA) front-end using DHTML, JSP and JavaScript.
- Used Struts framework for developing application and user interfaces.
- Worked on complete design and coding using JSP, Servlets.
- Efficiently memory managed theJavaClasses by customizing the Mark and Sweep, Garbage.
- Involved in using Corejavaconcepts - Collections, Exception Handling, Multithreading, Serialization etc.
- Developed SWING Desktop Client to access Cash Services.
- Collector algorithm used various design patterns like Singleton, Session Façade and DAO.
- Involved in writing EJB entity bean and Session bean.
- WebSphere 5.0 clustering and MQ-Series for background process.
- Validating XML documents with Schema using SAX parser.
- Implemented Log4J for Logging Errors, debugging and tracking.
Environment: Java, JSP, .Net, C#, Servlets, Swing, EJB, JMS, AJAX, Oracle, XML, XSLT, HTML, CSS, Web Sphere, UML, RAD, TOAD, PL/SQL, JUnit, Apache Ant, CVS, and Log4j.
Confidential
Java Developer
Responsibilities:
- Involved in High level and low-level design of the application.
- Designed the database to support the online application.
- Developed database interaction with JDBC API using SQL Queries and advanced prepared statements.
- Writing Entity & Session EJBs, deploying EJBs, Servlets and JSPs that holds the business logic.
- Developed JSPs using Jakarta Struts Framework (MVC)
- Supported the application in QA and Production environments
- Following Coding guidelines & maintain quality of code.
- Involved in building the code & deploying on the JBOSS application server.
- Involved in validating the application for different browser compatibility & users load.
Environment: J2EE, Servlets, JSP, EJB, HTML, DHTML, JavaBeans, Web logic, UML, struts, UNIX and Oracle.
