Hadoop Developer Resume
2.00/5 (Submit Your Rating)
Rochester, MN
SUMMARY:
- 8+ years of experience in a various IT related technologies, which includes 4 years of hands - on experience in Big Data technologies.
- Implementation and extensive working experience in wide array of tools in the Big Data Stack like HDFS, Spark, MapReduce, Hive, Pig, Flume, Oozie, Sqoop, Kafka, Zookeeper and HBase
- Proficient in installing, configuring and using Apache Hadoop ecosystems such as MapReduce, Hive, Pig, Flume, Yarn, HBase, Sqoop, AWS, Spark, Storm, Kafka, Oozie, and Zookeeper .
- Strong comprehension of Hadoop daemons and Map-Reduce topics.
- Used Informatica Power Center for Extraction, Transformation, and Loading (ETL) of information from numerous sources like Flat files, XML documents, and Databases .
- Experienced in developing UDFs for Pig and Hive using Java.
- Strong knowledge of Spark for handling large data processing in streaming process along with Scala .
- Hands On experience on developing UDF , DATA Frames and SQL Queries in Spark SQL .
- Highly skilled in integrating Kafka with Spark streaming for high speed data processing.
- Worked with NoSQL databases like HBase , Cassandra and MongoDB for information extraction and placehuge amount of data.
- Understanding of data storage and retrieval techniques , ETL , and databases , to include graph stores, relational databases, tuple stores
- Experienced inwriting Storm topology to accept the events from Kafka producer and emit into Cassandra DB .
- Ability to develop Map Reduce program using Java and Python.
- Hands-on experience in provisioning and managing multi-tenant Cassandra cluster on public cloud environment - Amazon Web Services (AWS) - EC2, Open Stack.
- Good understanding and exposure to Python programming .
- Knowledge in developing a Nifi flow prototype for data ingestion in HDFS .
- Exporting and importing data to and from Oracle using SQL developer for analysis.
- Good experience in using Sqoop for traditional RDBMS data pulls .
- Worked with different distributions of Hadoop like Hortonworks and Cloudera .
- Strong database skills in IBM- DB2, Oracle andProficient in database development , including Constraints, Indexes, Views, Stored Procedures, Triggers and Cursors .
- Extensive experience in Shell scripting.
- Extensive use of Open Source Software and Web/Application Servers like Eclipse 3.x IDE and Apache Tomcat 6.0.
- Experience in designing a component using UML Design- Use Case, Class, Sequence, and Development, Component diagrams for the requirements.
- Involved in reports development using reporting tools like Tableau . Used excel sheet, flat files, CSV files to generated Tableau adhoc reports.
- Broad design, development and testing experience with Talend Integration Suite and knowledge in Performance tuning of mappings.
- Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
- Experience in cluster monitoring tools like Ambari & Apache hue .
TECHNICAL SKILLS:
- HDFS
- MapReduce
- Hive
- Yarn
- Pig
- Sqoop
- Kafka
- Storm
- Flume
- Oozie
- Zookeeper
- Apache Spark
- Impala
- Java
- Python
- Scala
- J2EE
- SQL
- Unix
- Tableau
- Docker
- Eclipse
- Spring Boot
- Elastic search
- AWS
- Nifi
- Linux
- Windows
- Applets
- Swing
- JDBC
- JSON
- Java Script
- JPS
- Servlets
- JFS
- JQuery
- JBoss
- Shell Scripting
- Cassandra
- MVC
- Struts
- Spring
- Hibernate
- HBase
- Cassandra
- MongoDB
- Dynamo DB
- HTML
- AJAX
- XML
- Apache Tomcat
PROFESSIONAL EXPERIENCE:
Confidential, Rochester, MN
Hadoop Developer
Responsibilities:
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data.
- Exploring with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Worked on batch processing of data sources using Apache Spark, Elastic search.
- Involved in Converting Hive/SQL queries into Sparktransformations using Spark RDD, Scala.
- Worked on migrating PIG scripts and MapReduce programs to Spark Data frames API and Spark SQL to improve performance
- Experience in pushing from Impala to micro strategy.
- Created scripts for importing data into HDFS/Hive using Sqoop from DB2.
- Loading data from different source into hive using Talend tool.
- Implemented Data Ingestion in real time processing using Kafka.
- Developed data pipeline using Kafka and Storm to store data in to HDFS.
- Used all major ETLtransformations to load tables through Informatica mappings.
- Worked on Sequential files, RC files, Maps ide joins, bucketing, partitioning for Hive performance enhancement and storage improvement.
- Developed Pig scripts to parse the raw data, populate staging tables and store the refined data in partitioned DB2 tables for Business analysis.
- Worked on managing and reviewing Hadoop log files. Tested and reported defects in an AgileMethodology perspective.
- Used Apache Maven extensively while developing MapReduce program.
- Coordinating with Business for UAT sign off.
Confidential, Schaumburg, IL
Hadoop Developer
Responsibilities:
- Worked on Hadoop cluster using different big data analytic tools including Pig, Hive , and MapReduce
- Collecting and aggregating large amounts of log data using Apache Flume and staging data in HDFS for further analysis
- Worked on debugging, performance tuning of Hive & Pig Jobs.
- Worked on AWS environment for developing and deploying of custom Hadoop applications.
- Extracted and Stored data on DynamoDB to work on Hadoop Application.
- Generate Pipeline using PySpark and Hive
- Created HBase tables to store various data formats of PII data coming from different portfolios
- Experiencein developing java applications using SpringBoot .
- Involved in loading data from LINUX file system to HDFS
- Importing and exporting data into HDFS and Hive using Sqoop
- Experience working on processing unstructured data using Pig and Hive
- Developed spark scripts using Python .
- Involved in scheduling Oozie workflow engine to run multiple Hive and pig jobs
- Assisted in monitoring Hadoopcluster using tools like Nagios , and Ganglia
- Created and maintained Technical documentation for launching Hadoop Clusters and for executing Hive queries and Pig Scripts
- Developed Docker Images, Containers, Registry.
Confidential, Cincinnati, OH
Hadoop Developer
Responsibilities:
- Installed and configured HadoopMapReduce , HDFS and developed multiple MapReduce jobs in Java for data cleansing and preprocessing.
- Involved in loading data from UNIX file system to HDFS.
- Installed and configured Hive and also written Hive UDFs.
- Importing and exporting data into HDFS and Hive using Sqoop
- Used Cassandra CQL and Java API’s to retrieve data from Cassandra table.
- Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
- Worked hands on with ETL process.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, and loaded data into HDFS.
- Extracted the data from Teradata into HDFS using Sqoop.
- Analyzed the data by performing Hive queries and running Pigscripts to know user behavior like shopping enthusiasts, travelers, music lovers etc.
- Exported the patterns analyzed back into Teradata using Sqoop.
- Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
- Installed Oozie workflow engine to run multipleHive .
- Developed Hivequeries to process the data and generate the data cubes for visualizing.
Confidential, West state street, ID
Java Developer
Responsibilities:
- Developed, Tested and Debugged the Java , JSP and EJB components using Eclipse.
- Implemented J2EE standards, MVC2 architecture using Struts Framework
- Developed web components using JSP, Servlets and JDBC
- Taken care of Client Side Validations utilized JavaScript and Involved in reconciliation of different Struts activities in the structure.
- For analysis and design of application created Use Cases, Class and Sequence Diagrams.
- Implemented Servlets , JSP and Ajax to design the user interface
- Used JSP , Java Script , HTML5 , and CSS for manipulating, validating, customizing, error messages to the User Interface
- Used JBoss for EJB and JTA , for caching and clustering purpose
- Used EJBs (Session beans) to implement the business logic, JMS for communication for sending updates to various other applications and MDB for routing priority requests
- Wrote Web Services using SOAP for sending and getting data from the external interface
- Used XSL/XSLT for transforming and displaying reports Developed Schemas for XML
- Developed a web-based reporting for monitoring system with HTML and Tiles using Struts framework
- Used Design patterns such as Business delegate, Service locator, Model View Controller ( MVC ), Session, DAO.
- Involved in fixing defects and unit testing with test cases using JUnit
- Developed stored procedures and triggers in PL/SQL
Confidential
Java Developer
Responsibilities:
- Implemented server side programs by using Servlets and JSP.
- Designed, developed and validated user interface using HTML, Java Script, XML and CSS.
- Implemented MVC using Struts Framework.
- Handled the database access by implementing Controller Servlet.
- Implemented PL/SQL stored procedures and triggers.
- Used JDBC prepared statements to call from Servlets for database access.
- Designed and documented of the store procedures.
- Widely used HTML for web based design.
- Worked on database interactions layer for updating and retrieving of data from Oracle database by writing stored procedures.
- Used spring framework dependency injection and integration with Hibernate. Involved in writing JUnit test cases.