Hadoop Developer Resume
IL
SUMMARY
- Around 4 years of professional IT experience which includes expertise in Big Data Ecosystem related technologies.
- Excellent understanding/knowledge of Hadoop architecture and various components such as HDFS, JobTracker, TaskTracker, NameNode, DataNode and MapReduce programming paradigm.
- Hands on experience in installing, configuring, and using Hadoop ecosystem components like Hadoop MapReduce, HDFS, HBase, Zookeeper, Hive, Sqoop, Pig, Oozie, Kafka and Flume.
- Experience in developing custom UDF's for Hive.
- Good knowledge on using Spark for real time streaming of data into the cluster.
- Hands on experience with Spark core and SparkSQL.
- Solid understanding of OLAP concepts and challenges, especially with large data sets and mapping, analysis and documentation of OLAP reports.
- Expertise in developing Pig Latin scripts and using HiveQL and custom MapReduce programs in Java.
- Used RDBMS concepts for the manipulation of the data and to validate the results.
- Good knowledge in job scheduling and monitoring through Oozie and ZooKeeper.
- Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
- Capable of designing Spark SQL based on functional specifications.
- Expertise in handling streaming data using Kafka and Flume.
- Good experience as a Java Back end developer, working in various relational databases including Oracle, MySQL.
- Experience in working with Java and Scala.
- Experience in creating tables, partitioning, bucketing, loading and aggregating data using Hive.
- Experienced in writing procedures, functions and triggers in MySQL.
- Experienced in working with Agile methodology.
- Good understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase.
- Expertise in MS Visio for constructing and designing UML diagrams like use case diagrams, class diagrams and sequence diagrams.
- Effectively collaborated with Developers and Analysts to address project requirements, deliverables and any potential road blocks.
- Excellent interpersonal and communication skills, creative, research-minded, technically competent and result-oriented with problem solving and leadership skills.
TECHNICAL SKILLS
Hadoop Ecosystem: HDFS, Map Reduce, YARN, Kafka, Flume, PIG, HIVE, Apache Spark, Impala, Spark Streaming, Spark SQL, HUE.
Hadoop Distribution: Hortonworks, Cloudera
Hadoop Operational Services: Apache Zookeeper, Oozie
No SQL Databases: Apache HBase
Cloud Computing Services: AWS (Amazon Web Services), Amazon EC2
Reporting Tool: Tableau
Java& J2EE Technologies: Java, Servlets, JSP, JDBC, Java Beans
IDE Tools: Eclipse, Net Beans
Programming Languages: C, Java, Unix Shell scripting, Scala, SQL, PL/SQL
Databases: Oracle 11g/10g, MS-SQL Server, MySQL
Web Technologies: HTML5, JavaScript, CSS
Web Servers: Web Logic 10.3, Web Sphere 6.1, Apache Tomcat 5.5/6.0
Environment: Tools: SQL Developer, Win SCP, Putty, SOAP UI, MS Visio, JIRA, SharePoint
Version Control Systems: CVS, Tortoise SVN, Git
Operating Systems: Windows 98/2000/XP/Vista/7/8/10, Linux
PROFESSIONAL EXPERIENCE
Confidential, IL
Hadoop Developer
Responsibilities:
- Extracted and updated the data into HDFS utilizing Sqoop import/export command line utility interface.
- Involved in streaming data from Kafka topics via Flume into HDFS.
- Implemented partitioning in HIVE for every vehicle on monthly basis.
- Enhanced and optimized product Spark code to aggregate and run tasks using the Spark framework.
- Developed jobs for scheduling using Control-M through .sh files.
- Streamed our customer emails regarding reviews and complaints into Hadoop via Spark streaming.
- Developed backend Hadoop dataflows and schemas which serves vehicle configuration application.
- Worked on spark streaming to get the inputs from Kafka topics and load it to HDFS and H-Base.
- Built Tableau dashboards for few applications involving searches.
- Experienced in Impala to minimize query response time.
- Working on building the costing and pricing everyday as the market changes.
- Involved in loading data from UNIX filesystem to HDFS.
- Analyzed large data sets by running Hive queries.
- Implemented Partitioning and Bucketing in Hive to improve the performance.
- Experienced in managing and reviewingHadooplog files.
- Written regular expressions for pattern matching using MapReduce.
- Configured periodic incremental imports of data from MySQL into HDFS and vice versa using Sqoop.
- Prepared technical reports and documentation manuals during the program development.
- Provide 24/7 on call support in both testing and production environments.
Environment: Hadoop, HDFS, Cloudera, Sqoop, MySQL, Impala, Hive, H-Base, Kafka, Flume, Spark, Scala, Spark Streaming, Tableau, UNIX Shell Scripting, Control-M.
Confidential, CA
Hadoop Developer
Responsibilities:
- Wrote Hive jobs to parse the logs and structured them in tabular format for effective querying.
- Developed multiple MapReduce jobs for data cleaning and preprocessing.
- Worked on setting up signal monitoring by integrating Apache Kafka with Spark Streaming process to consume data from external REST APIs and run custom functions.
- Developed Custom Generic UDAF in Java for calculating the aggregated counts of preferred terms.
- Involved in developing a streaming process using Spark streaming to pull data from an external REST API.
- Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
- Worked on setting up the streaming process with Flume for detecting any adverse events.
- Created data movement process and moved large amounts of data from RDBMS to Hadoop and vice-versa using Sqoop.
- Performed partitioning and bucketing of hive tables to store data on Hadoop.
- Involved in developing shell scripts and automated data management from end to end integration work.
- Developed SQL statements to improve back-end communications.
- Involved in requirement gathering and translated them into technical design in Hadoop.
Environment: Hadoop, Hortonworks, MapReduce, HDFS, Hive, Java, Sqoop, Flume, Spark, Spark Streaming, Kafka, Scala, UNIX Shell Scripting, MySQL.
Confidential
Java Developer
Responsibilities:
- Developed the application under JEE architecture, designed dynamic and browser compatible user interfaces using JavaScript, HTML5, JSP and CSS.
- Deployed & maintained the JSP, Servlets components on Tomcat.
- Involved in storing the details about all employees and retrieving from Oracle database when required by the Administrator for the employee detail module.
- Developed and utilized J2EE Services and JMS components for messaging communication in Web Logic.
- Configured the Hibernate ORM to persist the application data into database used MSSQL server Database to store the data.
- Prepared UML sequence diagrams, activity diagrams and class diagram based on the business requirements using MS Visio tool.
- Implemented producer and consumed web services using Rest and SOAP.
- Coordinated with the QA team and participated in testing.
- Responsible and active in the analysis, definition, design, implementation, management, and deployment of full software development life cycle of project.
- Developed stored procedures to extract data from Oracle database.
- Used Subversion (SVN) for software versioning and as a revision control system.
Environment: Java, JavaScript, Hibernate, HTML5, CVS, Servlets, JSP, MS Visio, PL/SQL, Apache Tomcat, AWS, Oracle 10g.
Confidential
QA
Responsibilities:
- Performed acceptance/stress/functional testing using SOAP UI tool and manual testing techniques.
- Coordinated regression test automation activities across the ERP applications.
- Prepared functional requirement documents, business requirement documents and maintained defect-tracker sheet in project’s SharePoint site.
- Logged and reported defects with concise steps to reproduce in JIRA.
- Monitored and re-tested defects that have been fixed by Development team in JIRA.
- Collaborated with Developers and Analysts to address project requirements, deliverables and any potential road blocks.
Environment: SOAP UI, SharePoint, JavaScript, HTML5, CSS, GitHub, Tableau, JIRA.