We provide IT Staff Augmentation Services!

Sr. Hadoop Developer Resume

4.00/5 (Submit Your Rating)

Austin, TX

SUMMARY

  • About 9 years of experience in Application analysis, Design, Development, Maintenance and Supporting web, Client - server based applications in Java/J2EE technologies which includes around 6+ years of experience with Big Data and Hadoop related components like HDFS, Map Reduce, Pig, Hive, YARN, Sqoop, Flume, Spark, Scala, and Kafka.
  • Experience in multiple Hadoop distributions like Cloudera, and Horton works.
  • Excellent understanding of NoSQL databases like HBase, Cassandra and MongoDB.
  • Experience on working structured and unstructured data with various file formats such as Avro data files, xml files, JSON files, sequence files using Map Reduce programs.
  • Work experience with cloud configurations like Amazon web services (AWS).
  • Implemented custom business logic and performed join optimization, secondary sorting, custom sorting using Map Reduce programs.
  • Experienced testing and running of Map Reduce pipelines.
  • Expertise in Data ingestion using SQOOP, Apache Kafka, Spark Streaming and FLUME.
  • Implemented business logic using Pig scripts. Wrote custom Pig UDF’s to analyze data.
  • Performed different ETL operations using Pig for joining operations and transformations on data to join, clean, aggregate and analyze data.
  • Hands on experience on fetching the live stream data from DB2 to Hbase table using Spark Streaming and Apache Kafka.
  • Experience in designing, configuring and installing Datastax Cassandra.
  • Good understanding of Conceptual, Logical and Physical Data Modeling.
  • Experience with Oozie Workflow Engine to automate and parallelize Hadoop, Map Reduce and Pig jobs.
  • Extensive experience with wiring SQL queries using HiveQL to perform analytics on data.
  • Experience in performing data validation using HIVE dynamic partitioning and bucketing.
  • Experienced in importing and exporting data between RDBMS and Teradata into HDFS using Sqoop.
  • Experienced in handling streaming data like web server log data using flume.
  • Good knowledge analyzing data using Python development and scripting for Hadoop Streaming.
  • Worked with Spark Data Frames, Spark SQL and Spark Mlib extensively.
  • Experience in implementing Spark using Scala and SparkSQL for faster processing of data.
  • Extensive Hands on experience with Accessing and perform CURD operations against HBase data using Java API and implementing time series data management.
  • Hands-on experience with message broker such as Apache Kafka.
  • Employed in planning different stages of migrating data from RDBMS to Cassandra.
  • Expertise in benchmarking and load testing a Cassandra cluster Cassandra-stress tool.
  • Involved in various datamining tasks such as pattern mining, classification and clustering.
  • Experienced in J2EE, Spring, Hibernate, SOAP/Rest web services,JMS, JNDI, EJB.
  • Expertise with Application servers and web servers like Oracle Web Logic, IBM Web Sphere, Apache Tomcat, JBOSS and VMware.
  • Experienced in developing the unit test cases using MRUnit and JUnit.
  • Experience in using Maven and ANT for build automation.
  • Experience working in environments using Agile (SCRUM) and Waterfall methodologies.
  • Expertise in database modeling, administration and development usingSQL and PL/SQL in Oracle (8i, 9i and 10g), MySQL, DB2 and SQL Server environments.

TECHNICAL SKILLS

BigData / Hadoop: HDFS / Map Reduce / Hive / Pig / HBase / YARN / Sqoop / Flume/ Oozie / Scala / Kafka / Apache Spark / Spark Sql / AWS / Talend.

Databases / NoSQL: Cassandra / MongoDB / HBase / Hive / SQL / Pl/SQL / Oracle.

Web Technologies: HTML / CSS / AJAX / JavaScript / JQuery.

Web Services: Soap / Rest / XML / XSD.

J2EE Frameworks: Hibernate / Springs / JMS / JSF.

Operating Systems: Windows / Unix / Linux.

Methodologies: Agile, Waterfall.

Ide’s / Tools: Eclipse / NetBeans/ Microsoft Visio.

Build Tools: Maven / Apache- ANT / Log4j.

PROFESSIONAL EXPERIENCE

Confidential, Austin, TX

Sr. Hadoop Developer

Responsibilities:

  • Designed a pipeline to collect, clean, and prepare data for analysis using Map reduce, Spark, Pig, Hive and HBase and reporting using Tableau.
  • Developed and implemented script to send large amount of data to any Http Server, which is configurable in number of users, operations and a range of dates.
  • Created reports using Tableau using HiveQL.
  • Created/modified UDF and UDAFs for Hive and PIG whenever necessary.
  • Managed and reviewed Hadoop log files to identify issues when job fails.
  • Used Apache Kafka for handling log messages that are handled by multiple systems.
  • Worked with Data Staging Validation using Talend.
  • Involved with Unit testing and integration testing with Hue
  • Worked with Spark Data Frames, Spark SQL and Spark Mlib extensively.
  • Worked with Data Science team in developing Spark Mlib applications to develop various predictive models
  • Involved in installing, configuring andmanaging HadoopEcosystem components like Hive, Pig, Sqoop and Flume.
  • Hands on experience in importing and exporting data from relational databases to HDFS and vice versa using Sqoop.
  • Worked on Impala for creating tables and querying data.
  • Implemented daily workflow for extraction, processing and analysis of data with Oozie.
  • Processed the source data to structured data and stored to Cassandra.
  • Exploring with theSparkimproving the performance and optimization of the existing algorithms in Hadoop usingSparkContext,Spark-SQL, Data Frame, Pair RDD's,SparkYARN.
  • DevelopedSparkcode using scala andSpark-SQL/Streaming for faster testing and processing of data.
  • Involved in creating Hive tables and loading with data.

Environment: Hortonwork’s HDP, Java, Kafka, Pig, Hive, HDFS, Cassandra, UNIX, Spark, Scala, HBase, HiveQL, AWS, Tableau.

Confidential, Long Beach, CA

Hadoop Developer

Responsibilities:

  • Experience in using Avro, Parquet, RCFile and JSON file formats and developed UDFs using Hive and Pig.
  • Developed MapReduce Input format to read specific data format
  • Developing and maintaining Workflow Scheduling Jobs in Oozie.
  • Used Sqoop to transfer data from external sources to HDFS
  • Understanding of data storage and retrieval techniques, ETL, and databases, to include graph stores, relational databases, tuple stores, NOSQL, Hadoop, PIG, MySQL and Oracle databases
  • Imported and exported data into HDFS and Hive using Sqoop.
  • Experience in loading and transforming huge sets of structured, semi structured and unstructured data.
  • Responsible for loading unstructured and semi-structured data into Hadoop cluster coming from different sources using Flume.
  • Worked on different file formats like XML files, Sequence files, CSV and Map files.
  • Continuously monitored and managed Hadoop cluster using Cloudera Manager.
  • Performed POC’s using latest technologies like spark, Kafka, scala.
  • Created Hive tables, loaded them with data and wrote hive queries.
  • Involved in collecting, aggregating and moving data from servers to HDFS using Apache Flume.
  • Experience in managing and reviewing Hadoop log files.
  • Analysis of Web logs using Hadoop tools for operational and security related activities.
  • Used all complex data types in Pig for handling data.
  • Developed efficient Map Reduce programs in java for filtering out the unstructured data.
  • Supported Map Reduce Programs those are running on the cluster.
  • Managed and reviewed Hadoop log files to identify issues when job fails.

Environment: Hadoop, MapReduce, HDFS, HBase, Hive, Pig Java, XML, SQL, MySql, Scala, Pig, Sqoop, Oozie

Confidential, Cupertino, CA

Hadoop Consultant

Responsibilities:

  • Involved in creating Hive tables, loading with data and writing hive queries.
  • Involved in data ingestion into HDFS using Sqoop and Flume from variety of sources.
  • Developed MapReduce programs to parse the raw data, populate tables and store the refined data in partitioned tables.
  • Installed and configured Hadoop and Hadoop stack on a 4 node cluster.
  • Experienced in managing and reviewing application log files.
  • Ingest the application logs into HDFS and processes the logs using map reduce jobs.
  • Create and maintain Hive warehouse for Hive analysis.
  • Generate test cases for the new MR jobs.
  • Lead & Programmed the recommendation logic for various clustering and classification algorithms using JAVA.
  • Developed multiple Map Reduce jobs in Java for data cleaning and preprocessing
  • Responsible for design and creation of Hive tables, partitioning, bucketing, loading data and writing hive queries.
  • Created HBase tables to store various data formats of personally identifiable information data coming from different portfolios.
  • Involved in managing and reviewing Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and Pig jobs.

Environment: HDFS, Hive, Scala, Map Reduce, Storm, Java, HBase, Pig, Sqoop, Shell Scripts, Oozie, MySQL, Tableau, Eclipse, Webservices, Oracle11g/SQL, JDBC and Websphere Applications.

Confidential, Fremont, CA

Software Engineer

Responsibilities:

  • Worked on Marshalling and Un Marshalling theXMLusing the JIBX Parser.
  • Interpreted and manipulatedspringandhibernateconfigure files.
  • Worked onJMSandMessaging Queue (MQ)configurations.
  • Designed and developed GUI Screens for user interfaces usingJSP, JavaScript, XSLT, AJAX, XML, HTML, CSS, JSON.
  • Good in Configure, Design, implement and monitor Kafka Cluster and connectors.
  • Generated the Class diagrams, Sequence diagrams extensity for the entire process flow using RAD.
  • Consumed external web services by creating service contract through WSRR from different Development centers.
  • Worked onSOAPbased Web services, tested Web Services usingSOAPUI.
  • UsedJenkinstool to build the application on the server.
  • Developed documentation for QA Environment.
  • Loaded the records from Legacy database to Cassandra.
  • Synchronized the create, Update and delete of records between Legacy Database and Cassandra.
  • Created stored procedures, SQL Statements and triggers for the effective retrieval and storage of data into database.
  • Application developed onAgilemethodologiesscrumand iterative method process.
  • Used ApacheLog4jlogging API to log errors and messages.
  • Deployed applications on Unix Environment for Dev, QA-Smoke
  • Unit tested the application usingJUnitsand Easy Mock.

Environment: JDK, Spring Framework, XML, HTML, Cassandra, JSP, Hibernate, ANT, Java Script, XSLT, CSS, AJAX, JMS, SOAP Web Services, Web Sphere Application Server, PL/SQL, Junit, Log4j, Shell scripting, UNIX.

Confidential

Java/J2EE Developer

Responsibilities:

  • Developed front-end screens usingJSP,HTMLandCSS.
  • Developed server-side code usingStrutsandServlets.
  • Developed core java classes for exceptions, utility classes, business delegate, and test cases.
  • Developed SQL queries usingMySQLand established connectivity.
  • Worked with Eclipse usingMavenplugin forEclipse IDE.
  • Designed the user interface of the application usingHTML5, CSS3, JSF 2.0 JSP and JavaScript.
  • Tested the application functionality withJUnit Test Cases.
  • Developed all the User Interfaces usingJSPandStrutsframework.
  • Writing Client-Side validations usingJavaScript.
  • Extensively usedJQueryfor developing interactive web pages.
  • Experience in developingwebservicesfor production systems usingSOAP andWSDL.
  • Developed the user interface presentation screens usingHTML,XML, andCSS.
  • Experience in working withspringusingAOP, IOC and JDBC template.
  • Developed theShellscriptsto trigger the Java Batch job, Sending summary email for the batch job status and processing summary.
  • The application was developed in Eclipse IDE and was deployed onTomcat server.
  • Supported for bug fixes and functionality change.

Environment: Java, Struts 1.1, Servlets, JSP, HTML, CSS, JavaScript, Eclipse 3.2, Tomcat, Maven, MySQL, Windows and Linux, JUnit.

We'd love your feedback!