We provide IT Staff Augmentation Services!

Hadoop Developer Resume

2.00/5 (Submit Your Rating)

Columbus, GA

SUMMARY:

  • 5+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies.
  • 3 years of comprehensive experience in Big Data processing using Apache Hadoop and its ecosystem (MapReduce, Pig, Hive, Sqoop, HBase, Spark, NoSQL, Oozie, Kafka, Zoo Keeper and Flume).
  • In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
  • Knowledge on testing with Big Data Technologies like Hadoop, MapReduce, Hive, Pig, HBase, Kafka and Spark.
  • Hands on experience in installing, configuring and testing ecosystem components like Hadoop MapReduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, HDP, Cassandra, Sqoop, PIG, Flume.
  • Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
  • Good Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
  • Experience in preparing test plans and executing the test cases.
  • Good experience and great knowledge in testing the process for Hadoop based application design and implementation.
  • Good knowledge of java to do the Map Reduce Testing.
  • Experience in developing PIG Latin Scripts and Hive Queries.
  • Experience in scripting for automation and monitoring using Python.
  • Good knowledge in programming Spark using Scala and Experienced in handling Spark SQL, Streaming and complex analytics using Spark over Cloudera Hadoop YARN.
  • Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
  • Sound knowledge on using job scheduling and monitoring tools like Kafka, Oozie and Zookeeper.
  • Experienced in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
  • Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
  • Expertise in writing Map - Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
  • Experience working on NoSQL databases such as HBase.
  • Strong Knowledge in understanding Open source with Network Controllers.
  • Ability to work effectively with associates at all levels within the organization.
  • Strong background in mathematics and have very good analytical and problem-solving skills.

TECHNICAL SKILLS:

Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Flume, Oozie, Cassandra, YARN, Apache Spark, Impala, Kafka, MapReduce.

Hadoop Distribution: Cloudera CDHs, Hortonworks HDPs.

Programming Languages: Core Java, Python, SQL, C, HTML.

Database Systems: Oracle, MySQL, HBase, Cassandra

IDE Tools: Eclipse, NetBeans, IntelliJ

Monitoring Tools: Ambari, Cloudera Manager

Operating Systems: Windows, Linux, UNIX

PROFESSIONAL EXPERIENCE:

Confidential, Columbus, GA

Hadoop Developer

Responsibilities:

  • Handled importing of data from various data sources, performed transformations using Hive; Map Reduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
  • Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume.
  • Importing and exporting data into HDFS from Relational databases and vice versa using Sqoop.
  • In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
  • Created partitions, bucketing across state in Hive to handle structured data.
  • Implemented Dash boards that handle HiveQL queries internally like Aggregation functions, basic hive operations, and different kind of join operations.
  • Used Pig in three distinct workloads like pipelines, iterative processing and research.
  • Involved in moving all log files generated from various sources to HDFS for further processing through Kafka, Flume.
  • Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
  • Implemented MapReduce jobs to write data into Avro format.
  • Created Hive tables to store the processed results in a tabular format.
  • Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Implemented various MapReduce Jobs in custom environments and updating them to HBase tables by generating hive queries.
  • Performed Sqoop operations for various file transfers through the HBase tables for processing of data to several MangoDB.
  • Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming.
  • Evaluated Oozie for workflow orchestration in the automation of MapReduce jobs, Pig and Hive jobs.
  • Created tables, secondary indexes, join indexes in Teradata development Environment for testing.
  • Extracted files from other databases through Sqoop and placed in HDFS and processed.
  • Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
  • Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.

Environment: HDFS, Hive, Pig, MapReduce, CDH, Spark, AVRO, Sqoop, Oozie, Flume, Teradata, Kafka, Scala, HBase, SQL, Talend, Java, Unix.

Confidential, charlotte, NC

Hadoop Developer

Responsibilities:

  • Responsible for building scalable distributed data solutions using Hadoop .
  • Handling structured, semi structured and unstructured data.
  • Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems and vice-versa.
  • Developed Simple MapReduce Jobs using Hive and Pig.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Extensively used Pig for data cleansing and created partitioned tables in Hive.
  • Managed and reviewed Hadoop log files.
  • Worked in Hadoop MapReduce, HDFS Developed multiple MapReduce jobs in java for data cleaning and processing.
  • Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
  • Integrated Hive and HBase to perform queries using Impala .
  • Responsible to manage data coming from different sources.
  • Extensively used Pig for data cleansing.
  • Created partitioned tables in Hive.
  • Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
  • Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
  • Mentored analyst and test team for writing Hive Queries.

Environment: Hadoop, MapReduce, HDFS, Hive, HBase, Sqoop, Impala, Java (jdk1.6), Pig, Flume, Oracle 11/10g, MySQL, Eclipse, Java, Shell Scripting, SQL Developer, Putty, XML/HTML.

Confidential

Hadoop Developer

Responsibilities:

  • Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
  • Responsible for building scalable distributed data solutions using Hadoop.
  • Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
  • Involved in loading data from LINUX file system to HDFS.
  • Worked on installing cluster, commissioning & decommissioning of data nodes, name node recovery, capacity planning, and slots configuration.
  • Created HBase tables to store variable data formats of data coming from different portfolios.
  • Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
  • Implemented best income logic using Pig scripts and UDFs.
  • Implemented test scripts to support test driven development and continuous integration.
  • Worked on tuning the performance Pig queries.
  • Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
  • Responsible to manage data coming from various sources.
  • Involved in loading data from UNIX file system to HDFS.
  • Load and transform large sets of structured, semi structured and unstructured data.
  • Cluster coordination services through Zookeeper.
  • Experience in managing and reviewing Hadoop log files.
  • Job management using Fair scheduler.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
  • Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
  • Installed Oozie workflow engine to run multiple Hive and pig jobs.
  • Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
  • Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Zookeeper, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Confidential

Jr. Java Developer

Responsibilities:

  • Involved in design and development phases of Software Development Life Cycle (SDLC).
  • Involved in designing UML Use case diagrams, Class diagrams, and Sequence diagrams using Rational Rose.
  • Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
  • Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
  • Developed a Dojo based front end including forms and controls and programmed event handling.
  • Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
  • Used Core java and object-oriented concepts.
  • Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
  • Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
  • Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making the system more scalable.
  • Deployed application on windows using IBM Web Sphere Application Server.
  • Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
  • Used Web Services - WSDL and REST for getting credit card information from third party.
  • Used ANT scripts to build the application and deployed on Web Sphere Application Server.

Environment: Core Java, J2EE, Oracle, SQL Server, JSP, JDK, JavaScript, HTML, CSS, Web Services, Windows.

We'd love your feedback!