Hadoop Developer Resume Columbus, GA - Hire IT People

SUMMARY:

5+ years of overall IT experience in a variety of industries, which includes hands on experience in Big Data technologies.
3 years of comprehensive experience in Big Data processing using Apache Hadoop and its ecosystem (MapReduce, Pig, Hive, Sqoop, HBase, Spark, NoSQL, Oozie, Kafka, Zoo Keeper and Flume).
In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
Knowledge on testing with Big Data Technologies like Hadoop, MapReduce, Hive, Pig, HBase, Kafka and Spark.
Hands on experience in installing, configuring and testing ecosystem components like Hadoop MapReduce, HDFS, HBase, Zoo Keeper, Oozie, Hive, HDP, Cassandra, Sqoop, PIG, Flume.
Experience in analyzing data using HiveQL, Pig Latin, HBase and custom Map Reduce programs in Java.
Good Experience in importing and exporting data between HDFS and Relational Database Management systems using Sqoop.
Experience in preparing test plans and executing the test cases.
Good experience and great knowledge in testing the process for Hadoop based application design and implementation.
Good knowledge of java to do the Map Reduce Testing.
Experience in developing PIG Latin Scripts and Hive Queries.
Experience in scripting for automation and monitoring using Python.
Good knowledge in programming Spark using Scala and Experienced in handling Spark SQL, Streaming and complex analytics using Spark over Cloudera Hadoop YARN.
Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
Sound knowledge on using job scheduling and monitoring tools like Kafka, Oozie and Zookeeper.
Experienced in using Zookeeper and Oozie Operational Services for coordinating the cluster and scheduling workflows.
Worked extensively with Dimensional modeling, Data migration, Data cleansing, Data profiling, and ETL Processes features for data warehouses.
Expertise in writing Map - Reduce Jobs in Java for processing large sets of structured semi-structured and unstructured data sets and stores them in HDFS.
Experience working on NoSQL databases such as HBase.
Strong Knowledge in understanding Open source with Network Controllers.
Ability to work effectively with associates at all levels within the organization.
Strong background in mathematics and have very good analytical and problem-solving skills.

TECHNICAL SKILLS:

Hadoop Technologies: HDFS, MapReduce, Hive, HBase, Pig, Sqoop, Flume, Oozie, Cassandra, YARN, Apache Spark, Impala, Kafka, MapReduce.

Hadoop Distribution: Cloudera CDHs, Hortonworks HDPs.

Programming Languages: Core Java, Python, SQL, C, HTML.

Database Systems: Oracle, MySQL, HBase, Cassandra

IDE Tools: Eclipse, NetBeans, IntelliJ

Monitoring Tools: Ambari, Cloudera Manager

Operating Systems: Windows, Linux, UNIX

PROFESSIONAL EXPERIENCE:

Confidential, Columbus, GA

Hadoop Developer

Responsibilities:

Handled importing of data from various data sources, performed transformations using Hive; Map Reduce loaded data into HDFS and extracted the data from MySQL into HDFS using Sqoop.
Experienced in working with different Hadoop ecosystem components such as HDFS, MapReduce, HBase, Spark, Yarn, Kafka, Zookeeper, PIG, HIVE, Sqoop, Storm, Oozie, Impala and Flume.
Importing and exporting data into HDFS from Relational databases and vice versa using Sqoop.
In depth understanding and knowledge of Hadoop Architecture and its components such as HDFS, Map Reduce, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager, Node Manager.
Created partitions, bucketing across state in Hive to handle structured data.
Implemented Dash boards that handle HiveQL queries internally like Aggregation functions, basic hive operations, and different kind of join operations.
Used Pig in three distinct workloads like pipelines, iterative processing and research.
Involved in moving all log files generated from various sources to HDFS for further processing through Kafka, Flume.
Extensively used PIG to communicate with Hive using HCatalog and HBASE using Handlers.
Implemented MapReduce jobs to write data into Avro format.
Created Hive tables to store the processed results in a tabular format.
Implemented Spark using Scala and Spark SQL for faster processing and testing of data.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Implemented various MapReduce Jobs in custom environments and updating them to HBase tables by generating hive queries.
Performed Sqoop operations for various file transfers through the HBase tables for processing of data to several MangoDB.
Involved in converting Hive/SQL queries into Spark transformations using Spark RDDs, Scala and have a good experience in using Spark-Shell and Spark Streaming.
Evaluated Oozie for workflow orchestration in the automation of MapReduce jobs, Pig and Hive jobs.
Created tables, secondary indexes, join indexes in Teradata development Environment for testing.
Extracted files from other databases through Sqoop and placed in HDFS and processed.
Captured the data logs from web server into HDFS using Flume & Splunk for analysis.
Experienced in writing Pig scripts and Pig UDFs to pre-process the data for analysis.

Environment: HDFS, Hive, Pig, MapReduce, CDH, Spark, AVRO, Sqoop, Oozie, Flume, Teradata, Kafka, Scala, HBase, SQL, Talend, Java, Unix.

Confidential, charlotte, NC

Hadoop Developer

Responsibilities:

Responsible for building scalable distributed data solutions using Hadoop .
Handling structured, semi structured and unstructured data.
Worked extensively with Sqoop for importing and exporting the data from HDFS to Relational Database systems and vice-versa.
Developed Simple MapReduce Jobs using Hive and Pig.
Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Extensively used Pig for data cleansing and created partitioned tables in Hive.
Managed and reviewed Hadoop log files.
Worked in Hadoop MapReduce, HDFS Developed multiple MapReduce jobs in java for data cleaning and processing.
Involved in creating Hive tables, loading with data and writing hive queries that will run internally in MapReduce way.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS.
Integrated Hive and HBase to perform queries using Impala .
Responsible to manage data coming from different sources.
Extensively used Pig for data cleansing.
Created partitioned tables in Hive.
Developed Pig Latin scripts to extract the data from the web server output files to load into HDFS
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Mentored analyst and test team for writing Hive Queries.

Environment: Hadoop, MapReduce, HDFS, Hive, HBase, Sqoop, Impala, Java (jdk1.6), Pig, Flume, Oracle 11/10g, MySQL, Eclipse, Java, Shell Scripting, SQL Developer, Putty, XML/HTML.

Confidential

Hadoop Developer

Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, HBase database and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
Involved in loading data from LINUX file system to HDFS.
Worked on installing cluster, commissioning & decommissioning of data nodes, name node recovery, capacity planning, and slots configuration.
Created HBase tables to store variable data formats of data coming from different portfolios.
Implemented a script to transmit sysprin information from Oracle to HBase using Sqoop.
Implemented best income logic using Pig scripts and UDFs.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance Pig queries.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Responsible to manage data coming from various sources.
Involved in loading data from UNIX file system to HDFS.
Load and transform large sets of structured, semi structured and unstructured data.
Cluster coordination services through Zookeeper.
Experience in managing and reviewing Hadoop log files.
Job management using Fair scheduler.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Installed Oozie workflow engine to run multiple Hive and pig jobs.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Environment: Hadoop, HDFS, Pig, Zookeeper, Sqoop, HBase, Shell Scripting, Ubuntu, Linux Red Hat.

Confidential

Jr. Java Developer

Responsibilities:

Involved in design and development phases of Software Development Life Cycle (SDLC).
Involved in designing UML Use case diagrams, Class diagrams, and Sequence diagrams using Rational Rose.
Followed agile methodology and SCRUM meetings to track, optimize and tailored features to customer needs.
Developed user interface using JSP, JSP Tag libraries, and Java Script to simplify the complexities of the application.
Developed a Dojo based front end including forms and controls and programmed event handling.
Created Action Classes which route submittals to appropriate EJB components and render retrieved information.
Used Core java and object-oriented concepts.
Used JDBC to connect to backend databases, Oracle and SQL Server 2005.
Proficient in writing SQL queries, stored procedures for multiple databases, Oracle and SQL Server 2005.
Wrote Stored Procedures using PL/SQL. Performed query optimization to achieve faster indexing and making the system more scalable.
Deployed application on windows using IBM Web Sphere Application Server.
Used Java Messaging Services (JMS) for reliable and asynchronous exchange of important information such as payment status report.
Used Web Services - WSDL and REST for getting credit card information from third party.
Used ANT scripts to build the application and deployed on Web Sphere Application Server.

Environment: Core Java, J2EE, Oracle, SQL Server, JSP, JDK, JavaScript, HTML, CSS, Web Services, Windows.

We provide IT Staff Augmentation Services!

Hadoop Developer Resume

Columbus, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship