Hadoop Big Data/Spark Developer Resume Charlotte, NC - Hire IT People

SUMMARY:

Over 6+ years of experience in IT and 3+ years of experience Hadoop/Big D Confidential eco systems and Java technologies like Confidential, MapReduce, oozie, Impala, Apache Pig, Hive, Hbase, Spark, Kafka and Sqoop.
In depth knowledge of Hadoop Architecture and Hadoop daemons such as Name Node, Secondary Name Node, D Confidential Node, Job Tracker and Task Tracker.
Hands on experience in writing Ad - hoc Queries for moving d Confidential from Confidential to HIVE and analyzing the d Confidential using HIVE QL.
Experience in developing and designing POCs using Scala and deployed on the Yarn cluster, compared the performance of Spark, with Hive and SQL/Terad Confidential .
Experience in writing Map Reduce programs using Apache Hadoop for analyzing Big D Confidential .
Experience in writing Hadoop Jobs for analyzing d Confidential using Pig Latin Commands.
Experience in importing and exporting d Confidential using SQOOP from Relational D Confidential base Systems to Confidential .
Working Knowledge in NoSQL D Confidential bases like HBase and Cassandra.
Good Knowledge in Amazon AWS concepts like EMR, EC2, EBS, S3 and RDS web services which provides fast and efficient processing of Big D Confidential .
Experience in administrative tasks such as installing Hadoop and its ecosystem components such as Hive and Pig in Distributed Mode.
Good Knowledge of analyzing d Confidential in HBase using Hive and Pig.
Experience in Integrating BI tools like Tableau and pulling required d Confidential to in-memory of BI tool.
Experience in Launching EC2 instances in Amazon EMR using Console.
Expertise in using Apache Nifi for multiple d Confidential transformations before loading to Confidential .
Extending Hive and PIG core functionality by writing custom UDFs like UDAFs and UDTFs.
Strong D Confidential Warehousing ETL experience of using DM Express ETL tool.
Experience in working with Windows, UNIX/LINUX platform with different technologies such as Big D Confidential, SQL, XML, HTML, Core Java, Shell Scripting etc.
Experience in using Apache Flume for collecting, aggregating and moving large amounts of d Confidential from application servers.
Passionate towards working in Big D Confidential and Analytics environment.
Knowledge on Reporting tools like Tableau which is used to do analytics on d Confidential in cloud.
Extensive experience with SQL, PL/SQL, Shell Scripting and d Confidential base concepts.
Experience with front end technologies like HTML, CSS and JavaScript.

TECHNICAL SKILLS:

D Confidential base: DB2, MySQL, Oracle, MS SQL Server, Terad Confidential

Languages: Core Java, PIG Latin, Scala Scripting, SQL, Hive QL, Shell Scripting and XML

API s/Tools: NetBeans, Eclipse, MYSQL workbench, Visual Studio, DM Express

Web Technologies: HTML, XML, JavaScript, CSS

BigD Confidential Ecosystem: Confidential, SPARK, PIG, MAPREDUCE, HIVE, Impala, KAFKA, SQOOP, FLUME, HBase

Operating System: Unix, Linux, Windows XP

Visualization Tools: Tableau, Zeppelin

Virtualization Software: VMware, Oracle Virtual Box.

Cloud Computing Services: AWS (Amazon Web Services).

PROFESSIONAL EXPERIENCE:

Confidential, Charlotte, NC

Hadoop Big D Confidential /Spark Developer

Responsibilities:

Analyzing the requirement to setup a cluster.
Developed MapReduce programs in Java for parsing the raw d Confidential and populating staging Tables.
Created Hive queries to compare the raw d Confidential with EDW tables and performing aggregates
Importing and exporting d Confidential into Confidential and Hive using SQOOP.
Writing PIG scripts to process the d Confidential .
Developed and designed Hadoop, Spark and Java components.
Developed Spark programs to parse the raw d Confidential, populate staging tables and store the refined d Confidential in partitioned tables in the EDW. Developed Spark code to using Scala and Spark-SQL for faster processing and testing.
Developed PIG Latin scripts to extract the d Confidential from the web server output files to load into Confidential .
Involved in HBASE setup and storing d Confidential into HBASE, which will be used for further analysis.
Explored the Spark for improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, D Confidential Frame, Pair RDD's, spark YARN and converted Hive queries into Spark transformations using Spark RDDs.
Created applications using Kafka, which monitors consumer lag within Apache Kafka clusters.
Used in production by multiple companies.
Developed Unix/Linux Shell Scripts and PL/SQL procedures.
Worked towards creating real time d Confidential streaming solutions using Apache Spark/Spark Streaming, Kafka.
Performance optimizations on Spark/Scala. Diagnose and resolve performance issues.
Installed and configured Hive and written Hive UDFs.
Involved in converting Map Reduce programs into Spark transformations using Spark RDD's using Scala.
Involved in creating Hive tables, loading with d Confidential and writing hive queries using the HIVEQL which will run internally in MAPREDUCE way.
Loaded some of the d Confidential into Cassandra for fast retrieval of d Confidential .
Involved in developing Spark code using Scala and Spark-SQL for faster testing and processing of d Confidential and exploring of optimizing it using Spark Context, Spark-SQL, Pair RDD's, Spark YARN.
Exported the analyzed d Confidential to the relational d Confidential bases using SQOOP for visualization and to generate reports by our BI team.
Extracted files from Cassandra through Sqoop and placed in Confidential and processed.
Implementation of Big D Confidential solutions on the Horton works distribution and AWS Cloud platform.
Complete end-to-end design of Apache NiFi to get connected to AWS and store the final output in Confidential .
Developed Pig Latin scripts for handling d Confidential formation.
Extracted the d Confidential from MySQL into Confidential using SQOOP.
Experience in managing and monitoring Hadoop cluster using Cloudera Manager.

Environment: Hadoop, Cloudera distribution, Hortonworks distribution, AWS, EMR, Azure cloud platform, Confidential, MapReduce, DocumentDB Unix Shell Scripting, Kafka, Pig, Hive, Sqoop, Flume, Oozie, Zoo keeper, Core Java, impala, HiveQL, Spark, UNIX/Linux Shell Scripting.

Confidential, Iriving TX

Big D Confidential Developer

Reponsibilities:

Responsible for building scalable distributed d Confidential solutions using Hadoop.
Involved in loading d Confidential from LINUX file system to Confidential .
Working experience in Confidential Admin Shell commands.
Experience in ETL methods for d Confidential extraction, transformation and loading in corporate-wide ETL Solutions and D Confidential warehouse tools for reporting and d Confidential analysis.
Understanding/knowledge of Hadoop Architecture and various components such as Confidential, Job Tracker, Task Tracker, Name Node and D Confidential Node concepts.
Developed Kafka producer and consumers, HBase clients, Apache Spark and
Hadoop MapReduce jobs along with components on Confidential, Hive.
Used Spark Streaming API with Kafka to build live dashboards; Worked on Transformations & actions in RDD, Spark Streaming, Pair RDD Operations, Check-pointing, and SBT.
Used Kafka to transfer d Confidential from different d Confidential systems to Confidential .
Migrated complex map reduce programs into Spark RDD transformations, actions.
Involved in the development of Spark Streaming application for one of the d Confidential source using Scala, Spark by applying the transformations.
Developed a script in Scala to read all the Parquet Tables in a D Confidential base and parse them as Json files, another script to parse them as structured tables in Hive.
Implemented Spark RDD transformations to map business analysis and apply actions on top of transformations.
Used spark to parse XML files and extract values from tags and load it into multiple hive tables.
Experience on different Hadoop distribution Systems such as: Cloudera & Hortonworks
Hands on experience on Cassandra DB.
Analyzed large d Confidential sets by running Hive queries and Pig scripts.
Used Hive and created Hive tables and involved in d Confidential loading and writing Hive UDFs.
Hands on using SQOOP to import and export d Confidential into Confidential from RDBMS and vice-versa.
Exported the analyzed d Confidential to the relational d Confidential bases using Sqoop for visualization and to generate reports for the BI team.
Used SQOOP, AVRO, HIVE, PIG, Java, MAPREDUCE daily to develop ETL, Batch Processing and d Confidential storage functionality.
Supported implementation and execution of MAPREDUCE programs in a cluster environment.

Environment: Hadoop, MapReduce, Hive,Pig, Hbase, Sqoop, Kafka, Cassandra, Flume, Java, SQL, Cloudera Manager, Eclipse, Unix Script, YARN.

Confidential, NJ

Hadoop Developer

Responsibilities:

Written MapReduce code to parse the d Confidential from various sources and storing parsed d Confidential into Hbase and Hive.
Integrated Map Reduce with HBase to import bulk amount of d Confidential into HBase using Map Reduce Programs.
Imported d Confidential from different relational d Confidential sources like Oracle, Terad Confidential to Confidential using Sqoop.
Worked on a stand-alone as well as a distributed Hadoop application.
Used Scala to convert Hive/SQL queries into RDD transformations in Apache Spark.
Used Oozie and Zookeeper to automate the flow of jobs and coordination in the cluster respectively.
Used Spark API over Cloudera Hadoop YARN to perform analytics on d Confidential in Hive.
Extensive knowledge on PIG scripts using bags and tuples and Pig UDF'S to pre-process the d Confidential for analysis.
Implemented usage of Amazon EMR for processing Big D Confidential across a Hadoop Cluster of virtual servers.
Used Terad Confidential to build Hadoop project and also as ETL project.
Developed several shell scripts, which acts as wrapper to start these Hadoop jobs and set the configuration parameters.
Involved in writing query using Impala for better and faster processing of d Confidential .
Involved in developing Impala scripts for extraction, transformation, loading of d Confidential into d Confidential warehouse.
Experienced in migrating HiveQL into Impala to minimize query response time.
Involved in moving all log files generated from various sources to Confidential for further processing through Flume.
Involved in collecting and aggregating large amounts of log d Confidential using Apache and staging d Confidential in Confidential for further analysis.
Develop testing scripts in Python and prepare test procedures, analyze test results d Confidential and suggest improvements of the system and software.

Environment: Confidential, MapReduce, Python, CDH5, Hbase, NOSQL, Hive, Pig, Hadoop, Sqoop, Impala, Yarn, Shell Scripting, Ubuntu, Linux Red Hat.

Confidential

Java / Hadoop Developer

Reponsibilities:

Involved in design and development phases of Software Development Life Cycle (SDLC) using Scrum methodology.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Involved in Installing, Configuring Hadoop Eco System, and Cloudera Manager using CDH4 Distribution.
Collected the logs d Confidential from web servers and integrated in to Confidential using Flume.
Involved in creating Hive Tables, loading with d Confidential and writing Hive queries which will invoke and run Map Reduce jobs in the backend.
Worked on analyzing Hadoop cluster and different big d Confidential analytic tools including Map Reduce, Hive and Spark.
Developed the Map Reduce programs to parse the raw d Confidential and store the pre-Aggregated d Confidential in the portioned tables.
Involved in start to end process of Hadoop cluster installation, configuration and monitoring
Responsible for building scalable distributed d Confidential solutions using Hadoop and Involved in submitting and tracking Map Reduce jobs using Job Tracker.
Responsible for troubleshooting issues in the execution of MapReduce jobs by inspecting and reviewing log files.
Worked with HBase in creating tables to load large sets of semi structured d Confidential coming from various sources.
Created design documents and reviewed with team in addition to assisting the business analyst / project manager in explanations to line of business.
Responsible for understanding the scope of the project and requirement gathering.
Involved in analysis, design, construction and testing of the application
Developed the web tier using JSP to show account details and summary.
Designed and developed the UI using JSP, HTML, CSS and JavaScript.
Used Tomcat web server for development purpose.
Involved in creation of Test Cases for JUnit Testing.
Used Oracle as D Confidential base and used Toad for queries execution and also involved in writing SQL scripts, PL/SQL code for procedures and functions.
Developed application using Eclipse and used build and deploy tool as Maven.

Environment: Hadoop, HBase, Confidential, Pig Latin, Sqoop, Hive,Java, J2EE Servlet, JSP, JUnit, AJAX, XML, JavaScript, Maven, Eclipse, Apache Tomcat, and Oracle.

We provide IT Staff Augmentation Services!

Hadoop Big Data/spark Developer Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship