Hadoop/Big Data Developer Resume Alpharetta, GA - Hire IT People

PROFESSIONAL SUMMARY:

Overall 5+ years of IT experience onBigDatatechnologies, Spark and database development.
Strong experience working with HDFS, MapReduce, Spark, Hive, Sqoop, Flume, Kafka, Oozie, Pig and HBase.
Experience working with DataFrames, RDD, Spark SQL, Spark Streaming, APIs, System Architecture, and Infrastructure Planning.
Experience in usage of Hadoop distribution like Cloudera and Hortonworks.
Worked extensively to integrate Horton works with BI tool Tableau.
Expertise in developing solutions to analyze large data sets efficiently.
Integrated Kafka with Spark Streaming for real timedataprocessing.
Built real - timeBigDatasolutions using HBase handling millions of records.
Implemented Hadoop baseddatawarehouses, integrated Hadoop with EnterpriseDataWarehouse systems.
Hadoop framework, Hadoop Distributed file system and Parallel processing implementation.
Expertise in developing solutions to analyze large data sets efficiently.
Skilled in writing Map Reduce jobs in Pig and Hive.
Large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring.
Experience in NOSQL databases and SQL databases.
Involved in Creation of database, tables, stored procedures, triggers, and user defined functions. Involved in Installation and configuration of SQL Server
Extensive hands-on experience with Linux and Windows.

TECHNICAL SKILLS:

HADOOP ECOSYSTEM: HDFS, MapReduce, Yarn(Cloudera, Hortonworks)

DATA ANALYTICS: Hadoop, Hive, Spark, Pig and Tableau

BIGDATATOOLS: Sqoop, Oozie, HBase, and Flume

PROGRAMMING LANGUAGES: Python, Scala and Java

DATABASE TOOLS: Oracle, MySQL, MS SQL server.

OPERATING SYSTEMS: Windows, Unix, Linux

PROFESSIONAL EXPERIENCE:

Confidential, Alpharetta, GA

Hadoop/Big Data Developer

Responsibilities:

Work with Hadoop ecosystem and Implement Spark using Scala and utilizing DataFrames and Spark SQL API for faster processing ofdata.
Develop Hive queries to process teh data, Implement Partitions and Buckets in Hive.
Develop RDD's/DataFrames in Spark using and apply several transformation logics to loaddatafrom HadoopDataLakes.
Provide proof of concepts converting Filedatainto parquet format to improve query processing by using Hive.
Develop Hive and Pig scripts for joining teh raw data with teh lookup data and for some aggregative operations as per teh business requirement.
Filtering and cleaning data using Scala code and SQL Queries.
Install Oozie workflow engine to run multiple map-reduce programs which run independently with time anddata.
Importing and exportingdatainto HDFS and Hive using Sqoop.
Working with Flume to load teh logdatafrom multiple sources directly into HDFS.

Environment: Hadoop, HDFS, Spark, Hive, Kafka, JSON, Linux, HBase, Python, Parquet, Hortonworks, Scala, Sqoop, Flume, SQL.

Confidential, Sandy Springs, GA

Pyspark / Hadoop Developer

Responsibilities:

Developed Spark programs with Python, and applied principals of functional programming to process teh complex unstructured and structured data sets.
Analyzing SQL scripts and designed teh solution to implement using PySpark
Developing data processing tasks using pySpark such as reading data from external sources, merge data, perform data enrichment and load in to target data destinations.
Used JSON and XML SerDe’s for serialization and de-serialization to load JSON and XML data into Hive tables.
Used SparkSQL to load JSON data and create Schema RDD and loaded it into Hive Tables and handled structured data using SparkSQL.
HBase setup and storing data into Hbase, which will be used for analysis.
Analyze SQL scripts and designed teh solutions to implement using Pyspark.
Converting MapReduce programs into Spark transformations using Spark RDD in Pyspark.
Implemented Spark using Pyspark API and utilizing Data frames and SparkSQL API for faster processing of data.

Environment: Hadoop, HDFS, Spark, Spark core, Spark Streaming, Yarn, Hive, Sqoop, Zookeeper, Flume, Kafka, HBase, Python, SQl scripting, Kerberos, Linux Shell Scripting, JSON, Parquet, Hortonworks.

Confidential, Phoenix, AZ

Hadoop/Big Data Developer

Responsibilities:

Installed and configured Hadoop MapReduce, HDFS, Developed multiple MapReduce jobs for data cleaning and preprocessing.
Develop different components of system like Hadoop process dat involves MapReduce and Hive.
Worked with Hive on big data of logs to perform a trend analysis of user behavior on various online modules.
Install Oozie workflow engine to run multiple jobs.
Worked on sequence files, Bucketing, partitioning for Hive performance enhancement and storage
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to Stream teh log data from servers.
Involved in HDFS maintenance and loading of structured and unstructured data.
Debugging and identifying issues reported by QA with Hadoop jobs by configuring to local file system.

Environment: Hadoop, Cloudera, Linux, CentOS, MapReduce, HBase, Sqoop, Flume, HDFS, Python, Hive, SQl.

Confidential, Bellavue, WA

Big Data Developer

Responsibilities:

Developed MapReduce programming dat works seamlessly on Hadoop clusters
Worked with SQL, NoSQL, data warehousing & DBA
Designing web services for swift data tracking and Querying data at high speeds
Test software prototypes, propose standards and smoothly transfer it to operations
Translate complex functional and technical requirements into detailed design.
Perform analysis of vast data stores and uncover insights.
Maintain security and data privacy.
Managing and deploying HBase.
Worked on Cluster coordination services through Zookeeper.
Experienced in running Hadoop streaming jobs to process terabytes of xml formatdata.
Created and maintained Technical documentation for launching Clusters and for executingHive queries to make UDFs.
Worked on BI tools as Tableau to create dashboards like weekly, monthly, daily reports using tableau desktop and publish them to HDFS cluster.

Environment: Hadoop, HDFS, MapReduce, Hive, HBase, Oozie, Zookeeper, Tableau, Java, JASON, Linux, CentOS, Cloudera, Sqoop, Flume, SQL.

Confidential

SQL Developer

Responsibilities:

Business and user requirements gathering
Involved in Creation of database, tables, stored procedures, triggers, and user defined functions. Involved in Installation and configuration of SQL Server
Backing up and restoring SQL server databases.
Design logical models and architecture.
Collection of data through design of survey questionnaires.
Optimize database systems for performance efficiency.
Data Mapping between teh source and teh destination and documentation of teh data mapping spreadsheet
Supervised data entry into SQL server database through application interfaces, Debugging.

Environment: oracle 10g, SQL developer, PL/SQL, Shell scripts, oracle forms, SQL*loader, Web focus reporting, triggers, wise Package studio 7.0.

We provide IT Staff Augmentation Services!

Hadoop/big Data Developer Resume

Alpharetta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship