Big Data Developer Resume Atlanta, GA - Hire IT People

SUMMARY:

Overall, 5 years of IT experience that includes hands on experience in Big Data and development.
Expertise the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Scala, Kafka, Yarn, Oozie.
Performed different Optimization Techniques like partitioning, Bucketing in Hive
Worked with data in multiple formats including SequenceFile, ORC, Xml, Json, Text (delimited)/CSV.
Developed User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
Written the Apache PIG scripts to process the HDFS data.
Ability to learn and adapt quickly and apply the new tools and technology.
Experience in working on Spark SQL queries, Data frames, import data and perform transformations, perform read/write operations, save the results to output directory into HDFS.
Experience in using Linux environment and Linux commands.
Experienced in implementing Spark RDD transformations.
Used various compression techniques like Snappy to save data and optimize data transfer.
Used Hive queries to query data in Hive Tables and loaded data into HBase Tables.

TECHNICAL SKILLS:

Hadoop Technologies: HDFS, Map Reduce, Yarn, Spark, Spark SQL, Pig, Hive, Sqoop, Hue, Hbase, Oozie, Impala, Flume, Kafka(knowledge).

IDE s: IntelliJ, Eclipse, Jupyter, PyCharm, Sublime text.

Oracle 11g/10g, MS: SQL Server, My SQL.

Operating Systems: Unix / Linux, Windows.

Programming Languages: Scala, Python.

Hadoop Distributions: Cloudera, Horton Works

Tools: MS - Office, JIRA, Putty, FileZilla, WinSCP

Visualization: Tableau

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Big Data Developer

Responsibility:

Involved in design and development phases of Software Development Life Cycle using agile methodology.
Used Sqoop to load data from Oracle Database into Hive.
Involved in creating Hive tables, loading data and writing HIVE queries as per requirement defined with appropriate static and dynamic partitions and bucketing, intended for efficiency.
Used flume to collect the entire web log from online ad-servers and push into HDFS.
Load and transform large sets of structured and semi structured data.
Worked on importing and exporting data from Oracle and DB2 into HDFS and Hive using SQOOP.
Enhanced Hive performance by implementing Optimizing and Compressing Techniques.
Handled importing of data from various data sources, performed transformations using Hive and loaded data into HDFS.
Written the Apache PIG scripts to process the HDFS data.
Involved in converting Hive/SQL queries into Spark RDD Transformations using Spark Data frames and Scala.
Implemented Partitioning, Dynamic Partitions and Buckets in Hive on different file formats to meet the business requirement.
Worked on Loading log data into HDFS using Flume, Kafka and performing ETL integrations.
Used Reporting Tool Tableau to connect with Hive for generating daily reports of data.
Load the data into Spark RDD and do in memory data Computation to generate the Output response.
Written Hbase queries for different metrics.
Good knowledge in creating data frames using spark SQL. Involved in loading data into HBase NoSQL Database.

Skills: Spark, Spark SQL, HDFS, Hive, Pig, Kafka, Sqoop, Hbase, Scala, Shell scripting, Linux, MySQL, IntelliJ, Oracle, Git, Tableau, MySQL.

Confidential, Chicago, IL.

Big Data Engineer

Responsibility:

Used Hive to analyze the partitioned and bucketed data and compute various metrics from reporting on the dashboard.
Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
Experienced in implementing Spark RDD transformations and writing queries in Spark SQL using Scala and Python to implement business logic.
Load and transform large sets of structured, semi structured, unstructured data even joins, and some pre-aggregations before storing data into HDFS.
Exported the analyzed data to the relational databases using Sqoop for visualization and generate reports by our BI team.
Involved in creating Hive tables by loading data and writing hive queries that will run internally in map reduce way.
Worked with different file formats like csv, xml, Json and applied compression technics to save the storage.
Written HBase queries for finding different metrics.
Worked on partitioning the Hive table and running the scripts in parallel to reduce the run time of scripts.
Done performance testing for different file compression technics to suit the best one.
Developing Hive User Defined Functions, compiling them into jars, adding them to the HDFS, and executing them with Hive Queries.
Importing data from Oracle to HDFS & Hive for analytical purpose.

Skills: HDFS, Hive, HBase, SQoop, PIG, Spark, Scala

Confidential

Python Developer

Responsibility:

Worked on software development in python and IDEs: PyCharm, Eclipse, Sublime text and Jupyter Notebook.
Worked with OOPS, Multithreading, collections concept in python.
Excellent debugging, problem solving and optimization skills.
Knowledge on Django framework design and developed CSS, HTML and Bootstrap for web-based screens.
Involved in Software Development Life Cycle including requirements gathering and designing.
Wrote and executed various MySQL database queries from python using python-MySQL connector and MySQL DB package.
Used subversion control for regular code reviews and pull/merge requests.
Efficient in using Hive and hive query language.
Used Hive to analyses data by writing SQL like queries called HiveQL.
Good experience in using Linux environment and Linux commands.
Performed efficient delivery of code based on principles of Test Driven Development. Used IDE tool to develop the application and JIRA for bug and issue tracking.

Skills: Python OOPS, MySQL, Django, JIRA, CSS, HTML, Bootstrap, Pycharm, Eclipse, Linux, Hive.