We provide IT Staff Augmentation Services!

Big Data Developer Resume

4.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • Overall, 5 years of IT experience that includes hands on experience in Big Data and development.
  • Expertise the tools in Hadoop Ecosystem including Pig, Hive, HDFS, Map Reduce, Sqoop, Spark, Scala, Kafka, Yarn, Oozie.
  • Performed different Optimization Techniques like partitioning, Bucketing in Hive
  • Worked with data in multiple formats including SequenceFile, ORC, Xml, Json, Text (delimited)/CSV.
  • Developed User Defined Functions (UDFs) in Hive to transform the large volumes of data with respect to business requirement.
  • Written the Apache PIG scripts to process the HDFS data.
  • Ability to learn and adapt quickly and apply the new tools and technology.
  • Experience in working on Spark SQL queries, Data frames, import data and perform transformations, perform read/write operations, save the results to output directory into HDFS.
  • Experience in using Linux environment and Linux commands.
  • Experienced in implementing Spark RDD transformations.
  • Used various compression techniques like Snappy to save data and optimize data transfer.
  • Used Hive queries to query data in Hive Tables and loaded data into HBase Tables.

TECHNICAL SKILLS:

Hadoop Technologies: HDFS, Map Reduce, Yarn, Spark, Spark SQL, Pig, Hive, Sqoop, Hue, Hbase, Oozie, Impala, Flume, Kafka(knowledge).

IDE s: IntelliJ, Eclipse, Jupyter, PyCharm, Sublime text.

Oracle 11g/10g, MS: SQL Server, My SQL.

Operating Systems: Unix / Linux, Windows.

Programming Languages: Scala, Python.

Hadoop Distributions: Cloudera, Horton Works

Tools: MS - Office, JIRA, Putty, FileZilla, WinSCP

Visualization: Tableau

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Big Data Developer

Responsibility:

  • Involved in design and development phases of Software Development Life Cycle using agile methodology.
  • Used Sqoop to load data from Oracle Database into Hive.
  • Involved in creating Hive tables, loading data and writing HIVE queries as per requirement defined with appropriate static and dynamic partitions and bucketing, intended for efficiency.
  • Used flume to collect the entire web log from online ad-servers and push into HDFS.
  • Load and transform large sets of structured and semi structured data.
  • Worked on importing and exporting data from Oracle and DB2 into HDFS and Hive using SQOOP.
  • Enhanced Hive performance by implementing Optimizing and Compressing Techniques.
  • Handled importing of data from various data sources, performed transformations using Hive and loaded data into HDFS.
  • Written the Apache PIG scripts to process the HDFS data.
  • Involved in converting Hive/SQL queries into Spark RDD Transformations using Spark Data frames and Scala.
  • Implemented Partitioning, Dynamic Partitions and Buckets in Hive on different file formats to meet the business requirement.
  • Worked on Loading log data into HDFS using Flume, Kafka and performing ETL integrations.
  • Used Reporting Tool Tableau to connect with Hive for generating daily reports of data.
  • Load the data into Spark RDD and do in memory data Computation to generate the Output response.
  • Written Hbase queries for different metrics.
  • Good knowledge in creating data frames using spark SQL. Involved in loading data into HBase NoSQL Database.

Skills: Spark, Spark SQL, HDFS, Hive, Pig, Kafka, Sqoop, Hbase, Scala, Shell scripting, Linux, MySQL, IntelliJ, Oracle, Git, Tableau, MySQL.

Confidential, Chicago, IL.

Big Data Engineer

Responsibility:

  • Used Hive to analyze the partitioned and bucketed data and compute various metrics from reporting on the dashboard.
  • Optimizing Map reduce code, pig scripts, user interface analysis, performance tuning and analysis.
  • Experienced in implementing Spark RDD transformations and writing queries in Spark SQL using Scala and Python to implement business logic.
  • Load and transform large sets of structured, semi structured, unstructured data even joins, and some pre-aggregations before storing data into HDFS.
  • Exported the analyzed data to the relational databases using Sqoop for visualization and generate reports by our BI team.
  • Involved in creating Hive tables by loading data and writing hive queries that will run internally in map reduce way.
  • Worked with different file formats like csv, xml, Json and applied compression technics to save the storage.
  • Written HBase queries for finding different metrics.
  • Worked on partitioning the Hive table and running the scripts in parallel to reduce the run time of scripts.
  • Done performance testing for different file compression technics to suit the best one.
  • Developing Hive User Defined Functions, compiling them into jars, adding them to the HDFS, and executing them with Hive Queries.
  • Importing data from Oracle to HDFS & Hive for analytical purpose.

Skills: HDFS, Hive, HBase, SQoop, PIG, Spark, Scala

Confidential

Python Developer

Responsibility:

  • Worked on software development in python and IDEs: PyCharm, Eclipse, Sublime text and Jupyter Notebook.
  • Worked with OOPS, Multithreading, collections concept in python.
  • Excellent debugging, problem solving and optimization skills.
  • Knowledge on Django framework design and developed CSS, HTML and Bootstrap for web-based screens.
  • Involved in Software Development Life Cycle including requirements gathering and designing.
  • Wrote and executed various MySQL database queries from python using python-MySQL connector and MySQL DB package.
  • Used subversion control for regular code reviews and pull/merge requests.
  • Efficient in using Hive and hive query language.
  • Used Hive to analyses data by writing SQL like queries called HiveQL.
  • Good experience in using Linux environment and Linux commands.
  • Performed efficient delivery of code based on principles of Test Driven Development. Used IDE tool to develop the application and JIRA for bug and issue tracking.

Skills: Python OOPS, MySQL, Django, JIRA, CSS, HTML, Bootstrap, Pycharm, Eclipse, Linux, Hive.

We'd love your feedback!