We provide IT Staff Augmentation Services!

Big Data Engineer Resume

3.00/5 (Submit Your Rating)

Charlotte, NC

SUMMARY

  • Over 10+ years of professional IT experience with 5Years of Big Data Hadoop Ecosystems experience in ingestion, storage, querying, processing and analysis of big data.
  • Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice - versa.
  • Experience with leveraging Hadoop ecosystem components including Pig and Hive for data analysis, Sqoop for data migration, Ooziefor scheduling and HBase as a NoSQL data store.
  • Experience in installation, configuration, supporting and managingHadoop Components using Hortonworks
  • Experience in developing a data pipeline through Kafka-Spark API.
  • Hands on Experience in AWS EC2, S3, Redshift, EMR, RDS, Glue, Dynamo DB.
  • Balancer.Experience in Hadoop Shell commands, writing MapReduce Programs, verifying managing and reviewing Hadoop Log files.
  • Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
  • Experience in Big Data analysis usingScala,Python, PIG and HIVE and understanding of SQOOP .
  • Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
  • Optimizing of existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Used Spark for interactive queries, processing of streaming data and integration with popular NoSQL database for huge volume of data.
  • Performance tuning the Spark jobs by changing the configuration properties and using broadcast variables.
  • Worked on performing transformations & actions on RDDs and Spark streaming data.
  • Extensive knowledge in TuningSQLqueries, improving the performance of the Database.
  • Experience in managing Hadoop clusters using Cloudera Manager Tool.
  • Manage all CM tools (JIRA, Confluence, Maven, Jenkins, Git, GitHub, Visual Studio) and their usage / process ensuring traceability, repeatability, quality, and support.
  • Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
  • Ability to adapt evolving technology, strong sense of responsibility and accomplishment.

TECHNICAL SKILLS

Hadoop/Big Data: HDFS, MapReduce, HBase, Pig, Hive, Sqoop, Oozie, Spark, Spark SQL, Spark Streaming

Languages: Java, SQL, XML, C++, C, WSDL, XHTML, HTML, CSS, Java Script, AJAX, PLSQL.

Java Technologies: Java, J2EE, Hibernate, JDBC, Servlets, JSP, JSTL, JavaBeans, JQuery and EJB.

ETL/ELT Tools: Informatica, Pentaho

Design and Modeling: UML and Rational Rose.

Web Services: SOAP, WSDL, UDDI, SDLC

Scripting languages: Java Script, Shell Script

Version Control and integration: CVS, Clear case, SVN,GIT,Jenkins.

Databases: Oracle 10g/9i/8i, SQL Server, DB2, MS-Access

Environment: s: UNIX, Red Hat Linux, Windows 2000/ server 2008/2007, Windows XP.

PROFESSIONAL EXPERIENCE

Big Data Engineer

Confidential, Charlotte, NC

Responsibilities:

  • Expert in implementing advanced procedures like text analytics and processing using the in-memory computing capabilities like Apache Spark written in Scala.
  • Experience in using Apache Sqoop to import and export data to from HDFS and external RDBMS databases.
  • Experienced in using the spark application master to monitor thespark jobsand capture the logs for the spark jobs.
  • ImplementedSparkusing pyspark and Spark SQL for faster testing and processing of data.
  • Implemented Spark using Scala and Spark SQL for faster testing and processing of data.
  • Hadoop Developer with hands on experience on major components in Hadoop Ecosystem like Hadoop Map Reduce, HDFS, HIVE, PIG, Hbase, Zookeeper,Oozie and Flume.
  • Experienced with the Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark SQL, Data Frame, Pair RDD's, Spark YARN.
  • Hands on Experience in AWS EC2, S3, Redshift, EMR, RDS.
  • Migrating the needed data from Oracle, MySQL in to HDFS in using Sqoop and importing various formats of flat files in to HDFS.
  • Proposed an automated system using Shell scripts to sqoop the job.
  • Developed a data pipeline for data processing using Kafka-Spark API.
  • Developed a strategy for Full load and incremental load using Sqoop
  • Implemented POC to migrate map reduce jobs into Spark RDD transformations.
  • Good exposure with Agile software development process.
  • Experience in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.

Environment: YARN, Ambari,Hive, Java, Sqoop, Spark Core, Spark SQL, MySQL, ADF, GIT, Agile, Apache Hadoop, HDFS, Pig Hive, Hortonworks, Oracle, Tableau,Sparkpython, SparkR

Big Data Engineer

Confidential, Jacksonville, FL

Responsibilities:

  • Worked with Spark and Scala mainly in framework exploration for transition from Hadoop/MapReduce to Spark.
  • Developed HIVE queries for the analysts.
  • Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
  • Exported the result set fromHIVEto MySQL using Shell scripts.
  • Migrating the needed data from Oracle, MySQL in to HDFS in using Sqoop and importing various formats of flat files in to HDFS.
  • Proposed an automated system using Shell scripts to sqoop the job.
  • Worked in Agile development approach and Storm, Flume, Bolt, Kafka.
  • Very good experience in customer specification study, requirements gathering, system architectural design and turning the requirements into final product.
  • Imported data fromAWS S3intoSpark RDD,PerformedtransformationsandactionsonRDD.
  • Installed HDFS on AWS EC2 instances and developed multiple MapReduce jobs in PIG and Hive for data cleaning and pre-processing.
  • Experience in interacting with customers and working at client locations for real time field testing of products and services.
  • Ability to work effectively with associates at all levels within the organization.

Environment: YARN, Ambari,Hive, Java, Sqoop, Spark Core, Spark SQL, MySQL, ADF, GIT, Agile, Apache Hadoop, HDFS, Pig Hive, Hortonworks, Oracle, Tableau,Sparkpython, SparkR

Big Data Engineer

Confidential, Seattle, WA

Responsibilities:

  • Develop HIVE queries for the analysts.
  • Executing parameterized Pig, Hive, impala, and UNIX batches in Production.
  • Experience in one or more of the following cloud platforms: Confidential Azure, Databricks, Confidential Enterprise Cloud, or other cloud technologies.
  • Experience in using Apache Sqoop to import and export data to from HDFS and external RDBMS databases.
  • Hands on experience in setting up workflow using Apache Oozie workflow engine for managing and schedulingHadoopjobs.
  • Experienced in working with spark eco system using Spark SQL and Scala,pysaprk queries on different formats like Text file, CSV file.

Environment: AZURE, HDInsight, Confidential Azure, Confidential Enterprise Cloud, YARN, Ambari, Hive, Java, Sqoop,Spark Core, Spark SQL, MySQL, ADF, GIT,Agile, Apache Hadoop, HDFS, Pig Hive, Hortonworks, Oracle, Tableau.

Big Data Engineer

Confidential, Seattle, WA

Responsibilities:

  • Understanding and analyzing business requirements, High Level Design and Detailed Design
  • Extensive scripting in Perl and Python.
  • Design and Develop Parsers for different file formats (CSV, XML, Binary, ASCII, Text, etc.).
  • Extensive usage of Cloudera Hadoop distribution.
  • Executing parameterized Pig, Hive, impala, and UNIX batches in Production.
  • Big Data management in Hive and Impala (Table, Partitioning, ETL/ELT, etc.).
  • Design and Develop File Based data collections in Perl.
  • Extensive Usage of Hue and other Cloudera tools.
  • Used Map Reduce JUnit for unit testing.
  • Extensive usage of NOSQL (HBASE) Database.
  • Maintained System integrity of all sub-components (primarily HDFS, MR, HBase, Cassandra and Hive).
  • Design and Develop Dashboards in ZoomData and Write Complex Queries.
  • Worked on Shell Programming and CronTab automation.
  • Monitored System health and logs and respond accordingly to any warning or failure conditions.
  • Extensively worked in Unix and Redhat environment.
  • Performed testing and bug fixing.

Environment: Apache Hadoop, HDFS, Perl, Python, Pig, Hive, Java, Sqoop, Cloudera CDH5, Oracle, MySQL, Tableau,AWS,Talend, Elastic search, ZoomData, Storm, Data governance,Agile.

Sr.Systems Engineer (ATG/Java Developer)

Confidential

Responsibilities:

  • Understanding and analyzing business requirements, High Level Design and Detailed Design
  • Involved in three releases of versions eShop 2.0.1, eShop 2.1 &eShop 2.2.
  • Provided high level systems design; this includes specifying the class diagrams, sequence diagrams and activity diagrams
  • Utilized Java/J2EE Design Patterns - MVC at various levels of the application and ATG Frameworks
  • Worked extensively on DCS (ATG Commerce Suite) using the commerce API to accomplish the Store Checkout.
  • Expertise in developing JSP’s, Servlets and good with web services (REST, SOAP)
  • Served as DB Administrator, creating and maintaining all schemas

Environment: ATG, JAVA, JSP, Oracle 9i, 10g, Weblogic 10.3.5, SOAP, RESTFul, SVN, SQL Developer, UNIX, Eclipse. XML, HTML, CSS, JavaScript, AJAX, JQUERY, SCA.

We'd love your feedback!