We provide IT Staff Augmentation Services!

Big Data Engineer Resume

4.00/5 (Submit Your Rating)

PROFESSIONAL SUMMARY:

  • 3+ years of IT experience and over 2+ years in Big Data Analytics as Big Data Engineer with good knowledge in Hadoop ecosystem technologies. Have been part of two Data Analytics Proof of Concept Implementations. Mainly focusing in working on setting up clusters, data Extraction, Transformation and Loading.
  • Experience in developing Map Reduce Programs using Apache Hadoop for analyzing the big data as per the requirement.
  • Experience in major Hadoop ecosystem components such as PIG, HIVE, HBASE, SQOOP, KAFKA and monitoring them with Cloudera Manager and Ambari.
  • Hands on experience working on NoSQL databases including HBase and its integration with Hadoop cluster.
  • Good working experience using Sqoop to import data into HDFS from RDBMS and vice - versa.
  • Have been involved in full life-cycle projects using Object Oriented methodologies / Programming (OOP’s).
  • Experience in working with Oracle using SQL, PL/SQL.
  • Experience in developing Hive Query Language scripts for data analytics.
  • Experience and actively involved in Requirement gathering, Analysis, Design, Reviews, Coding and Code Reviews, Unit and Integration Testing.
  • Working Knowledge in J2EE technologies like Servlets, JSP, Struts, Hibernate, EJB and JDBC.
  • Knowledge in Web Services and SOA Architecture.
  • Designed and developed Microservices business components using Spring Boot.
  • Have strong analytical skills with proficiency in debugging, problem solving.
  • Experienced with web/application servers as IBM Web Sphere 5.1/6.0
  • Good knowledge of stored procedures, functions, etc. using SQL and PL/SQL .
  • Expertise in using Version Control systems like GIT .
  • Familiar with CI/CD tools like Jenkins, Ansible, Chef and Puppet.
  • Knowledge on Amazon Web Services and Microsoft Azure .
  • Strong verbal and communication skills.
  • Experience working within an agile development process.
  • Worked with big teams, and always like to be a TEAM ( T ogether E veryone A chieves M ore) player.

TECHNICAL SKILLS:

Hadoop / Big Data Technologies: HDFS, MapReduce, Spark, Hive, Pig, Sqoop, Flume, Kafka, Nifi, Oozie.

Programming Languages: Python, Java JDK1.4/1.5/1.6 (JDK 5/JDK 6), HTML, SQL, PL/SQL.

Web Services: SOAP, Apache, REST.

Frameworks: Spring, Hibernate, Struts, EJB, JMS, JSF

Java/J2EE Technology: Servlets, JSP, Web Services, JQuery, JDBC, SOAP, REST, JMS, AJAX, XML.

Operating Systems: UNIX, Windows, LINUX

Databases: Oracle 8i/9i/10g, Microsoft SQL Server, DB2 & MySQL 4.x/5.x

NoSQL Databases: Hbase, Cassandra

Java IDE: Eclipse 3.x, IBM Web Sphere Application Developer, IBM RAD 7.0

Tools: SQL Developer, SOAP UI, ANT, Maven, Gradle.

PROFESSIONAL EXPERIENCE:

Confidential

Big Data Engineer

Environment: Apache Spark, HDFS, Java, Map Reduce, Hive, HBase, Sqoop, SQL, Knox, Oozie, Cloudera Manager, Zoo Keeper, Cloudera.

Responsibilities:

  • Involved in Big data requirement analysis, develop and design solutions for ETL and Business Intelligence platforms.
  • Installed Kafka on Hadoop cluster and configured producer and consumer in java to establish connection from source to HDFS with popular hash tags.
  • Load real time data from various data sources into HDFS using Kafka .
  • Worked on reading multiple data formats on HDFS using python .
  • Implemented Spark using Python ( pySpark ) and SparkSQL for faster testing and processing of data.
  • Load the data into Spark RDD and do in memory data Computation.
  • Involved in converting Hive/SQL queries into Spark transformations using API’s like Spark SQL, Data Frames and python.
  • Analyzed the SQL scripts and designed the solution to implement using python.
  • Exploring the Spark by improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, SparkSQL, Data Frame, Pair RDD & Spark YARN.
  • Performed transformations, cleaning and filtering on imported data using Spark Data Frame API, Hive, MapReduce, and loaded final data into Hive .
  • Involved in converting Map Reduce programs into Spark transformations using Spark RDD on python.
  • Developed Spark scripts by using python Shell commands as per the requirement.
  • Worked with NoSQL databases like HBase in creating tables to load large sets of semi structured data coming from source systems.
  • Design and develop the HBase target schema.
  • Experience in Oozie and workflow scheduler to manage hadoop jobs with control flows.
  • Worked on visualizing the reports using Tableau .

Confidential

Big Data Engineer

Environment: Apache Hadoop, HDFS, Java, Map Reduce, Hive, PIG, Sqoop, SQL, Knox, Oozie, Ambari, Ranger, Zoo Keeper, Hortonworks (HDP).

Responsibilities:

  • Responsible for a setup of 5 node development cluster for a Proof of Concept which was later implemented as a fulltime project by Fortune Brands.
  • Responsible for Installation and configuration of Hive, Sqoop, Zookeeper, Knox and Oozie on the Hortonworks Hadoop cluster using Ambari.
  • Involved in extracting large sets of structured, semi structured and unstructured data.
  • Developed Sqoop scripts to import data from Oracle database and handled incremental loading on the point of sale tables.
  • Created Hive external tables and views, on the data imported into the HDFS.
  • Developed and implemented Hive scripts for transformations such as evaluation, filtering and aggregation.
  • Worked on partitioning Hive tables and running the scripts in parallel to reduce run-time of the scripts.
  • Developed User Defined Functions (UDF) in Java if required for hive queries.
  • Worked with data in multiple file formats including Avro, Parquet, Sequence files, ORC and Text/ CSV.
  • Used Oozie Operational Services for batch processing and scheduling workflows dynamically.
  • Worked on creating End-End data pipeline orchestration using Oozie.
  • Developed bash scripts to automate the above process of Extraction, Transformation and Loading.
  • Implemented Authentication and Authorization using Kerberos, Knox and Apache Ranger.
  • Very good experience in managing the Hadoop cluster using Ambari.
  • Created roles and user groups in Ambari for permitted access to Ambari functions.
  • Working knowledge of Mapreduce and YARN architectures.
  • Working knowledge on Zoo Keeper.
  • Working knowledge on Tableau.
Confidential

Java Developer

Environment: Java 1.7, J2EE, Servlet/filters, JSP, JSTL, Spring IOC, Spring AOP, Spring MVC, Spring boot, Microservices, Spring REST, Hibernate 3.0, NodeJS, Ajax, HTML5, jQuery Angular JS, XSD, XML, AWS, EC2, S3, Tomcat, Netflix Eureka, Eclipse STS, Oracle 11g, MAVEN, JUnit, Log4J, Jenkins, JProfiler, JMeter, Git, Ansible, Chef, JIRA, JUnit, Mockito.

Responsibilities:

  • Involved in all the phases of SDLC including Requirements Collection, Design & Analysis of the Customer Specificationsfrom Business Analyst.
  • Followed the Agile methodology to implement the application.
  • Designed and developed Application based on Spring framework using MVC design patterns.
  • Used Spring IOC, AOP, Spring Boot to implement middle tier.
  • Responsible for writing/reviewing server-side code using Spring JDBC and DAO module of spring for executing storedprocedures and SQL queries.
  • Worked with Core Java for business logic.
  • Involved in developing persistent layer using Hibernate framework and Spring JPA repositories.
  • Published and consumed Web Services using REST and deployed it on WebSphere Application Server.
  • Implemented microservices using Spring boot, spring based microservices, and enabled discovery using Netflix eurekaserver.
  • Used swagger to provide documents for REST API.
  • Used Jenkins, Git Stash, Ansible like CI/CD tools to make daily builds and deploys.
  • Used JSON for validation, documentation data required by a given application.
  • Used AWS Infrastructure and features of AWS like S3, EC2, RDS, ELB to host the portal.
  • Created Quality and Production instances using AWS Console and CLI tool of AWS. Used Putty and WinSCP to login.
  • Used Log4j to capture the log that includes runtime exceptions.
  • Created Web application using NodeJS, and Restful Services, and MongoDB.
  • Tested server side with Mocha for NodeJS.
  • Built scripts using MAVEN that compiles the code, pre-compiles the JSP's, built an EAR file and deployed the application on the WebSphere application server.
  • Used Git repository hosted on cloud platform.
  • Developed the application using EclipseSTS.
  • Developed JUnit test cases using Junit and Mockito for unit test of all developed modules.
  • Wrote SQL queries for Oracle Database.
  • Participated in and contributed to design reviews and code reviews.

Confidential

Software Trainee

Environment: SQL Server 2005, DDL, DML, Store Procedures, Views, UDFs, VSS, Waterfall methodology.

Responsibilities:

  • Involved in ER diagrams and mapping the data into database objects, design of the Database and the Tables.
  • Build table relationships and wrote stored procedures to clean the existing data.
  • Developed SQL scripts to Insert/Update and Delete data.
  • Written complex SQL statements using joins, scalar and table user defined functions, views.
  • Worked on stored procedures and database triggers and worked.
  • Generated database SQL scripts and deployed databases including installation and configuration.
  • Experience in creating Indexes for faster performance and views for controlling user access to data.
  • Performed unit testing, provided bug fixes and deployment support.

Confidential

Software Trainee

Responsibilities:

  • Installation and Configuration of ARISTA (7050T- 64) networking switches.
  • Installation of ARISTA EOS and Ubuntu OS on all the lab computers and switches.
  • On-call person for troubleshooting the switches
  • Monitored projects performed by students in the lab.
  • Maintained the switches with regular upgrades and patches.
  • Documented the lab activities for future reference.
  • Explained the concepts of data center switches to the students.
  • Responsible for the creation of the lab manual.
  • Developed a dashboard to display a table of sensor values using PHP.
  • Implemented an embedded prototype that collects and monitors moisture, temperature, humidity and light using Intel Edison with Arduino Breakout board. Stores the values in AWS DyanmoDB and controls the actuators based on the threshold inputs given in the dashboard.
  • Developed an android application using Android Studio in JavaScript that receives values using from sensors using Bluetooth.
  • Implemented a prototype using PIC16F microcontroller in Embedded C that collects sensor values and sends it to an android application using Bluetooth.

We'd love your feedback!