Big Data Engineer Resume
Cincinnati, OhiO
SUMMARY:
- Professional IT experience this includes recent experience in Big Data/Hadoop Ecosystem
- Competence in using various Hadoop components such as MapReduce (MR1),YARN(MR2), HDFS, MR2, Kafka, Pig, Hive, Sqoop, HBase, Impala, Spark Streaming, Flume, Spark SQL, ZooKeeper, Oozie, Hue
- Extensive knowledge on Hadoop architecture and Hands - on experience with Hadoop components such as Job Tracker, Task Tracker, Name Node, Data Node and Map Reduce concepts and HDFS Framework
- Experienced in using Spark API over Hadoop Map Reduce to perform analytics on data sets
- Experience in converting Hive queries into Spark transformations using Spark RDDs
- Good experience in data loading from Oracle and MySQL databases to HDFS system using Sqoop (Structure Data) and Flume (Log Files & XML)
- Hands on experience with in-memory data processing using Apache Spark applications with Python utilizing the data frame and Spark SQL API
- Experience working with Relational Database Management Systems (RDMS)
- Capable of processing large sets of structured, semi-structured and unstructured data and supporting systems application architecture
- Experience in tools such as GIT, Subversion
- Experience working with Horton Works Distribution and Cloudera Distribution
- Experience with web-based UI development using JQuery, CSS, HTML, HTML5, XHTML JavaScript and bootstrap
- Good experience working in Agile/Scrum development environment participated in technical discussions with client and contributed to project analysis and development specs
- Exposure in GraphX, Mlib (Machine Learning Algortithm K-Means, Naïve Bayes)
- Exposure to AWS stack (EC2, S3, RDS, and Lambda) focused on high-availability, fault tolerance environment
- Excellent communication & problem solving skills with the ability to work independently or in a team environment
- Experience in using CI tools such as Hudson, Jenkins, Nagios, Docker for continuous integration and for End to End automation for all build and deployments
- Highly motivated and passionate to learn new technologies quickly
- Exposure to Machine learning like collaborative filtering, clustering, classification
TECHNICAL SKILLS:
NOSQL: HBASE, Cassandra
Java Technologies and Framework: J2EE, JDBC, JSP, Struts, Hibernate
Languages: Java, Python, Scala, R, SQL
Web Technologies: JavaScript, CSS, CSS3, HTML, HTML5, Bootstrap, JQuery
Databases: Oracle, MySQL
Web Servers: Weblogic, Apache Tomcat, Glassfish, Weblogic
IDE Tools: Eclipse, NetBeans, Intellij
Build Tools: Maven, Scala Build Tools (SBT), ANT
Operating Systems: Linux (Red Hat, Ubuntu, Centos), Windows7
WORK EXPERIENCE:
Big Data Engineer
Confidential, Cincinnati, Ohio
Responsibilities:
- Created scripts to load data from UNIX local file system to HDFS
- Extracted data from different databases and copied into HDFS using Sqoop
- Implemented Flume, Spark framework for real time data processing
- Created the Spark streaming code to take the source files as input
- Used Oozie workflow to automate all the jobs
- Exported the analyzed data into relational databases using Scoop for visualization and to generate reports for BI team
- Used Hive to Partition and Bucketing, for optimizing data storage and compute-various metrics for reporting
- Created tables, oplogs and source logs for data ingestion in BIGSQL(IBM)
- Involved in developing Hive DDLs to create, drop and alter Hive tables
- Created hive queries and UDFs for loading and processing data
- Developed Spark programs using Scala; involved in creating Spark SQL queries and developed Oozie workflow for Spark jobs
- Responsible for cluster maintenance, monitoring, managing, commissioning and decommissioning data nodes.
- .88;8/;7
- Troubleshooting and review data backups. Manage & review log files.
- Worked on weekend in L3 for production issue and monitored data lake using HDP(Horton Works)
- Used Agile methodology in developing the application; participated in standup meetings and Sprints
Tools: and Technologies: Hadoop MapReduce, HDFS, Hive, Spark, BIGSQL, Sqoop, Impala, SQL, Scala, Java (JDK 1.8), Hadoop (Horton Works), Eclipse, Zookeeper 3.4.6, HUE, HDP 2.7.3, Ranger,Kerbos.RHEL 7.x,AWS(Cloud computing)
Big Data/Hadoop Developer
Confidential
Responsibilities:
- As a part of Data acquisition used Sqoop and Flume to inject the data from server to Hadoop using incremental import
- Used Sqoop to import data from RDMS sources into HDFS
- Developed scripts for automating Job Queues and workflows
- Performed transformation, cleaning and filtering on imported data using Hive, Map Reduce, and loaded data into HDFS
- Worked on performing transformation & actions on RDDs and Spark streaming data
- Real time streaming of data was developed using Spark with Kafka
- Implemented Spark using Scala and utilized Data frames / Spark SQL API for faster processing of data
- Collected large amount of log data using Flume and staged data in HDFS for further analysis
- Configured Kafka to read and write messages from external programs
- Converted Hive queries into Spark transformations using Spark RDDs
- Responsible for creating shell script to run Hive jobs
- Import/Export data into HDFS and Hive using Sqoop
- Worked with different file formats like Sequence file, ORC file and Parquet
- Wrote ETL scripts for automation the ETL process
- Worked with Nagios for monitoring and Jenkins for continuous integrations
- Worked as Devops responsible for configuring Puppet, GIT, Ansible and Nagios
- Developed Spark scripts by using Scala and Python shell commands as per the requirement
- Tools and Technologies: Hadoop, Sqoop, Spark (1.5), Spark SQL, Spark Streaming, Hive, HBase, Linux, Scala, IntelliJ, Tableau, UNIX, Shell Scripting, Putty, Oozie, Zookeeper
Software Engineer
Confidential, NY
Responsibilities:
- Created scripts to load data from UNIX local file system to HDFS
- Extracted data from different databases and copied into HDFS using Sqoop
- Implemented Flume, Spark framework for real time data processing
- Created the Spark streaming code to take the source files as input
- Used Oozie workflow to automate all the jobs
- Exported the analyzed data into relational databases using Scoop for visualization and to generate reports for BI team
- Used Hive to Partition and Bucketing, for optimizing data storage and compute-various metrics for reporting
- Involved in developing Hive DDLs to create, drop and alter Hive tables
- Created hive queries and UDFs for loading and processing data
- Worked with HBase to conduct quick look ups (Update, Inserts and Deletes) in Hadoop
- Developed Spark programs using Scala; involved in creating Spark SQL queries and developed Oozie workflow for Spark jobs
- Performed analytics on structured/unstructured data and managed ingestion of large data by using Flume, Kafka and Sqoop
- Used Bit Bucket as code repository and version control
- Managing build and deployment using Maven, Nagios, Jenkins, Chef SCM tools
- Involved in deployment enhancements & issues
- Created Branches, tags, labels and performed merges in Stash and GIT
- Used Agile methodology in developing the application; participated in standup meetings and Sprints
- Tools and Technologies: Hadoop MapReduce, HDFS, Hive, Spark, Flume, Kafka, Sqoop, Impala, SQL, Scala, Java (JDK 1.6), Hadoop (Cloudera), Tableau, Eclipse, CDH 5, Zookeeper, HUE, Chef, Jenkins
Software Engineer
Confidential, NY
Responsibilities:
- Used Spring MVC as Model, View & Controller design pattern
- Used Spring support for JDBC for all CRUD (Create, Update, Delete) operations
- Experience in J2EE framework Spring MVC
- Generating WSDL for publishing the web services
- Used Maven build tool to build the project
- Used Log4j for logging errors, message and performance logs
- Worked in Agile environment with active scrum participation
- Tools and Technologies: J2EE, Spring MVC Framework, WS, JavaScript, Oracle, SQL Server, Web Logic 11g, UNIX, Version One, OXYGEN XML Editor, Maven, SVN
Front End Developer
Confidential, Ohio
Responsibilities:
- Wrote UI/Business validations for the owned use cases
- Executed SQL queries to perform crud operations on customer records
- Assist developers throughout the Scrum and help team in understanding the requirements access the design, code and review the solution while building portals
- Used maven for build tool and added external dependencies
- Involved in configuring Jenkins for many jobs and handled lot of issues on Jenkins side
- Solved new defects from end users and worked with new change requests
- JQuery and JavaScript used to handle all functionality of the front page along with CSS and HTML
- Tools and Technologies: Core Java, J2EE, Spring, WS, JavaScript, Oracle, SQL Server, Windows7, Web Logic 11g, UNIX, Version One, FogBugz, OXYGEN XML Editor, Maven, JQuery, SVN
Software Engineer
Confidential
Responsibilities:
- Used MySQL database for storing all data
- Generating all types of necessary reports from database
- Implemented web services (WSDLs) using Jax-Ws
- Used Custom Exception handling for carrying perfect message to the end user
- Bug fixing and maintenance of the product
- Created presentation layer with Java script and HTML
- Involved in daily stand up and weekly sprint meetings
- Tools and Technologies: Core Java, J2EE, JDBC, Oracle 11g, Hibernate, Spring, Hibernate, Tomcat Server, Windows 7, Maven, SVN, Putty, MySQL