We provide IT Staff Augmentation Services!

Sr. Bigdata/hadoop Engineer Resume

4.00/5 (Submit Your Rating)

Atlanta, GA

SUMMARY:

  • Over 8+ years of professional experience in IT, which includes Analysis, Design, Coding, Testing, Implementation and support in Java and Big Data Technologies working with Apache Hadoop Eco - components.
  • 4+ years of exclusive experience in Hadoop and its components like HDFS, Map Reduce, Apache Pig, Hive, Sqoop, HBase and Oozie
  • Involved in writing the Pig scripts and Pig UDFs to pre-process the data for analysis
  • Experience in creating Hive External and Managed tables and writing queries on them
  • Hands on Experience in troubleshooting operational issues and identifying root causes of Hadoop Cluster
  • Expertise in managing data from multiple sources and transform large sets of data
  • Extensively used Sqoop to import data into HDFS from RDBMS and vice-versa.
  • Designing and creating HIVE external tables using shared meta-store instead of the derby with partitioning, dynamic partitioning and buckets.
  • Experience in Oozie and workflow scheduler to manage Hadoop jobs by Direct Acyclic Graph (DAG) of actions with control flows.
  • Experience in integrating Hive and HBase for effective operations.
  • Good understanding in working with various compression techniques like Avro, Snappy, LZO
  • Experienced in working with Spark ecosystem using Spark-SQL and Scala queries on different data file formats like .txt, .csv etc.
  • Hands on experience in migrating Map Reduce jobs into Spark RDD transformations using SCALA.
  • Good experience in Cloudera, Hortonworks & Apache Hadoop distributions.
  • Strong understanding of NoSQL databases and hands on work experience in writing applications on NoSQL databases like HBase, Cassandra and MongoDB, Redis, Neo4j.
  • Working knowledge on major Hadoop ecosystems PIG , HIVE , Sqoop, and Flume.
  • Experience in implementing Custom Partitions and Combiners for effective data distributions.
  • Experience in Writing Map Reduce jobs for text mining for predictive analysis
  • Experience in analyzing data using Cassandra QL, Hive QL and Pig Latin programs.
  • Experience in implementing Custom Partitions and Combiners for effective data distributions.
  • Good working knowledge with Map Reduce and Apache Pig
  • Experience in application development using Java, J2EE, EJB, Hibernate, JDBC, Jakarta Struts, JSP and Servlets.
  • Experience in using various IDEs Eclipse, My Eclipse and repositories SVN and CVS.
  • Experience of using build tools Ant and Maven.
  • Working with relative ease with different working strategies like Agile, Waterfall and Scrum methodologies.
  • Excellent communication and analytical skills and flexible to adapt to evolving technology.

TECHNICAL SKILLS:

Languages: C, C++, Python, Java, J2EE, SQL, PL/SQL, Scala, UML, XML

Hadoop Ecosystem: HDFS, MapReduce, Spark Core, Spark Streaming, Spark SQL, Hive, Pig, Sqoop, Flume, Kafka, Oozie, Zookeeper.

Databases: Oracle 10g/11g, SQL Server, MYSQL, DB2

No SQL: HBase, Cassandra, MongoDB

Application / Web Servers: Apache Tomcat, JBoss, Mongrel, Web Logic, Web Sphere

Web Services: SOAP, REST

Operating systems: Windows, Unix, Linux

Microsoft Products: MS office, MS Visio, MS Project

Frameworks: Spring, Hibernate, Struts

PROFESSIONAL EXPERIENCE:

Confidential, Atlanta, GA

Sr. BigData/Hadoop Engineer

Roles &Responsibilities:

  • Moved all crawl data flat files generated from various retailers to HDFS for further processing.
  • Written the Apache PIG scripts to process the HDFS data.
  • Created Hive tables to store the processed results in a tabular format.
  • Developed the sqoop scripts in order to make the interaction between Pig and MySQL Database.
  • Writing the script files for processing data and loading to HDFS
  • Writing CLI commands using HDFS.
  • Developed the UNIX shell scripts for creating the reports from Hive data.
  • Completely involved in the requirement analysis phase.
  • Created two different users (hduser for performing hdfs operations and mapred user for performing map reduce operations only)
  • Ensured NFS is configured for Name Node
  • Setting Password less Hadoop
  • Written PIG scripts to process the Credit Card and Debit Card Transactions for Active customers by joining the data from HDFS and Hive using HCatalog for various merchants
  • Responsible for writing Lucene search program for high-performance, full-featured text search of Merchants
  • Written Python UDFs to process the RegEx and return the valid Merchant codes and names using streaming
  • Created Hive scripts for joining the raw data with the lookup data and for some aggregative operations as per the business requirement.
  • Setting up cron job to delete Hadoop logs/local old job files/cluster tempfiles
  • Setup Hive with MySQL as a Remote Metastore
  • Moved all log/text files generated by various products into HDFS location
  • Written Map Reduce code that will take input as log files and parse the logs and structure them in tabular format to facilitate effective querying on thelog data
  • Loading data from UNIX file system to HDFS and vice versa.
  • Implemented Data Ingestion in real time processing using Kafka.
  • Created External Hive Tables on top of parsed data.

Environment: Hadoop, HDFS, Map Reduce, Apache Pig, Hive, SQOOP, Linux, MySQL, Spark, Hbase, Hortonworks, HDP 2.6.5

Confidential, Chicago, IL

Hadoop Developer/Spark Developer

Roles &Responsibilities:

  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Wrote Hive queries for data analysis to meet the business requirements.
  • Created Hive tables and working on them using Hive QL.
  • End-to-end performance tuning of Hadoop clusters and Hadoop MapReduce routines against very large data sets.
  • Exported analyzed data to HDFS using Sqoop for generating reports.
  • Used MapReduce and Sqoop to load, aggregate, store and analyze web log data from different web servers.
  • Written Java UDFs to convert to upper case of card names & process dates to suitable format in PIG & Hive
  • Responsible for build the Docker containers and scheduled the Oozie workflows to run the sprints
  • Periodically reviewed Hadoop related logs and fixing errors and preventing errors by analyzing the warnings.
  • Involved in Analyzing system failures, identifying root causes, and recommended course of actions.
  • End-to-end involvement in data ingestion, cleansing, and transformation in Hadoop.
  • Developed Hive queries for the analysts.
  • Used Impala to read, write and query the Hadoop data in HDFS from Cassandra and configured Kafka to read and write messages from external programs.
  • Optimizing existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frames and Pair RDD's.
  • Cluster co-ordination services through Zookeeper.
  • Written the Spouts and Bolts after collecting the real stream customer data from Kafka broker to process and store into HBASE.
  • Analyze the log files and process through Flume
  • Experience in optimization of MapReduce algorithm using combiners and partitions to deliver the best results and worked on Application performance optimization.

Environment: Hadoop, HDFS, Map Reduce, Apache Pig, Hive, SQOOP, Linux, MySQL, Spark

Confidential, Columbus, Ohio

Hadoop Developer

Roles &Responsibilities:

  • Imported data using Sqoop to load data from MySQL to HDFS on regular basis.
  • Developed Scripts and Batch Job to schedule various Hadoop Program.
  • Wrote Flume configuration files for importing streaming log data into MongoDB with Flume
  • Performed masking on customer sensitive data using Flume interceptors.
  • Used IMPALA to analyze data ingested into Hive tables and compute various metrics for reporting on the dashboard .
  • Implemented Spark using Scala and utilizing Data frames and Spark SQL API for faster processing of data.
  • Wrote Hive queries for data analysis to meet the business requirements.
  • POC developed to pull Realtime twitter streaming using kafka, flume and spark
  • Used parquet compression format for hive table creation
  • Implemented Kerberos security Implementation
  • Created users in Active Directory and map the roles in each group for the users in Apache Sentry
  • Involved in LDAP implementation for different types of accesses in AD for Hue, Hive, Pig
  • Complete caring of Hive and Spark tuning with partitioning/bucketing of ORC and executors/driver’s memory
  • Involved in Extracting, transforming, loading Data from Hive to Load an RDBMS.
  • Optimized MapReduce Jobs to use HDFS efficiently by using various compression mechanisms.
  • Involved in extracting, transform and load the data into HBase Database
  • Implemented Partitioning and bucketing in Hive based on the requirement.
  • Involved in transforming Data within a Hadoop Cluster
  • Involved in Using MapReduce to Parse Weblog Data for MapReduce to convert raw weblog data into parsed, delimited records.
  • Written Hive UDFs to extract data from staging tables
  • Developed hive external tables
  • Created Hbase tables to store Json data

Environment: Eclipse, jdk1.8.0, Hadoop2.8, HDFS, MapReduce, Pig0.15.0, Hive2.0, HBase, Kerberos, Apache-Maven3.0.3

Confidential

Java Developer

Roles &Responsibilities:

  • Analyzing the feasibility Documents
  • Coding the business logic methods in core java.
  • Involved in development of the Action classes and Action Forms based on the Struts framework.
  • Participated in client-side validation and server-side validation.
  • Involved in creation of struts configuration file and validation file for skip module using struts framework.
  • Developed java programs, JSP pages and servlets using Spring framework.
  • Involved in creating database tables, writing complex TSQL queries and stored procedures in the SQL server.
  • Worked with AJAX framework to get the asynchronous response for the user request and used JavaScript for the validation.
  • Used EJBs in the application and developed Session beans to implement business logic at the middle tier level.
  • Actively involved in writing SQL using SQL Query Builder.
  • Used JAXB to read and manipulate the xml properties.
  • Used JNI for calling the libraries and other implemented functions in C language.
  • Handling Server Related issues, new requirement handling, changes and patch movements.
  • Developed the Restful Web Services for various XSD schemas.
  • Used Servlets to implement Business components.
  • Designed and Developed required Manager Classes for database operations.
  • Developed various Servlets for monitoring the application.
  • Designed the UML class diagram, Sequence diagrams for Trade Services.
  • Designed the complete Hibernate mapping for SQL Server for PDM.
  • Designed the complete JAXB classes mapping for various XSD schemas.
  • Developed the Restful Web Services for various XSD schemas.
  • Involved in writing JUnit test Classes for performing Unit testing.

Environment: Eclipse neon, jdk1.8.0, Java, Servlets, JSP, EJB, xml, SQL server, Struts, JUnit and Eclipse, SQL, UNIX, UML, Apache-Maven3.0.3

Confidential

Java Developer

Roles &Responsibilities:

  • Identifying reviewing, assessing and resolving production issues
  • Configure and maintain the associated application components and environments (as required).
  • Provide application support to management, team members and end users
  • Having experience on the sales functionality and order management
  • Worked on the Email template creation with HTML code as per the requirements
  • Email notification configured based on the requirement
  • Involved in writing programs for XA transaction management on multiple databases of the application.
  • Developed java programs, JSP pages and servlets using Cantata Struts framework.
  • Involved in creating database tables, writing complex TSQL queries and stored procedures in the SQL server.
  • Worked with AJAX framework to get the asynchronous response for the user request and used JavaScript for the validation.
  • Used EJBs in the application and developed Session beans to implement business logic at the middle tier level.
  • Actively involved in writing SQL using SQL Query Builder.
  • Involved in coordinating the on-shore/Off-shore development and mentoring the new team members.
  • Extensively Used Ant tool to build and configure J2EE applications and used Log4J for logging in the application
  • Used JAXB to read and manipulate the xml properties.
  • Used JNI for calling the libraries and other implemented functions in C language.
  • Used prototype MooTools and script.aculo.us for fluid User Interface.
  • Involved in fixing defects and unit testing with test cases using JUnit.
  • Involved in Configured business components, Views, applets, Controls, Menus and other objects to meet the business Requirements

Environment: jdk1.8.0, Java, Servlets, JSP, xml, SQL server, JUnit and Eclipse, Unix, UML, Apache-Maven3.0.3, EJB, Servlets, XSLT, CVS, J2EE, AJAX, Struts, Hibernate, ANT, Tomcat, JMS, Log4J, Oracle 10g, Eclipse, Solaris, JUnit and Windows 7/XP

We'd love your feedback!