Hadoop Engineer Support Analyst Resume
Peoria, IL
SUMMARY
- I have 4+ years of experience in the IT industry as a Hadoop Engineer/ Analyst. I have strong expertise importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop, and worked on different applications incidents to provide insights, perform ad - hoc analysis and implement in the production environment.
- I also performed data analysis on Hive tables using Hive queries (i.e. HiveQL) and has strong experience in system monitoring, development and support related activities for Hadoop.
- In addition, developed and tested data ingestion/ preparation/ dispatch jobs and have experienced in working with Spark SQL for processing data in the Hive tables and also have an excellent communication skill and I hope and will be a very good addition to the Confidential team and am currently working with Confidential CDDW support team.
TECHNICAL SKILLS
Hadoop/Big Data: HDFS, Map Reduce, Snowflake, Impala, Cloudera, Spark Core, Spark SQL, Hive, Pig, Sqoop, Flume, and Zookeeper
AWS Components: EC2, EMR, S3, Redshift
Languages and Technologies: SQL, Shell Script, Scala, Pig Latin, HiveQL, UNIX shell scripts
Operating Systems: Linux, Windows, CentOS, Ubuntu
Databases: MySQL, Oracle 11g/10g/9i, MS-SQL Server, HBase, Mongo dB
Tools: WinSCP, Wireshark, JIRA, IBM Tivoli
Others: HTML, XML, CSS, JavaScript
PROFESSIONAL EXPERIENCE
Confidential, Peoria, IL
Hadoop Engineer Support Analyst
Environment: Hadoop, HDFS, Cloudera, Map Reduce, Hive, Impala, snowflake, XML, oozie, Shell Scripting, Postgres, Linux.
Responsibilities:
- The Confidential Digital Data Warehouse (CDDW ) platform stages the data received from Confidential ’s Dealers and prepares them for consumption for a wide variety of uses, such as customer portal services, analytics for equipment monitoring, parts pricing, and customer lead generation, and other emerging applications.
- Work with the others on the Support team to monitor/support various aspects of the CDDW product. If incidents are found, proper documentation is created and conversation is initiated with the proper team (CDDW Development, CDDW Post Prod) for resolution.
- Debugging OOZIE workflow failures That are usually run 24*7 with certain frequency.
- Performing Adhoc purge and reload activities when they are Raised by the Integrators and Consumers.
- Preparing necessary reports (Dashboard report, xml report...).
- Perform support related deployment activities by coordinating with DDSW Dev team during monthly deployments.
- Perform OLGA monthly purge activity that usually done once in a month and It deletes the historical data and reload with the new files.
- Monitoring DFDW daily loads.
- Monitor BDR job that copies the data from prod to qa.
- Monitoring Cube Master failures.
- Sending dealer xml files on request.
- Suspend and resume all the jobs during outage period by coordinating with EIM/Hadoop Infra team. creating scripts to automate daily support activities.
- Handling Customer Portal incidents (P1).
Confidential, Waukegan, IL
Hadoop Engineer Support Analyst
Responsibilities:
- Data analyzer for all environments
- Hands on experience in installation, configuration, supporting and managing Hadoop Clusters using Cloudera (CDH3, CDH4).
- Monitored day to day incident via ServiceNow.
- Client Interfacing skills: Had a interacted with Managing Director, Director, Senior Vice president and Vice-president frequently on resolving show stopper and any critical incident
- Responsible for providing support off-hour/weekend on-call rotation
- Provide support/documents to offshore L1 team
- Worked on different applications incidents of AbbVie to provide insights, perform ad-hoc analysis and implemented in the production environment.
- Performed Data analysis on Hive tables using Hive queries (i.e. HiveQL)
- Involved in Estimations and release plan for project release activities.
- Performed valuation, validation and supporting analysis on the production environment such as Inventory and core metrics etc..
- Good exposure on handling critical incident during quarter.
- Data patch for production issues via tactical fix whenever required and provide suggestion to improve the quality of data which is consumed by downstream
- Good experience in system monitoring, development and support related activities for Hadoop.
- Involved in developing a workflow for scheduling and orchestrating the ETL process
- Experience in importing and exporting data from RDBMS to HDFS, Hive tables and HBase by using Sqoop.
- Worked with different file formats like JSON, XML, Avro data files and text files.
- Excellent understanding and knowledge of NoSQL databases like HBase, Cassandra.
- Hands on experience in creating Apache Spark RDD transformations on Datasets in the Hadoop data lake.
- Experience in working with Spark SQL for processing data in the Hive tables.
- Developing and testing data Ingestion/Preparation/Dispatch jobs
- Worked on HBase table setup and shell script to automate ingestion process.
Confidential, Chicago, IL
Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster and different big data analytic tools including Hive, HBase NoSQL database, and Sqoop
- Processed HDFS data and created external tables using Hive and developed scripts to ingest and repair tables that can be reused across the project.
- Created 30 buckets for each Hive table based on clustering by client Id for better performance (optimization) while updating the tables.
- Created Hive tables and load the data using Sqoop and worked on them using Hive QL
- Expert in importing and exporting data into HDFS using Sqoop and Flume.
- Handled importing of data from various data sources, performed transformations using Hive, MapReduce, loaded data into HDFS and Extracted the data from MySQL into HDFS using Sqoop
- Creating monitoring tools to raise alerts if any functionality breakages in the system.
- Working with business to support new product installations, and expansion projects for data setups
- Executed queries using Hive and developed Map-Reduce jobs to analyze data
- Prepared an ETL framework with the help of Sqoop, and HIVE to be able to frequently bring in data from the source and make it available for consumption.
- Involved in emitting processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or Copy to Local
- Involved in loading data from UNIX file system to HDFS.
- Optimizing the Hive Queries using the various files format like JSON, AVRO, ORC, and Parquet
Environment: CDH5, MapReduce, HDFS, Hive, SQOOP, Pig, Linux, XML, MySQL, MySQL Workbench, PL/SQL, SQL connector
Confidential, Charleston, IL
Graduate Assistant
Responsibilities:
- Involved in maintaining databases of students
- Retrieving student’s data to ensure the course intakes for that semester and scholarship details
- Creating Marks trends of students for improving performance for the next semester
- Involved in administrative tasks such as scheduling professor’s classes and maintaining department needs
- Creating presentations, documentation of Big Data Technologies for professor related subjects
- Involved in creating homework’s for professor based on Big data technologies and cloud technologies
- Created web pages for professor’s for communicating their seminars with students
- Networked with developers to solve student issues
- Worked on java related projects for designing database tables for optimal storage of data
Confidential
Java Developer - Internship
Responsibilities:
- Enhancement of the System according to the customer requirements.
- Developed Servlets and Java Server Pages(JSP) for Presentation Layer
- Written Procedures, functions and triggers as part of accessing DB layer
- Developed PL SQL queries to generate reports based on client requirements.
- Created test cases scenarios for Functional Testing.
- Used HTML, Java Script validation in JSP pages.
- Helped design the database tables for optimal storage of data.
- Coded JDBC calls in the Servlets to access the Oracle database tables.
- Responsible for Integration, unit testing, system testing and stress testing for all the phases of the project.
- Prepared final guideline document that would serve as a tutorial for the users of this application
- Involved in loading data from LINUX file system to HDFS.
- Importing and exporting data into HDFS and Hive using Sqoop and Flume.
- Developed MapReduce jobs for Log Analysis, Recommendation, and Analytics
- Wrote MapReduce jobs to generate reports for the number of activities created on a day, during a dumped from the multiple sources and the output was written back to HDFS
- Responsible for loading the customer's data and event logs from Oracle database, Teradata into HDFS using Sqoop
- Involved in initiating and successfully completing Proof of Concept on SQOOP for Pre-Processing, Increased Reliability, and Ease of Scalability over traditional Oracle database.
- Installed and configured Hadoop HDFS, MapReduce, Pig, Hive, and Sqoop
- Exported analyzed data to HDFS using Sqoop for generating reports.
- Used MapReduce and Sqoop to load, aggregate, store and analyze web log data from different web servers.
- Developed Hive queries for the analysts.
- Cluster coordination services through Zookeeper.
- End-to-end performance tuning of Hadoop clusters and Hadoop MapReduce routines against very large data sets.
- Reviewed the HDFS usage and system design for future scalability and fault-tolerance.
Environment: Java 1.5, Servlets, J2EE 1.4, JDBC, SOAP, Oracle 10g, PL SQL, HTML, JSP, Eclipse, UNIX