Sr Big Data Engineer/lead Resume
Phoenix, AZ
SUMMARY:
- Confidential is Cloudera Certified professional, having over 13+ years of Information Technology experience; all of his experience is relevant in database design & architecture of which over 5 years of experience are relevant Big Data Architecture and is being presented for the role of Hadoop Architect.
- Experience in all phases of software development life cycle & Agile Methodology.
- Expertise in implementing, consulting, managing hadoop clusters and eco system components like HDFS, MapReduce, Pig, Hive, Flume, Oozie & Zookeeper .
- Around 5 years of experience in building large scale distributed data processing and in - depth knowledge of hadoop architecture MR1 & MR2 (YARN) and 7+ years of experience in Java.
- Expertise in batch processing using hadoop MapReduce, pig & hive
- Significant experience in real time processing using Spark streaming (Scala) with Kafka.
- Hands on experience in writing Pig/Hive Scripts and custom UDF's.
- Experience in partitioning, bucketing and joins in Hive.
- Experience in query optimization and performance tuning with Hive.
- Hands on experience in importing and exporting data to/from RDBMS and HDFS/HBase/Hive thru Sqoop full refresh and incremental.
- Hands on experience in loading the log data from multiple sources into HDFS thru Flume Agent.
- Experience in configuring and implementing Flume components such as Source, Channel and Sink.
- Experience in HBase, NoSQL Database
- Experience working with various hadoop distributions like OpenSource Apache, Cloudera, and HortonWorks & MapR.
- Programming experience in UNIX Shell Script.
- Experience with Agile daily stand-up meetings, writing user Stories, evaluating story points, creating tasks, ETA tasks, task progress with daily burn-down chart, completing the backlogs.
TECHNICAL SKILLS:
Programming Languages: Java, Scala
Big Data Technologies: HDFS, MapReduce, YARN, Hive, Hue, Pig, Sqoop, Flume, Oozie, Zookeeper, NoSQL, HBase
RDBMS: MySQL, Oracle, SQLServer, DB2
Data Ingestion Tools: Flume, Sqoop, Kafka
Realtime: Streaming and Processing Storm, Spark Streaming
Operating Systems: Windows 9x/2000/XP/7/8/10, Linux, UNIX, Mac
Development Tools: Eclipse
Build and Log Tools: Maven
Version Control: SVN, Git
PROFESSIONAL EXPERIENCE:
Confidential, Phoenix, AZ
Sr Big data Engineer/Lead
Responsibilities:
- Involved in technical discussion and responsible for architecture design
- Mentoring the team and provide technical solutions
- Performance tuning of Spark Applications, analysing various dependencies, storage levels, resource tuning and memory management
- Worked on RDD’s and DataFrames using SparkSQL
- Design and develop Shell Scripts, Pig Scripts, Hive Scripts and MapReduce jobs
- Hive queries and partitions to store the data in internal tables
- Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.
- Unix Shell /Pig script to pre-process the data stored in the Cornerstone Platform
Environment: Hadoop, HDFS, Pig, Hive, Spark, Scala, MapReduce, MapR Distribution
Confidential, Eden Prairie, MNSenior Big Data Lead Consultant
Responsibilities:
- Designing and developing Logical Data Models for the Legacy & Cornerstone Databases
- Creation of Sqoop scripts for tables using Linux Scripts
- Creation, Deletion & Execution of Sqoop Jobs in sqoop metastore
- HBase Table's Hbase row key design and mapping with RDBMS table column names
- Mapping of HBase Table columns with Hive External table columns
- Historical and Incremental Importing of RDBMS data to HBase table using metastore
- Validation of Sqoop scripts, Hive Scripts, Hbase Scripts
- Creation of Hbase Tables and column families, altering the column families, providing permission to Hbase tables, defining region server space
- Automation of workflow thru Oozie
- Written transformation and actions on Scala to process complex data
- Bug fixing and production support running processes.
- Participated in SCRUM Daily stand-up, sprint planning, and Backlog grooming & Retrospective meetings.
Environment: Hadoop, HDFS, MapReduce, Pig, Hive, Oozie, HBase, Spark, Scala, Zookeeper, MapR Distribution
Confidential, Atlanta, GASenior Big data Lead Consultant
Responsibilities:
- Designing technical architecture and developed various Big Data workflows using MapReduce, Hive, YARN, Kafka, Spark, Scala
- Built re-usable Hive UDF libraries for business requirements which enabled various business analysts to use these UDF’s in Hive querying.
- Used FLUME to dump the application server logs into HDFS.
- The logs that are stored on HDFS are analysed and the cleaned data is imported into Hive warehouse which enabled end business analysts to write Hive queries.
- Experience in working with search engine Elastic Search in getting real time data analytics integrating with Kibana dashboard.
- Process Kafka message using spark streaming
- Applied transformation on RDD’s and Data Frames for filtering, mapping, joining and aggregation
- Experience in data migration from RDBMS & processed events from Spark Streaming to Cassandra
- Stores the streaming events in Parquet Format
- Participated in SCRUM Daily stand-up, sprint planning, and Backlog grooming & Retrospective meetings.
Environment: MapReduce, Pig, Hive, FLUME, JDK 1.6, Linux, Kafka, Spark Streaming, Scala,, AWS, Elastic-Search, YARN, Hue, HDFS, Git, Kibana, Linux Scripting
Confidential, Bloomington, ILSenior Big data Consultant
Responsibilities:
- Written M/R jobs to process trip summary & scheduled to execute hourly, daily, weekly, monthly & quarterly.
- Responsible for loading machine data into Hadoop cluster coming from different sources using Flume
- Used Flume to collect, aggregate, and store the log data from different web servers.
- Ingested data into HBase and retrieve using Java API's
- Used SPARK SQL from extracting data from different data sources and placing the processed data into NoSQL(MongoDB)
- Used SPARK for analysing the machine emitted & sensor data to help extracting data sets for meaningful information such as location, driving speed, acceleration, braking speed, driving pattern and so on.
- Used Git as version control to checkout and check-in of files.
- Reviewed high level design & code & mentoring team members.
- Participated in SCRUM Daily stand-up, sprint planning, and Backlog grooming & Retrospective meetings.
Environment: Hadoop, MapReduce, OpenStack, Flume-NG, HBase 0.98.2, Spark-SQL, Scala, Kafka, Map/Reduce, HDFS, Zookeeper
ConfidentialBig Data Engineer
Responsibilities:
- Analysed the functional specification
- Responsible to manage data coming from different sources and involved in HDFS maintenance and loading of structured and semi-structured data.
- Load data to External tables by using Hive Scripts
- Performed aggregate Joins, transformation using Hive queries
- Implemented Partitions, Dynamic Partitions, Buckets in Hive
- Optimized HIVE SQL queries and thus improved the job performance
- Developed Sqoop scripts to import and export the data from relational sources and handled incremental loading on the customer and transaction data by date
- Performed Hadoop cluster environment administration that includes adding & removing cluster nodes, cluster capacity planning, performance tuning, cluster monitoring, and trouble shooting
- Written Unit Test Cases for Hive Scripts
Environment: Java, Hadoop, HDFS, MapReduce, Pig, Hive, Flume, Zookeeper, CHEF
ConfidentialSenior Software Engineer
Responsibilities:
- Understanding the functional requirements of the client for designing the technical specifications, to develop the system and subsequently documenting the requirement
- Responsible for developing class diagrams, sequence diagrams
- Designed and implemented a separate middle ware Java component on Fusion
- Reviewed high level design & code & mentoring team members.
- Participated in SCRUM Daily stand-up, sprint planning, and Backlog grooming & Retrospective meetings.
Environment: Java1.6, Oracle Fusion Middleware, Eclipse, WebSphere, Spring F/w
ConfidentialSenior Software Engineer
Responsibilities:
- Understanding the functional requirements of the client for designing the technical specifications, to develop the system and subsequently documenting the requirement.
- Prepared LLD - Class Diagrams, Sequence Diagrams, Activity Diagram using Enterprise Architect UML Tool
- Worked on Hibernate, Spring IOC, DAO, JSON Parsing
- Prepared Unit test cases for the developed UI.
- Responsible for problem tracking, diagnosis, replications, troubleshooting, and resolution of client problems.
Environment:
- Java, ACG Proprietary F/w using DOJO, Hibernate, Spring, DB2, RSA, Rational ClearCase, RPM, RQM, Mantis
Java Developer
Responsibilities:
- Understanding the functional requirements of the client for designing the technical specifications, to develop the system and subsequently documenting the requirement.
- Prepared LLD - Class Diagrams, Sequence Diagrams, Activity Diagram using Enterprise Architect UML Tool.
- Developed Rich Faces components on JSF
- Written Dao’s and their implementations With Hibernate.
- Written Hibernate Persistence class and their mapping document.
- Implemented J2EE patterns and java patterns
- Writing TestNG test Cases and Validations
- Setting up Environment for servers. Monitoring the calls
Environment: Windows, Eclipse, Java, JBoss, Hibernate, Spring, JSF with Rich Faces
ConfidentialJava Developer
Responsibilities:
- Setting up Environment for servers. Monitoring the calls
- Responsible for problem tracking, diagnosis, replications, troubleshooting, and resolution of client problems.
- To support the contractual service level in the achievement of prompt response and high level customer service.
- Provide service deliverables as defined by agreed client SLA and objectives.
- Ensuring appropriate process standards are met and maintained.
- Involved in preparing Adhoc Reports.
- Involved in developing Spam Assassin for the application.
Environment: Windows, UNIX, Java, Struts, Hibernate, Tomcat, Lenya, Remedy Tool, WinSCP, Putty, VPN, Eclipse
ConfidentialAssociate Projects
Responsibilities:
- Studying CR, understanding functionality and coding.
- Development for HRMS 6.x Application which supports various clients like Confidential.
- Debugging, enhancement of framework Java objects
- Unit Testing
- Replicate the Issues and fixing.
- Version Maintenance in CM Synergy Continuous
Environment: Windows, Unix, Java, Struts, JSP, Servlet, Tomcat, DB2, Eclipse, Lotus Notes 6.5, CM Synergy, Junit