Senior Consultant Resume
Atlanta, GA
SUMMARY:
- Master Data Engineer and Architect: Expert in cloud - based data services, Hadoop, data lakes, data warehousing, and database. Proficient in architecting data analytics systems and data processing pipelines to optimize data analytics processes using Spark, Hive, Sqoop, AWS, Cloudera, Hortonworks.
- 10 plus years overall experience in the IT industry.
- 5 years of experience in the field of data analytics, data processing and database technologies.
- 5 years of experience in IT and with database, data storage, and data platforms and technologies.
- Specializing in big data platform design and implementation and development of custom data pipelines for analytics use cases.
- Hadoop, Cloudera, Hortonworks, Cloud Data Analytic Platforms, AWS, Azure
- Expertise with the tools in Hadoop Ecosystem including HDFS, MapReduce, Hive, Sqoop, Spark, Kafka, Yarn, Oozie, Zookeeper etc.
- ETL, data extraction, transformation and load using Hive and HBase.
- Effective in HDFS, Map Reduce, YARN, Hive, Impala, Sqoop, HBase, Cloudera.
- Experience in importing and exporting data using Sqoop and SFTP for Hadoop to/from RDBMS.
- Extensive Knowledge in Development, analysis and design of ETL methodologies in all the phases of Data Warehousing life cycle.
- Strong knowledge of Software Development Life Cycle and expertise in detailed design documentation.
- Excellent understanding of Agile methodologies.
- Experience in managing and leading teams through the project life cycle.
PROFESSIONAL EXPERIENCE:
Senior Consultant
Confidential, Atlanta, GA
- Led team in strategic planning, implementation, and support for Informatica’s Big Data & Cloud solutions.
- Create and Maintain Client Relationships.
- Develop and Document Work Flow.
- Evaluate emerging technologies and systems to enhance technology services or replace failing resources.
- Provided team with direction on strategies for architecting and implementing solutions to clients.
- Responsible for supporting and leading project tasks.
- Identifies and develops Big Data and Cloud sources & techniques to solve business problems
- Cross-trains other team members on technologies being developed, while also continuously learning new technologies from other team members
- Align solutions design with overall architectures and strategic technologies.
- Analyze the system, met with end users and business units to in order to define requirements.
- Integrated big data and cloud solutions with Informatica’s software.
Environment: Java, Hadoop, HDFS, Hive, Sqoop, HBase, Sqoop, Eclipse, SQL, Oracle, Oozie, Spark, Zookeeper, AWS EMR, AWS S3, AWS EC2, Unix/Linux, Azure, Informatica
Big Data Architect
Confidential, Atlanta, GA
- Prepared ETL design document which consists of the database structure, change data capture, Error handling, restart and refresh strategies.
- Created mapping documents to outline data flow from sources to targets
- Responsible for supporting and leading 3 developers to achieve project task goals.
- Work with the architecture team to define conceptual and logical data models.
- Identifies and develops Big Data sources & techniques to solve business problems
- Cross-trains other team members on technologies being developed, while also continuously learning new technologies from other team members
- Responsible for leading 4 direct reports in order to achieve to project tasks and goals in timely manner.
- Linux Shell scripting.
- Created high- and low-level specification documents.
Environment: Java, Hadoop, HDFS, Hive, Impala, Sqoop, HBase, Sqoop, Eclipse, SQL, Oracle, Oozie, Spark, WSO2, Zookeeper, AWS EMR, AWS S3, AWS EC2, Unix/Linux
Hadoop Data Architect/Engineer
Confidential, Washington, D.C.
- Prepared ETL design document which consists of the database structure, change data capture, Error handling, restart and refresh strategies.
- Created mapping documents to outline data flow from sources to targets
- Worked with different feeds data like JSON, CSV, XML, DAT and implemented Data Lake concept.
- Most of the infrastructure is on AWS (AWS EMR Distribution for Hadoop, AWS S3 for raw file storage)
- Used Kafka producer to ingest the raw data into Kafka topics run the Spark Streaming app to process clickstream events.
- Imported the data from HDFS into Spark RDD.
- Successfully loading files to Hive and HDFS from Oracle, SQL Server using SQOOP.
Environment: Hadoop, Map Reduce, HDFS, Hive, Python, Scala, Kafka, Spark streaming, Spark SQL, MongoDB ETL, Oracle, SQL, Sqoop, Zookeeper, AWS EMR, AWS S3, AWS EC2, GIT, Unix/Linux, Agile Methodology, Scrum.
Hadoop Data Engineer
Confidential - Jersey City, NJ
- Worked with Deep knowledge in incremental imports, partitioning and bucketing concepts in Hive needed for optimization.
- ETL (extract, transform, load) large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries.
- Analyzed data by performing Hive queries (HiveQL) and Impala scripts to study customer behavior.
- Used Hbase to store majority of data which needs to be divided based on region.
- Created multi-node Hadoop and Spark clusters in AWS instances to generate tera bytes of data and store it in AWS HDFS.
- Created instances which contains Hadoop installed and running.
- Implemented images conversion and hosting on static website using S3 to have a back-up of images.
Environment: Hadoop, HDFS, Hive, MapReduce, Impala, Sqoop, PIG, HBase, Git, Sqoop, Oracle, Oozie, AWS- EC2, S3, DynamoDB, YARN
Hadoop Data Engineer
Confidential - Savannah, GA
- Involved in creating Hive tables, loading with data and writing Hive queries that will run internally.
- Involved in collecting, aggregating and moving data from servers to HDFS using Flume.
- Imported and Exported Data from Different Relational Data Sources like DB2, SQL Server, Teradata to HDFS using Sqoop. transformations, perform read/write operations, save the results to output directory into HDFS.
- Written Hive jobs to parse the logs and structure them in tabular format to facilitate effective querying on the log data.
Environment: Hadoop Cluster, HDFS, Hive, Sqoop, Linux, Hadoop Map Reduce, HBase, Shell Scripting, Eclipse, Oozie.
Data Analytics Developer
Confidential - Austin, TX
- Designed an archival platform, which provided a cost-effective platform for storing big data using Hadoop and its related technologies.
- Archived the data to Hadoop cluster and performed search, query and retrieved data from the cluster.
- Involved in creating Hive tables, loading with data and writing Hive Queries, which will internally run a Map Reduce job. Implemented Partitioning, Dynamic Partitions and Buckets in Hive for optimized data retrieval.
- Connected various data centers and transferred data between them using Sqoop and various ETL tools. Extracted the data from RDBMS (Oracle, MySQL) to HDFS using Sqoop.
Environment: Java, Bootstrap, Hadoop, HDFS, Hive, MapReduce, Impala, Sqoop, HBase, Git, Sqoop, Eclipse, SQL, Oracle, Oozie