Senior Hadoop Developer Resume
5.00/5 (Submit Your Rating)
SUMMARY:
- 14+ years of experience in creating and delivering solutions tied to business growth, organizational development, and system optimization.
- Proven ability in designing, building and maintaining the architecture and systems for automating predictive models, data analysis and visualization in a distributed and non - distributed environment.
- Expertise in data management, processing, analysis and visualization tools and platforms with great proficiency in Spark, Python, PIG, HDFS, HIVE, SQOOP, Scala, HBase, Cassandra, SQL, SSIS, MYSQL, Kafka, Nifi, Flume, Oozie, Git, Maven, Jira, Jenkins, TensorFlow, Ab Initio and shell scripts.
- Strong experience in delivering advanced technical projects as an individual contributor and as a technical leadership.
- Experienced in manipulating/analyzing large datasets and finding patterns and insights within structured and unstructured data.
- Experienced in using IntelliJ and Maven to build, package and manage projects.
- Experienced in Automating, Configuring and deploying instances on AWS environments and Data centers, also familiar with EC2, Cloud watch, Cloud Formation and managing security groups on AWS.
- Experienced in ETL methodologies and development.
- Wrote python scripts to manage AWS resources from API calls using BOTO SDK and worked with AWS CLI.
- Wrote Ansible playbooks to launch AWS instances and used Ansible to manage web applications, configuration files, used mount points and packages.
- In depth Knowledge of AWS cloud service like Compute, Network, Storage and Identity & access management.
- Hands-on Experience in configuration of Network architecture on AWS with VPC, Subnets, Internet gateway, NAT, Route table.
- Extensively worked on CI/CD pipeline for code deployment by engaging different tools (Git, Jenkins, CodePipeline) in the process right from developer code check-in to Production deployment.
- Extensive experience in data analytics, business intelligence and data science.
- Ingest, store and manage large data sets from multiple source systems and third parties.
- Experience on working with data pipelines, data validation, data processing
- Expertise in Data Extraction, Transforming and Loading (ETL) tools.
- Extensive exposure in data security, capacity planning, storage management and governance techniques.
- Skilled in identifying and resolving data processing issues.
- Experience in developing and analyzing ETL mapping logic and writing complex queries.
- Work closely with data science and analytics teams to identify correct data segmentation, best algorithm to apply, and which features to use in the model.
- Extensive knowledge on agile methodology and DevOPS tools like Jenkins, Github, and Jira.
- Self-starter, self-motivated, desire for learning and ability to work independently.
- Experience on data compression techniques, Avro/Parque schemas, table partitioning, and optimization tuning.
- Expertise in Transact-SQL and SQL queries, query optimization, performance tuning, creation of databases objects, stored procedures and views.
TECHNICAL SKILLS:
Advanced: Apache Spark (2+ yrs), Python (3+ yrs), Scala (2+ yr), Hive (3+yrs), PIG (3+ yrs), SQOOP (3+ yrs), SQL (14 yrs), & Hbase (3+ yrs), AWS (3+ yrs), Linux/Unix (5 yrs), mySQL (6+ yrs), SSIS (13 yrs)
Intermediate: Jenkin (2 yrs), C++ (2 yrs), C# (2+ yrs), Java (2 yrs), Jira (2 yrs), Shell scripting (3 yrs), and Stash (2 yrs).
Basic: Machine learning, Docker, Talend
Databases Technology: HQL, HBase, SQL Server 2000/2005/2008 R 2/2012/2014/2016, Oracle 12c, and MySQL
ETL Tools: Sqoop, Flume, Airflow, Talend and SSIS
PROFESSIONAL EXPERIENCE:
Confidential
Senior Hadoop Developer
- Participated in selecting big data solutions and building data repository & reporting.
- Design and support big data ingestion and transformation pipelines.
- Analyze complex and high-volume data from varying sources using Hadoop ecosystem tools.
- Wrote python scripts to manage AWS resources from API calls using BOTO SDK and worked with AWS CLI.
- Wrote Ansible playbooks to launch AWS instances and used Ansible to manage web applications, configuration files, used mount points and packages.
- Developed Spark scripts by using PySpark as per the requirement.
- Experienced in performance tuning of Spark Applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
- Designed, developed and did maintenance of data integration programs in a Hadoop and RDBMS environment with both traditional and non-traditional source systems for data access and analysis.
- Analyzed the SQL scripts and designed the solution to implement using PySpark.
- Imported data from AWS S3 and into Spark RDD and performed transformations and actions.
- Worked on Batch processing and Real-time data processing on Spark Streaming using Lambda architecture.
- Worked data extraction and data integration from different data sources into Hadoop Data Lake by creating ETL pipelines Using Spark, MapReduce, PySpark, Sqoop, and Hive.
- Hands on experience in developing ETL data pipelines using PySpark on AWS EMR.
- Used EMR transform and move large amounts of data into and out of S3 and DynamoDB.
- Worked on use cases to improve recovery of payments and pay by date processes.
- Provide matrices to management, also proactively identify data quality issues and coordinate with stakeholders to ensure date integrity.
- Strong experience to Spark and Python. developed in Hadoop and its ecosystems like Hive, Hbase, Pig, HDFS, Spark, Python, Scala, Flume and Sqoop.
- Participated on Data protection and Compliance use case.
- Design data ingestion workflows and mappings to extract, transform, and a load data into staging areas.
- Strong experience and exposure to ETL tools and data warehouse projects
- Researched, recommended and implemented enhancements that improves system reliability and performance.
- Closely worked with business teams to understand the requirements and show case POC.
Confidential
Senior SQL Server DBA /SQL Server Developer
- Hired as full-time Senior SQL Server Developer/DBA position following two years of initial consulting role.
- Planned, Documented and performed a seamless migration of SQL server from SQL server 2005 to SQL server 2012
- Proactively researched and retired mainframe system after transitioning file transfer process to a managed file transfer system, which enhance a process significantly.
- Documented and configured Production Servers and migrated SQL objects and data between two different geographical data centers.
- Automated legacy jobs, ad-hock queries and backup and restore processes.
- Tuned and reduced the processing time of batch and transactional processes.
- Led in supporting two spinoff activities, with one of them requiring a data center change.
- Design and implement DR solutions, log shipping, mirroring and AlwaysOn AG.
- Performed yearly disaster recovery exercises, i.e. switching the roll of production and disaster recovery servers.
- Participated on yearly audit and reconciliation processes
- Worked on billing system consolidation
Confidential
SQL Server and ETL developer
- Consulted with Client Company to provide SQL server & ETL development solutions.
- Planned, Documented and performed a seamless migration of SQL server from, from SQL
- Develop strategy processes that manage business process in collecting bad debt which involve several collection agencies.
- Wrote complex queries and developed stored procedures and user defined functions.
- Analyzed, tuned and optimized the performance of complex T-SQL statements.
- Prepared understanding documents, test cases and reviewed the requirements with clients.
- Tuned and rewrote stored procedures and functions that have performance issues.
- Reviewed SQL server database objects before passing them to the production.
- Participates and coordinates in maintaining, modifying and creating database structures.
Environment:
- Operating Systems: - Windows 2012, 2008, 2003, 2000 Server.
- Databases: - MS SQL Server 2016, 2012, 2008, 2005, 2000, Linux Oracle 11g PL/SQL, Hbase, mySQL and Hive