Sr Big Data/hadoop Developer Resume
Houston, TX
SUMMARY
- Around 10 years of experience across software development, System Testing, Quality Assurance, implementing test plans, test cases and test processes fueling swift corrective actions, including leading teams. Implemented Extract Load and Transform(ELT) with Apache Pig.
- About 3 years of work experience in ingestion, storage, querying, processing and analysis of Big Data with hands on experience in Hadoop Ecosystem development including MapReduce, HDFS, Hive, Hive QL, Pig, Spark, Spark SQL, Spark Streaming, YARN, HBase, MongoDB, ZooKeeper, Sqoop, Flume, Impala, Oozie.
- Recognized software engineer with a proven track record of consistently exceeding performance goals. Identified, developed and maintained technical process improvements and application process flows. Implemented Web Server Log Analysis with Apache Spark, Spark SQL.
- Data warehousing, management with Hive. Query optimization with partitioning, bucketing and indexing. Handling different data formats like Avro, Parquet.
- Knowledge of Integrating pig and hive with Hcatalog, Hive and HBase with HBaseStorage and Pig and HBase integration with HBaseStorage.
- Structured Data Ingestion from RDBMS (Oracle, MySQL, SQLServer) in to Hadoop (HDFS, Hive and HBase) using sqoop (RDBMS < - -> Hadoop). Implemented incremental inserts with Sqoop. Workflow designing, scheduling and coordination with Oozie.
- Code development, maintenance and debugging/ diagnosing logs with HUE.
- Designed cost-effective application solutions by creating new or enhancing existing software.
- Earned a “Customer Appreciation ” for saving $150,000 after automating a monthly process. Experience translating business requirements into technical designs.
- Developed and executed test plans and documented system flows.
- Strong understanding of back office accounting and telecom inventory management.
- Executed Payment Card Industry (PCI) security compliance engagements.
- Proficient in onsite and offshore model management, experienced in handling and guiding 6 members team. Experience achieving Sarbanes-Oxley (SOX) Act compliance guidelines.
- Strong collaboration, team building, interpersonal, communication skills with proficiency at grasping new technical concepts quickly & utilizing the same in a productive manner.
- Participated in post-implementation reviews (PIR) of both application development content and process to maximize and share learning.
- Experienced with Apache Spark by improving the performance and optimization of the existing algorithms in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN.
- Experience and strong knowledge on implementation of SPARK core - Spark Streaming, Spark SQL. Assisted in Extending Hive and Pig core functionality by writing custom UDFs.
- Experience and strong knowledge on implementation of SPARK core - Spark Streaming, Spark SQL. IBM certified database associate—DB2 10.1 fundamentals
- Capable of processing large sets of structured, semi-structured and supporting systems application architecture.
- Strong Knowledge of Banking domain in Payment Card, Payment transfers and ACH settlement.
- Familiar with data architecture including data ingestion pipeline design, Hadoop information architecture.
TECHNICAL SKILLS
Operating Systems: Windows 7/8/10, MVS Z-OS
Software Tools: MS Office, Eclipse, NetBeans IDE
Big Data Ecosystems: Hadoop, Map Reduce, HDFS, HBase, Zookeeper Hive, Apache Spark, Pig, Sqoop, Oozie, Flume, Spark SQL
Programming: Core Java, Python, Scala, C, UNIX, PL/SQL
Databases: DB2, MySQL, MongoDB, NOSQL, MS SQL Server, Oracle
Test Automation: Quick Test Pro (QTP)
Methodologies: Agile, Waterfall model
PROFESSIONAL EXPERIENCE
Sr Big Data/Hadoop Developer
Confidential, Houston, TX
Responsibilities:
- Analyzed large and critical datasets using HDFS, Map Reduce, Hive, Pig.
- Importing the data from the MySql database into the HDFS using Sqoop.
- Analyzed and transformed stored data by writing Mapreduce or Pig jobs based on business requirements.
- Participated in multiple big data POC to evaluate different architectures, tools and vendor products. Worked on Log analysis of large Datasets related to retail and financial industries.
- Responsible for developing data pipeline using flume, Sqoop and pig to extract the data from weblogs and store in HDFS. Experienced in implementing Spark RDD transformations, actions to implement business analysis.
- Used Sqoop to dump data from relational database into HDFS and HBase and viceversa.
- Implemented Spark using Python and Spark SQL for faster testing and processing of data.
- Created HBase column families to store various data types coming from various sources.
- Implemented partitioning, dynamic partitions and buckets in HIVE.Developing testing strategies for efficient and error free testing (like preparing Unit Test plans and doing Unit Test reviews).
- Played a key role in understanding user requirements and translating business requirements into technical solutions and documenting them for ClearXchange project.
- Have worked on CLearXchange project which is used for real time money settlement in the US.
- Ensuring Reviews & testing are carried out as per testing standards. Reviewing the programs for Standards and syntax using review checklists.
- Developing testing strategies for efficient and error free testing (like preparing Unit Test plans and doing Unit Test reviews).
- Have used iterative, incrementalsoftware development methodologies using Agile methodology
Environment: HDFS, Map Reduce, Hive, Pig, MySQL, Sqoop, Mapreduce, Flume, HBase, Spark, Python, Scala
Hadoop Developer
Confidential - New Jersey
Responsibilities:
- Work closely with client decision makers to understand their requirements to provide solutions.
- Prepare estimates of projects based on the unique business requirements.
- Optimized MapReduce code, pig scripts and performance tuning and analysis.
- Executed Data Flow Management with Pig.
- Worked on Log Analysis and Test Analytics with Spark.
- Worked on exporting data into relational database using Sqoop for making it available for visualization for the BI team.
- Coordinated 1) user acceptance testing and takes business user sign off for final production deployment, 2) onsite/offshore communications and clarifications, and 3) with business/development counterparts to resolve issues that pose risks to testing progress.
- Lead technical tasks on tight deadlines to meet the project delivery schedule.
- Provide 24/7 operational support for all production practices.
- Interacted with the Trintech system to transmit journal entry files via a SFTP transfer mechanism.
Environment: MapReduce, Pig, Spark, Sqoop, Oozie, Python
IT Analyst
Confidential
Responsibilities:
- Worked on a development and enhancement project that analyzed, designed and created new features based on Change Requests (CRs) and Project Service Requests (PSRs) raised by Avis.
- Focused on defect free delivery within the proper estimated time and cost (ETC) and work breakdown structure (WBS).
- Trained colleagues on best practices of back offices processes and system job flow.
- Communicated effectively with personnel at major vendors such as Bank of America, FDMS, Bottomline and Bank of Montreal to transmit payment files that contain PCI data.
Environment: HDFS, Map Reduce, Pig, Hive, Core Java, PL/SQL, Unix, QTP
Technical Associate
Confidential
Responsibilities:
- Spearheaded major activities for COSMOSS, a generic order handling and provisioning system targeted at business customers, with front and back office applications and support.
- Prepared and executed QTP scripts for test automation for COSMOSS-GENIUS interface for regression test cases to automate manual regression testing by 60%.
- Implemented projects on COSMOSS circuits’ migration, from one site to another.
- Managed work package management such as baselines, CI register traceability matrix and schedules.
Environment: Core Java, PL/SQL, Unix, QTP