Senior Hadoop Developer Resume
Chicago, IL
SUMMARY:
- About 12 years of IT experience in software Design, Development, Analysis, Maintenance, Testing, Support and troubleshooting in Banking, Insurance, and Telecom Industries.
- Over 3 years of experience in Hadoop, Hive, Impala Pig, Sqoop, Flume, Kafka, Oozie, Yarn, Spark and designing and implementing Map/Reduce, Spark jobs to support distributed data processing and process large data sets on the Hadoop cluster .,
- About 3 years of Data Warehousing experience using Informatica Power Center, OLAP, OLTP environments.
- Expertise in Architecting, Designing, Developing, Testing, Implementing, Supporting, Troubleshooting in several Programming languages/platforms like JAVA, SCALA, PYTHON J2EE, C,C++,SAS,COBOL, Web services and PL/1.
- Worked on integrated BI, ETL tools and Mainframe applications.
- Expertise in creating test plans, test cases and test scripts.
- Expertise Unit Testing, Regression Testing, Integration testing, User Acceptance testing, production implementation and maintenance.
TECHNICAL SKILLS:
Big Data Ecosystem: Hadoop, MapReduce, HDFS, HBase, Zookeeper, Hive, Pig, Sqoop, Flume Kafka, Spark, and Oozie.
Programming Languages : C,C++, Java,J2EE, Web Services, COBOL, SAS, Scala PL/SQL, PL/1
Database: Cassandra, Hbase, DB2,Oracle, MS Access, MySQL, MongoDB, SQL Server, Teradata and IMS
Scripting Languages: JavaScript, Python, SQL, Shell Scripting
Operating Systems: Windows, LINUX, z/OS
Modeling Tools: Ms Visio
Project: Technical Leadership, Project Coordination, Training and Development, SME, Requirements Analysis, Deployment Planning.
BI Tools: Tableau, Platfora, SAS
Version Control Tools: SVN, GIT
Build Tools: Maven
PROFESSIONAL EXPERIENCE:
Confidential, Chicago, IL
Senior Hadoop Developer
Responsibilities:
- Architect, Design, Develop, Test, Deploy and Support Big Data Applications on Hadoop Cluster with Map Reduce(Java), Spark(Scala/Python), Kafka, Sqoop, Flume, Hbase, Pig, Hive/Beeline and Impala.
- Importing the large Volumes of Data from RDBMS(DB2, ORACLE) to HDFS using Sqoop.
- Exporting the Data from Hadoop to SQL Server to create the Data marts.
- Stored, Accessed and Processed data from different file formats i.e. Avro, ORC and Parquet.
- Developed multiple Spark jobs in Scala/Python for data cleaning, pre - processing and Aggregating.
- Developing Shell/Python Programs to generate code at run time to process the data and to automate the process in Hadoop/Spark Environment.
- Developing Spark based transformations, analytical jobs
- Involved in creating Hive Tables, loading with data and writing Hive/Impala queries.
- Load the xml files to HDFS and then process with Spark/Hive.
- Creating tables in Hbase.
- Loading the data to Hbase Using Pig, Hive and Java API's.
- Performance tuning the Hive, Pig, Spark Applications and Hbase NoSQL Database.
- Developed UDF's to implement complex transformations on Hadoop .
- Load log data into HDFS using Kafka.
- Setting-up Kafka brokers, Producers and Consumers.
- Setting-up Data pipelines to hadoop with Kakfa,Flume and Sqoop.
- Created partitioned tables in Hive for best performance and faster querying.
- Designed NoSQL schemas in Hbase.
- Developed Spark ETL/ELT process with Sqoop, Scala/Python, Hive and Pig.
- Perform analysis and author Spark/SQL queries for business facing reports. .
- Managing and scheduling Jobs on a Hadoop cluster using Control-M
- Responsible to manage data coming from different sources.
- Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
- Working in Agile and scrum software development methodologies.
- Working on Highly Secured Kerberos cluster, no compromise on data security and security policies.
- Supports development and testing teams with Hadoop environment needs, code-promotions and code deployments.
- Ensure the quality and integrity of data are maintained as design and functional requirements change.
- Work closely with Hadoop Admins
- Develop and document design/implementation impacts based on system monitoring reports.
Environment: Linux, Hadoop, Hive, Hbase, GIT, Pig, Java, Python, Scala, Flume, Kafka Map Reduce, Sqoop, Spark, SQL, DB2,Teradata and Oracle.
Confidential, Elmhurst, IL
Lead Hadoop Developer
Responsibilities:
- Worked on Hadoop environment with MapReduce, Kafka, Sqoop, Oozie, Flume, Hbase, Pig, Hive and Impala on a multi node environment.
- Worked with different file formats i.e Avro, ORC and Parquet.
- Involved in migration of data from existing RDBMS (oracle and SQL server) to Hadoop using Sqoop for processing data. .
- Developed multiple MapReduce jobs in java for data cleaning, pre-processing.
- Developing Python scripts to process the data and to automate the process in Spark Environment.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
- Creating tables in Hbase.
- Loading the data to Hbase Using Pig, Hive and Java API's.
- Performance tuning in Hbase.
- Creating Datasets and Lens on Platfora.
- Building the Lens on Platfora.
- Creating Vizboards(Dashboards) on Platfora.
- Exposure in spark iterative processing
- Extract and load the data from DB2 and Mainframe tape files and copy over to HDFS.
- Analyzed the data using Hive queries and running Pig scripts to study customer behavior.
- Developed UDF's to implement business logic in Hadoop .
- Load log data into HDFS using Flume.
- Created partitioned tables in Hive for best performance and faster querying.
- Designed and implemented Map Reduce jobs to support distributed data processing.
- Designed NoSQL schemas in Hbase.
- Developed MapReduce ETL in Java and Pig.
- Responsible for performing extensive data validation using HIVE.
- Imported and exported the data using Sqoop from HDFS to Relational Database systems
- Managing and scheduling Jobs on a Hadoop cluster using Oozie, Control-M
- Responsible to manage data coming from different sources.
- Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
Environment: Linux, Hadoop, Hive, Hbase, GIT, Pig, Java, Python, Scala, Flume, Kafka Map Reduce, MongoDB, Sqoop, Spark, SQL, DB2,Teradata and Oracle.
Confidential, Bloomington IL
Hadoop Developer
Responsibilities:
- Developed Big Data analytic models using Hive.
- Developed multiple MapReduce jobs in java for data cleaning, pre-processing.
- Involved in creating Hive Tables, loading with data and writing Hive queries, which will invoke and run MapReduce jobs in the backend.
- Extract and load the data from DB2 and Mainframe tape files and copy over to HDFS.
- Create Statistical reports as needed for quantitative analysis of business management team.
- Importing and exporting data into HDFS, Hbase and Hive using Sqoop.
- Automate the process and Transform date with Python scripting.
- Analyzed the data using Hive queries and running Pig scripts to study customer behavior.
- Developed UDF's to implement business logic in Hadoop .
- Developed extraction modes using Hive from policy and Bank Data.
- Designed and implemented Map Reduce jobs to support distributed data processing.
- Processed large data sets utilizing our Hadoop cluster.
- Designed NoSQL schemas in Hbase.
- Loading the data to Hbase Using Pig, Hive and Java API's.
- Developed MapReduce ETL in Java and Pig.
- Responsible for performing extensive data validation using HIVE.
- Imported and exported the data using Sqoop from HDFS to Relational Database systems.
- Creating Datasets and Lens on Platfora.
- Building the Lens on Platfora.
- Creating Vizboards(Dashboards) on Platfora.
- Managing and scheduling Jobs on a Hadoop cluster using Oozie.
- Responsible to manage data coming from different sources.
- Experience in loading and transforming of large sets of structured, semi structured and unstructured data.
Environment: Linux, Hadoop, Hive, Hbase, Pig, Python, Platfora, Flume, Map Reduce, MongoDB, Sqoop, SQL, SVN, DB2,Teradata and Oracle.
Confidential, Bloomington, IL
ETL Developer
Responsibilities:
- Analysis of source systems and work with business analysts to identify study and understand requirements and translate them into ETL code.
- Perform analysis on quality and source of data to determine accuracy of information being reported and analyze data relationships between systems.
- Worked on complete life cycle from Extraction, Transformation and Loading of data using Informatica .
- Used Informatica 's features to implement Type I, II changes in slowly changing dimension tables and also developed complex mappings to facilitate daily, weekly and monthly loading of data.
- Prepared high-level design document for extracting data from complex relational database tables, data conversions, transformation and loading into specific formats.
- Designed and developed the Mappings using various transformations to suit the business user requirements and business rules to load data from Oracle, SQL Server, DB2, Teradata and flat file.
- Developed standard and re-usable mappings and mapplets using various transformations like Expression, Lookups, Joiner, Filter, Source Qualifier, Sorter, Update strategy and Sequence generator.
- Used Debugger to test the data flow and fix the mappings.
- Created Sessions and Workflows to load data from the SQL server, flat file,DB2, Teradata and Oracle sources that exist on servers located at various locations all over the country.
- Involved in unit testing, Integration testing and User acceptance testing of the mappings.
- Used various performance enhancement techniques to enhance the performance of the sessions and workflows.
- Performance tuning on sources, targets mappings and SQL (Optimization) tuning.
- Responsible for creating business solutions for Incremental and full loads.
- Installation and configuring the Power Center tool including database connections.
- Providing technical support and troubleshooting issues for business users.
Environment: Informatica 8.6.1, Business Objects, Oracle 10g, SQL Server2008, DB2, Teradata, Flat Files, Windows 2000, Unix.
Confidential, Tampa, FL
Informatica Developer
Responsibilities:
- Identified data source systems integration issues and proposing feasible integration solutions.
- Partnered with Business Users and DW Designers to understand the processes of Development Methodology, and then implement the ideas in Development accordingly.
- Worked with Data modeler in developing STAR Schemas and Snowflake schemas.
- Created Oracle PL/SQL queries and Stored Procedures, Packages, Triggers, Cursors and backup-recovery for the various tables.
- Identifying and tracking the slowly changing dimensions (SCD)
- Extracting data from Oracle and Flat file, Excel files sources and performed complex joiner, Expression, Aggregate, Lookup, Stored procedure, Filter, Router transformations and Update strategy transformations to extract and load data into the target systems.
- Created reusable Mailing alerts, events, Tasks, Sessions, reusable worklets and workflows in Workflow manager.
- Scheduled the workflows at specified frequency according to the business requirements and monitored the workflows using Workflow Monitor.
- Fixing invalid Mappings, Debugging the mappings in designer, Unit and Integration Testing of Informatica Session and Workflows.
- Extensively used TOAD for source and target database activities.
- Involved in the development and testing of individual data marts, Informatica mappings and update processes.
- Created repository, users, groups and their privileges using Informatica Repository Manager
- Involved in writing UNIX shell scripts for Informatica ETL tool to run the Sessions .
- Generated simple reports from the data marts using Business Objects.
Environment: Informatica Power Center 8.6.1, Business Objects XI, Oracle 9i, SQL/PL, SQL, UNIX Shell Programming, UNIX, and Windows NT.
Confidential, Irving, TX
Mainframe Developer
Responsibilities:
- Understanding the High level, Low level design documents and technical specification documents.
- Coding complex Application programs using COBOL, PL/1, CICS, MVS JCL, VSAM, DB2, TSO and SAS, testing the same.
- Preparing the validation and loading jobs with Load resume and replace.
- Resolving the trouble tickets in lotus notes.
- Supporting the weekly production installs.
- Preparing the NDM and SFTP jobs.
- Migrating the Data from IMS to DB2
- Working on Multiple LPAR MVS environment
- Creating and updating the control cards
- Scheduling the jobs in CA7.
- Used INSYNC to perform various operations on database.
- Used various IBM utilities like DFSORT, IEBGENER, and IDCAMS to write various jobs depending on requirements.
Confidential, Addison, TX
Programmer Analyst
Responsibilities:
- 24x7 application production support, first point of contact.
- Involved in Production support by Monitoring Batch cycle, accepting production tickets and solving the same.
- Application enhancements using COBOL, PL/1, JCL, VSAM, DB2.
- Involved in production installs and create ENDEVOR packages for production implementation.
- Refresh various test environments with production data and run test cycles in various test environments depending on the request from the client.
- Write complex queries using SPUFI to perform various operations on database.
- Used NDM tool to transfer the data.
- Working with PROC’s and JCL overrides.
- Working on Multiple LPAR MVS environment
- Working with control cards. .
- FTP various files to mainframe modify/send it to Client as per the request.
- Used File-Aid to create various VSAM clusters, and GDG.
- Testing programs using various testing tools like XPEDITOR and debugging run time problems.
- Perform various activities on database using ‘DATABASE COMMANDS’ utility such as ‘termination of active utilities’ to overcome ‘Resource Unavailability’ problems.
Confidential, Charlotte, NC
PROGRAMMER ANALYST
Responsibilities:
- Understanding requirements and preparation/review of High level, Low level design documents and technical specification documents.
- Migrating the data from IMS to DB2
- Coding the Applications using COBOL, CICS, MVS JCL, VSAM, DB2, and TSO and testing the same.
- Conducting quality reviews.
- Writing the Stored Procedures.
- Data mapping.
- Websphere MQ messaging system.
- Preparing the validation and loading jobs
- Preparing the NDM jobs.
- Scheduling the jobs in CA7.
Environment: OS/390, COBOL, JCL, DB2, IMS, VSAM, CICS, File-Aid, File-Aid/DB2, QMF