We provide IT Staff Augmentation Services!

Bigdata Data Lake Production Support Resume



  • 5 years of experience in Bigdata - Data lake-Data warehousing -Production support Analyst with experience in data analysis, ETL design, development, QA testing, enhancement, production support and maintenance of Data lake, Data Warehouse applications using Hadoop, HBase, Hive, Sqoop, IBM Infosphere Information Server DataStage 11.5, SQL, Unix shell scripting, Oracle, DB2,SQL server, Cognos, Control-M, Autosys.
  • A Highly motivated with the ability to work effectively with the team as well as independently. Excellent interpersonal and communication skills, and experienced in working with senior level managers, businesspeople and developers across multiple disciplines. Ability to grasp and apply new concepts quickly and effectively.
  • Good technical knowledge of Data lake HDFS systems- HBase, Hdfs, Hive, Phoenix, Infogix, Pig, Python, Linux, Zena, Control-M, Relational & Dimensional Data models (Star and Snowflake), Facts and Dimension tables, SCD, Normalized, Data, Data Extraction, Data Integration.
  • Provided frameworks for cost reductions like proactive monitoring alerts from Hadoop queues, edge nodes, pepperdata alerts, Incident Auto Assignment, Auto resolve/Transfer
  • Automated few Report Generations and now concentrating on Incident, Failure and Manual efforts reductions, Finding and implementing Offshoring possibilities
  • Experienced in Daily Sync-up Meeting with Offshore, client SME and adhoc business user meetings
  • Prepares Daily, weekly, Monthly status reports with all project critical incidents/updates to client with Abend Tracker, HPI Tracker, SLA Baselining, Incident Variance, SI Estimate vs Actual, CSI tracker, WSR, release coordination, Non-Ticketed Activities sheet, release fallout, debt category data, defect tracking
  • Coordinates with business team, Product team and ITG team related to Govt. Submission activities. Sent communications on status. MPP updates, monitoring jobs
  • Daily Monitoring Check out Activities on the Critical process and the All App Health with daily Offshore/Onshore Handshake - Defect Analysis and documentation, DailyTicketing assignments sessions on new process and release transparency, SI KT sessions and Meetings
  • All File processing issues and Communicating to the business and the users post File process
  • Involves in new service intake Discovery /KA sessions / MPH Walkthrough
  • Different analysis on duplicate & Missing files - Requests from internal teams and all Queries from business users
  • Experienced in performance tuning of all business-critical SLA bounded code (Job level, step level as needed)


Big Data Technologies: Hadoop, Map Reduce, Pig-Latin, Hive, HBase, Zookeeper, Cloudera Distribution of Hadoop, Apache Spark, Scala, Apache Kafka

Data Loading Techniques: SQOOP, FLUME, Kafka

ETL Tools: IBM Infosphere DataStage (versions 7x, 8x and 11x)

RDBMS: Teradata, Oracle, MySQL, DB2

Scheduling: Zookeeper, Autosys, Control-M

Languages & APIs: Scala, Java, PIG, HIVE, HBase, MapReduce, Spark, SQL/ PL/SQL, Unix Shell

Frameworks: Apache Hadoop, Apache Spark, MapReduce

Operating Systems: Linux/Unix, Windows

NoSQL: HBase

Application Development Tools: WinSCP, PuTTY, SQL Developer, HP ALM

Methodologies: Agile, Scrum, Waterfall


Confidential, IL



  • Planning new service intakes starting from discovery to steady state.
  • Prepares Daily, weekly, Monthly status reports with all project critical incidents/updates to client with Abend Tracker, HPI Tracker, SLA Baselining, Incident Variance, SI Estimate vs Actual, CSI tracker, WSR, release coordination, Non-Ticketed Activities sheet, release fallout, debt category data, defect tracking
  • Automation, Continuous service improvement opportunities finding in the existing production codes and fixing the possible and recommending out of scope to automation teams.
  • Application support for the existing deployed projects.
  • Daily batch cycle monitoring and issue fixes and Alerting.
  • Operational metrics reporting and support
  • Support the Data Lake Infrastructure issues
  • Scheduling system support (Autosys, Control M, Zena)
  • User Access and Security work
  • Release Management Support Activities

Environment: HBase, HDFS, Hive, PIG, Sqoop, Scala, Spark, MapReduce, Unix Shell Script, Python, Phoenix, Infogix, Linux, Zena, Control-M

Confidential, Deerfield, IL



  • Developed the DataStage jobs, Sql and Unix scripts by Understanding requirements as per the Client
  • Involved in Performance Tuning Jobs process identifying and resolve Performance Issues
  • Supported as the operations team in resolving user issues
  • Help to the Front-end team for loading Aggregate tables and validate the data.
  • Involved in Unit Testing, supported to the operations team in resolving user issue
  • Scheduling and to automating the whole process of data loading the Jobs based on the dependency.
  • Participate in all stages of the development life cycle including requirement analysis, design, development and implementation of the system.
  • Worked on ETL QA and complete testing
  • Collaborated with Business Analysts to review the business specifications of the project and to gather the ETL requirements.
  • Worked with Data Architects in designing of tables and involved in modifying technical Specifications.
  • Involved in Extraction, Transformation and Loading of data.
  • Worked on Error handling techniques and tuning the ETL flow for better performance.
  • Perform Code reviews of the DataStage jobs. Worked with the development team to develop reliable, maintainable ETL jobs to process huge volume data from various sources

Environment: IBM Infosphere Information Server DataStage 11.5, SQL, PL/SQL, Unix shell scripting, Oracle, DB2, SQL server, Cognos, MicroStrategy, Autosys.

Confidential, MI

ETL Developer


  • Created ETL Mapping with Datastage Integration Suite to pull data from Source, apply transformations, and load data into target database.
  • Prepared ETL mapping Documents for every mapping and Data Migration document for smooth transfer of project from development to testing environment and then to production environment.
  • Design and Implemented ETL for data load from heterogeneous Sources to SQL Server and Oracle as target databases and for Fact and Slowly Changing Dimensions SCD-Type1 and SCD-Type2.
  • Developed ETL jobs for data exchange from and to Database Server and various other systems including RDBMS, XML, CSV, and Flat file structures
  • Used Control-M to schedule ETL Jobs on daily, weekly, monthly and yearly basis.
  • Involved in Complete Software Development Lifecycle Experience (SDLC) from Business Analysis to Development, Testing, Deployment and Documentation.
  • Design the Job in Datastage 7.5.2 PX. Developed mappings using the Datastage jobs using DataStage tool to load data warehouse and Data Mart.
  • Created sequences which include the jobs in data stage project.
  • Design and Develop ETL Performance tuning of Datastage jobs.

Environment: Oracle 10g/11g, Datastage 7.5.2, DB2 and AIX, Control-M

Hire Now