Sr. Hadoop Developer Resume
Columbus, OH
PROFILE:
- MBA candidate with over 6+ years of professional experience in IT & its management system development, databases & analytics, business intelligence along with 2 years of experience in Big Data Analytics as Hadoop Developer. Strong knowledge in System Development Life Cycle (SDLC) using Agile & Waterfall methodologies.
- Skilled in various areas of IT such as Big Data, ETL, database development, data analytics including project management & different business management strategies as Business Process Automation (BPA), Business Process Improvement (BPI) and Business Process Re - engineering (BPR).
- Experienced in HDFS Architecture and Cluster concepts.
- Hands-on experience with Horton works & Cloudera Distributed Hadoop (CDH)
- Administered Hadoop ecosystem components such as HDFS, MapReduce, Hive, Sqoop, Pig, Flume, HBase, ZooKeeper, Kafka & Spark.
- Optimized Map Reduce algorithm using Combiners and Partitioners for analyzing the big data as per the requirement to deliver the best results. Performed POC on Bucketing & Partitioners to analyze the cluster performance.
- Developed Hive Query Language & PIG Latin Scripts for data analytics.
- Read data from local files, XML files, excel files, JSON files in python with use of PANDAS module.
- Read from SQL DBs, Web through APIs and processed them for further use in python with PANDAS module.
- Learning Spark & developed Spark scripts by using Scala shell commands as per the requirement.
- Experience in building ETL Design and Development
- Proficient in data mapping, converting logical data models to physical database designs in Data warehousing Environment
- Experience with databases like DB2 Oracle 9i Oracle 10g MySQL SQL Server and MS Access
- Experience in developing test cases performing Unit Testing Integration Testing experience in QA with test methodologies and skills for manual/automated testing
- Strong hands on experience with Production Support
- Strong knowledge & experience of Project Management Knowledge Areas, Process groups & MIS.
- Experienced in financial data analysis using BI tools such as Tableau to provide growth insight about the product & market.
- Worked in an iterative, agile, SCRUM Methodology SDLC with strong ability to estimate/scope the development of projects.
- Extensive experience in various types of UML diagrams including Activity diagram, Use Case diagrams, Behavior Diagrams (Sequence diagrams and Activity diagrams), and Data Flow diagrams.
- Preparation of business review and presentations with fluent public speaking skills & strong client relationship.
- Strong analytical, logical, troubleshooting skills and flexible to adapt to new technologies & business functions.
- Excellent verbal & written communication skills with a sense of ownership and drive to get things done in multi-diverse teams.
TECHNICAL EXPERTISE:
Hadoop/Big Data Technologies: HDFS, MapReduce, Hive, Pig, Sqoop, Flume, Hbase, Oozie, Zookeeper, YARN,Spark,Kafka,Python
Programming Languages: C/C++, PL/SQL,Java,Python,Spark
Application Packages: MS Visio, MS Project, MS Office Suite, RDBMS, Oracle 11g
Operating Systems: UNIX, LINUX, Windows
Databases: Oracle 8i/9i/10g,Microsoft SQL Server, DB2 & MySQL 4.x/5.x
Java IDE: Eclipse 3.x, Pycharm, Pystudio, Pyscripter
Tools: Eclipse,SQL Developer, Informatica,Tableau8.x/9.x,TortoiseSVN
Script: Bash, SQL, HiveQL, Shell Scripting
Methodologies: SDLC, SCRUM, Agile, UML, Waterfall model
PROFESSIONAL EXPERIENCE:
Confidential, Columbus, OH
Sr. Hadoop Developer
Responsibilities:
- Worked as core team for complete architectural development & implementation of Hadoop Technology at Confidential from nothing to a stable Hadoop ecosystem.
- Involved in the architectural decision with of Hadoop for application development and ecosystems like Hive, Pig, Zookeeper, flume, Hbase and Sqoop.
- Performed research & POC on Hive to analyze the partitioned and bucketed data and compute various metrics to determine the performance on hadoop cluster.
- Used Sqoop to import and export data from RDBMS to HDFS & vice versa
- Involved in installing and configuring Kerberos for the authentication of users and Hadoop daemons.
- Created Hive tables and involved in data loading and writing Hive UDFs.
- Created Hive External tables on the existing HDFS file systems
- Experienced in managing and reviewing Hadoop log files.
- Performed data analysis, queries on hive, pig on AMBARI(Hortonworks)
- Worked on importing and exporting data from SQL Server and Teradata into HDFS and HIVE using Sqoop.
- Performed development, deployment, job scheduling, testing, validation & troubleshooting of data from development to production environment.
- Involved in writing shell scripts for Access Control List permissions for roles, active directories, unix directories & file systems.
- Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's, Spark YARN.
- Developed Spark code and Spark-SQL/Streaming for faster testing and processing of data.
- Loaded the data into Spark RDD and performed in-memory data computation to generate the output response.
- Provided technical and analytical expertise to complex, specialized report requests requiring higher-level data analysis and data management services.
- POC/analysis on no. of mappers & reducers required for fast processing of data for certain size of data & cluster.
- Involved in documentation of high level & detail level design of complete development phase.
- Monitored production deployment for successful implementation of data.
- Extensively worked on changed control process of the company required for hadoop implementation & documented the same to help hadoop team understand & follow the overall process.
- Worked on source target mapping of data to analyze the transformation of data for business analysis
- Established healthy working relationship with team & management to ensure efficient & productive workspace & helped each other at any situation in order to avoid delay in project & met higher management expectations.
- Learned new technology through company training & self to contribute in continuous innovation in technology.
Environment: Hadoop, MapReduce, Hive, HDFS, PIG, Sqoop, Flume, Hbase, Spark, Python ZooKeeper AMBARI(Hortonworks), Tidal, SQL Server, Teradata, MYSQL, PL/SQL, UNIX, TortoiseSVN, MS Visio,TFS, Pycharm, Pystudio, Pyscripter
Confidential, Austin,TX
Hadoop Consultant
Responsibilities:
- Configured Apache Hadoop clusters for application development and Hadoop tools like Hive, Pig, Zookeeper and Sqoop.
- Developed shell scripts to monitor the health check of Hadoop daemon services and respond accordingly to any warning or failure conditions.
- Involved in collecting and aggregating large amounts of log data and staging data in HBASE/HDFS for further analysis.
- Installed Oozie workflow engine to run multiple Hive and Pig Jobs.
- Used Sqoop to import and export data from HDFS to RDBMS and vice-versa.
- Worked on importing and exporting data from Oracle and DB2 into HDFS and HIVE using Sqoop.
- Involved in writing shell scripts for rolling day-to-day processes and its automation.
- Performed data analysis, querying on hive, pig on Cloudera Distributed Hadoop (CDH)
- Transformed massive amounts of raw data into actionable analytics include financial data anaylsis, market, product by the use of BI tools.
Environment: Hadoop, MapReduce, Hue, Hive, HDFS, PIG, Sqoop, Cloudera, ZooKeeper, CDH4&CDH5, Oracle, PL/SQL, Linux, Tableau
Confidential
Asst. Manager(Big Data Analytics & Information Management )
Responsibilities:
- Involved in developing the architecture BI solution with team, including reporting, ETL, database platforms and applications and driving BI design to enable greater data quality, integrity, consistency and access
- Supported business decision-making by involving into design and development of ETL processes and data warehouse infrastructure. In doing so, was able to develop and evaluate logical and physical models, define strategies, prepared transformation rules, mapping data from source to target and developed application workflows
- Work responsibility also involved reviewing, maintaining, and updating metadata and business logic documentation regarding the BI environment for the users
- Created the TDS, Support Documents, Architecture Review documents
- Strong knowledge in business processes, performing gap analysis, cost benefit analysis, SWOT analysis documenting requirements
- Proficient with statistical reporting, analysis and documentation in ERP software systems.
- Involved in interacting with business users, gathering and analyzing source date for enterprise data warehouse schemas
- Mentored and trained 4 personnel to assume duties as promotions were implemented
- Experienced with financial and clinical data, claims or other clinical program data
- Created Reports using Web Intelligence, used multiple sources to create reports, Combined Queries, Slice and Dice, Drill Down, Cross tab and Master Detail Reports.
- Implemented project BETA (Business Excellence through Transformation Activity) - a change management implementation to improvise the business operation through new ideas.
- Performed staff scheduling and annual performance reviews for all employees.
- Worked on design analysis, requirements management, business modeling and validation.
- Created budget templates to support forecasting, allocations and department reporting.
- Completed month-end department audits to eliminate shrinkage..
- Leveraged my technical, commercial and engineering capabilities in the application engineering and embedded software space to enable tangible customer success
Senior Engineer (Business Intelligence)
Confidential
Responsibilities:
- Provided recommendations and requirements for possible enhancements that will expand the scope of possible analytics and reporting solutions or simplify their development.
- Audited and validated DW to ensure accuracy of data and related dimensions, measures, and reports.
- Identified any resource dependencies, such as access to and availability of environments (source, staging, target system), tools, software licenses, or personnel.
- Administered user, user groups, and scheduled instances for reports in Tableau
- Monitored on-going reports to ensure that refreshes occur as scheduled.
- Gathered Business Requirements, analyzed data/workflows, and defined the system scope.
- Identified, researched, coordinated and implemented process improvement opportunities. Preparing testing scenarios, doing walkthroughs.
Engineer (Data Analyst)
Confidential
Responsibilities:
- Wrote PL/SQL procedures to do the database jobs and other monthly, weekly maintenance tasks.
- Researched & worked on performance tuning every day and support production readiness testing in performance lab database.
- Performed schema refreshes from Production to QA and other test environments for testing.
- Provided on call support for various database issues like Oracle errors, slow performance and system maintenance issues.
- Performed SQL server Databases Migration from One server to another Server during Maintenance window activities.
- Scheduled Cron jobs for day-to-day database jobs and other monitoring tasks at database and UNIX level.
- Involved in analyzing the real time data and doing performance tuning.
- Worked with Production, QA and development database servers.
- Worked on identifying and troubleshooting the bugs.
Environment: Oracle, PL/SQL, Linux, Teradata, Informatica Power Centre, MS Visio, MS Word, UML, Oracle, Business Objects XI R2, Web Intelligence XI R2, Desktop Intelligence, Windows Server 2003. MS Project
