- 7+ years of experience in Business Analytics, Data Modeling and Data warehousing.
- Hands - on experience with Horton Works, Cloudera Hadoop platforms.
- Worked on Microsoft Azure platform, deploying IaaS, & PaaS solutions.
- Worked with Data analytic tools like Spark, Hive & Kafka.
- Worked with Hive, creating Hive tables, writing queries for analysis and reporting.
- Working knowledge with Python, SQL and Shell scripting.
- Involved in designing and building Data pipelines.
- Led data migration efforts between RDBMS & Hadoop environments.
- Worked with Hadoop Ecosystem tools like Hive, Sqoop, Flume & Zookeeper.
- Experienced working with NoSQL databases like HBase.
- Experienced in DWH, ETL best practices & relational databases.
- Good knowledge in data architecture, data modelling and data Integration
- Experienced with GIT, Perforce and other source code management tools.
- Experienced in all phases of Software Development Life Cycle.
- Excellent communication skills, worked with multiple teams across the organization.
Data Analytics: Hadoop, MapReduce, Spark, Kafka, Storm, Hive.
Databases: Data Management Tools
HBase, Oracle, SQL Server, Teradata: Sqoop, Zookeeper, Flume
Data Reporting Tools: Tableau, Zeppelin
ETL: Informatica Power center
Languages: Python,SQL, Shell Scripting.
Version Control SW: IDE & Tools
Perforce, GIT, Jenkins, JIRA: Eclipse IDE, IntelliJ, Maven
Confidential, San Jose, CA
Big Data Engineer
- Worked on FACT - Reconciliation financial project on Hadoop platform.
- Fixing JIRA defects, resolving variance issues observed in data, removing duplicate data, data cleansing, data aggregation and reporting.
- Worked with Braintree clients to resolve issues reported on Fuse reports and internal Rugo tables. Scripts to check periodic variance reported in ledger.
- Worked on PROD, QA data querying & manipulating data sets using Hive client.
- Worked on data quality issues using Jupyter notebooks, Hive client.
Confidential, Mountain View, CA
- As a part of Data Services team, built Hadoop cluster on Azure Platform & deployed various data analytic solutions.
- Worked on Confidential Telemetry project to create data aggregations and develop meaningful insights based on cyber threat level using Spark, Kafka for product team.
- Ingested Learning Platform data and developed transformations using Spark to categorize data & generated course related info &employee participation metric reports.
- Used HIVE extensively for data analysis in streaming & batch processing jobs.
- Developed custom Python code to parse inbound CCS data of Confidential .
- Worked with CPE team, Product team and Data science teams & Business to translate business requirements to technical specifications and in implementing POC’s.
- Worked on Performance tuning of Spark jobs, Hive jobs & OS related tuning.
- Experience working with various file formats like CSV, AVRO, Parquet and JSON.
- Experienced working on data quality issues.
- Worked on collecting and storing stream data into log data in HDFS using Flume.
- Led Data migration efforts from various RDBMS sources like Teradata into Hadoop cluster for data processing, transformation, storage and BI reporting.
- Involved in building and customizing Data pipelines using Azure Data Factory.
- Worked with ETL team members, Microsoft solution architects, Infrastructure team & networking teams to coordinate & implement migration projects.
- Generated BI reports in Tableau & published larger datasets using Tableau Server.
- Worked with Pentaho Data Integrator to integrate multiple data sources like Splunk, for Confidential Global Security team. Supported in R&D activities.
- Worked on troubleshooting HDFS issues & other tools such as Hive, Spark & Oozie.
- Presented project technical details to the management & documented in Confluence.
- Handled and resolved many production related application issues. Worked with Hortonworks architects & analysts for getting bug fixes.
Environment: - Horton Works, Azure, Hive, Spark, Kafka, Sqoop, Flume, Teradata.
Confidential, San Diego CA
Big Data Engineer .
- Involved in creating Hive tables, loading data and writing hive queries.
- Written and implemented custom Hive UDF's as per business requirements
- Worked on Hive performance tuning and Hadoop MapReduce operation optimization.
- Responsible for Performance tuning, Hadoop cluster management, patching, resolving network issues, cluster monitoring and reviewing log data to fix issues.
- Managed and reviewed Hadoop log files to optimize MapReduce jobs performance.
- Importing and exporting of data into HDFS and Hive using Sqoop.
- Performed analysis using Hive on the partitioned and bucketed data to compute various metrics for reporting using Tableau.
- Developed shell scripts for processing of hive scripts and according to business need.
- Worked on and gained good expertise with NoSQL databases like HBase, Cassandra.
- Experience working with PL/SQL scripting and programming.
- Trained and coordinated with offshore team members to meet tight deadlines.
Environment: - Hadoop, Hive, Pig, Kafka, Storm, Python, MS SQL Server.
Confidential, San Diego
Sr.System Administrator .
- Performing security management by maintaining roles, privileges, and user profiles. Add/Remove users from many different Security Groups in our domain (Auto groups)
- Run, Schedule, Enable SQL Jobs. Troubleshoot if the jobs fail, provide log files of jobs to dev teams Look into the health of the Production Databases.
- Monitoring Mirroring, Log shipping & Replication.
- Schedule backup jobs and object level recovery using maxima tool.
- Monitoring of Online and scheduled jobs.
Environment: - Oracle, MS SQL Server, DB2, Tableau v7(Desktop/Server), MS Visio, Word, Excel, Access, HTML, XML, Agile Methodology, Shell Scripting.
- Configuring and maintenance various MySQL databases for applications like Drupal.
- Setting up Single Sign On on Application Servers using SSL keys.
- Worked on Setting up process maker and getting it work with Shibboleth (SSO).
- Worked on Creating Cluster of MySQL databases on Mac OS.
- Experience with installing Drupal Multi Site and Migration of data
Environment: - Oracle, MS SQL Server, DB2.