We provide IT Staff Augmentation Services!

Big Data Engineer/data Scientist Resume

Atlanta, GA

SUMMARY:

  • Tech lead and Management Professional with 10+ years of overall experience and extensive experience in big data, machine learning and deep learning.
  • Prepared System Architecture & Components Design documentation & performed code - review.
  • Experience with big data tools: Hadoop, Spark, Hive, Pig, HBase, Scoop, Kafka, etc.
  • Implemented a big data solution to mask PII (Personally Identified Information) in the non-production instances with encrypted values.
  • Proficient in SQL Queries, triggers, Developing a notification system to push the notification into Flume Agent when data is added into MySQL.
  • Expert in writing MySQL triggers for the notification system. She developed Python & R Scripts to run various data models like Quantitative Model, Statistical Model & Clustering.

TECHNICAL SKILLS:

Operating Systems: Mac, Windows 10, Linux, Ubuntu

Programming Languages: Python, SQL, PL/SQL, HTML5, CSS3, JavaScript, Java, REST, Shell Scripts, R Big Data Hadoop, Mapreduce, Hive, Pig, Hbase, Spark, API Integration

Machine Learning: Feature engineering, KNN, SVM, Na ve Bayes, Decision Trees & Forests

NLP: LSA, topic modeling, dimensionality reduction, data miningDeep Learning:Long Short Term Memory Networks, Multilayer Perceptron Network

Methodologies: Object Oriented, SDLC, Agile, Waterfall

Software: Microsoft Office, Microsoft Visio, Microsoft Project

Databases: MongoDB, PostgreSQL, Oracle 8, 8i, 9i, 10g and 11g, SQL Server 2008/2012, MySQL, DB2

Reports: Tableau, Crystal reports

Tools: AWS, Docker, Toad, Visual Studio .NET, SQL*Plus, SQL* Loader, GIT, GITHUB

ERP: Oracle Financials R11 & R12, SAP, Banner

ETL: Informatica, SQL Server Integration Services, SQL Server DTS, DataStage

EXPERIENCE:

Confidential, Atlanta, GA

Big Data Engineer/Data Scientist

Responsibilities:

  • Teradata to Hadoop Migration & Data Analytics project
  • Responsible for Teradata to Hadoop Migration with Data Science team.
  • Automated the Data Migration from Teradata to Hadoop using MS Azure.
  • Developed Python & R Scripts to run various data models like Quantitative Model, Statistical Model & Clustering.
  • Development of front end applications using Java, Apache Camel to develop on demand scheduling & triggering the script.

Technology: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Oozie, Hive, Spark, Spark SQL, Java,Python,Kafka R.

Confidential, San Francisco

Big Data Engineer/Data Scientist

Responsibilities:

  • Requirement Gathering & Analysis, Business Process Documentation, Design System Architecture, Use Cases, Test Cases, User Acceptance Test, User Training
  • This system pulls information from multiple data sources and ingests it into the systems Data lake.
  • It helps in identifying trends - price moving up/down, whether the stock price is in alignment with Sector trends or deviating, Stock Volume Data Story, for profitable stock purchasing/selling.
  • Module Lead for Data ingestion. Involved in the overall architecture design for the system.
  • Prepared System Architecture & Components Design documentation & performed code-review. Experience with big data tools: Hadoop, Spark, Hive, Pig, HBase, Scoop, Kafka, etc.
  • Used Agile development process and practices
  • Create design architecture for Data Ingestion from multiple sources like RDBMS & Amazon S3 services
  • Developed a SQOOP Incremental Import Job, Shell Script & CRONJOB for importing data into HDFS
  • Imported data from HDFS into Hive using Hive commands
  • Created Hive partition on Dates and Stocks for imported data
  • Developed a PySpark Script which dynamically downloads the Amazon S3 Data files into the HDFS system.
  • Developing a notification system to push the notification into Flume Agent when data is added into MySQL. Writing MySQL triggers for the notification system.
  • Creating & configuring the Flume-Agent with proper source, channel & sinks to add the data into HDFS
  • Created PySpark RDDs for data transformation
  • Proficient in SQL Queries, triggers
  • Implemented incremental import for S3 CSV files
  • Worked with Structured & Unstructured, RDBMS & CSV data.
  • Data Masking Project
  • Implement a big data solution to mask PII (Personally Identified Information) in the non-production instances with encrypted values
  • Data came from different sources (Unix/Mainframe). Load CSV files into MySQL database
  • Importing data into HDFS and Hive using Sqoop. Writing Hive UDFS for masking Email and Phone
  • Inserting masked data into the Hive tables using the Hive UDF function. Perform Sqoop export to MySQL DB tables from Hive masked tables Developed Hive queries to process the data for visualizing.
  • Responsibilities
  • Built data pipelines to Load and transform large sets of structured, semi structured and unstructured data.
  • Imported data from HDFS into Hive using HiveQL
  • Involved in creating Hive tables, loading and analyzing data using hive queries
  • Created Hive Partitioned and Bucketed tables to improve performance.
  • Developed a SQOOP Import Job, Shell Script & CRONJOB for importing data into HDFS
  • Used Tableau for visualization and building dashboards
  • To improve performance and optimization of the existing algorithms, explored different components like Spark Context, Spark-SQL, Data Frame, Pair RDD's, accumulators
  • Processed millions of records using Hadoop jobs. Experience with object-oriented/object function scripting languages: Python, Java, C#, C++, Scala, etc.
  • Implemented Spark code using Python for RDD transformations & actions in Spark application
  • Built reusable Hive UDF libraries for business requirements
  • Working with the leadership to understand scope, derive estimates, schedule, allocate work, manage tasks/projects, present status updates to IT and business leaders as required.
  • Define and contribute to development of standards, guidelines, design patterns and common development frameworks & components.

Technology: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Hive, Spark, Spark SQL, Python

Confidential, Atlanta, GA

Application Developer/Data Analyst

Responsibilities:

  • Involved in the creating Tables, Views, Stored Procedures, Functions, Packages, and DB
  • Triggers for outbound and inbound interfaces using PL/SQL.
  • Created reports using Tableau and SAP Business Objects to display student educational data from Banner Student System
  • Clean, Map and transform student data to meet university’s database schema
  • Developed Ad-hoc SQL queries based upon client’s requirements

Confidential, Atlanta, GA

Database Developer

Responsibilities:

  • Introduced an Agile development environment and practices (SCRUM), including daily stand-ups and retrospectives, in order to respond more effectively to changing business needs and increase team productivity
  • Developed backend procedures, functions and packages using PL/SQL to support C# applications

Confidential, Monroeville, PA

Production Support Analyst

Responsibilities:

  • Maintained and modified existing UNIX shell scripts and programs
  • Enhanced and modified batch jobs in UNIX and used cron for scheduling jobs
  • Developed and maintained new SQL queries, functions, packages, store procedures and reports
  • Extensive experience in coordinating with various stakeholders and project team

Confidential

Financial Analyst

Responsibilities:

  • Involved in writing PL/SQL programs for Data Conversion and Data integration
  • Created Financial Report using Crystal Reports
  • Involved in creating PL/SQL Packages, Procedures Triggers and Cursors.

Confidential, Middletown, NY

Oracle Financial Analyst

Responsibilities:

  • Oracle E-Business Suite Applications Developer responsible for Financial Services
  • Reporting development
  • Proficient in Forms and Reports development utilizing Oracle Applications General Ledger, Accounts Payable, Purchasing, Projects, Fixed Assets, Inventory, Order Management, Accounts Receivables, and Cash Management, iProcurement modules
  • Trained and managed developers upgrading Oracle E-Business Suite Financials R11 to R12

Hire Now