Big Data Engineer/Data Scientist Resume Atlanta, GA - Hire IT People

SUMMARY:

Tech lead and Management Professional with 10+ years of overall experience and extensive experience in big data, machine learning and deep learning.
Prepared System Architecture & Components Design documentation & performed code - review.
Experience with big data tools: Hadoop, Spark, Hive, Pig, HBase, Scoop, Kafka, etc.
Implemented a big data solution to mask PII (Personally Identified Information) in the non-production instances with encrypted values.
Proficient in SQL Queries, triggers, Developing a notification system to push the notification into Flume Agent when data is added into MySQL.
Expert in writing MySQL triggers for the notification system. She developed Python & R Scripts to run various data models like Quantitative Model, Statistical Model & Clustering.

TECHNICAL SKILLS:

Operating Systems: Mac, Windows 10, Linux, Ubuntu

Programming Languages: Python, SQL, PL/SQL, HTML5, CSS3, JavaScript, Java, REST, Shell Scripts, R Big Data Hadoop, Mapreduce, Hive, Pig, Hbase, Spark, API Integration

Machine Learning: Feature engineering, KNN, SVM, Na ve Bayes, Decision Trees & Forests

NLP: LSA, topic modeling, dimensionality reduction, data miningDeep Learning:Long Short Term Memory Networks, Multilayer Perceptron Network

Methodologies: Object Oriented, SDLC, Agile, Waterfall

Software: Microsoft Office, Microsoft Visio, Microsoft Project

Databases: MongoDB, PostgreSQL, Oracle 8, 8i, 9i, 10g and 11g, SQL Server 2008/2012, MySQL, DB2

Reports: Tableau, Crystal reports

Tools: AWS, Docker, Toad, Visual Studio .NET, SQL*Plus, SQL* Loader, GIT, GITHUB

ERP: Oracle Financials R11 & R12, SAP, Banner

ETL: Informatica, SQL Server Integration Services, SQL Server DTS, DataStage

EXPERIENCE:

Confidential, Atlanta, GA

Big Data Engineer/Data Scientist

Responsibilities:

Teradata to Hadoop Migration & Data Analytics project
Responsible for Teradata to Hadoop Migration with Data Science team.
Automated the Data Migration from Teradata to Hadoop using MS Azure.
Developed Python & R Scripts to run various data models like Quantitative Model, Statistical Model & Clustering.
Development of front end applications using Java, Apache Camel to develop on demand scheduling & triggering the script.

Technology: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Oozie, Hive, Spark, Spark SQL, Java,Python,Kafka R.

Confidential, San Francisco

Big Data Engineer/Data Scientist

Responsibilities:

Requirement Gathering & Analysis, Business Process Documentation, Design System Architecture, Use Cases, Test Cases, User Acceptance Test, User Training
This system pulls information from multiple data sources and ingests it into the systems Data lake.
It helps in identifying trends - price moving up/down, whether the stock price is in alignment with Sector trends or deviating, Stock Volume Data Story, for profitable stock purchasing/selling.
Module Lead for Data ingestion. Involved in the overall architecture design for the system.
Prepared System Architecture & Components Design documentation & performed code-review. Experience with big data tools: Hadoop, Spark, Hive, Pig, HBase, Scoop, Kafka, etc.
Used Agile development process and practices
Create design architecture for Data Ingestion from multiple sources like RDBMS & Amazon S3 services
Developed a SQOOP Incremental Import Job, Shell Script & CRONJOB for importing data into HDFS
Imported data from HDFS into Hive using Hive commands
Created Hive partition on Dates and Stocks for imported data
Developed a PySpark Script which dynamically downloads the Amazon S3 Data files into the HDFS system.
Developing a notification system to push the notification into Flume Agent when data is added into MySQL. Writing MySQL triggers for the notification system.
Creating & configuring the Flume-Agent with proper source, channel & sinks to add the data into HDFS
Created PySpark RDDs for data transformation
Proficient in SQL Queries, triggers
Implemented incremental import for S3 CSV files
Worked with Structured & Unstructured, RDBMS & CSV data.
Data Masking Project
Implement a big data solution to mask PII (Personally Identified Information) in the non-production instances with encrypted values
Data came from different sources (Unix/Mainframe). Load CSV files into MySQL database
Importing data into HDFS and Hive using Sqoop. Writing Hive UDFS for masking Email and Phone
Inserting masked data into the Hive tables using the Hive UDF function. Perform Sqoop export to MySQL DB tables from Hive masked tables Developed Hive queries to process the data for visualizing.
Responsibilities
Built data pipelines to Load and transform large sets of structured, semi structured and unstructured data.
Imported data from HDFS into Hive using HiveQL
Involved in creating Hive tables, loading and analyzing data using hive queries
Created Hive Partitioned and Bucketed tables to improve performance.
Developed a SQOOP Import Job, Shell Script & CRONJOB for importing data into HDFS
Used Tableau for visualization and building dashboards
To improve performance and optimization of the existing algorithms, explored different components like Spark Context, Spark-SQL, Data Frame, Pair RDD's, accumulators
Processed millions of records using Hadoop jobs. Experience with object-oriented/object function scripting languages: Python, Java, C#, C++, Scala, etc.
Implemented Spark code using Python for RDD transformations & actions in Spark application
Built reusable Hive UDF libraries for business requirements
Working with the leadership to understand scope, derive estimates, schedule, allocate work, manage tasks/projects, present status updates to IT and business leaders as required.
Define and contribute to development of standards, guidelines, design patterns and common development frameworks & components.

Technology: Hadoop Ecosystem, HDFS, MapReduce, Flume, Sqoop, Hive, Spark, Spark SQL, Python

Confidential, Atlanta, GA

Application Developer/Data Analyst

Responsibilities:

Involved in the creating Tables, Views, Stored Procedures, Functions, Packages, and DB
Triggers for outbound and inbound interfaces using PL/SQL.
Created reports using Tableau and SAP Business Objects to display student educational data from Banner Student System
Clean, Map and transform student data to meet university’s database schema
Developed Ad-hoc SQL queries based upon client’s requirements

Confidential, Atlanta, GA

Database Developer

Responsibilities:

Introduced an Agile development environment and practices (SCRUM), including daily stand-ups and retrospectives, in order to respond more effectively to changing business needs and increase team productivity
Developed backend procedures, functions and packages using PL/SQL to support C# applications

Confidential, Monroeville, PA

Production Support Analyst

Responsibilities:

Maintained and modified existing UNIX shell scripts and programs
Enhanced and modified batch jobs in UNIX and used cron for scheduling jobs
Developed and maintained new SQL queries, functions, packages, store procedures and reports
Extensive experience in coordinating with various stakeholders and project team

Confidential

Financial Analyst

Responsibilities:

Involved in writing PL/SQL programs for Data Conversion and Data integration
Created Financial Report using Crystal Reports
Involved in creating PL/SQL Packages, Procedures Triggers and Cursors.

Confidential, Middletown, NY

Oracle Financial Analyst

Responsibilities:

Oracle E-Business Suite Applications Developer responsible for Financial Services
Reporting development
Proficient in Forms and Reports development utilizing Oracle Applications General Ledger, Accounts Payable, Purchasing, Projects, Fixed Assets, Inventory, Order Management, Accounts Receivables, and Cash Management, iProcurement modules
Trained and managed developers upgrading Oracle E-Business Suite Financials R11 to R12

We provide IT Staff Augmentation Services!

Big Data Engineer/data Scientist Resume

Atlanta, GA

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship