Data Engineer Resume OH - Hire IT People

SUMMARY

5+ years of IT industry experience related to Data Engineering, Data science and Full Stack Development.
Experienced in working with Google Cloud Platform, Hadoop distributions like Cloudera and Hortonworks
Hands on experience on major components in Hadoop ecosystem like HDFS, Hive, Spark, Sqoop, Oozie and Yarn.
Experienced in importing, exporting data from RDBMS into HDFS / Hive and .
Good working knowledge on creating / maintaining Hive tables, partitions, bucketing and also written Spark SQL for data analysis and processing to meet business requirements.
Experienced working in Python Full Stack with Flask Framework.
Develop and monitor IBM DataStage Jobs using various Processing and Debug Stages.
Experienced in creating data quality rules in Ataccamma to ensure ETL process in IBM Data Stage.
Deploy, test python flask application in Azure DevOps.
Developed terraform code to create G - Cloud bucket objects/ Big Query tables.
Hands on experience in working with machine learning algorithms such as logistic regression, random forest, linear regression, and K-means.
Highly skilled in using visualization tools like Tableau, R and Spotfire.
Highly motivated with a strong sense of achievement and the willingness to learn and adapt to new technologies
Strong team player;Ability to quickly triage and troubleshoot complex problems.

TECHNICAL SKILLS

Big Data Stacks: Hadoop stack (Hive, Spark Yarn, Sqoop, Oozie), NoSQL(Cassandra HBase)

Languages: C, C++, Java, Python, JS, HTML, XML, PHP, R, Matlab, SQL

Cloud Platforms: Azure DevOps and Google Cloud-Storage, Pubsub, BigQuery

Operating Systems: Windows, UNIX / Linux

Database: MySql, Oracle 9i/10g, SQL Server 2009, MariaDB, IBM DB2

ETL Tools: IBM DataStage 11.5

Data Quality Tool: Ataccama 12.5

Data Design / Modeling Tools: Erwin, MS Visio

Data Science Tools: KNIME, Alteryx, Jupyter

Visualization Tools: R, Tableau, Spotfire

PROFESSIONAL EXPERIENCE

Data Engineer

Confidential, OH

Responsibilities:

Gather Business Requirements, create Documentation, and analyze Data in IBM DB2.
Develop and monitor IBM DataStage Jobs using various Processing stages such as Transformer, Aggregator, Filter, Lookup, Remove Duplicate, Merge, Copy, Join, Sort, Debug etc. and Development/Debug stages such as Row generator stage, Column generator stage, Head-Tail stage, Peek stage.
Migrate commercial customer data from DB2 to Salesforce nCino using DataStage 11.5
Debug, test, fix data transformation process in various stages for parallel jobs.
Create Data Quality rules in Ataccama IDE.
Strong SQL skills including complex query building and query performance tuning.
Work on Production deployment and support process.

Environment: IBM DataStage 11.5, Python3.7, Ataccama v12.5, AQT, IBM DB2, Alation, MS Visio

Full Stack Data Engineer

Confidential, GA

Responsibilities:

Responsible for developing and automating the network engineering tool-eROM.
Responsible for requirement gathering, developing, deploying, testing, managing one complete project.
Creating front end forms and graphs in javascript which helps user to enter input and review the data.
Implement flask webservice to connect front end and back end, and perform mathematical calculations, and transformations
Load data into Maria DB using python scripts.
Deploy, test app in Azure DevOps Dev, Test, Prod environments.

Environment: Javascript, Python3.7, Anaconda, Python Flask, MariaDB, Redhat-7.

Data Engineer / Analyst

Confidential, OH

Responsibilities:

Responsible for developing infrastructure for Google Cloud Platform and participated in cloud architecture meetings.
Developed Python code to send bucket/table notifications via pubsub.
Loaded SupplyChain Data every day on incremental basis to BIGQUERY using Google DataProc, GCS bucket, HIVE, Python, Gsutil.
Performed exploratory data analysis on large set of data Confidential rest in Hadoop to build a curated data layer to perform data science activities.
Contributed to all stages of data science or decision modeling projects, including problem formulation, solution development and deployment.
Worked with business teams to translate business-relevant scientific, engineering and commercial problems in to questions that may be addressed using data science.
Well versed in using one more of the following software packages: scikit-learn,numpy,pandas,Jupyter,matplotlib,scipy and keras.
Experienced in solving problems using one or more of the following techniques: Regression,Decision trees, random forest, Boosting, PCA, KMeans.

Environment: GCP, Cloudera Hadoop, Big Query, PubSub, Hue, Python, YAML, Spark

Data Engineer

Confidential, TX

Responsibilities:

Responsible for developing schemas for drilling data in NoSQL Database.
Developed scripts to ingest data from external systems to Hadoop.
Responsible for developing data pipelines using StreamSets.
Ingested data from WellView server to hdfs storage.
Batch loading of various drilling data types into a big data store.
Used GIT repository for tracking changes and coordinating work in the team.
Parsed, enriched, ingested, and tested the quality of drilling data using Python and Apache Spark.
Develop data process pipelines and machine learning algorithms(e.g., regression, random forest) to predict dysfucntions.
Used various Python libraries(matplotlib, plotly, and dash) to visualize data.
Supported ad hoc query and extract requests from other teams.
Vizualized data on TIBCO Spotfire.

Java AEM Developer

Confidential

Responsibilities:

Develop and Design website for Insurance domain using Adobe AEM.
Good experience on JAVA, JSP, CRX, JCR, Felix, OSGi and other technologies involved in deploying solutions based on the Adobe AEM framework.
Used CRXDE and Brackets for components & templates development, and eclipse Mars for java based implementations.
Coordinated and implemented with third party vendors for web chat component using AJAX and Restful services.
Performed unit testing on various Jira tickets and components.
Worked on version migration issues from CQ5.5 to AEM6.1.
Execute process of object model designing, implementation and unit testing.
Environment: Azure, Hadoop, Hive, Implala, HDFS, Sypder, Jupiter, Apache Spark, Cassandra, Python, StreamSets, Cent OS, Tibco Spotfire, Microsoft VSTS, GIT, AdobeCQ5.x/AEM6.x, JSP, JCR, CRXDE, DAM, OSGI, HTML, CSS, JavaScript, Eclipse, AngularJS, JVM1.8, Maven, Apache Tomcat, JIRA.

Software Developer

Confidential, TX

Responsibilities:

Implement machine learning algorithms in R, building natural language processing systems.
Collect, track, and integrate multiple sources of bigdata.
Maintain SQL scripts to create and populate tables in data warehouse for daily reporting.
Experience in using statistical modeling and/or machine learning techniques to build models.
Construct different supervised machine learning models
Logistic Regression,Support Vector Machine, K Nearest Neighbors, etc. in R and Java.
Work with business teams to create Hive queries for ad hoc analysis.
Publish blog posts to promote the company's analytics platform.
Evaluate the performance of various algorithms/models/strategies based on the real-worlddatasets.
Use analytical tools and regression analysis to create predictive models.
Use shiny dashboard, dygraphs, and plotly to develop professional-quality interfaces for data interaction.
Environments and Platforms - R, Java, Oracle 10g, Tableau, SAS, Hive

We provide IT Staff Augmentation Services!

Data Engineer Resume

OH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship