Data Engineer Resume Wilmington, DE - Hire IT People

SUMMARY

Experience in design and development of real - time streaming applications using Spark streaming and Scala.
Over 7+ years of hands-on experience Design, Development, Testing, Implementation, Maintenance and Enhancements on various IT Projects and experience in Java/j2EE and Big Data in implementing end-to-end Hadoopsolutions.
Extensive experience in installing, configuring, and using ecosystem components like HadoopMapReduce, HDFS, Sqoop, Yarn, Pig, Zookeeper, Hive, Impala & Spark.
Good experience in writing Spark applications using Python and Scala, and SQL.
ImplementedSpark scriptsusingScala,SparkSQLto access hive tables into spark for faster processing of data.
PerformedETL processwithSparkusingScalafor processing and validation of raw data logs.
Expertise in publish and subscribe event streaming systems like Apache Kafka and MapR Event Streams and Flink
Worked on Oracle, DB2, MySQL database and NoSQL databases like HBase and MapR-DB and Cassandra, MongoDB.
Worked on AmazonAWS concepts like EMR and EC2 instance and Lambda, S3buckets web services, Glue, Athena which provides fast and efficient processing.
Extensive experience on IDEs My Eclipse and Atom
Worked on visualizations tools like Tableau, Power BI, Grafana.
Experience in working GIT and Bitbucket
Extensively worked on beginner patches for Docker and Build tools like Jenkins.
Used Agile (SCRUM) methodologies for Software Development

TECHNICAL SKILLS

Big data Technologies: HDFS, Hive, Map Reduce, Spark, Sqoop, Pig, Apache Flume, HBase, Apache Kafka, Oozie

Languages: Scala, Core Java, Unix Shell scripts, SQL

Databases: Oracle, DB2, SQL Server, MySQL, HBase, MapR-DB

IDEs: Eclipse, IntelliJ, Atom

Other Tools & packages: Cloudera Manager, MapR Control System (MCS), Replicate, SVN, JUnit, Maven, ANT, GitHub, Bitbucket, PuttyAnd Stream sets Data Collector, Power BI,, Grafana, Tableau, FileZilla

SDLC Methodology: Agile

Operating Systems: Linux/UNIX, windows

PROFESSIONAL EXPERIENCE

Confidential, Wilmington, DE

Data Engineer

Responsibilities:

Followed Agile Scrum methodology that included iterative application development, weekly Sprints and stand-up meeting.
Worked with analyst to determine and understand business requirements.
Worked on analyzing Hadoop stack and different big data analytic tools including Pig and Hive, HBase database and Sqoop.
Involved in creatingSparkapplicationsinScalausing cache, map reduce. functions to process data.
Wrote complex SQL queries using advanced SQL concepts like Aggregate functions.
Extracted data using Sqoop Import query from multiple databases and ingest into Hive tables.
Nomination QA Application is developed using Scala/spark/data frames to read data from Hive Tables on YARN Framework.
Used Kafka streams to configure spark streaming to get information and then store in HDFS
Developed Kafka consumer to consume data from Kafka topics.
ImplementedSparkCoreinScalato process data in memory.
CreatedOozieworkflowsforHadoopbased jobs includingSqoop,HiveandPig.
CreatedHive External tablesand loaded the data in to tables and query data usingHQL.
SupportedMapReducePrograms that are running on the cluster.Cluster monitoring, maintenance, and troubleshooting.

Environment: Hadoop, spark, SCALA, Amazon EMR, S3, EC2, SQOOP, Kafka, Jira, Jenkins

Confidential, Austin, TX

Data Engineer

Responsibilities:

Followed Agile Scrum methodology that included iterative application development, weekly Sprints and stand up meetings.
Extracted data using Sqoop Import query from multiple databases and ingest into Hive tables.
Documented the data flow from application Kafka storm HDFS HIVE tables.
Nomination QA Application is developed using Scala/spark/data frames to read data from Hive Tables on YARN Framework.
DevelopedSparkscripts by usingScalashell commands as per the requirement.
Closely worked with Kafka Admin team and setup Kafka cluster setup on the QA and production environments
Implementing new dimensions into spark application upon on the business requirements.
Responsible to store processed data into MongoDB.
Wrote queries to fetch data from different table by using JOINs, Sub-queries, Correlated sub-queries and derived tables on SQL Server platform
Created/Enhanced Teradata Stored Procedures to generate automated testing SQLs.
Designed and Implement test environment on AWS.
Experience in creating various views in Tableau (Tree maps, Heat Maps, Scatter plot).
Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.
Data Extraction, aggregations and consolidation of Adobe data within AWS Glue using PySpark.
Responsible for Account management, IAM Management and Cost management
Creating S3 buckets also managing policies for S3 buckets and Utilized S3 bucket and Glacier for storage and backup onAWS.
Created monitors, alarms and notifications for EC2 hosts using Cloud Watch, Cloud trail and SNS.
Involved in setting upKafkaandZookeeperProducer-Consumer components for the Big Data Environments
Installed and configuredHive,Pig,SqoopandOozieon theHadoopcluster.
Experienced in Data Modelling in SQL and NoSQL Databases
Worked using the tools ofJIRAandJenkinswithin the project

Environment: Hadoop, spark, SCALA, Amazon EMR, S3, EC2, SQOOP, Kafka, Jira, Jenkins

Confidential, McLean, VA

Data Engineer

Responsibilities:

Followed Agile Scrum methodology that included iterative application development, weekly Sprints and stand up meetings.
As a Developer Worked with the development and operations teams to implement the necessary tools and process to support the builds, deployments, testing, and infrastructure.
Worked on the S3 buckets whitelisting in Risk Dev account and fix the security groups in AWS QA environment.
Building Data Pipelines to automate the process of accessing the S3 buckets of AWS and to get required information from the buckets. Worked on Aws lambda
Create external tables with partitions using Hive, AWS Athena and Redshift
Update the security groups using Confidential IAM roles in AWS Risk Development account.
Updated numerous Confluence pages.
Updated and worked on the Metadata in Snowflake.
Worked on Confidential internal technologies such as Bogie (Build tool), Nebula.
Developed numerous cloud formation templates to deploy EC2 instances based on requirements
Updated the S3 buckets in prod Nebula and Worked on deploy EMR.

Environment: AWS EMR, S3, Lambda, RDS, Bogie, Docker, Jenkins, Git, Python, Atom.

Confidential, Columbus IN

Data Engineer

Responsibilities:

Followed Agile Scrum methodology that included iterative application development, weekly Sprints and stand up meetings.
Creating External and Managed Hive tables and working on them using HiveQL.
Validated the Map reduce, Pig, Hive Scripts by pulling the data from the Hadoopand validating it with the data in the files and reports.
Configured real-time streaming pipeline from DB2 to HDFS using Apache Kafka.
Used JIRA for the issue tracking and bug reporting.
ImplementedSparkusing Scala and Spark SQL for faster testing and processing of data
DevelopedSparkcode to usingScalaandSpark -SQLfor faster processing and testing.
Wrote Stored Procedures/Triggers/Functions using SQL Navigator to perform operations on Oracle database.
Installed Name node, Secondary name node, Yarn (Resource Manager, Node manager, Application master), Data node.
Developed ETL process usingJitterbit Harmonycloud Integration tool.
Implemented AWS solutions using EC2, S3, RDS, EBS, Elastic Load Balancer, Auto-scaling groups.
Exporting data to Teradata using SQOOP.

Environment: Hadoop, HDFS, Pig, Hive, Spark, Scala, MapReduce, AWS

Confidential, Denver CO

Data Engineer

Responsibilities:

Used Hue and MapR Control System (MCS) to monitor and troubleshoot Spark jobs.
Worked on data validation using HIVE and written Hive UDFs and Importing and exporting data into HDFS and Hive usingSqoop.
Experienced in developing scripts for doing transformations usingScala.
Used Spark API over HadoopYARNas execution engine for data analytics using Hive.
UsedPigandHivein the analysis of data.
Worked on UNIX shell scripts and automation of the ETL processes using UNIX shell scripting.
Built dashboards and visualizations on top of MapRDB and MapR-Hive using Tableau and Oracle data visualizer desktop.
Built real-time visualizations on top of Open TSDB using Grafana.

Environment: Hadoop, HDFS, Pig, Hive, Spark, Scala, MapReduce

Confidential, Alpharetta, GA

Big Data Developer

Responsibilities:

Configured Stream sets data collector with MapR Event Streams to stream real time data from different sources (database & files) into MapR topics.
Created scripts for importing data into HDFS/Hive using Sqoop from DB2.
Worked on data validation using HIVE and written Hive UDFs.
Built dashboards and visualizations on top of Hive using Power BI.
Worked onData Lakeprocessing for the manufacturing data withSpark Scalaand storing inHivetables for further analysis withTableau.
Developed SQL queries to perform joins on the tables in MySQL.
WrittenHiveUDFs to extract data from staging tables.
Involved in creatingHive tables, loading with data.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Created stored procedures in MySQL to improve data handling and ETL Transactions.

Environment: Hadoop, HDFS, Pig, Hive, Spark, MapReduce, Java

Confidential

Java Developer

Responsibilities:

Involved in requirement, design and development phases of the application.
Worked with DBA for the creation of new tables and new fields in the database.
Developed custom tags, STLD to support custom User Interfaces.
Developed the application using Struts Framework that leverages classical Model View Controller (MVC) architecture.
Created new Action Forms to access the form data.
Used Multithreading in programming to improve overall performance.
Created RESTful Web service for updating customer data from sent from external systems.
Data was converted into JSON using JSP tags.
Developed the front end User Interface using HTML5, JavaScript, CSS3, JSON, jQuery.

Environment: Java, Oracle DB, HTML, JavaScript, and CSS

We provide IT Staff Augmentation Services!

Data Engineer Resume

Wilmington, DE

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship