Big data Engineer Resume Virginia Beach - Hire IT People

SUMMARY

3 years of experience in software development industry, mainly focused on Big Data technologies and Distributed computing.
Experience in working with MapReduce programs using Apache Hadoop for working with Big Data .
Hands on creating databases, schemas and tables in PostgreSQL .
Experience with SQL, PL/SQL and database concepts.
Responsible for account growth of new and existing accounts utilizing knowledge of PaaS, IaaS, and SaaS.
Experience on Creating SQL objects like Tables, Stored Procedures, Views, Indexes, Triggers, Functions, User Defined Data - Types, Rules, and Defaults.
Hands on Hadoop Ecosystem, HDFS for the data storage & MapReduce for the cluster management.
Good knowledge in Apache Spark 2.0 using Scala programming.
Knowledge on processing with real-time data using Apache Spark.
Hands on converting Hive/SQL queries in to Spark transformations using spark RDDs, spark SQL.
Hands on coding Apache Spark using Python for the faster data processing.
Worked on various individual projects relevant to data mining.
Good understanding of concepts like Enterprise Data Warehousing ETL, Data Modelling, Data Mapping.
Experience in designing and maintaining high performing ELT/ETL Processes
Experience in working with Hive tool creating tables, data distribution by implementing partitioning and bucketing, writing and optimizing the Hive SQL queries as per the client requirements.
Hands on creating External and Internal tables in Hive.
Developed the UNIX shell scripts for creating the reports from Hive data.
Work experience with cloud infrastructure like Google cloud and Microsoft Azure .
Experience in NOSQL databases such as HBase and MongoDB .
Ability to meet deadlines and handle multiple tasks, flexible in work schedules and processes excellent communication skills.
Articulate in written and verbal communication along with strong interpersonal, strong analytical and organizational skills.

TECHNICAL SKILLS

Programming: Python, R, Scala, Bash scripting & Linux Kernel Development.

Hadoop: Spark, Map-Reduce, Pig, Hive, Sqoop and Hbase.

Methodologies: Agile Methodologies and Waterfall Methodology PostgreSQL

SQL: My SQL server, NOSQL

Data warehousing Data Mining: Cassandra & Mango DB. Teradata and Hive.

Cloud services: Rapid Miner and R-programming tool Google cloud services & Microsoft Azure.

Apache: Core, Data frames, Spark SQL, Spark Streaming Eclipse, IntelliJ, Eclipse Scala IDE and PyCharm (Python).

Spark: Spark

PROFESSIONAL EXPERIENCE

Confidential, Virginia Beach

Big data Engineer

Environment: Hadoop HDFS, MapReduce, Apache Spark v 2.1.1, Apache Kafka v 0.10.2, Apache Spark streaming, Apache Cassandra v 3.10, Apache Zoo Keeper (v 3.4.10), Microsoft Azure and Scala programming v 2.11 .

Responsibilities:

Task is on Data pipeline (Data processing job) which uses the latest technology such as spark streaming and Kafka
Reading input data, transform, and write out the resulting output.
Creating a program with multiple pipelines.
Pipeline Transforms, Composite Transforms and Root Transforms.
Processed streaming data as well as batch using Apache Spark, Spark Streaming.
Streaming data in the Kafka cluster and also collected data is fed to the Apache core via Spark streaming API used.
Data was divide into small chunks and processed.
Used Spark for cleaning, processing, extracting relevant fields and performing aggregations.
Tasks on performing data transformations, Spark RDD and Data frames.
Kafka was used as messaging system to gather all the data from different sources and processing and pushing them to the Kafka.
Developed Kafka consumer program in Scala
Used Cassandra as the NoSQL Database and acquired very good working experience with NoSQL databases.
Hadoop HDFS was used to archiving the incoming data and also performing the ETL.
Used Apache Zookeeper for maintaining configuration information, naming, providing distributed synchronization and group services.

Confidential

Hadoop Developer

Environment: Hadoop, HDFS, Pig, Sqoop, HBase, Shell Scripting, Apache Crunch, Ubuntu, Linux Red Hat.

Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hbase database and Sqoop.
Responsible for building scalable distributed data solutions using Hadoop.
Implemented nine nodes CDH3 Hadoop cluster on Red hat LINUX.
Implemented Apache Crunch library on top of map reduce and spark for data aggregation .
Involved in loading data from LINUX file system to HDFS.
Worked on installing cluster, commissioning & decommissioning of data node, name node recovery, capacity planning, and slots configuration.
Created HBase tables to store variable data formats of PII ( Personally identifiable information ) data coming from different portfolios.
Implemented a script to transmit sysprin information from Oracle to base using Sqoop .
Implemented best income logic using Pig scripts and UDFs.
Implemented test scripts to support test driven development and continuous integration.
Worked on tuning the performance Pig queries.
Worked with application teams to install operating system, Hadoop updates, patches, version upgrades as required.
Responsible to manage data coming from different sources.
Involved in loading data from file system to HDFS.
Load and transform large sets of structured, semi structured and unstructured data
Cluster coordination services through Zookeeper.
Experience in managing and reviewing Hadoop log files.
Job management using Fair scheduler.
Exported the analyzed data to the relational databases using Sqoop for visualization and to generate reports for the BI team.
Responsible for cluster maintenance, adding and removing cluster nodes, cluster monitoring and troubleshooting, manage and review data backups, manage and review Hadoop log files.
Installed Oozie workflow engine to run multiple Hive and pig jobs.
Analyzed large amounts of data sets to determine optimal way to aggregate and report on it.
Supported in setting up QA environment and updating configurations for implementing scripts with Pig and Sqoop.

Confidential

Java Developer

Environment: JAVA, Eclipse IDE, HTML, PL/SQL.

Responsibilities:

Worked on designing and developing the Web Application User Interface and implemented its related functionality in JAVA/J2EE for the product.
Designed and developed applications using JSP, Servlets and HTML.
Used Hibernate ORM module as an Object Relational mapping tool for back end operations.
Provided Hibernate configuration file and mapping files and also involved in the integration of Struts with Hibernate libraries.
Extensively used Java Multi-Threading concept for downloading files from a URL.
Extensively used Eclipse IDE for developing, debugging, integrating and deploying the application.
Developed Web Service client interface for invoking the methods using SOAP.
Created navigation component that reads the next page details from an XML config file .
Developed applications with HTML, JSP and Tag libraries.
Developed required stored procedures and database functions using PL/SQL.
Developed, Tested and debugged various components in WebLogic Application Server .

Software Assurance and Risk Mitigation

Confidential, Montgomery

Responsibilities:

IPAD technology in hospitals, the usage of IPAD’s by the doctors to enter the all the required information of the patients.
Security challenges: CIA (confidentiality Integrity Availability)
Experience and a deeper understanding of software-induced security risks and how to manage them.
FIPS-200, NIST SP800-30 rev1, NIST800-53 rev4.

Confidential, Montgomery, Alabama

Risk Quantitative Risk Assessment and Management

Responsibilities:

Conducted a research survey as part of our Cyber systems and Information Security degree program requirements.
Survey is comprised of questions related to risk areas associated with your particular field of work.
Purpose of the survey is to utilize experienced company personnel to help identify the vulnerabilities associated within the area of Banking.
The survey will assist in extracting the specific threats that comprise each vulnerability and any countermeasures employed to minimize the Risk / impact of the vulnerability exploitation.
Once a sufficient number of sample surveys are collected from each company, the results will be input into a security risk assessment program developed at the university.
The program will analyze the vulnerability, threat, and countermeasure survey results, calculate a cost to mitigate the identified vulnerabilities, and produce an overall residual risk percentage.
The program will then optimize the results down to a desired risk percentage.
The optimized results include a cost to optimize and recommendations on areas to focus on to achieve the best return on investment.
The results of the optimization will be made available to the company along with a detailed explanation of the findings.
The results of the survey/optimization process may be used in published articles from the university, but the actual company name will never be identified.

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Virginia, BeacH

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship