Big Data Engineer Resume Charlotte, NC - Hire IT People

SUMMARY

6+ years of experience in analysis, design and development using Big Data and Java
Experience on Hadoop, HDFS, Hive, Pig, Mapreduce, Spark
Configured Zoo Keeper, Flume, Kafka & Sqoop to the existing Hadoop cluster.
Hands - on experience with Hadoop applications (such as administration, configuration management, monitoring, debugging, and performance tuning).
Having experience on various Databases and Sources like Oracle, Netezza, MySql, Sql Server, Db2, Postgres, MainFrames.
Participated in requirement analysis, reviews and working sessions to understand the requirements and system design.
Knowledge in storing entire data at one single repository using Data Lakes.
Experience in developing Front-End using JSF, JavaScript, HTML, XHTML and CSS.
Experience in working with web/applications servers IBM Web sphere, Oracle Weblogic, Apache Tomcat.
Experience in designing highly transactional web sites using J2EE technologies and handling design/implementation-using Eclipse.

TECHNICAL SKILLS

Languages: Java, Python, R, Scala

Platforms: LINUX, Windows

Big Data: Hadoop, HDFS, MapReduce, Pig, Zookeeper, Hive, Sqoop, Flume, Kafka, Spark, Impala

J2SE / J2EE Technologies: Java, J2EE, JDBC, JSF, JSP, Web Services, Maven

Web Technologies: HTML, XHTML, CSS, Java Script, JSF and AJAX, Qlikview, XML and Shell Script.

Cloud Technologies: AWS, EC2, S3, Redshift, Data Pipeline, EMR.

Web/Application Servers: Web Sphere, Web logic Application server, Apache Tomcat

IDE / Tools: Eclipse, IntelliJ, RStudio

Methodologies: Agile, Scrum, Kanban

PROFESSIONAL EXPERIENCE

Confidential, Charlotte, NC

Big Data Engineer

Responsibilities:

Used Sqoop to pull the data from RDBMS like Teradata, Netezza, Oracle and storing it in the Hadoop.
Creating external hive tables to store and queries the data which is loaded.
Data will be loaded monthly, weekly and daily depends on the portfolios.
Different data include retail, auto, cards, home loans, and references.
Some of the retail data is located in Mainframes and RDBMS, so need to apply joins and store them at one location.
Scrubbed the history data present in hive and files located in HDFS.
Optimizations techniques include partitioning, bucketing.
Created internal tool for comparing the RDBMS and Hadoop such that all the data located in source and target matches using shell script.
Working with copybook files converting them from ASCHII, binary formats and storing in HDFS and creating hive tables such that we can Decommission Mainframes and make Hadoop as a primary source and same this for the export to mainframes.
Used some of the Pig and written pig scripts to transform the data in structured format.
Worked with Text, Avro, and Parquet file formatted and snappy as a default compression.
Created Oozie work flows to automate the process in structured manner.
We have 3 layers of storing the data Raw layer, Intermediate layer and Publish layer.
Used impala to query the data into the publish layers where all the other teams or business users can access for faster processing.
Worked on the Autosys and created jil with the dependencies of the other jobs such that all the jobs run in parallel and it’s been automated.
Used Eclipse IDE to check the new files, existing, and modification needs be done.
Used SVN repository to checking or checkout the code.

Environment: Hadoop, HDFS, Cloudera, Hive, Impala, shell script, eclipse, SVN, linux, oozie, Autosys, Teradata, Netezza, Oracle.

Confidential, Charlotte, NC

Hadoop Engineer

Responsibilities:

Managing several Hadoop clusters and other services of Hadoop Ecosystem in development and production environments.
Work closely with engineering teams and participate in the infrastructure development and framework development.
Worked on POCs in R&D environment on Hive2, Spark SQL and Kafka before providing services to the applications teams.
Used Spark SQL to create structured data by using data frame and querying from other data sources using JDBC and hive.
Automate deployment and management of Hadoop services including implementing monitoring.
Worked closely with Alpide team, ensuring all the issues where addressed or resolved sooner.
Contribute to the evolving architecture of our services to meet changing requirements for scaling, reliability, performance, manageability, and price.
Capacity planning of Hadoop clusters based on application requirement.
Peer Reviews with the application teams for their release and ensure they maintain the standards.
Created sentry policy files to provide access to the required databases and tables to view from impala to the business users in the dev, uat and prod environment.
Migrated the existing data to Hadoop from RDBMS (Netezza, Oracle and Teradata) using Sqoop for processing the data and logs from server using flume into HDFS.
Created managed and external tables in hive and implemented partitioning and bucketing techniques for space and performance efficiency.
Used Impala on select queries for Business Users to retrieve the tables faster.
Developed Oozie shell wrapper for implementing Oozie re-run process for common workflows and sub-workflows.
Used Autosys scheduler to automate the jobs.
Used various file formats Avro, Parquet, Json, Text by using snappy compression.
Used CVS repository to checking or checkout the code.

Environment: Hadoop, HDFS, Hive, Sqoop, Impala, Flume, Spark SQL, Kafka, Python, Oozie, Autosys, Linux, Oracle, Netezza and CVS, Cloudera.

Confidential, Jersey City, NJ

Big Data Developer

Responsibilities:

Worked with closely with Business sponsors on the architectural solutions to meet their business needs
Conducted information sharing and teaching sessions to facilitate increased awareness of industry trends and upcoming initiatives by ensuring compliance between business strategies and goals and solution architecture designs
Performance tuned the application at various layers - MR, HIVE.
Used Qlikview to create visual interface of the real time data processing.
Implemented partitioning, dynamic partitioning and bucketing in hive.
Imported and exported data from various databases Netezza, oracle, MySql, DB2 into hdfs.
Automated the process from pulling the data from data sources to Hadoop and exporting the data in the form of Jason files in to specified location.
Migrated the Hive queries to Impala
Worked on various file formats Avro, Parquet, Text by using snappy compression.
Created analysis batch job prototypes using Hadoop, Pig, Oozie, Hue and Hive.
Used Git repository to checking and checkout the code.
Designed, documented operational problems by following standards and procedures using a software-reporting tool JIRA.

Environment: Hadoop, HDFS, Map Reduce, Hive, Impala, Pig, Sqoop, Java, Linux shell scripting, Oracle, Netezza, MySql, Db2, Qlikview, GIT.

Confidential

Java Developer

Responsibilities:

Used class-responsibility-collaborator (CRC) model to identify organized classes in the Hospital Management Systems.
Used sequence diagrams to show the object interactions involved with the Use-Cases of a user of the system.
Involved in Database Design by creating Data Flow Diagram (Process Model) and ER Diagram (Data Model).
Designed HTML screens with JSP for the front-end.
Made JDBC calls from the Servlets to the Database
Involved in designing stored procedures to extract and calculate billing information connecting to oracle.
Formatting the results from the Database as HTML reports to the client.
Java Script was used for client side validation.
Servlets are used as the controllers and Entity/Session Beans for Business logic purpose.
Used WebLogic to deploy applications on local and development environments of the application.
Used Eclipse for building the application.
Participated in User review meetings and used Test Director to periodically log the development issues, production problems and bugs.
Implemented and supported the project through development, Unit testing phase into production environment.
Used CVS Version manager for source control and CVS Tracker for change control management.

Environment: Java, JSP, JDBC, Java Script, HTML, WebLogic, Eclipse and CVS.

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Charlotte, NC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship