Big Data Developer Resume Dallas, TX - Hire IT People

SUMMARY:

Extensive IT experience in Big Data technologies, Data Management/Analytics, Data visualization.
Worked in various domains including E - commerce, Automotive and Manufacturing.
Technical experience of using Hortonworks 2.6.5 distributions, Cloudera 4 and Hadoop working environment including Hadoop 2.8.3, Hive 2.1.1, Sqoop 1.99.7, Flume 1.7.0, HBase 2.0.0, Nifi 2.x, Apache Spark 2.2.1, Scala 2.12.0, Kafka 1.3.2
Technically skilled at developing new applications on Hadoop according to business needs and converting existing applications to Hadoop environment
Exposure in analyzing data using HiveQL, HBase 1.3.0 and Map Reduce programs in Java.
Good understanding of workload management, schedulers, scalability and distributed platform architectures.
Experience in Spark 2.2.1 programing with Scala and Python for high-volume data processing
Experience in collecting, processing and aggregating large amounts of streaming data using Kafka 1.3.2, Spark Streaming
In-Depth understanding of Spark Architecture including Spark Core, Spark SQL, Data Frames.
Experience in importing and exporting data using Sqoop 1.99.7 from HDFS to RDBMS and vice-versa
Experience in building ETL pipelines using NIFI 2.x.
Involved in creating HIVE tables, Partitioning, Bucketing, loading data and writing HIVE queries
Experience in working with RDBMS including Oracle and MySQL 5.x
Experience in developing scalable solutions using NoSQL databases including Cassandra 3.10, HBase 1.3.0
Experience in working with AWS using the services like EC2/Kinesis/S3.
Familiar with software development tools like Git and JIRA.
Exposure to various software development methodologies like Agile and Waterfall.
A good team-player, can work independently in a fast-paced multitasking environment, and a self-motivated learner

TECHNICAL SKILLS:

Cloud Technologies: Real Time Streaming Snowflake, AWS.\ Apache Storm, Apache Kafka 1.3.2\

Bigdata Technologies: Database Spark 2.1.0, Hive 2.1.1, Hdfs, MapReduce, \ HBase 1.3.0, Oracle 12c, SQL Server, MySQL Nifi 2.x, Sqoop 1.99.7, Flume 1.7.0, Oozie\ 5.x, Db2\

Hadoop Distributions: Programming Languages Cloudera 5.8.3, Hortonworks 2.5\ Scala 2.12.0, Python 3, Java 8, Shell scripting\

Dashboard: Operating System Elastic Search, Kibana, Ambari\ Windows 10, Centos 7.3, Mac OS 10.12.3\

Data Warehousing: IDEs Teradata, Snowflake\ Eclipse 4.6, Visual Studio 2016, IntelliJ.\

PROFESSIONAL EXPERIENCE:

Confidential

Big Data Developer

Responsibilities:

Developed Scala scripts, UDF's using both Data frames and RDD in Spark 2.1.0 for Data Aggregation.
Used Spark-SQL to create Schema RDD and loaded it into Hive Tables.
Developed Spark 2.1.0 code using Scala and Spark-SQL for faster processing of data.
Demonstrated better organization of the data using techniques like hive partitioning, bucketing
Extracted data from MYSQL databases to HDFS using Apache Nifi 2.x.
Optimizing Hive 2.0.x Queries, joins to get better results for Hive ad-hoc queries.
Involved in creating Oozie 3.1.3 workflow and Coordinator jobs for Hive jobs to kick off the jobs on time for data availability.
Involved in deploying the applications in AWS.
Used Agile methodology for project management and Git for source code control.

Environment: Apache Spark 2.1.0 , Nifi 2.x, HDFS 2.6.1, Hive 2.0.x , Hadoop distribution of Cloudera 5.9, Linux, Eclipse, MySQL 5.x

Confidential - Dallas, TX

Big data developer

Responsibilities:

Developed Spark 2.0 applications using RDDs, Data Frames to do data cleansing, data transformations, and data aggregations.
Ex trac ted, tra nsfor med, a nd loade d ET L da ta f ro m m ult ipl e fe de ra ted da t a source s in Sp ar k 2.0 .
Experience in In-memory computations with Spark RDDs for faster responses.
Experience in handling large datasets using data partitioning, shared variables in Spark 2.0, effective & efficient Joins, and various data transformations.
Experience in Performing tuning of Spark 2.0 applications for setting right Batch Interval time, correct level of Parallelism and memory tuning.
Implemented Apache Nifi 1.7.x flow topologies to perform cleansing operations before moving data into HDFS
Developed Spark Streaming applications to perform necessary operations real time and persists into HBase.
Utilized Spark SQL with Data Frames API to provide efficiently structured data processing.
Experience in Spark application submission over variety of cluster managers.
Well versed in configuring Kafka 2.1.0 topics and scheduling Oozie workflows

Environmen t: Hadoop, Spark 2.0, Scala, Kafka 2.1.0, Hive, CDH 4.7.1, HBase, Nifi 1.7.x, Oozie, Linux, ETL

Confidential - Dallas, TX

Hadoop Developer

Responsibilities:

Developed data pipeline using flume, Sqoop to extract the data from weblogs and store in HDFS.
Used SQOOP 1.4.6 for importing and exporting data into HDFS and Hive.
Involved in processing ingested raw data using MapReduce, Hive.
Experience in moving processed data from Hadoop to relational databases or external file systems using SQOOP, HDFS GET or CopyToLocal.
C oll ec ted a nd a gg re ga te d lar ge a mount s of da ta use d Ap ac h e F lu m e 1.6.0 a nd stage d da ta in HDF S for fur ther a na l y si s.
Used Hue for Hive queries and created partitions according to day using Hive to improve performance
Developed, validated and maintained HiveQL queries
Implemented Partitions, bucketing concepts in Hive and designed both Managed and External tables for optimized performance
Involved in developing Shell scripts to orchestrate execution of all other scripts (Hive, and MapReduce) and move the data files within and outside of HDFS.
Involved in using HCATALOG to access Hive table metadata from Map Reduce code.
Supported Map Reduce Programs those are running on the cluster
Wrote Hive queries for data analysis to meet the business requirements.

Environment: Hadoop (HDFS/MapReduce), Hive, SQOOP, Hue, SQL, Linux

Confidential

Hadoop Developer

Responsibilities:

Developed workflow in SSIS to automate the tasks of loading the data into HDFS and processing using Hive.
Moved Relational Database data using Sqoop into HDFS and Hive Dynamic partition tables using staging tables
Stored data as parquet file format in Hive
Performed analytics and drawn insights from the data using Hive
Designed and Created Hive external tables using shared Meta-store instead of derby with partitioning, dynamic partitioning and buckets.
Implemented SQOOP scripts to load data to Hive.
Worked on data ingestion from Oracle to hive and involved in different data migration activities.
Involved in fixing various issues related to data quality, data availability and data stability.
Worked on Hue interface for Loading the data into HDFS and querying the data.

Environme nt: Hadoop, SQOOP, Hive, Oozie, SSIS, Linux

Confidential

Data Analyst

Responsibilities:

Queried the data from RDBMS to csv files for each month for every service category
Wrote SQL queries for data analysis and filtering out the required data for further processing.
Performed SQL queries to extract data from Oracle SQL database.
Performed initial descriptive data analysis and generate statistical reports.
Developed regression algorithms to identify wire down incidents as to whether they are energized or non-energized and automated the detection procedure.
Established an executive dashboard to demonstrate the project achievement and effectively communicated the results.
Generated weekly reports to discuss with the fault rectifying teams.

Environment: Tableau, MySQL, Excel

Confidential

SQL Developer

Responsibilities:

Managed connectivity using JDBC for querying/inserting & data management including triggers and stored procedures.
Used JDBC for database connectivity.
Wrote SQL queries, stored procedures and database triggers on the database objects.
Analyzed the data and created dashboards using Tableau.
Used SQL queries, JDBC prepared statements for retrieving data from MySQL database.
Actively participated and provided feedback constructively during daily Stand up meetings and weekly Iterative review meetings

Environment: Java 1.6, J2EE, Tableau, Eclipse, My SQL

We provide IT Staff Augmentation Services!

Big Data Developer Resume

Dallas, TX

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship