Developer/programmer Analyst Resume
NY
SUMMARY
- Over 8 years of overall experience in data analysis, data modelling and implementation of enterprise class systems spanning Big Data, Data Integration, Object Oriented programming and Advanced Analytics.
- Analyzed the Cassandra/SQL scripts and designed the solution to implement using Scala.
- Expertise in Big Data Technologies and Hadoop Ecosystem tools like Flume, Sqoop, Hbase, ZooKeeper, Oozie, MapReduce, Hive, PIG and YARN.
- Extracted and updated the data into MONGODB using MONGO import and export command line utility interface.
- Developed Collections in Mongo DB and performed aggregations on the collections.
- Hands on experience in installation, configuration, management and deployment of Big Data solutions and the underlying infrastructure of Hadoop Cluster using Cloudera and Horton works distributions.
- Experience in different Hadoop distributions like Cloudera (Cloudera distribution CDH3, 4 and 5), Horton Works Distributions (HDP) and Elastic Map Reduce (EMR).
- Extensive experience in Spark/Scala, Map Reduce MRv1 and Map Reduce MRv2 (YARN) Expert in creating Pig and Hive UDFs using Java to analyze the data efficiently.
- Experience in creating Map Reduce codes in Java as per the business requirements.
- Developed Map reduce jobs to automate transfer of data from Hbase.
- Strong experience in working with Elastic Map Reduce and setting up environments on Amazon AWS
TECHNICAL SKILLS
BigData/Hadoop Framework: Spark, HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Flume HBase, Amazon AWS (EMR)
Databases: MySQL, Oracle, Hbase, MongoDB and Cassandra.
Languages: C, C++, Java, Python2.7/3.x, R, SQL, PL/SQL, Pig Latin, HiveQL.
Web Technologies: JSP, XML, Servlets, JavaBeans, JDBC, AWT.
Operating Systems: Windows XP, 7,8,10, UNIX, CentOS, Ubuntu and Macintosh.
Front - End: HTML/HTML 5, CSS3, JavaScript/jQuery.
Development Tools: Microsoft Visual Studio, dB Visualizer, Eclipse, TOAD, Intelij, MySQL Workbench, PyCharm, Sublime, PL/SQL Developer.
Version Controls: GIT, SVN.
Reporting Tool: Tableau, SAP Business Objects.
Office Tools: Microsoft Office Suite.
Development Methodologies: Agile/Scrum, Waterfall.
Environment: Hadoop, MapReduce, Sqoop, HDFS, HBase, Hive, Pig, Oozie, Spark, Kafka, Cassandra, AWS, Elasticsearch, Java, Oracle 10g, MySQL, Ubuntu, HDP.
PROFESSIONAL EXPERIENCE
Confidential, NY
Developer/Programmer Analyst
Responsibilities:
- Responsible for working with Spark streaming to get ongoing information from Kafka.
- Developed Data pipelines to support the work of Data scientists.
- Developed application for consumer and producer using Apache Kafka.
- Developed ETL process using Apache Spark.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Work upon developing internal testing tools which were written in Python.
- Consumed JSON messages using Kafka and processed the JSON file using Spark Streaming to capture updates.
- Use AWS services like EC2 and S3 for small data sets processing and storage, Maintain the cluster on AWS EMR.
- Use Spark SQL in Data bricks tool to parse the structured and unstructured data.
- Responsible for setting up the Docker, Elastic search as part of the Streaming data purpose.
- Use Cloud data warehouse such as Snowflake to query the data.
- Worked External data such as Lexis-Nexis using the Rest API / GraphQL and used Databricks Spark SQL for analysis.
- Worked with AWS RDS, DynamoDB for cloud based data storage purposes.
- Worked using Python JSON module to write data to a text file and parse the unicode characters.
Environment: AWS,Python,Hive,Spark SQL,Apache Kafka,SQL,PySpark,Cassandra,Ab initio,Snowflake, Databricks, Github,Bash, SQL, Teradata, ETL, Spark Streaming, Docker, Kubernetes, Terraform
Confidential, SFO, CA
Sr. Hadoop Developer
Responsibilities:
- Configured Spark streaming to get ongoing information from the Kafka and stored the stream information to HDFS.
- Developed Map Reduce programs for some refined queries on big data.
- Created Hive tables and working on them for data analysis to cope up with the requirements.
- Developed a framework to handle loading and transform large sets of unstructured data from UNIX system to HIVE tables.
- Worked with business team in creating Hive queries for adhoc access.
- Used Spark API over Cloudera Hadoop YARN to perform analytics on data in Hive.
- Analyzed the data by performing Hive queries, ran Pig scripts, Spark SQL and Spark Streaming.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Worked on developing internal testing tools which were written in Python.
- Developed script which will Load the data into Spark RDD and do in memory data Computation to generate the output.
- Used CloudSQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform.
- Consumed JSON messages using Kafka and processed the JSON file using Spark Streaming to capture UI updates.
- Experienced in writing live Real-time Processing and core job using Spark Streaming with Kafka as a data pipe-line system.
- Optimized Hive QL/ pig scripts by using execution engine like Tez, Spark.
- Tested Apache TEZ, an extensible framework for building high performance batch and interactive data processing applications on Pig and Hive jobs.
- Worked and learned a great deal from AWS Cloud services like EC2, S3, EBS, RDS and VPC.
- Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for small data sets processing and storage, Experienced in Maintaining the Hadoop cluster on AWS EMR.
Environment: Hortonworks, Hadoop, Python, Map Reduce, HDFS, Hive, Pig, Sqoop, Apache Kafka, Apache Storm, Oozie, SQL, Flume, Spark, Hbase, Cassandra, Informatica, Java, Github
Confidential, Sunnyvale, CA
Hadoop Developer
Responsibilities:
- Worked on analyzing Hadoop cluster and different big data analytic tools including Pig, Hive, Hbase and Sqoop.
- Performed analysis on implementing Spark using Scala and wrote spark sample programs using PySpark.
- Good exposure in development with HTML, Bootstrap and Scala.
- Developed Spark code using Scala and Spark-SQL/Streaming for faster testing and processing of data.
- Worked on migrating Map Reduce programs into Spark transformations using Spark and Scala.
- Worked in AWS environment for development and deployment of custom Hadoop applications.
- Strong experience in working with ELASTIC MAPREDUCE(EMR)and setting up environments on Amazon AWS EC2 instances.
- Created Oozie workflows to automate data ingestion using Sqoop and process incremental log data ingested by Flume using Pig.
- Implemented Partitioning, Dynamic Partitions, Buckets in Hive.
- Involved in migrating hive queries and UDF’s in hive to Spark SQL.
- Extensively Used Sqoop to import/export data between RDBMS and hive tables, incremental imports and created Sqoop jobs for last saved value.
- Developed custom writable Map Reduce JAVA programs to load web server logs into HBase
Environment: Hadoop, Map Reduce, Spark, Kafka, HDFS, Hive, Pig, Oozie, Core Java, Python, Eclipse, Hbase, Flume, Cloudera, Oracle, UNIX Shell Scripting
Confidential, Boston, MA
Sr. Software Engineer
Responsibilities:
- As build and Release Engineer for a team that involved different development teams and multiple simultaneous software releases.
- Provided Configuration Management and Build support for more TEMPthan 8 different applications, built and deployed to the production and lower environments Analyses and resolve compilation and deployment errors related to code development, branching, merging and building of source code.
- Created and maintained Clear Case repositories, Projects, Streams and Baselines.
- Used spring framework in enabling Controller Class access to Hibernate.
- Designed and developed several Flex UI Screens and Custom Components using MXML and Action script.
- Installed Build Forge and Build Forge agents on all servers, created user and managed controlled access.
- Scripting experience in Python, PHP, and/or Bash, PowerShell, Run deck for automation Purpose.
- Expert in Azure Services and capabilities ( VMs, VNETs, ExpressRoute, Azure AD, Load Balancers, Azure SQL, SCCM, SCOM etc.)
- Worked on theDB2 databases to keep all the database tables in every environment in sync.
- Deployed code on WebSphere application servers for Production, QA, and Development environments.
- Used HP Quality Centre and Jira tool to track all defects and changes related to build and release team.
- JIRA is used as ticket tracking in cases, change management and Agile/SCRUM tool.
Environment: Java, Oracle SQL, PL/SQL, Python, JSP, SVN, J2EE Servlets, UNIX, JIRA.
Confidential
Software Engineer
Responsibilities:
- Extensive Involvement in Requirement Analysis and system implementation.
- Actively involved in SDLC phases like Analysis, Design and Development.
- Responsible for developing modules and assist in deployment as per the client’s requirements.
- Application is implemented using JSP and servlets are used for implementing Business logic.
- Developed utility and halper classes and Server side Functionalities using servlets.
- Created DAO Classes and Written Various SQL queries to perform DML Operations on the data as per the requirements.
- Created Custom Exceptions and implemented Exception handling using Try, Catch and Finally Blocks.
- Developed user interface using JSP, JavaScript and CSS Technologies.
- Implemented User Session tracking in JSP.
- Involved in Designing DB Schema for the application.
- Implemented Complex SQL Queries, Reusable Triggers, Functions, Stored procedures using PL/SQL.
- Worked in pair programming, Code reviewing and Debugging.
- Involved in Tool development, Testing and Bug Fixing.
- Performed unit testing for various modules.
- Involved in UAT and production deployments and support activities.
Environment: Java, J2EE, Oracle SQL, PL/SQL, Quality Centre, JSP, Servlets.