Big Data and Analytics for Decision Making

Business Analytics Program

Program Overview
The business world has entered the era of Big Data. Organizations are using analytics to gain customers, generate new revenues, and save costs. Executives of large, medium, and emerging startups wonder how best to harness Big Data. Meanwhile, data scientists and analysts are learning to explore Big Data technologies in revolutionary ways and trying to understand how to better harness those technologies to benefit their organizations.  

There is no doubt that organizations have become more competitive through the use of analytics. A recent study by the MIT Sloan Management Review indicates that 67% of companies use data analytics to gain a competitive advantage compared to only 37% in 2010. First Tennessee Bank, for example, lowered its marketing costs by 20% and increased its return on investment by more than 600% by using data analytics. Likewise, the use of business analytics by smaller companies will likely lead to large productivity improvements.
 

The UTC College of Business is offering a new, executive education series on Big Data and Analytics. Participants can choose from several modules created for middle and top-level executives focused on the managerial and decision-making aspects of Big Data technologiesor or for middle and top-level managers on using Big Data technologies for processing and analysis.

The series will be led by Dr. Arben Asllani, Marvin E. White Professor of Business Analytics and Dr. Ashish Gupta, Associate Professor of Analytics and Director for the Big Data & Analytics Research Center in the UTC College of Business.

Program Outline

Module I: An Overview of Big Data Analytics
This two-day course focuses on managerial and decision-making aspects of Big Data technologies. The recommended audiences for this module are middle and top-level executives who are interested in learning about Big Data paradigm, and how to improve the decision-making skills through analytics.

Day 1:

  • Introduction to business analytics
  • Descriptive, predictive, and prescriptive analytics
  • Factors that led to the era of Big Data
  • Volume, variety, and velocity of Big Data
  • Opportunities and challenges of Big Data: lessons learned from organizations
  • Data visualization principles and examples
  • Implementing a business analytics strategy


Day 2:

  • Relational versus Big Database approaches
  • Introduction to Hadoop: HDFS and MapReduce
  • HDFS: reliable storage for massive amounts of data with MapReduce: large-scale data processing
  • Introduction to Hadoop Ecosystem
  • Integrating Hadoop into the modern data center: a cost benefit analysis
  • Security concerns of Big Data: a technical, legal, and ethical framework
  • Handling a data security breach: exercise

 

Module II: Data Analytics with MapReduce, Pig, Hive, and Impala
This three-day course is focused on using Big Data technologies for data processing and analysis. The recommended audiences for this module are middle and top-level managers, and business analysts who are interested in learning about such tools as MapReduce, Pig, Hive, and Impala. For a better understanding of the topics, knowledge of any programming language and SQL commands is recommended. Topics discussed in the first module (An Overview of Big Data Analytics) serve as a good prerequisite and allow the beginners to successfully follow a series of commands and create a better understanding of the Big Data analytics.  

Day 1:

  • The anatomy of Hadoop and its ecosystem
  • A closer look at MapReduce:  exercise
  • Introduction to Pig
  • Basic data analysis with Pig:  exercise
  • Processing complex data with Pig:  exercise


Day 2:

  • Multi-dataset operations with Pig:  exercise
  • Pig troubleshooting and optimization
  • Introduction to Impala and Hive:  exercise
  • Querying with Impala and Hive:  exercise


Day 3:

  • Impala and Hive Data Management
  • Relational data analysis with Impala & Hive: exercise
  • Analyzing text & complex data with Hive: exercise
  • Hive optimizations
  • Choosing the best tool for the job

 

Module III: Big data with Cognitive Computing dealing with Unstructured Data
This three-day course is focused on using Big Data technologies for working with disparate data and developing different analytic models for smart decision-making. The course will use predictive, descriptive models on big data in addition to developing cognitive applications. The recommended audiences for this module are middle and top-level managers, and business analysts who are interested in building predictive and descriptive applications on Big Data platforms such as Hadoop, Teradata Aster, etc. and learning about such tools as IBM Watson and IBM Bluemix. For a better understanding of the topics, knowledge of any programming language (preferably python), SQL, and Hadoop is recommended. Topics discussed in the first module and second module serve as a good prerequisite for this module and allow the beginners to successfully follow a series of commands and create a better understanding of the Big Data analytics.

Day 1: Assaying, Modifying & Recoding Big Data

  • Understanding disparate data sources
  • Advanced Data Science Applications:  case examples
  • Building Data Science applications
  • Working with the Data: exercises
  • Introductions to Data Transformation: exercises
  • Understanding Time series Data: exercise

Day 2: Developing Smart Applications

  • Predictive Analytics: exercise
  • Association & Clustering: exercise
  • Intro to Recommendation systems: case examples
  • Intro to Apache Mahout
  • Using Mahout to develop Recommendation Systems: exercises

 

Module IV: Real Time Big Data Analytics with Apache Spark
This three-day course is focused on using real time Big Data technologies and streaming data. This
module will be based on learning and building applications using the revolutionary Apache Spark
and its ecosystem for building real time analytics systems of the future. Topics covered in module II
and III will serve as a good prerequisite for this module. Understanding of all three modules is necessary
and understanding of programming language such as Python or Scala will be helpful.

Day 1: Introduction to Spark

  • Introduction to streaming data and Internet of Things (IoS)
  • Introduction to streaming tools
  • The Anatomy of Spark
  • Using Spark Shell: exercise
  • Introduction to Resilient Distributed Datasets: exercise

Day 2: Data Processing with Spark

  • Advanced work with RDD (pairs & partitioning): exercises
  • Running and integrating Spark and HDFS on a
  • Cluster: exercise

Day 3: Solving business problems with Spark

  • Solve Business problems using Spark: case examples
  • Building and Running Spark Applications: exercises
  • Spark Streaming and Fine Tuning Spark: exercise
  • Using Machine learning Library with Spark