Luddy Online Programs

A male student sitting in a chair inside of Luddy Hall typing on his laptop

Online programs

Advance your career from anywhere in the world. With flexible online courses taught by the same expert faculty who teach on campus, you’ll earn a world-renowned IU degree—on your own schedule.

Artificial Intelligence Graduate Certificate

Level up your technical expertise with the online Certificate in Artificial Intelligence and become the expert companies need.

Data Science Graduate Certificate

Quickly and conveniently acquire new skills in topics such as data analysis, cloud computing, health and medicine, statistics, and data mining.

Data Science Master's

The online M.S. in Data Science from the Luddy School offers working professional the flexibility to advance their careers while gaining specialized knowledge in data science.

Available courses

An available list of online courses offered for our Online Master’s and Online Certificate programs are listed below. Please check the official Schedule of Classes for section numbers and instructor of record each Fall, Spring, and Summer term.

If you need additional content regarding a course, you are welcome to reach out to the instructor directly.

Note: Unless otherwise specified, all courses listed are worth three (3) credit hours.

Computer Science

Term offered: Fall

Algorithms are at the heart of any computer-related task. In this course, we will teach how to approach the meta-task of algorithm building, as well as look at individual algorithms. We will use mathematical tools for designing and analyzing our algorithms, and get some simple hands-on coding experience. If you’re a non-CS major, or someone who cares more about applications than theory, then your chances are this course is the one you want.

At the completion of this course, you will be able to:

Know, use, and, if necessary, modify a range of algorithms and data structures for well-known problems.
Given a problem, be able to develop an algorithmic solution to it.
Be able to analyze the above solution for correctness and efficiency.
Given different algorithms, be able to analyze and compare them.
Have basic knowledge of complexity, upper and lower bounds.
Have basic experience in coding simple algorithms.

Term offered: Fall

Prerequisite(s): Experience with programming, data structures, and algorithms will be assumed. Assignments will involve substantial amount of programming in Python. In addition, we will encounter math of various kinds, including linear algebra, probability theory, and basic calculus.

This course covers the fundamentals of Artificial Intelligence, and is aimed at M.S., early Ph.D., and advanced undergraduate students in Computer Science and Data Science, as well as students in other related fields who have a strong computing background. Topics will include (tentatively):

AI overview: Goals, history, progress, challenges.
Problem solving and search: Uninformed search, heuristics, A*, local search.
Applied search: Game playing, constraint satisfaction, planning.
Reasoning under uncertainty: Uncertainty representation, probabilistic models, probabilistic inference, Bayesian and Markov networks.
Machine learning: Decision trees, neural networks, support vector machines.
Applications: Computer vision, natural language processing, robotics.

Term(s) offered: Spring

Prerequisite(s): It is assumed that you can program in various styles (imperative, functional, and object-oriented), have knowledge of algebra and elements of discrete math, as well as data structures and algorithms.

Introduction to database concepts and systems. Topics include:

Database models and systems: especially relational, object-oriented, semi-structured, and graph data models
Query languages and aspects of database programming
Database design and modeling
Components of query processing
Data structures and algorithms for efficient query processing
Introduction to transaction management: concurrency and recovery

Term(s) offered: Spring

Prerequisite(s): CSCI B551 or equivalent is required. You will need to be proficient in a general-purpose programming language (Python or C/C++); you should be able to implement basic matrix operations using basic data structures of the language (e.g. matrix multiplication using arrays). Exposure to linear algebra, basic calculus, machine learning, graph theory, probability theory, geometry, and statistics will be extremely helpful.

This is an introductory course in computer vision. We will give a broad overview of the field, with a slight bias towards some topics to reflect current research trends (e.g. object recognition, deep learning). The emphasis will be on algorithms, mathematical models, and techniques that are broadly applicable to many problems not only in vision but also in other areas of AI and CS. Topics will include

(tentatively and not necessarily in this order):

Basic image manipulation: digital image representation, image filtering, morphology
Feature detection: edge detection, corner detection, invariant interest point detection Segmentation
Recognition: subspace methods, hausdorff/chamfer matching, bags-of-words, pictorial structures
Learning for vision: support vector machines, neural networks, deep learning
Geometry and image transformations: camera models, image transformations, 2d & 3d geometry
Graphical models: Markov models, Markov Random Fields, belief propagation, graph cuts
Motion: Optical flow, parametric motion
Reconstruction: stereo, structure from motion, shape from X, image restoration

Term(s) offered: Fall, Spring

Prerequisite(s): To register, successful completion of Entrance Exam with score of 6/10 is required. After completing the exam, please forward your score to the Office of Online Education via email to grant permission.

If you want to become a machine learning practitioner, a better problem solver, or maybe even consider a career in machine learning research, then this course is for you. However, for a novice, the theoretical concepts behind machine learning can be quite overwhelming.

This course focuses on introducing theoretical concepts and algorithms in a step-by-step manner, while infusing them with intuition, examples and python jupyter notebooks. In this spirit, you will study core ML algorithms, while also working through numerous example applications of machine learning. Concrete examples help illustrate the broader concepts by putting the learned material directly into action. This combination of theory and hand-ons will help you master core ML concepts and algorithms that are used, not only in Silicon Valley but, throughout the world, while also offering intuitive yet informative explanations of how machine learning algorithms work, how to use them, and most importantly, how to avoid the most common pitfalls.

For those with a stronger interest in ML theory and development this course will provide an optional track that will focus on delving into the theory a little more deeply, and that culminates in coding up core ML algorithms from scratch and possible extending them.

Data Science

Term(s) offered: Spring, Summer

Prerequisite(s): Python

Database is the central focus in data science to store and manage data. Relational databases have empowered the main industries for decades and are still widely adopted. In the new era of big data, the database landscape is undergoing significant change. Many non-relational databases become an important part of the enterprise data architecture of companies. Relational databases were developed long before the Internet and the Web to tackle the issues of central-controlled data storage and management. NoSQL databases emerged with the rise of Internet and Web applications to connect companies with customers (i.e., online or mobile) and to develop with agility to adapt to faster changes. The new challenges of being agile and being able to accommodate data variability/data integration drive enterprises to turn to NoSQL database technology. It is important for every data scientist to master the skills of the current database and know about the future of databases in a world of NoSQL.

This course provides the basic overview of the current database landscape and tools, starting with relational databases and moving to several different NoSQL databases, such as MongoDB, Neo4j, Cassandra, and Redis.

Term(s) offered: Fall

Prerequisite(s): R, Python, and Statistics.

The goal of this course is to develop practical skills needed to perform
applied data science research. The course is organized around each stage of
the data science workflow (setting expectations, exploratory data analysis,
modeling, interpreting, and communicating results) and covers algorithms,
best practices, and evaluation criteria. Both good and bad application
examples will be discussed to help students develop a deeper understanding
and intuition about the choice of algorithm or visualization for the data
task, the development of the best practices, and the methods for evaluating
results of different approaches. Lectures and readings will provide students
with theoretical foundation for research and assignments will provide
hands-on practice for developing practical skills.

Term(s) offered: Spring (Special course dates apply)

This course is designed to provide a foundation in the use of modeling techniques in managerial decision-making. The course will cover three separate areas of modeling – forecasting, computer simulation and optimization. Computer simulation will be introduced and we will follow up on more advanced aspects of the topics in this course.

In particular, we will concentrate on input and output analysis for simulation models. In optimization, we will cover several different areas including linear programming, integer programming, nonlinear programming and genetic algorithms. We will also spend two weeks on forecasting and cover a broad overview of key forecasting techniques.

Upon completion of this course, students should be able to:

Understand how analytical techniques and tools are used to provide solutions to operational problems in various business functional areas including finance, economics, operations, and marketing.
Develop analytical models to analyze various business problems.
Recommend sound solutions to complex business problems based on the results of their analysis.
Solve complex problems using analytical techniques and tools on spreadsheets using various spreadsheet based add-ins. These include using Excel Solver for linear and integer programming problems, StatTools for statistical analysis, and @RISK for probabilistic simulations and risk analysis.

Term(s) offered: Fall, Every other Spring

Prerequisite(s): Machine Learning and Python

Natural Language Processing (NLP) has become an essential skill in many daily tasks for Data Scientists. From cleaning and parsing to extracting and computing, a scientist often faces challenging questions:

Data Wrangling - how to process and clean structured and unstructured data
Data Collection - how to extract text information
Data Analysis - how to summarize and categorize text data
Data Understanding - how to interpret natural language data

In this course, you will be introduced to NLP basics and will be guided though the most common NLP tasks for data analysis. In the first half of the course you will learn NLP processing skills. In the second half of the course you will dive into the domain-specific NLP techniques for data analysis featuring Healthcare, Banking, Marketing, Customer Service, and Technology domains.

This course is designed to prepare you for more advanced Data Science courses (Machine Learning and Deep Learning) as well as for more linguistic theory-oriented courses (Computational Linguistics) to enhance and refine your NLP skills.

Term(s) offered: Fall, Spring, Summer

Prerequisite(s): Basic algebra

This course provides a gentle, yet intense, introduction to programming in Python for students with little or no prior experience in programming. Python is an open-source language that allows rapid application development of scalable software systems is object-oriented by design and provides an excellent platform for doing data science. The course will focus on planning and organizing programs, and developing high quality, working software that solves real world problems.

Students will:

Learn how to design and implement scalable Python programs that solve real world problems, with focus on Data Science applications.
Learn top-down and object oriented approaches to software design.
Learn data structures and algorithms used in numeric and text data processing.

Term(s) offered: Every other Spring

Prerequisite(s): STAT S519 or equivalent

The course is a hands-on course providing a guided platform to learn and practice critical time-series analysis skills. This course will cover time series regression and exploratory data analysis, ARMA/ARIMA models, model identification/estimation/linear operators, Fourier analysis, spectral estimation, and state space models. The analyses will be performed using the freely available package astsa, xts, zoo. Lectures and reading are obligatory. R [Rstudio & Rmarkdown] and GitHub [Github Desktop] are required.

Terms offered: Summer

Prerequisites: Basic proficiency in Python and data wrangling, familiarity with data analysis tools (e.g., Pandas, NumPy) and introductory knowledge of statistical analysis and machine learning is recommended.

This course is designed for data science students eager to apply analytical skills to biomedical research using the NIH’s AllofUs Research Program data. It emphasizes leveraging large-scale, diverse datasets to explore real-world biomedical questions, with a focus on hands-on group projects using the AllofUs (AoU) workbench. Students will gain practical experience analyzing genomics, clinical, and socioeconomic data to address population health, disease risk, and personalized medicine applications.

Terms offered: Fall, spring

Prerequisites: Basic proficiency in Python

This course provides a practical introduction to fundamental and cutting-edge techniques in Artificial Intelligence. Students will gain hands-on experience in developing and applying AI models across various domains. The course is structured around seven distinct modules, allowing students to tailor their learning experience by selecting at least four modules of interest. Each module introduces practical implementation of specific AI methodologies, equipping students with the skills necessary to tackle real-world AI challenges. You may enroll in 1-3 credit hours per academic term.

Learning Outcomes: Upon successful completion of this course, students will be able to:

Understand the core concepts and principles behind various AI techniques.
Develop and implement AI models using relevant programming frameworks and libraries (e.g., PyTorch).
Critically evaluate the strengths and weaknesses of different AI approaches for specific tasks.
Effectively communicate technical concepts and findings related to AI projects.

Terms offered: Spring

Data science is a means to an end. The end is to answer questions or solve problems for the world (companies, nonprofit/nongovernmental organizations, retail and institutional investors, governments, regulators, politicians, journalists, employees, customers, and communities). This course aims to prepare students for a career that applies data science to answer questions or solve problems for (for-profit) companies. The focus is less on data science—because students in this program have already studied and/or will study many courses on data science—and more on business because D590 may be a students only business course.

Data science enables one to quantify the business problem/question, identify the cause(s) for the problem/question, and hopefully provide alternatives to solve the problem or answer the question. More broadly, this course aims to make you a better consumer of statistical information (e.g., a news article that reports that X% of your customers like your product Y) and critique the information regarding how the statistics arrived at.

Students will learn to:

Identify the appropriate statistical method to analyze data
Use IBM Corporation’s SPSS Statistics software program to analyze data
Report in both written and oral forms the insights from the data analysis
Recommend what business decision(s) to make

Term(s) offered: Fall, Spring, Summer

Prerequisite(s): To register, an offer letter from the hiring entity must be submitted to the Office of Online Education with a Graduate Internship form. Please contact the Office of Online Education for further instructions.

Graduate Internship credit can be awarded to students undertaking a significant experiential learning opportunity through a company, organization, nonprofit, etc. Students are responsible for securing their own internships, but should contact Luddy Career Services for assistance and resources. Students will participate in an internship for at least 6 weeks, with no less than 160 hours of supervised work. A student cannot earn more than three (3) credit hours in the course and the experience must be integral to their curricula.

Term(s) offered: Fall, Spring

Prerequisite(s): MS student in their final year of the program, or minimum completion of 18 credit hours in program.

This course is designed to help students experience the complexities and nuances of applying data science in the real world. Students will work in teams to tackle real-world problems in ongoing and new projects defined by a project sponsor. Project sponsors can be academics or industry practitioners. Students will need to work with the project sponsor and other team members to understand the problem domain, decide on a role, identify where their data science skills can be applied, and to work on a solution; in this regard, much of the course is about moving from ambiguity to an achievable outcome. During the course, students will also study aspects of data science consulting and project management through weekly reading assignments. The emphasis in this course is on the learning experience over the technical outcomes in the projects.

After completing this course, students will have practical experience working in a complex team environment using their skills to solve a real data science problem in an application domain; be able to go into an ambiguous situation and identify concrete opportunities; and understand multiple perspectives on data science consulting and project management.

Term(s) offered: Fall, Spring, Summer

Data Science On-ramp is a variable credit, asynchronous course comprised of several beginner and advanced mini-topics aimed to build and enhance your data science skills and technologies. Each topic covers 4-6 weeks of materials and will be counted as one credit hour. You may enroll in 1-3 credit hours per academic term; individual topic selection will be administered through the course’s Canvas site during the first week of each term. All topics will have weekly discussion requirements and deadlines for time management. If you enroll in 3 credit hours of On-Ramp, anticipate to spend 9-12 hours per week on three individual topics. Topics are designed to be completed sequentially (one at a time) or concurrently.

Note: No more than three (3) credit hours of On-Ramp credit may be applied to the Data Science program requirements effective Spring 2019.

A list of On-Ramp topics include:

Basics of Scala
Data Processing
Deep Learning Principles
Introduction to Hadoop Framework
Introduction to Spark
Kaggle Cases
Machine Learning Python
Machine Learning with PySpark
Machine Learning with R
Natural Language Processing in Python
Data Visualization Using Tableau
Web Scraping

Terms offered: Fall, Spring, Summer

Prerequisite(s): To register, a project proposal must be submitted to the Office of Online Education with an Independent Study form. Please contact the Office of Online Education for further instructions.

Independent study courses allow students to conduct individualized projects under the supervision of a faculty member. Up to three (3) credit hours may be earned to conduct research or to explore specific areas of data science that are not well covered by any specific formal course. The course is managed by a supervising faculty in conjunction with the proposed learning goals of the student. The student and the faculty discuss and propose goals, topics and projects.

Intelligent Systems Engineering

Term offered: Spring

Prerequisite(s): STAT S519 or equivalent, CSCI P556 strongly suggested

Machine Learning for Signal Processing teaches advanced machine learning
concepts, while it also encompasses many signal processing applications:
students are exposed to those signal processing applications during the
lectures and via homework (e.g., speech denoising, music source separation,
stereo image matching, temporally ordered tweeter streams, EEG recordings,
image segmentation, etc). The lectures are structured in a problem-solving
way, where the machine learning models are introduced to solve a specific
motivating problems. It starts from basic unsupervised and supervised
machine learning models, but it also delves into more advanced topics
including kernel methods, probabilistic topic modeling, hashing, Kalman
filtering, boosting, and so on. It is strongly recommended that the students
have to have some background in probabilistic theory, optimization, and
linear algebra, although the course is homework-heavy and programming
oriented.

Term offered: Fall

Prerequisite(s): A high comfort-level with systems programming and debugging. The assignments in this course will include nontrivial programming in the language of your choice.

This course covers basic concepts on programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms, parallel algorithms, storage and high level language for proficiency with a complex ecosystem of tools that span many disciplines. The course has the following objectives:

Provide a basic introduction to cloud computing
Introduce the concept of cloud data centers
Get familiar with cloud infrastructure as a Service such as OpenStack, Azure, or AWS
Get familiar with cloud infrastructure such as Docker and Kubernetes
Program cloud services
Understand the differences between virtual machines and containers
Develop sophisticated programming language independent REST services
Learn advanced programming models for clouds such as Map/Reduce, Messaging, and GraphQL
Exploration of Go for cloud computing
Demonstrate knowledge of clouds while developing a significant project
Explore state-of-the-art cloud technologies and services while providing a section and summary and commenting on its use for the cloud
Learn how edge computing is enhancing cloud services and infrastructure
Learn how to set up a cloud based on using commodity hardware

Term offered: Fall

Prerequisite(s): Intermediate C experience, familiarity with Linux/Unix command-line utilities.

This one-semester college course, “Introduction to High Performance Computing”, is offered as an entry-level hands-on learning experience in supercomputing providing the essential concepts, knowledge, and skills needed to begin a career either in supercomputing or as an effective means of achieving within the scope of other disciplines dependent on it. This course will also serve those interested in HPC engineering and design, software development, or system administration. The goal of the course is to engender a new generation of computer and computational scientists expert in the development, operation, and application of high performance computing systems prepared to address this future challenges demanding capability and expertise in HPC. The course is interdisciplinary combining critical elements from hardware technology and architecture, system software and tools, and programming models and application algorithms with the cross-cutting theme of performance management and measurement. It provides hands-on experience with strong educational reinforcement through experimental exercises.The topic areas to be covered by this one-semester course include:

Introduction and overview of HPC,
Large scale applications and parallel algorithmic methods,
Enabling technologies for logic, memory, and communication,
Parallel architectures including SMPs, commodity clusters, MPPs, and GPUs
Performance metrics, monitoring, measurement, and benchmarking,
Programming methods and tools including MPI, OpenMP, and OpenACC, and
Scientific visualization, performance and correctness debugging, and resource management.

Term offered: Fall

Prerequisite(s): STAT S519, CSCI P556; ENGR E511 or DSCI D590 Intro to NLP for Data Sciences helpful but not required

Deep Learning Systems is a comprehensive deep learning course that starts by
the basics of neural network, principles of deeper neural networks, and deep
learning-specific optimization techniques. Then, the course introduces core
deep learning models that are widely used in various application fields,
such as convolutional neural networks, recurrent neural networks (LSTMs and
GRUs), embedding models both for text/language modeling and signal
processing. The course also covers some generative models such as
variational autoencoders, generative adversarial networks, and
autoregressive models. The course also captures the engineering aspect of
the neural networks, mainly the network compression algorithms to reduce the
cost of run-time inference in the hardware deployment. The course consists
of programming-oriented homework, as well as final projects.

Term offered: Fall

Prerequisites: Knowledge of a programming language, the ability to pick up other programming languages as needed, willingness to enhance your knowledge from online resources and additional literature. You will need access to a “modern” computer that allows using virtual machines and/or containers. Knowledge of material taught by ENGR E516 is desirable and will make project execution easier. ENGR E516 and this class can be taken in parallel.

This class investigates the use of clouds running data analytics collaboratively for processing Big Data to solve problems in Big Data Applications and Analytics. Case studies such as Netflix recommender systems, Genomic data, Sports, Health, and more will be discussed.

The course has the following objectives:

Provide an introduction to Big Data
Provide an introduction to Big Data Analytics
Provide overviews of different Big Data Application areas
Explore state-of-the-art big data and cloud technologies and services while providing a write up about it and exploring it practically with a section that you develop
Enforce the theoretical knowledge with a project that you conduct in one of the application areas.

Term offered: Spring

The visual representation of information requires a deep understanding of human perceptual and cognitive capabilities, data mining and visualization algorithms, interface and interaction design, as well as creativity. Data—such as twitter, books or social networks—is typically non-spatial and needs to be mapped into a physical space that represents relationships contained in the information faithfully and efficiently. If done successfully, data visualizations combine human and machine intelligence to solve tasks that neither could accomplish alone.

This course provides an overview about the state-of-the-art in information visualization. It teaches the process of producing effective temporal, geospatial, topical, and network visualizations. Students get the chance to use tools such as Tableau, D3.js, OpenRefine, Gephi, and Plot.ly. Students have the opportunity to collaborate on real-world projects for a variety of clients.

Specifically, the course covers:

visualization frameworks that guide development
data analysis algorithms that enable extraction of structures and trends in data
major visualization and interaction techniques
discussions of systems that drive research and development
trends, opportunities, and challenges in the field

Informatics

Term(s) offered: Fall

Prerequisite(s): Some programming background is necessary. A specific language is not required, but it is assumed you can pick up new languages where needed for this course. One of the labs will be related to buffer overflows in C. This course also assumes you are savvy with the Linux command line.

This course is an extensive survey of network security. The course materials cover threats to information confidentiality, integrity, and availability in different Internet layers, and defense mechanisms that control these threats. The course also provides a necessary foundation on network security, such as cryptographic, primitives/protocols, authentication, authorization and access control technologies; and hands-on experiences through programming assignments and course projects.

Term offered: Fall

The course will use the tools of economics to better understand computer security. This is not a course in economics research in that no new tools will be discovered and no new ground will be broken in economic theory. The understanding of economics required for this course is modest, and a strong mathematical background with no economics will certainly suffice. There is no textbook.

At its core, this course should improve your decision-making for any organizations requires for its security professionals. In addition to the fundamental language of decision-making, the course will identify the dimensions of organizational and economic behavior that impinge upon the success of organizational technical choices.

Term offered: Fall

Prerequisite(s): Python, R, and C (or C++)
Machine learning techniques have been successful in analyzing biological
data because of their capabilities in handling randomness and uncertainty of
data noise and in generalization. In this class, we will learn about the classical machine learning techniques, such as Naive bayes, principal component analyses, clustering and neural network using biological problems.

We will also learn about recent developments and applications of dimensionality reduction and deep networks and their successful applications to solve some biological problems. Finally, we will learn about probabilistic models (including Markov models, Hidden Markov models, and Bayesian networks) for biological sequence analysis and systems biology.

Assessments will be take-home assignments and five (5) programming exercises.

By the end of this class, students will:

Learn more about the ML approaches that are unique to bioinformatics and their applications
Implement simple ML approaches
Be able to use ML approaches implemented in R & Python packages (e.g. scikit-learn) to solve real-world problems
Understand and appreciate biological problems that can be solved using ML approaches

Term offered: Spring

Prerequisite(s): A reasonable programming background is necessary. A course in operating systems, networking and computer architecture are helpful but not necessary. You are not required to know any particular language, but rather it is assumed you can pick up new languages if needed for the course.

This course is targeted at graduate students. This course covers the design and analysis of secure systems, including identifying security goals and risks, threat modeling, defense, integrating different technologies to achieve security goals, developing security protocols and policies, implementing security protocols and secure coding. Some real-world scenarios that have many security requirements will be studied.

Terms offered: Fall, Spring

Prerequisite(s): Students are expected to have undergraduate level expertise in computational thinking, but not a strong programming background. Experience with Linux File System and MySQL will be helpful prior to taking this class.

Data is abundant, and its abundance offers potential for new discovery as well as economic and social gain. However, data can be difficult to use, not to mention noisy and inadequately contextualized. There can be too big a gap from data to knowledge due to limits in technology or policy not easily combined with other data. This course will examine the underlying principles and technologies needed to capture, clean, contextualize, store, access, and trust data for a repurposed use. Students in this course will be introduced to capabilities and benefits of big data, key components of big data projects, and major steps in data analysis and visualization.

The following concepts are covered in the course:

Big data in science and business
Data pipelines
Complexity in software systems
Modeling data storage in noSQL stores
Data replication
Distributed computing
Data coding and cleaning
Data provenance
Data trustworthiness
Economies of data sharing

It is expected that a student will put in 6-7 hours a week every week into the course which includes time spent in readings, reflections, and engaging with instructional content.

Term(s) offered: Fall, Summer

Prerequisite(s): Because producing visualizations using Python data & visualization stack is an integral part of the course, it is required to have good understanding and working knowledge of programming, as well as working knowledge of using open-source libraries. It is recommended students have a basic understanding of mathematics, statistics, and Web (HTML, CSS, Javascript, and JSON)

From TV news to cutting-edge scientific papers, from a home office to the largest companies in the world, data visualization is extensively used to reveal patterns in data and to tell stories. More and more data is collected, and more and more decisions are made through data analysis. Data visualization is indispensable for understanding data, and thus is an essential skill for every knowledge worker. This course is an introduction to basic statistical data analysis and visualization. We will learn fundamentals of data visualization in the context of perception, integrity, design, statistics, types of 1 2 data, and visualization techniques. The hands-on exercises using the Python stack aim to equip you with practical data visualization skills and they will be an integral part of the course.

By the end of the course, you are expected to be able to understand, explain, and manipulate basic types of data, analyze them by applying basic exploratory visualization techniques, and create explanatory visualizations. You will also be able to evaluate the effectiveness of data visualizations based on the principles of human perception, design, types of data, and visualization techniques.

Term offered: Spring

Prerequisite(s): The course will require a good foundation of mathematics, statistics, and programming, although there is no formal prerequisite. Key topics are probability, statistics, linear algebra, data structures, and algorithms. Python is used as the main programming language and it will be very helpful to be proficient in Python.

Networks, or graphs, provide a unifying framework to study complex systems, such as living organisms, societies, and many techno-social systems. This graduate-level course focuses on the fundamental concepts as well as key applications of network science. The course will cover recent advancement of network science, with respect to statistical properties and models of real-world networks, network algorithms, and practical applications. Topics include: how information and diseases spread in our society, measures and algorithms for quantifying importance, link prediction, and community detection.

By the end of the course, students are expected to be able to identify, construct, and analyze networks by choosing and applying appropriate methods and algorithms. Students are also expected to be able to explain, both mathematically and conceptually, the key network concepts and statistical properties, and their implications.

Information and Library Science

Term offered: Spring

Prerequisite(s): Knowing linear algebra and basic statistics is helpful.

With the exponential growth of the Web in the past decades, we are facing a flood of information.

The success of GYM (Google, Yahoo and MSN) has shown that Information Retrieval is a key component to assist users to access target information based on their need. The course introduces information retrieval theories and concepts underlying all search applications. We will investigate techniques used in modern search engines and demonstrate their significance by experiment.

At the end of this course, students will be able to

Understand the mechanism of the most important and up-to-date retrieval theories and models
Be able to design and implement search engines using retrieval models
Work in teams or individual to build your own search components and interfaces
Learn ow information retrieval is used in other related fields, i.e. digital library, online shopping, multimedia environment
Enhance your search skills via various of search engines such as Google and Bing

Term offered: Fall

Prerequisite: Adequate knowledge of Python to read, modify, write code independently.

This course is intended to introduce you to the burgeoning field of Social Media Mining. We will explore what, exactly, is meant by the term "Social Media," and why anyone would be interested in mining it. After establishing some basic definitions and motivations, we will spend the rest of the course learning various techniques and methods that are currently employed to extract meaningful signals from the growing flood of social media data. In pursuit of this goal, I will provide hands-on, guided exercises using Python, and we will also read academic papers where authors share their methods, research questions, and insights mining the social web.

Statistics

Term(s) offered: Fall, Spring, Summer

Prerequisite(s): Intermediate algebra skills, such as comfort with functions, logarithms, and college-level mathematical notation. To register, please email the Statistics Department at statdept@iu.edu and include your 10-digit UID.

This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual data sets.

At the end of this course, students will be able to

Characterize uncertainty and variation using probability
Summarize data using computer graphics and numerical measures of center, spread, and
association
Assess whether observed data fits a probability model and understand the implications for
analysis
Explain what significance probabilities (P-value) and confidence intervals mean, and identify
common misinterpretations
Compare two or more samples or sets of measurements to draw scientific conclusions
Apply statistical models to real data and recognize their uses and limitations

Term offered: Spring

Prerequisite(s): S519 is a requirement to enrollment in STAT S580. You should already know how to calculate probabilities using software or otherwise for the fundamental probability distributions like the binomial and the normal. You should also know the forms and interpretations of t-tests, confidence intervals, and the simple linear regression line. You should have some experience with R. To register, please email the Statistics Department at statdept@iu.edu and include your 10-digit UID.

This course is a survey of statistical methods that do not rely on parametric assumptions. Knowledge of introductory statistics at the level of S320/S520 is assumed; this course is in some ways a sequel. As such, it will review the parametric techniques learned in that and similar introductory courses, and compare them to nonparametric alternatives to see when one technique outperforms another. The course material will include:

EDA and basic concepts
Nonparametric tests
Empirical distributions and the bootstrap
Multiple linear regression
Nonparametric and penalized regression
GLMs and other advanced models

O’Neill School of Public and Environmental Affairs

Terms offered: Fall, Spring, Summer

Prerequisite(s): To register, please email the O'Neill Records Office at oneillrc@iu.edu and include your 10-digit UID.

Application of statistical analysis to issues in public and environmental affairs and related fields. Addresses descriptive statistics, statistical inference, the nature of random variables, sampling distributions, point and interval estimation of parameters (mean, standard deviation, etc.), hypothesis testing, analysis of variance, and bivariate and multivariate regression. Emphasizes practical aspects of applying such methods, appropriately interpreting the results of these statistical analysis tools, and gaining a meaningful understanding of how statistical analysis can be misused or erroneously executed. Use of computer tools for carrying out statistical analysis (primarily SAS) will is also a major emphasis

Term offered: Spring

Prerequisite(s): A prerequisite for the class is a graduate-level, introductory statistics course that includes coverage of the simple (two-variable) regression model and an introduction to multivariate regression. To register, please email the O'Neill Records Office at oneillrc@iu.edu and include your 10-digit UID.

Intermediate-level perspective on statistical concepts and techniques for analyzing and modeling complex systems via regression analysis. Includes estimating the parameters of such models based on existing data, testing hypotheses about these systems, forecasting, correcting for violations of assumptions, and dealing with commonly encountered problems such as near multcollinearity. Primarily focused on single equation regression models and the extension of these models to a variety of situations, but includes an introduction to simultaneous equation models. Application of these techniques to problems and policies in public and environmental affairs, as well as general social sciences.

Luddy School of
Informatics, Computing, and Engineering

Luddy Online Programs

Online programs

Artificial Intelligence Graduate Certificate

Data Science Graduate Certificate

Data Science Master's

Available courses

Computer Science

Data Science

Intelligent Systems Engineering

Informatics

Terms offered: Fall, Spring

Information and Library Science

Statistics

O’Neill School of Public and Environmental Affairs

Ready to start your journey at Luddy? Take the next step!

Luddy Online Programs

Online programs

Artificial Intelligence Graduate Certificate

Data Science Graduate Certificate

Data Science Master's

Available courses

Computer Science

CSCI B505: Applied Algorithms

CSCI B551: Elements of Artificial Intelligence

CSCI B561: Advanced Database Concepts

CSCI B657: Computer Vision

CSCI P556: Applied Machine Learning

Data Science

DSCI D532: Applied Database Technologies

DSCI D590: Applied Data Science

DSCI D590: Optimization and Simulation for Business Analytics

DSCI D590: Introduction to NLP for Data Science

DSCI D590: Introduction to Python Programming

DSCI D590: Time Series Analysis

DSCI-D 590: Biomedical Data Science in Practice

DSCI-D 590: Artificial Intelligence On-ramp

DSCI D590: Data Science in Business

DSCI D591: Graduate Internship in Data Science

DSCI D592: Data Science in Practice

DSCI D595: Data Science On-Ramp

DSCI D699: Independent Study in Data Science

Intelligent Systems Engineering

ENGR E511: Machine Learning for Signal Processing

ENGR E516: Engineering Cloud Computing

ENGR E517: High Performance Computing

ENGR E533: Deep Learning Systems

ENGR E534: Big Data Applications

ENGR E583: Information Visualization

Informatics

INFO I520: Security for Networked Systems

INFO I525: Organizational Informatics and Economics of Security

INFO I529: Machine Learning in Bioinformatics

INFO I533: Systems and Protocol Security and Information Assurance

INFO I535: Management, Access, and Use of Big and Complex Data

Terms offered: Fall, Spring

INFO I590: Data Visualization

INFO I606: Network Science

Information and Library Science

ILS Z534: Search

ILS Z639: Social Media Mining

Statistics

STAT S519: Introduction to Statistics

STAT S580: Introduction to Regression Models and Nonparametrics

O’Neill School of Public and Environmental Affairs

SPCN V506: Statistical Analysis for Effective Decision-making

SPCN P507: Data Analysis and Modeling for Public Affairs

Ready to start your journey at Luddy? Take the next step!

Luddy School of Informatics, Computing, and Engineering resources and social media channels