
Online programs
Advance your career from anywhere in the world. With flexible online courses taught by the same expert faculty who teach on campus, you’ll earn a world-renowned IU degree—on your own schedule.

Advance your career from anywhere in the world. With flexible online courses taught by the same expert faculty who teach on campus, you’ll earn a world-renowned IU degree—on your own schedule.
Level up your technical expertise with the online Certificate in Artificial Intelligence and become the expert companies need.
Quickly and conveniently acquire new skills in topics such as data analysis, cloud computing, health and medicine, statistics, and data mining.
The online M.S. in Data Science from the Luddy School offers working professional the flexibility to advance their careers while gaining specialized knowledge in data science.
An available list of online courses offered for our Online Master’s and Online Certificate programs are listed below. Please check the official Schedule of Classes for section numbers and instructor of record each Fall, Spring, and Summer term.
If you need additional content regarding a course, you are welcome to reach out to the instructor directly.
Note: Unless otherwise specified, all courses listed are worth three (3) credit hours.
Term offered: Fall
Algorithms are at the heart of any computer-related task. In this course, we will teach how to approach the meta-task of algorithm building, as well as look at individual algorithms. We will use mathematical tools for designing and analyzing our algorithms, and get some simple hands-on coding experience. If you’re a non-CS major, or someone who cares more about applications than theory, then your chances are this course is the one you want.
At the completion of this course, you will be able to:
Term offered: Fall
Prerequisite(s): Experience with programming, data structures, and algorithms will be assumed. Assignments will involve substantial amount of programming in Python. In addition, we will encounter math of various kinds, including linear algebra, probability theory, and basic calculus.
This course covers the fundamentals of Artificial Intelligence, and is aimed at M.S., early Ph.D., and advanced undergraduate students in Computer Science and Data Science, as well as students in other related fields who have a strong computing background. Topics will include (tentatively):
Term(s) offered: Spring
Prerequisite(s): It is assumed that you can program in various styles (imperative, functional, and object-oriented), have knowledge of algebra and elements of discrete math, as well as data structures and algorithms.
Introduction to database concepts and systems. Topics include:
Term(s) offered: Spring
Prerequisite(s): CSCI B551 or equivalent is required. You will need to be proficient in a general-purpose programming language (Python or C/C++); you should be able to implement basic matrix operations using basic data structures of the language (e.g. matrix multiplication using arrays). Exposure to linear algebra, basic calculus, machine learning, graph theory, probability theory, geometry, and statistics will be extremely helpful.
This is an introductory course in computer vision. We will give a broad overview of the field, with a slight bias towards some topics to reflect current research trends (e.g. object recognition, deep learning). The emphasis will be on algorithms, mathematical models, and techniques that are broadly applicable to many problems not only in vision but also in other areas of AI and CS. Topics will include
(tentatively and not necessarily in this order):
Term(s) offered: Fall, Spring
Prerequisite(s): To register, successful completion of Entrance Exam with score of 6/10 is required. After completing the exam, please forward your score to the Office of Online Education via email to grant permission.
If you want to become a machine learning practitioner, a better problem solver, or maybe even consider a career in machine learning research, then this course is for you. However, for a novice, the theoretical concepts behind machine learning can be quite overwhelming.
This course focuses on introducing theoretical concepts and algorithms in a step-by-step manner, while infusing them with intuition, examples and python jupyter notebooks. In this spirit, you will study core ML algorithms, while also working through numerous example applications of machine learning. Concrete examples help illustrate the broader concepts by putting the learned material directly into action. This combination of theory and hand-ons will help you master core ML concepts and algorithms that are used, not only in Silicon Valley but, throughout the world, while also offering intuitive yet informative explanations of how machine learning algorithms work, how to use them, and most importantly, how to avoid the most common pitfalls.
For those with a stronger interest in ML theory and development this course will provide an optional track that will focus on delving into the theory a little more deeply, and that culminates in coding up core ML algorithms from scratch and possible extending them.
Term(s) offered: Spring, Summer
Prerequisite(s): Python
Database is the central focus in data science to store and manage data. Relational databases have empowered the main industries for decades and are still widely adopted. In the new era of big data, the database landscape is undergoing significant change. Many non-relational databases become an important part of the enterprise data architecture of companies. Relational databases were developed long before the Internet and the Web to tackle the issues of central-controlled data storage and management. NoSQL databases emerged with the rise of Internet and Web applications to connect companies with customers (i.e., online or mobile) and to develop with agility to adapt to faster changes. The new challenges of being agile and being able to accommodate data variability/data integration drive enterprises to turn to NoSQL database technology. It is important for every data scientist to master the skills of the current database and know about the future of databases in a world of NoSQL.
This course provides the basic overview of the current database landscape and tools, starting with relational databases and moving to several different NoSQL databases, such as MongoDB, Neo4j, Cassandra, and Redis.
Term(s) offered: Fall
Prerequisite(s): R, Python, and Statistics.
The goal of this course is to develop practical skills needed to perform
applied data science research. The course is organized around each stage of
the data science workflow (setting expectations, exploratory data analysis,
modeling, interpreting, and communicating results) and covers algorithms,
best practices, and evaluation criteria. Both good and bad application
examples will be discussed to help students develop a deeper understanding
and intuition about the choice of algorithm or visualization for the data
task, the development of the best practices, and the methods for evaluating
results of different approaches. Lectures and readings will provide students
with theoretical foundation for research and assignments will provide
hands-on practice for developing practical skills.
Term(s) offered: Spring (Special course dates apply)
This course is designed to provide a foundation in the use of modeling techniques in managerial decision-making. The course will cover three separate areas of modeling – forecasting, computer simulation and optimization. Computer simulation will be introduced and we will follow up on more advanced aspects of the topics in this course.
In particular, we will concentrate on input and output analysis for simulation models. In optimization, we will cover several different areas including linear programming, integer programming, nonlinear programming and genetic algorithms. We will also spend two weeks on forecasting and cover a broad overview of key forecasting techniques.
Upon completion of this course, students should be able to:
Term(s) offered: Fall, Every other Spring
Prerequisite(s): Machine Learning and Python
Natural Language Processing (NLP) has become an essential skill in many daily tasks for Data Scientists. From cleaning and parsing to extracting and computing, a scientist often faces challenging questions:
In this course, you will be introduced to NLP basics and will be guided though the most common NLP tasks for data analysis. In the first half of the course you will learn NLP processing skills. In the second half of the course you will dive into the domain-specific NLP techniques for data analysis featuring Healthcare, Banking, Marketing, Customer Service, and Technology domains.
This course is designed to prepare you for more advanced Data Science courses (Machine Learning and Deep Learning) as well as for more linguistic theory-oriented courses (Computational Linguistics) to enhance and refine your NLP skills.
Term(s) offered: Fall, Spring, Summer
Prerequisite(s): Basic algebra
This course provides a gentle, yet intense, introduction to programming in Python for students with little or no prior experience in programming. Python is an open-source language that allows rapid application development of scalable software systems is object-oriented by design and provides an excellent platform for doing data science. The course will focus on planning and organizing programs, and developing high quality, working software that solves real world problems.
Students will:
Term(s) offered: Every other Spring
Prerequisite(s): STAT S519 or equivalent
The course is a hands-on course providing a guided platform to learn and practice critical time-series analysis skills. This course will cover time series regression and exploratory data analysis, ARMA/ARIMA models, model identification/estimation/linear operators, Fourier analysis, spectral estimation, and state space models. The analyses will be performed using the freely available package astsa, xts, zoo. Lectures and reading are obligatory. R [Rstudio & Rmarkdown] and GitHub [Github Desktop] are required.
Terms offered: Summer
Prerequisites: Basic proficiency in Python and data wrangling, familiarity with data analysis tools (e.g., Pandas, NumPy) and introductory knowledge of statistical analysis and machine learning is recommended.
This course is designed for data science students eager to apply analytical skills to biomedical research using the NIH’s AllofUs Research Program data. It emphasizes leveraging large-scale, diverse datasets to explore real-world biomedical questions, with a focus on hands-on group projects using the AllofUs (AoU) workbench. Students will gain practical experience analyzing genomics, clinical, and socioeconomic data to address population health, disease risk, and personalized medicine applications.
Terms offered: Fall, spring
Prerequisites: Basic proficiency in Python
This course provides a practical introduction to fundamental and cutting-edge techniques in Artificial Intelligence. Students will gain hands-on experience in developing and applying AI models across various domains. The course is structured around seven distinct modules, allowing students to tailor their learning experience by selecting at least four modules of interest. Each module introduces practical implementation of specific AI methodologies, equipping students with the skills necessary to tackle real-world AI challenges. You may enroll in 1-3 credit hours per academic term.
Learning Outcomes: Upon successful completion of this course, students will be able to:
Terms offered: Spring
Data science is a means to an end. The end is to answer questions or solve problems for the world (companies, nonprofit/nongovernmental organizations, retail and institutional investors, governments, regulators, politicians, journalists, employees, customers, and communities). This course aims to prepare students for a career that applies data science to answer questions or solve problems for (for-profit) companies. The focus is less on data science—because students in this program have already studied and/or will study many courses on data science—and more on business because D590 may be a students only business course.
Data science enables one to quantify the business problem/question, identify the cause(s) for the problem/question, and hopefully provide alternatives to solve the problem or answer the question. More broadly, this course aims to make you a better consumer of statistical information (e.g., a news article that reports that X% of your customers like your product Y) and critique the information regarding how the statistics arrived at.
Students will learn to:
Term(s) offered: Fall, Spring, Summer
Prerequisite(s): To register, an offer letter from the hiring entity must be submitted to the Office of Online Education with a Graduate Internship form. Please contact the Office of Online Education for further instructions.
Graduate Internship credit can be awarded to students undertaking a significant experiential learning opportunity through a company, organization, nonprofit, etc. Students are responsible for securing their own internships, but should contact Luddy Career Services for assistance and resources. Students will participate in an internship for at least 6 weeks, with no less than 160 hours of supervised work. A student cannot earn more than three (3) credit hours in the course and the experience must be integral to their curricula.
Term(s) offered: Fall, Spring
Prerequisite(s): MS student in their final year of the program, or minimum completion of 18 credit hours in program.
This course is designed to help students experience the complexities and nuances of applying data science in the real world. Students will work in teams to tackle real-world problems in ongoing and new projects defined by a project sponsor. Project sponsors can be academics or industry practitioners. Students will need to work with the project sponsor and other team members to understand the problem domain, decide on a role, identify where their data science skills can be applied, and to work on a solution; in this regard, much of the course is about moving from ambiguity to an achievable outcome. During the course, students will also study aspects of data science consulting and project management through weekly reading assignments. The emphasis in this course is on the learning experience over the technical outcomes in the projects.
After completing this course, students will have practical experience working in a complex team environment using their skills to solve a real data science problem in an application domain; be able to go into an ambiguous situation and identify concrete opportunities; and understand multiple perspectives on data science consulting and project management.
Term(s) offered: Fall, Spring, Summer
Data Science On-ramp is a variable credit, asynchronous course comprised of several beginner and advanced mini-topics aimed to build and enhance your data science skills and technologies. Each topic covers 4-6 weeks of materials and will be counted as one credit hour. You may enroll in 1-3 credit hours per academic term; individual topic selection will be administered through the course’s Canvas site during the first week of each term. All topics will have weekly discussion requirements and deadlines for time management. If you enroll in 3 credit hours of On-Ramp, anticipate to spend 9-12 hours per week on three individual topics. Topics are designed to be completed sequentially (one at a time) or concurrently.
Note: No more than three (3) credit hours of On-Ramp credit may be applied to the Data Science program requirements effective Spring 2019.
A list of On-Ramp topics include:
Terms offered: Fall, Spring, Summer
Prerequisite(s): To register, a project proposal must be submitted to the Office of Online Education with an Independent Study form. Please contact the Office of Online Education for further instructions.
Independent study courses allow students to conduct individualized projects under the supervision of a faculty member. Up to three (3) credit hours may be earned to conduct research or to explore specific areas of data science that are not well covered by any specific formal course. The course is managed by a supervising faculty in conjunction with the proposed learning goals of the student. The student and the faculty discuss and propose goals, topics and projects.
Term offered: Spring
Prerequisite(s): STAT S519 or equivalent, CSCI P556 strongly suggested
Machine Learning for Signal Processing teaches advanced machine learning
concepts, while it also encompasses many signal processing applications:
students are exposed to those signal processing applications during the
lectures and via homework (e.g., speech denoising, music source separation,
stereo image matching, temporally ordered tweeter streams, EEG recordings,
image segmentation, etc). The lectures are structured in a problem-solving
way, where the machine learning models are introduced to solve a specific
motivating problems. It starts from basic unsupervised and supervised
machine learning models, but it also delves into more advanced topics
including kernel methods, probabilistic topic modeling, hashing, Kalman
filtering, boosting, and so on. It is strongly recommended that the students
have to have some background in probabilistic theory, optimization, and
linear algebra, although the course is homework-heavy and programming
oriented.
Term offered: Fall
Prerequisite(s): A high comfort-level with systems programming and debugging. The assignments in this course will include nontrivial programming in the language of your choice.
This course covers basic concepts on programming models and tools of cloud computing to support data intensive science applications. Students will get to know the latest research topics of cloud platforms, parallel algorithms, storage and high level language for proficiency with a complex ecosystem of tools that span many disciplines. The course has the following objectives:
Term offered: Fall
Prerequisite(s): Intermediate C experience, familiarity with Linux/Unix command-line utilities.
This one-semester college course, “Introduction to High Performance Computing”, is offered as an entry-level hands-on learning experience in supercomputing providing the essential concepts, knowledge, and skills needed to begin a career either in supercomputing or as an effective means of achieving within the scope of other disciplines dependent on it. This course will also serve those interested in HPC engineering and design, software development, or system administration. The goal of the course is to engender a new generation of computer and computational scientists expert in the development, operation, and application of high performance computing systems prepared to address this future challenges demanding capability and expertise in HPC. The course is interdisciplinary combining critical elements from hardware technology and architecture, system software and tools, and programming models and application algorithms with the cross-cutting theme of performance management and measurement. It provides hands-on experience with strong educational reinforcement through experimental exercises.The topic areas to be covered by this one-semester course include:
Term offered: Fall
Prerequisite(s): STAT S519, CSCI P556; ENGR E511 or DSCI D590 Intro to NLP for Data Sciences helpful but not required
Deep Learning Systems is a comprehensive deep learning course that starts by
the basics of neural network, principles of deeper neural networks, and deep
learning-specific optimization techniques. Then, the course introduces core
deep learning models that are widely used in various application fields,
such as convolutional neural networks, recurrent neural networks (LSTMs and
GRUs), embedding models both for text/language modeling and signal
processing. The course also covers some generative models such as
variational autoencoders, generative adversarial networks, and
autoregressive models. The course also captures the engineering aspect of
the neural networks, mainly the network compression algorithms to reduce the
cost of run-time inference in the hardware deployment. The course consists
of programming-oriented homework, as well as final projects.
Term offered: Fall
Prerequisites: Knowledge of a programming language, the ability to pick up other programming languages as needed, willingness to enhance your knowledge from online resources and additional literature. You will need access to a “modern” computer that allows using virtual machines and/or containers. Knowledge of material taught by ENGR E516 is desirable and will make project execution easier. ENGR E516 and this class can be taken in parallel.
This class investigates the use of clouds running data analytics collaboratively for processing Big Data to solve problems in Big Data Applications and Analytics. Case studies such as Netflix recommender systems, Genomic data, Sports, Health, and more will be discussed.
The course has the following objectives:
Term offered: Spring
The visual representation of information requires a deep understanding of human perceptual and cognitive capabilities, data mining and visualization algorithms, interface and interaction design, as well as creativity. Data—such as twitter, books or social networks—is typically non-spatial and needs to be mapped into a physical space that represents relationships contained in the information faithfully and efficiently. If done successfully, data visualizations combine human and machine intelligence to solve tasks that neither could accomplish alone.
This course provides an overview about the state-of-the-art in information visualization. It teaches the process of producing effective temporal, geospatial, topical, and network visualizations. Students get the chance to use tools such as Tableau, D3.js, OpenRefine, Gephi, and Plot.ly. Students have the opportunity to collaborate on real-world projects for a variety of clients.
Specifically, the course covers:
Term(s) offered: Fall
Prerequisite(s): Some programming background is necessary. A specific language is not required, but it is assumed you can pick up new languages where needed for this course. One of the labs will be related to buffer overflows in C. This course also assumes you are savvy with the Linux command line.
This course is an extensive survey of network security. The course materials cover threats to information confidentiality, integrity, and availability in different Internet layers, and defense mechanisms that control these threats. The course also provides a necessary foundation on network security, such as cryptographic, primitives/protocols, authentication, authorization and access control technologies; and hands-on experiences through programming assignments and course projects.
Term offered: Fall
The course will use the tools of economics to better understand computer security. This is not a course in economics research in that no new tools will be discovered and no new ground will be broken in economic theory. The understanding of economics required for this course is modest, and a strong mathematical background with no economics will certainly suffice. There is no textbook.
At its core, this course should improve your decision-making for any organizations requires for its security professionals. In addition to the fundamental language of decision-making, the course will identify the dimensions of organizational and economic behavior that impinge upon the success of organizational technical choices.
Term offered: Fall
Prerequisite(s): Python, R, and C (or C++)
Machine learning techniques have been successful in analyzing biological
data because of their capabilities in handling randomness and uncertainty of
data noise and in generalization. In this class, we will learn about the classical machine learning techniques, such as Naive bayes, principal component analyses, clustering and neural network using biological problems.
We will also learn about recent developments and applications of dimensionality reduction and deep networks and their successful applications to solve some biological problems. Finally, we will learn about probabilistic models (including Markov models, Hidden Markov models, and Bayesian networks) for biological sequence analysis and systems biology.
Assessments will be take-home assignments and five (5) programming exercises.
By the end of this class, students will:
Term offered: Spring
Prerequisite(s): A reasonable programming background is necessary. A course in operating systems, networking and computer architecture are helpful but not necessary. You are not required to know any particular language, but rather it is assumed you can pick up new languages if needed for the course.
This course is targeted at graduate students. This course covers the design and analysis of secure systems, including identifying security goals and risks, threat modeling, defense, integrating different technologies to achieve security goals, developing security protocols and policies, implementing security protocols and secure coding. Some real-world scenarios that have many security requirements will be studied.
Prerequisite(s): Students are expected to have undergraduate level expertise in computational thinking, but not a strong programming background. Experience with Linux File System and MySQL will be helpful prior to taking this class.
Data is abundant, and its abundance offers potential for new discovery as well as economic and social gain. However, data can be difficult to use, not to mention noisy and inadequately contextualized. There can be too big a gap from data to knowledge due to limits in technology or policy not easily combined with other data. This course will examine the underlying principles and technologies needed to capture, clean, contextualize, store, access, and trust data for a repurposed use. Students in this course will be introduced to capabilities and benefits of big data, key components of big data projects, and major steps in data analysis and visualization.
The following concepts are covered in the course:
It is expected that a student will put in 6-7 hours a week every week into the course which includes time spent in readings, reflections, and engaging with instructional content.
Term(s) offered: Fall, Summer
Prerequisite(s): Because producing visualizations using Python data & visualization stack is an integral part of the course, it is required to have good understanding and working knowledge of programming, as well as working knowledge of using open-source libraries. It is recommended students have a basic understanding of mathematics, statistics, and Web (HTML, CSS, Javascript, and JSON)
From TV news to cutting-edge scientific papers, from a home office to the largest companies in the world, data visualization is extensively used to reveal patterns in data and to tell stories. More and more data is collected, and more and more decisions are made through data analysis. Data visualization is indispensable for understanding data, and thus is an essential skill for every knowledge worker. This course is an introduction to basic statistical data analysis and visualization. We will learn fundamentals of data visualization in the context of perception, integrity, design, statistics, types of 1 2 data, and visualization techniques. The hands-on exercises using the Python stack aim to equip you with practical data visualization skills and they will be an integral part of the course.
By the end of the course, you are expected to be able to understand, explain, and manipulate basic types of data, analyze them by applying basic exploratory visualization techniques, and create explanatory visualizations. You will also be able to evaluate the effectiveness of data visualizations based on the principles of human perception, design, types of data, and visualization techniques.
Term offered: Spring
Prerequisite(s): The course will require a good foundation of mathematics, statistics, and programming, although there is no formal prerequisite. Key topics are probability, statistics, linear algebra, data structures, and algorithms. Python is used as the main programming language and it will be very helpful to be proficient in Python.
Networks, or graphs, provide a unifying framework to study complex systems, such as living organisms, societies, and many techno-social systems. This graduate-level course focuses on the fundamental concepts as well as key applications of network science. The course will cover recent advancement of network science, with respect to statistical properties and models of real-world networks, network algorithms, and practical applications. Topics include: how information and diseases spread in our society, measures and algorithms for quantifying importance, link prediction, and community detection.
By the end of the course, students are expected to be able to identify, construct, and analyze networks by choosing and applying appropriate methods and algorithms. Students are also expected to be able to explain, both mathematically and conceptually, the key network concepts and statistical properties, and their implications.
Term offered: Spring
Prerequisite(s): Knowing linear algebra and basic statistics is helpful.
With the exponential growth of the Web in the past decades, we are facing a flood of information.
The success of GYM (Google, Yahoo and MSN) has shown that Information Retrieval is a key component to assist users to access target information based on their need. The course introduces information retrieval theories and concepts underlying all search applications. We will investigate techniques used in modern search engines and demonstrate their significance by experiment.
At the end of this course, students will be able to
Term offered: Fall
Prerequisite: Adequate knowledge of Python to read, modify, write code independently.
This course is intended to introduce you to the burgeoning field of Social Media Mining. We will explore what, exactly, is meant by the term "Social Media," and why anyone would be interested in mining it. After establishing some basic definitions and motivations, we will spend the rest of the course learning various techniques and methods that are currently employed to extract meaningful signals from the growing flood of social media data. In pursuit of this goal, I will provide hands-on, guided exercises using Python, and we will also read academic papers where authors share their methods, research questions, and insights mining the social web.
Term(s) offered: Fall, Spring, Summer
Prerequisite(s): Intermediate algebra skills, such as comfort with functions, logarithms, and college-level mathematical notation. To register, please email the Statistics Department at statdept@iu.edu and include your 10-digit UID.
This course introduces the basic concepts of statistical inference through a careful study of several important procedures. Topics include 1- and 2-sample location problems, the one-way analysis of variance, and simple linear regression. Most assignments involve applying probability models and/or statistical methods to practical situations and/or actual data sets.
At the end of this course, students will be able to
Term offered: Spring
Prerequisite(s): S519 is a requirement to enrollment in STAT S580. You should already know how to calculate probabilities using software or otherwise for the fundamental probability distributions like the binomial and the normal. You should also know the forms and interpretations of t-tests, confidence intervals, and the simple linear regression line. You should have some experience with R. To register, please email the Statistics Department at statdept@iu.edu and include your 10-digit UID.
This course is a survey of statistical methods that do not rely on parametric assumptions. Knowledge of introductory statistics at the level of S320/S520 is assumed; this course is in some ways a sequel. As such, it will review the parametric techniques learned in that and similar introductory courses, and compare them to nonparametric alternatives to see when one technique outperforms another. The course material will include:
Terms offered: Fall, Spring, Summer
Prerequisite(s): To register, please email the O'Neill Records Office at oneillrc@iu.edu and include your 10-digit UID.
Application of statistical analysis to issues in public and environmental affairs and related fields. Addresses descriptive statistics, statistical inference, the nature of random variables, sampling distributions, point and interval estimation of parameters (mean, standard deviation, etc.), hypothesis testing, analysis of variance, and bivariate and multivariate regression. Emphasizes practical aspects of applying such methods, appropriately interpreting the results of these statistical analysis tools, and gaining a meaningful understanding of how statistical analysis can be misused or erroneously executed. Use of computer tools for carrying out statistical analysis (primarily SAS) will is also a major emphasis
Term offered: Spring
Prerequisite(s): A prerequisite for the class is a graduate-level, introductory statistics course that includes coverage of the simple (two-variable) regression model and an introduction to multivariate regression. To register, please email the O'Neill Records Office at oneillrc@iu.edu and include your 10-digit UID.
Intermediate-level perspective on statistical concepts and techniques for analyzing and modeling complex systems via regression analysis. Includes estimating the parameters of such models based on existing data, testing hypotheses about these systems, forecasting, correcting for violations of assumptions, and dealing with commonly encountered problems such as near multcollinearity. Primarily focused on single equation regression models and the extension of these models to a variety of situations, but includes an introduction to simultaneous equation models. Application of these techniques to problems and policies in public and environmental affairs, as well as general social sciences.