CS X456.01 - Data and Data Preparation for AI
Course Description
This course provides an in-depth understanding of data collection, cleaning, preprocessing, and preparation techniques essential for building reliable AI models. Students will learn methods to explore and analyze data, apply feature engineering, and manage large datasets. The course combines theoretical learning with hands-on projects to ensure practical data preparation skills learned.Learner Outcomes
- Understand methods of data collection, cleaning, and preprocessing.
- Explore data using visual and statistical techniques for insights.
- Learn data labeling strategies and synthetic data generation for AI models.
- Apply feature engineering techniques to improve model accuracy.
- Master the practices for data management and storage.
- Complete a practical hands-on project in preparing datasets for machine learning models.
Prerequisites
1. Basic Programming Skills (Python): Knowledge of Python is essential, as this course involves hands-on data manipulation using libraries such as Pandas, NumPy, and Matplotlib. Suggested prerequisite course: "Introduction to Python" or equivalent experience.
2. Introductory Statistics and Probability: Familiarity with basic statistics (mean, median, variance) and probability is important for understanding data distributions, exploratory data analysis, and feature engineering techniques. Suggested prerequisite course: "Introduction to Statistics" or similar.
3. Basic Knowledge of Machine Learning Concepts: Understanding of supervised and unsupervised learning, and general familiarity with how machine learning models work will help with contextualizing data preparation steps. Suggested prerequisite course: "Introduction to Machine Learning" or an equivalent course covering basic ML algorithms.
4. Familiarity with Data Structures: Students should know how to work with data structures like arrays, lists, and dictionaries, as these are frequently used in data manipulation tasks.
Applies Towards the Following Certificates
- Applied Artificial Intelligence : Required
- Applied Artificial Intelligence : Required