The course covers the utilization of big data tools, cloud computing platforms, and MLOps practices to efficiently process, manage, and analyze massive datasets, emphasizing the development of robust data pipelines and the deployment of machine learning operations for real-world applications. Students gain practical experience in harnessing the power of distributed computing frameworks, scalable data storage and processing tools, and cloud services to address the challenges posed by the growing volume and complexity of contemporary data science. Topics include: Hadoop MapReduce, Hive, Spark, Streaming Analytics, Cloud Computing, NoSQL, and MLOps.
prereq: MSBA 6311, MSBA 6321, or instructor consent