Course for international guest/part time students
- Faculty
- Faculty of Science
- Organization
- TTK Department of the Physics of Complex Systems
- Code
- dsexplorf20vm
- Title
- Data Exploration and Visualization
- Usual semester
- Spring
- Published semester
- 2025/26/2
- ECTS
- 6
- Language
- en
- Learning outcomes
- Learn the concept, methods and gain experience in data exploration, analysis and visualization to aid research and development in any field, including physics, where data is generated. a) Knowledge: He/she knows the numerical and computational methods of physics, as well as the fields of mathematics and computer science related to physics. He/she is familiar with high-level methods of scientific research, self-education and communication. He/she has a high level of scientific knowledge and knowledge of the elements of practice based on it and is able to systematize them. He/she is familiar with numerical tools and methods related to physics with which he/she can practice his/her profession at an advanced level. He/she is familiar with the terminology describing datascience, and has an extensive knowledge about the literature of his/her field. b) Abilities: Able to develop and operate high quality industrial, IT and measurement systems based on physical laws and state-of-the-art processes. Able to cultivate the fields of computational physics. With regular professional self-education, he/she is able to process new results in datasciences and apply them creatively in his/her work. He/she is able to test the researchable processes and systems of his/her field with methods accepted in the practice of datascienc. Based on his/her in-depth knowledge of the datascienc, he/she is able to design dataexploratory, -analysis and visualization workflows and apply it to the field. Able to continuously increase his/her knowledge and continue his/her studies in doctoral school. With the knowledge and problem-solving skills acquired during his/her studies, he/she is able to fill independent and complementary positions in any field where data needs to be managed. c) Attitude: He/she is characterized by creativity, flexibility, problem recognition and solution skills, intuition, methodicality and data processing skills. He/she is characterized by sensitivity to the environment, a positive attitude to professional development, and a commitment to quality work. He/she actively cooperates with his/her colleagues, participates in group work in a constructive way. He/she formulates the problems of his/her field of expertise professionally for both professionals and lay people. He/she is constantly striving to expand his/her knowledge and acquire new skills. d) Autonomy and responsibility: He/she can autonomously handle and work with any given data and create reproducible reports while unequivocally showing the source of data.
- Course content
- The aim of the course is that students gain practical skills to access large databases/datasets, to handle data stored in different formats, to explore/distill these data and present/visualize the gathered information. During the course students will come across databases of multiple disciples. Completing of the several projects allows students to gain experience on this field that will be a firm a foundation for later courses on theoretical datamining and advanced computing laboratories. 1. Datatypes, images, timeseries, tables, graphs, textual data 2. Standards of file- and dataformats 3. Raw and processed data, metadata, cleansing of data 4. Developing open source softwares 5. Access data locally and through the web, APIs 6. Access of scientific databases 7. Usage of relational databases 8. Transforming data, sortind, combining 9. Basics of timeseries analysis 10. Basics of imageprocessing 11. Dimension reduction, clustering 12. Infographics, visualisation 13. Interactive dataexplorative tools 14. Reproducible research
- Assessment method
- Students have to complete 8-10 assignments. These assignments will be graded in two aspects: 1, results and conclusions and 2, outlook and report quality
- Bibliography
- * Wes McKinney: Python for Data Analysis, (O’Reilly 2013) * Joel Grus: Data Science from Scratch (O’Reilly 2015) * Reproducible Analysis, https://buildmedia.readthedocs.org/media/pdf/reproducible-analysis-workshop/latest/reproducible-analysis-workshop.pdf * Why build dashboards? https://towardsdatascience.com/dashboards-are-dead-b9f12eeb2ad2 *Scientific code development and workflow, https://towardsdatascience.com/workflow-for-reportable-reusable-and-reproducible-computational-research-45d036c8a908