class: center, middle, inverse, title-slide # Brief Summary: ggplot2 Learning Reflections ## SDS 192: Introduction to Data Science ###
Shiya Cao
Statistical & Data Sciences
, Smith College
###
Fall 2024
--- # Load Packages * "You have to load tidyverse in order to use ggplot2." --- # Visualization Aesthetics * "ggplot needs arguments to specify which dataset is being plotted and the aesthetic (which variable is mapped to which axis)" * "I have learned how to map different aesthetic features, like color or fill, x and y, size, and shape to the ggplot function." * "You can use size and shape to distinguish the different types of data (categorical/numerical)" * "Color and fill function are also designed uniquely for plots like scatterplot or bar plots." * "how to use the R Brewer Color package" * "I've learned how to change the colors of my bars, point, lines, etc. in R using both packages and manual." * "I learned about the data-to-ink ratio" --- # Visualization Context * "I've learned how to label my graphs and make sure they have titles and axis names that make them easy to read and understand." * "Use labs() function to customize the x-axis/y-axis labels and titles in plots, making the plots more readable and informative." --- # Composing & Interpreting Plots * "I have learned how to visualize data by choosing a data frame, mapping variables to aesthetics, and choosing a geom." * "I have learned why the structure of the ggplot function is the way it is, instead of mindlessly using the code" * "You can visualize data in many different ways depending on the type of variables and data." * "I've learned how to make scatterplots, bar graphs, line graphs, and boxplots in R." * "I learned how to use ggplot to draw histogram, points, bar and boxplots." * "the function geom_TYPEOFGRAPH( ) is needed to visualize any of the 5 named graphs" * "Plots such as histogram and bar plots are designed for different types of variables (e.g. categorical, quantitative)." * "Use '+' in order to link together functions." * "how to use the facet_wrap function" --- # Address Overplotting * "jitter...is an amazing tool to add random noise to data so we can see a clear relationship between individual points. But we also need to be careful of how much jitter we want to add to avoid distortion of data" --- # Coding * "You need to make sure your spelling is correct otherwise the code will not run. Errors are our friends and they can help us figure out what is wrong with our code." * "Adding space after operands makes it clearer for coder to debug, helping isolate errors and ensures the plot behaves as expected." * "I have learned general coding knowledge/terminology (what functions and arguments are, downloading packages, etc)." * "The basic "grammar" of coding (where to add spaces, new lines, how to add comments, etc)" * "Practice more, and you will code skillfully." --- # MVP * "I learned about creating a minimally viable product." * "I have also learned not to let perfect be the enemy of good" * "you can code by writing a simple function that works and progressively making it more complex, and error codes are extremely helpful" * "Coding is iterative, and you don't have to get it right the first time" --- # Datasets * "how to view and understand large data sets" --- # Authoring Markdown Documents * "how to author and render a Quarto document" --- # GitHub * "I have learned how to commit code and push it to platforms like Github."