# This shows you where R is currently "standing"
getwd()[1] "/Users/alexnewhouse/alexbnewhouse.github.io"
If you’ve ever found yourself confused about why R can’t find your data file, or why your script works on your computer but not your classmate’s, or why you keep getting “file not found” errors even though you know the file is there—this tutorial is for you!
Understanding how your computer organizes files and how R navigates through them is absolutely crucial for doing data analysis. It’s like learning to read a map before going on a road trip. Once you understand the basic concepts, you’ll spend less time wrestling with technical issues and more time focusing on your actual research.
A file system is simply how your computer organizes and stores files. Think of it like a giant filing cabinet with folders inside folders inside folders. Every file on your computer has an “address” that tells you exactly where to find it.
Let’s start with some basics that apply to both Windows and Mac:
my_data.csv or homework.R)Your home directory is a special folder that belongs just to you. It’s like your personal office space on the computer. This is where your operating system stores your personal files like Documents, Downloads, Desktop, etc.
On Windows, your home directory is typically:
C:\Users\YourName\
Inside your home directory, you’ll find familiar folders like:
Documents\ - where many programs save files by defaultDesktop\ - the files you see on your desktopDownloads\ - where your web browser saves downloaded filesPictures\, Music\, Videos\ - for media filesOn Mac, your home directory is typically:
/Users/YourName/
Inside your home directory, you’ll find familiar folders like:
Documents/ - where many programs save files by defaultDesktop/ - the files you see on your desktopDownloads/ - where your web browser saves downloaded filesPictures/, Music/, Movies/ - for media filesWhy does this matter for R? When you first open RStudio, it usually sets your working directory to your home directory. Understanding this helps you navigate to where your files actually are!
On Windows, your file system starts with drive letters like C:, D:, etc. The main drive is usually C:. Folders are separated by backslashes (\).
A typical Windows path might look like:
C:\Users\YourName\Documents\POLS101\homework1.R
This means:
On Mac, the file system starts with a forward slash (/) and doesn’t use drive letters. Folders are separated by forward slashes (/).
A typical Mac path might look like:
/Users/YourName/Documents/POLS101/homework1.R
This means:
Your working directory is like your current location in the file system. It’s where R is “standing” right now. When you tell R to open a file, it looks for that file starting from your working directory.
Think of it this way: if you’re in a library and someone says “go get the book on the third shelf,” you need to know which section of the library you’re currently in to know which third shelf they mean!
Let’s see what your current working directory is:
[1] "/Users/alexnewhouse/alexbnewhouse.github.io"
When you first open RStudio, your working directory is usually set to your home folder. But this might not be where your data files are stored!
The Problem: You can see your data file in your file explorer/finder, but R says it can’t find it.
What’s happening: Just because you can see the file doesn’t mean R knows where to look for it. R only looks in your current working directory (and its subdirectories) by default.
The Solution: Either move your file to your working directory, or tell R exactly where to find your file.
The Problem: Your code works on your computer but breaks on your classmate’s computer.
What’s happening: You’ve written the specific path that exists on your computer, but your classmate has a different username and maybe even a different folder structure.
The Solution: Use relative paths (explained below) or set up your project properly.
The Problem: Your files are scattered all over your computer, making it hard to keep track of what goes with what project.
The Solution: Create a dedicated folder for each project and keep everything related to that project inside it.
Understanding the difference between absolute and relative paths is crucial:
An absolute path gives the complete address from the very top of your file system:
A relative path gives directions from your current location (working directory):
data/survey_results.csv
This says: “from where I am now, go into the ‘data’ folder and find ‘survey_results.csv’”
For each research project or class, create a dedicated folder:
Inside your project folder, create subfolders like:
POLS101/
├── data/ # Store your datasets here
├── scripts/ # Store your R code here
├── output/ # Store graphs, tables, etc. here
└── documents/ # Store papers, notes, etc. here
RStudio has a fantastic feature called “Projects” that makes file management much easier. When you create an RStudio Project, it automatically sets your working directory to the project folder.
Here’s how to create a new project:
C:\Users\YourName\Documents\)/Users/YourName/Documents/)Now let’s look at how to actually work with files in your R code:
If you’re not using RStudio Projects, you might need to set your working directory manually:
Pro tip: Notice that even on Windows, you can use forward slashes (/) in R code. This makes your code more portable between operating systems!
Once your working directory is set correctly, you can use relative paths:
# This will work for anyone who has the same folder structure
library(tidyverse)
# Load data from the 'data' subfolder
survey_data <- read_csv("data/survey_results.csv")
# Create a plot and save it to the 'output' subfolder
survey_data %>%
ggplot(aes(x = age, y = income)) +
geom_point() +
ggsave("output/age_income_plot.png")Before trying to load a file, you can check if R can find it:
One of the biggest sources of confusion for students is understanding how R Markdown (.Rmd) files handle file paths differently when you’re working interactively versus when you knit the document. This is critically important for completing assignments!
When you run code chunks interactively in RStudio (clicking the green arrow or pressing Ctrl/Cmd+Enter), R uses your current working directory. But when you knit your R Markdown document, R temporarily changes the working directory to wherever your .Rmd file is saved.
This means:
getwd() (your current working directory)Here’s a typical scenario that trips up students:
The Problem:
Your working directory is set to POLS101/
You run this code in a chunk and it works fine:
But when you knit, you get “file not found” errors!
What’s happening: When you knit, R looks for data/survey_data.csv starting from where homework1.Rmd is located (which is POLS101/), so it actually looks for POLS101/data/survey_data.csv - which is correct! But if your working directory was set to something else when running interactively, the paths might not match.
The easiest solution is to save your .Rmd file in the main project folder:
POLS101/ # <- Your project folder
├── homework1.Rmd # <- Save your .Rmd here
├── data/
│ └── survey_data.csv
└── output/
└── plots/
Then use relative paths from your project root:
here package (highly recommended!)The here package solves this problem elegantly:
The here() function automatically finds your project root (where your .Rproj file is, or other indicators) and builds paths from there.
Add this to the top of your R Markdown document to see where knitting thinks it is:
Here’s a workflow that works reliably for assignments:
here package in your .Rmd file# Example assignment .Rmd structure:
# This should work both interactively and when knitting
library(tidyverse)
library(here) # Recommended!
# Load data (works both ways)
data <- read_csv(here("data", "assignment_data.csv"))
# Do your analysis
summary_stats <- data %>%
summarize(mean_age = mean(age),
mean_income = mean(income))
# Save your plot (works both ways)
data %>%
ggplot(aes(x = age, y = income)) +
geom_point() +
ggsave(here("output", "scatter_plot.png"))Before submitting your assignment, always:
If your .Rmd file won’t knit due to file path errors:
Step 1: Add this diagnostic chunk at the top of your .Rmd:
Step 2: Compare the output when you: - Run the chunk interactively - Knit the document
Step 3: Adjust your file paths based on what you see. If knitting shows you’re in a different location than you expected, adjust your paths accordingly.
Many students lose points on assignments because:
Understanding this concept will save you time and frustration, and help ensure your assignments are complete when you submit them!
Let’s walk through setting up a complete research project:
Using File Explorer:
C:\Users\YourName\Documents\data, scripts, output, documentsUsing Finder:
/Users/YourName/Documents/data, scripts, output, documentsCreate a new R script and try this code:
# Check where you are
getwd()
# Create some sample data
library(tidyverse)
sample_data <- data.frame(
id = 1:10,
score = rnorm(10, mean = 75, sd = 10)
)
# Save it to your data folder
write_csv(sample_data, "data/sample_data.csv")
# Load it back (to test that it works)
loaded_data <- read_csv("data/sample_data.csv")
print(loaded_data)
# Create a simple plot and save it
loaded_data %>%
ggplot(aes(x = id, y = score)) +
geom_point() +
geom_line() +
labs(title = "Sample Data Plot",
x = "ID",
y = "Score") +
ggsave("output/sample_plot.png")
print("Success! Your project structure is working correctly.")Solutions:
getwd()file.exists("your_filename")list.files()The Problem: OneDrive automatically takes over your Documents, Desktop, and Pictures folders on Windows, moving them to the cloud. This can cause major confusion because your file paths might look like:
C:\Users\YourName\OneDrive\Documents\POLS101\data.csv
instead of the expected:
C:\Users\YourName\Documents\POLS101\data.csv
What’s happening: Microsoft OneDrive “hijacks” your default folders to sync them to the cloud. While this can be useful for backup, it changes where your files are actually stored and can make file paths unpredictable.
Common signs you have OneDrive issues:
Solutions:
Check your actual file locations - Use getwd() in R to see where you really are, and list.files() to see what’s actually in your working directory.
Use the full OneDrive path - If your files are in OneDrive, use the full path:
Move your project outside OneDrive - Create your R projects in a folder that’s NOT managed by OneDrive:
C:\Users\YourName\R_Projects\POLS101\Turn off OneDrive folder redirection (advanced):
Use RStudio Projects - They help manage paths automatically, regardless of where your project folder is located.
Pro tip: If you’re unsure whether OneDrive is affecting your files, open File Explorer and look at the address bar when you navigate to Documents. If it shows “OneDrive” in the path, then OneDrive is managing that folder.
If your folders or files have spaces in their names, you need to put quotes around the entire path:
here packageThe here package is fantastic for making your code work across different computers and operating systems:
R has some built-in shortcuts for common locations:
getwd()/ in your R codefile.exists() and list.files() to debug issuesThe most important thing is to be intentional about where you put your files and consistent in how you organize your projects. A little bit of organization at the beginning will save you hours of frustration later!
Remember: everyone struggles with file paths and working directories when they’re learning R. Don’t get discouraged—with practice, it becomes second nature. And when in doubt, ask for help!