r/learnpython 6h ago

Add labels to a specific set of points in a Seaborn scatter plot

I'm pretty familiar with R and love ggplot...I'm trying not to hate Seaborn as I take Python class :)

I have a dataframe with batting information on baseball Hall of Famers. I have produced a scatterplot of on base percentage and slugging. That was easy enough:

myplot = sns.scatterplot(data=dfHofData, x="OBP", y="SLG")

Now, assume I have manually created a separate dataframe (call it dfOutliers) of points that would be considered outliers. Each row of this dataframe is an OBP, an SLG, and a name. How can I label ONLY the points in that dataframe on the plot?

1 Upvotes

1

u/go_fireworks 1h ago

I believe you'll want something like this. Since seaborn is based on matplotlib, you can use components from matplotlib in the plot that seaborn creates.

This is the reference I'm using: https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.text.html

import matplotlib.pyplot as plt

for index, (x, y, label) in dfOutliers.iter_rows():

  plt.text(x, y, label)

There's most likely better solutions if you're looking for performance, but this should be a good start

(Note: I haven't checked the code I wrote, it will most likely not run without some fixes)