Introduction to the Python Set Data Structure
So far in this section of the course on Python data structures we've covered lists we've covered dictionaries and we've covered tuples and in this last section we are going to cover one more data structure and that is the set.
Guide Tasks
  • Read Tutorial
  • Watch Guide Video
Video locked
This video is viewable to users with a Bottega Bootcamp license

Now I left this one for the last because you're not going to be using sets a ton when you're building out Python applications. However, you will come across them every once in a while and I would be remiss if I did not show you what the syntax was and also the reason why you might want to use a set.

If you're familiar with the dictionary syntax and you're familiar with the list syntax a set is kind of a merging of both of those and we'll talk about a few of the key characteristics that make up a set here as we build out our example. Now I'm going to set up a list of tags and instead of using brackets like we use if we are building a list we're actually going to look like we're creating a dictionary but we're not because we're not using key-value pairs.

Instead we're just using elements and we're listing them out just like we would in a traditional list so I can say python, and say coding, and let's say tutorials and now if I come down here and print out tags everything here will work as normal so you can see we have tutorials python and coding and this has the set syntax where we have these curly brackets.

large

Now let's talk about probably the most important reason why you are going to ever want to use a set and that is that a set requires that all of the elements inside of the set are unique. So if you ever have a situation where you need a data structure that looks a lot like a list but you can not allow for duplicates then a set might be a good pick for you. If I add a new duplicate item such as coding again notice how we have three elements here. If I run this one more time notice how we still have 3 elements even though we have coding listed twice. It is not included in the output.

large

That is very important whenever we're using a set, our set is always going to be guaranteed to have unique elements and so that is one of the top reasons.

Now let's talk about how we can query our set so I'm going to come down here and if you try to work with your sets like you would a list then you're going to run into some issues because if I say print and then say tags and pass in 0 you may think this is going to give me Python because that's the exact way that a list would function.

tags = {
  'python',
  'coding',
  'tutorials',
  'coding'
}
print(tags[0])

However, if I run this for one you can see we have an error right there

large

but let's try it anyway and I'm going to run it. You can see we have an error

large

where it says traceback file python line 10 which is right here and set and so it knows this is a set object does not support indexing so this tells us exactly what our problem is.

Instead what we can do and I'm going to add a little nope right here, so do not do that.

10 # Nope
11 #print(tags[0])

But how can you query a set? It's actually kind of a cool syntax and so let's see what that looks like. So if I say query and I'm going to create a couple of them if I say query_one and I check to see is python in this set I can say 'python' in tags and let's see what this gives us. So I'm going to try to print this out and it's going to tell me that it is true.

large

So what our query is going to give us back is not the element because obviously we already know what the element is right here and so we are asking in this set of tags is python does it exist in there and if it does it's going to return True.

Now if we duplicate this and create another query, I'm going to say query_two and we can just say is 'ruby' in tags and try to print that out. We're going to get a false

large

because ruby does not exist in this list of tags or in the set of tags and that's exactly what we get. So right here it says that is False and so we have two key types of behavior that if you ever have a situation where you need to have a collection of items and you don't need the full functionality of a list you simply want to have a collection of different elements that need to be unique then a set may be a good choice.

Also if you have a collection of elements where you want to check if one of those elements exist or if a element exists in that set then this gives a really nice syntax for doing that.

Code

# Uniqueness
tags = {
  'python',
  'coding',
  'tutorials',
  'coding'
}

print(tags)

# Nope
print(tags[0])

query_one = 'python' in tags
query_two = 'ruby' in tags

print(query_one)
print(query_two)